Lapis app: a segmentation engine that understands grammar, the best dictionary lookups, SRS designed for language learning

Current project status: Slow development progress as of now. Subscriptions won’t be enabled yet until a full release with good support, so the app is still free to use until further notice.


TL;DR: I’m introducing a project called Lapis. Give it a try and let me know what you think: lapisapp.com.

Dark

This is a project by me and @alihmd. We would like to talk about why we think it can be the best companion to learning Japanese after you get comfortable with Kanji.

Features

Workspace (Segmentation)

Workspace

Workspace is where you’ll be when you’re reading/listening. The main feature here is segmentation.

Segmentation

Dark

Powered by a powerful engine that understands how the language is structured, we’re striving to make it the best segmentation engine/parser out there, one that not only segments words but also can recognize grammar and show you definitions/examples in place.

We think this is the defining feature of Lapis as it integrates with many other features. And as we’ll see later in the SRS section, this page gives us the simplest fastest way of creating sentence cards.

There are two targets here:

  • Beginners: It’s sometimes hard to know what to lookup when the whole sentence is full of unknown structures and grammar. Also, you sometimes can’t even tell when words are starting and ending in a string of text, making it harder to do a correct lookup.
  • Advanced: Discover new grammar. Sometimes you might understand a sentence, but not notice that a particular part is actually a well defined grammar point.

Lapis aims to be a solution to both levels through segmentation. It recognizes all kinds of conjugations, non standard slang like すげぇ/うまっ, and can show a detailed construction, even when grammar is involved.

Here’s an example with “してあげたくなります”:

(It makes a small distinction for grammar points, those are colored in yellow)

Selecting any of these constructs shows a grammar definition (if the construct is a grammar point), and rich example sentences containing the construct:

This is a progressive process, we haven’t yet added all grammar in existence, we’ll be doing that and improving the overall accuracy with time, as well as adding completely new helpful things we have planned.

Boards

A board is a temporary space where you pin sentences to. It acts as an intermediary, from which you can create cards or vocab sheets (coming later) in one go, or pin example sentences you’re interested in checking later. We won’t talk much about it now, as it’ll be most helpful after we finish working on vocab sheets.

Lookup

Lookup

Dark

This is an advanced lookup page with some nice features:

  • Results from JP-EN, JP-JP, example sentences, kanjis, names.
  • Pitch accent patterns and audio.
  • Advanced lookup queries such as ..人 to match anything that starts with any 2 characters and ends with 人. More info in the lookup page.

Recursive lookups/segments are also supported. Just select and right click any Japanese text in JP-JP or example sentences.

Recursive gif

SRS

SRS

Dark

Why yet another SRS app?

There are lots of well made SRS apps right now, old and new. But we had our own goals with this. Designing a system for language learning in particular can lead to a more efficient reviews. Also, we wanted a much easier creation process, and we wanted smart dictionary entries to be integrated in cards.

Configuration

First, how often cards appear is controlled through two simple settings that are part of the deck settings:

  • Repetition count: This tells us how many times we want this card to appear in reviews until we consider it learned (aka how many levels the card will have).
  • Retirement interval: When the review interval reaches this value, the card will be considered learned and won’t come again in reviews.
    Based on these two settings, we calculate the interval levels for each card. The defaults are 12 for repetition count and 233 days (7.7 months) for retirement interval (the default imitates a Fibonacci Sequence). This means that you want a card to appear 12 times, and with each level the interval gets closer to the retirement interval. The last level’s interval will be exactly the retirement’s interval. When you pass this level the card is learned.

You can also optionally configure a daily distribution, where in you define percentages at certain times a day, so that a day’s reviews get distributed instead of being available for review all at once.

Creation

Creating cards from sentences is extremely easy from Workspace:

You write in a sentence, you enter selection mode and then select definitions and grammar that interest you, and with a click a card gets created containing this info. Then in review, we’ll have the same nice dictionary entries with pitch/audio/etc. And you can even view the segmentation result of the original sentence from the review page.

In addition, creating cards this way actually creates a contextually aware card for the sentence. What we mean is that readings/definitions that don’t apply to the current context of the word are automatically removed from the card. Let’s see an example. If we’re adding this sentence to SRS:

This is how the 建前 entry will appear in reviews:

All the faded info in the segmentation page (aka unapplicable data to the current context) is automatically removed. This helps you always learn things in context instead of being bombarded with all the different meanings of a word that don’t apply to the current sentence.


We believe cards you create should be personal, as in, taken from what you’re reading/listening. This creates a personal attachment to the card and will make you more motivated to acquire it with each review. Using decks created by others is -in our opinion- a bad idea so we discourage that. We don’t have a “share your deck” feature and that’s by design.


State of the project

Lapis is a learning platform for any language, but will be focusing purely on Japanese first until we reach a comfortable place. More support for other languages will come later if the project lives on.

Right now, we’re doing a public beta release. “Beta” not in the sense of stability, but in the sense of feature workflows. Feel free to make feedback that can help shape the platform.

Future Roadmap

We have a huge list of enhancements and new features alike planned for the future. This is a short list:

  • Register more and more grammar.
  • Vocabulary sheets: What if when reading a chapter, you’d like to save the sentences with the interesting vocabs/grammar for easy viewing later on? What if you want to share your vocabulary sheet with others too (as in the case of book clubs)? The answer is vocab sheets.
  • Browser extension (similar to rikaikun, but applies segmentation and integrates with SRS).
  • Simulator graph for SRS.

Platform wise:

  • Of course, a mobile app.

Financial plan

We plan to have several subscription plans. Details to be determined later.

Feedback

For now, we want all discussions to be focused in the same place, so we’ll be using this thread for that. Questions, feedback, suggestions? Let us know in this thread.

Later on, we’ll be making use of discord for announcements and other stuff. So if you’re interested please be sure to join our discord server: Discord


Thanks for reading!
Give it a try and let us know what you think: lapisapp.com.

78 Likes

Looks great! Easy to use, incredibly useful, aesthetically pleasing website, would recommend it for everyone to try out. Will definitely use it in the future.

7 Likes

This looks great!

Can you create a way to import a csv or list of sentences? I’m using a Kindle and exporting lists of vocab and context sentences, so pulling them all in at once would save a lot of menial copy/paste.

Also, it looks like selecting items for SRS does not create individual cards for each item, unless I’m missing something. I’m not sure I’d want to SRS one giant sentence with multiple explanations on it.

8 Likes

Looks like you’ve been doing some good work on this ^-^

I hope you don’t mind, but I’ve moved this post over to Japanese Language > Resources and I’m just recommending tagging @mods or emailing hello@wanikani.com for their blessing, since you’ve got a Patreon linked ^-^

9 Likes

This looks amazing! Will be giving it a proper test in a few weeks, but just based on a cursory glance it looks fantastic! This must have taken a lot of time and effort, thank you for sharing!

3 Likes

Thanks!

We prefer sentence cards, which many believe is the better way to review. So the idea is to enter a sentence in which you don’t know a word (or more, but preferably one), and then create a card from that, with the word selected to be shown in the back. Of course, you can always just write that word directly if you want to create a single word card. We just chose to design this around sentence cards. If you want to talk about this more let me know.

We want to do that, but it’s not straightforward because usually for a single sentence you’d select the definitions/grammar you want to include (aka you want to automatically appear in the back of the card).

An option is to just not select anything and instead use the “segment” action for a card in the reviews page. So maybe we can just skip the inclusion step when importing big sentences, that might be a good compromise.

4 Likes

Ohh, thanks, I didn’t know where to best post this.

@mods If it’s against the rules I can remove the link.

2 Likes

Thanks!

It did! There was a very long time of prototyping, but we think we now have a really good foundation to build a new kind of a segmentation engine.

1 Like

No probs - it’s only that campfire isn’t externally indexed, so people will only see it if they look in and some people have it muted entirely ^-^

3 Likes

In my experience, long and complicated sentences are not very useful in an SRS, so in order to drill certain key words into your head, it is best to just learn that word and add a shorter sentence or collocation instead. If you are “pinning” words (as the robot lady says) while reading native material, SRSing the whole convoluted sentence is unnecessary.

I was imagining just importing the whole sentence list and using the parser to select the bits you want in something like a sentence queue and just flip through them. No need to import the word field from the Kindle, and there is no way I’ve found to get bulk definitions out of a Kindle anyway.

Well, the aim was for this to be used in shorter sentences. Forgive the example sentence in the page.
I’ve tried both approaches for a long time (and I’m also aware of the pinning idea), I still prefer a sentence card if only to narrow down a small context of the word you’re learning. But in any case, we can introduce some kind of mode when selecting the words that would instead create individual cards. That might be a good alternative for people who prefer word cards.

Wow, I think this is absolutely amazing! Will definitely try this out in detail, do you want feedback here?

I looked up a lot of vocabulary today while reading (ahem stumbling through) a few pages in a book, and I kind of created as notes what your app does for me with a simple copy and paste, I am stunned! :star2:

3 Likes

Nice. An option would be good (but also include the context sentence on the back, maybe?). I prefer learning in context too, but when mining real native material that may be slightly above one’s level, it is often just too much in one sentence without breaking it down into more digestible chunks.

2 Likes

Thanks!

Definitely. We have a lot of things we want to do so we’re deciding priorities, feedback will also help with that.

1 Like

This is so awesome! I’m really excited to try this app out and will start using it as my SRS for the next few weeks to help with feedback. Playing around with Lapis for a few minutes here is my immediate feedback:

  • The tutorial is a little bit confusing. I’m not sure if it’s because of Dark Mode, but it was really difficult to actually understand what elements were being referenced by the tutorial boxes. For example, when it was talking about selecting checkboxes I genuinely couldn’t find any checkboxes on the screen.
  • Include a “Back” button on the cards during the tutorial so that I can reference the previous step. There is a lot of unique info/vocabulary thrown at us in the intro tutorial and going back and forth would be helpful
  • like someone else mentioned, maybe try using a simpler sentence -during the tutorial-. I really like that the engine is capable of complex sentence parsing but it would be easier to absorb some of these initial concepts with a much simpler example sentence and fewer cards on the screen.

I’m joining the discord and if I continue using after the first 2 weeks I’ll join the Patreon as well. Super excited for this!!! :slight_smile:

5 Likes

Also - can you add a Feedback channel to the Discord for us to use as we test?

1 Like

One option to consider is a subscription model where the user chooses the subscription amount, with a recommended amount shown by default. For example, the subscription may default to $5/month, but the user can change that to $0/month.

The justification is people who would be willing to pay might not do so if there’s too much friction between using the service and subscribing. If all users have to go through the subscription screen, it’s much easier to get people to pay.

Having this extra subscription screen is also friction for everyone who won’t pay, but they should be able to handle it for what the service will provide them. It may still keep some people out, though, due to friction.

Just a thought.

3 Likes

Looks awesome! Getting an ichi.moe vibe, but with a much cleaner interface and with SRS built in! Went to give it a test ride and after signing up… I can’t login. :sweat_smile: Typing in my email and password just refreshes the login screen. Confirmed the email and reset my password to something new with no luck. :sweat_smile: Still, looks awesome though. :slight_smile:

5 Likes

I’m experiencing the same issue.

Safari on iOS14 (iPad)

First of all, very nice work. Lapis app looks really nice, and feels polished and responsive.

I registered an account and I’ve been testing a bit while reading today. First issue I’ve run into leads me to the following question. Since this is aimed also at beginners, what’s the current state with parsing かな strings, specially for things like children books?

I tried with a small phrase でむかえた枝元さん. Lapis currently seems to have trouble recognizing 出迎えた.

(枝元 is a person’s name, which is also often a pain point for beginners, but I guess that’s way too difficult to recognize for current parsers. At least for the ones I’ve tried…)

5 Likes