Hey everyone! I wanted to let you know about some updates I’ve rolled out to this project in the last couple of days.
Improved accuracy
Based on everyone’s helpful feedback, I’ve re-trained the model on a much larger dataset, including data from Twitter and Japanese web novels. You may have noticed a fairly big accuracy boost lately, especially on colloquial text. Thanks to everyone who gave advice and keep it coming!
Open sourcing models and code
If you’re interested in this sort of thing or know how to code, you might be interested in checking out the source code I used to create the data. You can also download the full model for use in whatever you like, at my github page:
As always let me know if you have suggestions or issues!
Thanks to everyone who gave feedback to the model through the mechanism on the site! Thousands of sentences were sent as feedback which was really helpful to me in seeing where the model was going wrong. Since the performance is now pretty good, I’ve removed the feedback mechanism which means, no more annoying messages asking you for feedback! Should make checking things much quicker. As always though, feel free to send me your thoughts or problems with the site, either here or at the email address on the website.
Make-over!
I decided to re-design the site with a brighter theme, since some people couldn’t read the text clearly. I’ve also set the default language to English.
Thanks for all your interest, almost 10,000 sentences have now been checked on 文法ーCHECK!
Hmm, Ctrl+Z works for me on Chrome. What browser are you using? Yeah all the formatting needs to be stripped on paste to stop weird fonts and characters coming in. There might be a way to conserve the line breaks though, I’ll have to look into that - thanks for bringing it up.
Mm yeaaaah it is a whole other beast to give suggestions, hehe. It’s definitely something I want to look into and it’s been requested quite a bit. I’m a bit busy with my degree at the mo but when I have a bit more time I’m going to look into building a new model which is capable of this.
Thanks for the thoughts and for giving it a try! very glad to hear it stood up the text corrected by a native.
A big update which has been a long time coming! Starting today, 文法ーCHECK now offers suggestions for fixing your sentence.
After hitting check, underlines will appear under any mistakes. Tap an underlined section to see the suggestion for fixing it.
Note that this is still a work in progress and might give incorrect results as always. Any and all feedback is welcome. I’m going to continue to work to make the accuracy better and better over time.
I was using your website for a bit to quickly check if I had any major flaws in my grammar and it was very useful. Now the website is useless because I ran out of uses after less than 30 seconds. Its all good since there are plenty of other grammar checkers which let you have unlimited checks and also give you suggestions such as Free Japanese Grammar Checker Online | Sapling
Thanks for taking the time to give some feedback! It’s always really useful to hear about how people are using the site. I can understand that the rate limiting on the new model prevents you from using the site in the same way as before, which must be frustrating. I’m sorry about that.
Part of the reason is that, in the AI world, giving better quality outputs usually means using a bigger model, and using a bigger model means paying a lot more money. It’s always a trade-off between quality and cost… in the case of Bunpo Check, my priority is on giving high-quality suggestions for smaller pieces of text. If you’re trying to do broader checks on large chunks of text, the site you linked may indeed be a better solution. But for a note on quality… I tried pasting the Wiki page about axolotls into both services. Here’s an example sentence comparison:
Bunpo Check basically just tries to make the last sentence more polite, which I think is probably a reasonable suggestion given the rest of context. The suggestions by Sapling all seem quite incorrect or like possible bugs, as far as I can tell.
That’s not to say my site is always better for all sentences, or that Sapling is bad and that people shouldn’t use it. This is only a single example. I’m just trying to highlight the difference between quality vs. quantity here - giving people the ability to check lots of stuff for free means they probably can’t run the large, powerful models needed for higher accuracy. (If I had to guess).
I’d urge anyone interested in this type of tool to try out all the available services and see what fits their use case best. Regardless of what you choose, I hope it can be helpful in your learning.
With all that said, I am currently working on changing some things in the backend to try and increase the number of sentences people can check before hitting limits by a fairly substantial amount. Keep an eye out for updates coming soon.
I think the training data was drawn from Wikipedia articles so my guess would be no… hopefully @gilledtothegills can get this back online because I think it’s a really cool tool.
I’m going to add on that I have tried reaching out to @gilledtothegills via email, GitHub, and LinkedIn with no response so far. For the time being I think we have to assume that this may not be coming back online for the foreseeable future.
If there are others with ML or software development experience that are interesting in working on a grammar checker project, let’s get in touch! I would love to work on something like this on the side.
Just tried it for the first time and it’s absolutely amazing! I know AI has limitations, but this kind of fixes is exactly what I’ve been missing as a self-learner.
Very nice tool!
Thank you!
It will prove to be useful.
I have a question.
The tool suggest to use ため rather than 為 or かかるrather than 掛かる. I know that Japanese people usually write it without kanji but the use of those kanji is still pretty common.
Do you think you could improve the tool to identify with an other colour that kind of suggestions? (Which I think are not mistakes).
Also, if you search for improvement, it would be awesome if when we click on the underlined word we could have a link to go below on the window, with an explanation on this particular grammar point, and a few examples. I don’t know what AI tool you’re using but I think an AI tool like chatgpt might be able to provide such information.
I also wanted to implement something to check the ‘importance’ of a suggestion, but I didn’t manage to find a way that was reliable with my current setup. If I get some time I want to come back to this!
One thing to bear in mind, is that some of these ‘style’ changes might be informed by the rest of the sentence. For example, if the rest of the sentence uses a formal literary style, the model will be more inclined to convert ため to 為. Of course it’s also possible the model is just going overboard in many cases, but sometimes these types of suggestions make me think about the consistency of the overall style of my sentence.
The suggestion regarding explanations is also great! ChatGPT would probably do a decent job there, yeah. The UI code is kind of outdated and it makes adding these types of features a bit difficult for now… I wanna migrate the front-end so I can add that to my idea list once the UI is upgraded