Hey everyone! I wanted to let you know about some updates I’ve rolled out to this project in the last couple of days.
Based on everyone’s helpful feedback, I’ve re-trained the model on a much larger dataset, including data from Twitter and Japanese web novels. You may have noticed a fairly big accuracy boost lately, especially on colloquial text. Thanks to everyone who gave advice and keep it coming!
Open sourcing models and code
If you’re interested in this sort of thing or know how to code, you might be interested in checking out the source code I used to create the data. You can also download the full model for use in whatever you like, at my github page:
As always let me know if you have suggestions or issues!