Deathnetworks said... Did the audio url (http://assets.languagepod101.com/dictionary/japanese/audiomp3.php?kanji=%kanji%&kana=%kana%) change?Apparently not, I can still get the audio clips in my browser. Can someone else confirm that audio still works on Houhou when configured with this url? (I'm at work at the moment but will check asap)
Doublevil said...Works fine for me on Windows 10 and that URL.Deathnetworks said... Did the audio url (http://assets.languagepod101.com/dictionary/japanese/audiomp3.php?kanji=%kanji%&kana=%kana%) change?Apparently not, I can still get the audio clips in my browser. Can someone else confirm that audio still works on Houhou when configured with this url? (I'm at work at the moment but will check asap)
Iāll turn it off and back on again
That did the job, note to self⦠actually close down HouHou now and again.
This looks fantastic I canāt wait to try it
Doublevil said...You are great. If what I'm working on looks to be useful to others similarly, I'll try get it to them, yourself included. Certainly, it has benefited from the tools and work of others, greatly.ocac said... @DoublevilHey ocac!
Howdy! Way back when, we had a wee chat, and I learned that your ę¬ commonality marker (so useful!) was from your own study of books and not any other.
https://www.wanikani.com/chat/kanji-and-japanese/8689
I'm looking through the massive TXT of the frequencies that you shared (never realised you had shared it until I went looking for said file for a project of my own - thank you!). 'Tis wonderful!
One thing though: What lexical tool did you use? Especially, what dictionary was it parsed/de-conjugated via? I was guessing Mecab, as that's most common, but I've recently started using UniDic so I see that there are other viable options. JWParser, even.
I'm looking to parse more text and match it to these frequencies to judge the most useful (common) words, so I'd like to copy the type of parsing for compatibility's sake.
EDIT: Actually, looking at the format of the entries, I think I would have to ask you for a new version to use it.
The setup is [Number]|[Item]|[Reading], with no line-breaks (at least as it renders for me), so there's no spacing between the entries. ie. [N]|[I]|[R][N]|[R]|[N][N]|[R]|[N][N]|[R]|[N][N]|[R]|[N]...
Would you be able to change the format to C/TSV, or some other line-broken (or easily line broken) format? Would really appreciate it.
The tool I used is chasen.
I'm not only sharing the file on WaniKani, I'm also sharing it with the rest of the world as it is included in Houhou's source code on GitHub.
There are Windows-format line breaks (\r\n). The format is:
<number of occurences>|<kanji reading>|<kana reading>
You should be able to use it as a | separated csv file already. If somehow you can't, I'm willing to transform it for you, it's easy enough. The upload would be by far the longest part.
If you just want to read it, you should be able to turn windows-style line breaks into your system's line breaks with any modern text editor. I'm using Notepad++.
I uninstalled Notepad++ a long time ago for some bizarre, vengeful reason. Yeah, it totally managed the file-size (which notepad barely could) and the line-spacing (which Notepad couldn't at all). Thank you for prompting me to patch-up my relationship with this useful tool. ;-)
I realise I confused parser tool with dictionary - Mecab being the former, UniDic the latter. I have yet to investigate ChaSen as a tool (another one @_@), but simply knowing of it is great. It seems it can be used with a variety of dictionaries. The lesser Windows version (the only one available to me in my present setup) defaults to using "ipadic" which I think is the same one used in cb's JTAT. However, ChaSen seems to work with UniDic and JAIST dictionaries as well?
Do you know what dictionary you used for your frequency study? Even if I'm using a different tool/version, matching the dictionary is probably the most important thing.
PS: I'm now thinking that ChaSen could be great as cb's JTAT, the tool I'm presently parsing with, lacks some control. If I need help with ChaSen, might I get in touch?
ocac said...You are great. If what I'm working on looks to be useful to others similarly, I'll try get it to them, yourself included. Certainly, it has benefited from the tools and work of others, greatly.Ah good question. I can't remember the dictionary I used. I can tell you that I ran ChaSen in a linux virtual machine, and with some parameters to adjust the output to what I wanted, but that's about all. I'll be able to investigate some more this weekend.
I uninstalled Notepad++ a long time ago for some bizarre, vengeful reason. Yeah, it totally managed the file-size (which notepad barely could) and the line-spacing (which Notepad couldn't at all). Thank you for prompting me to patch-up my relationship with this useful tool. ;-)
I realise I confused parser tool with dictionary - Mecab being the former, UniDic the latter. I have yet to investigate ChaSen as a tool (another one @_@), but simply knowing of it is great. It seems it can be used with a variety of dictionaries. The lesser Windows version (the only one available to me in my present setup) defaults to using "ipadic" which I think is the same one used in cb's JTAT. However, ChaSen seems to work with UniDic and JAIST dictionaries as well?
Do you know what dictionary you used for your frequency study? Even if I'm using a different tool/version, matching the dictionary is probably the most important thing.
PS: I'm now thinking that ChaSen could be great as cb's JTAT, the tool I'm presently parsing with, lacks some control. If I need help with ChaSen, might I get in touch?
If you want to get in touch, send me a mail at hello@houhou-srs.com. And if you want to chat on skype, line, etc, tell me your ids in the mail. :)
Doublevil said...Sending!ocac said...You are great. If what I'm working on looks to be useful to others similarly, I'll try get it to them, yourself included. Certainly, it has benefited from the tools and work of others, greatly.Ah good question. I can't remember the dictionary I used. I can tell you that I ran ChaSen in a linux virtual machine, and with some parameters to adjust the output to what I wanted, but that's about all. I'll be able to investigate some more this weekend.
I uninstalled Notepad++ a long time ago for some bizarre, vengeful reason. Yeah, it totally managed the file-size (which notepad barely could) and the line-spacing (which Notepad couldn't at all). Thank you for prompting me to patch-up my relationship with this useful tool. ;-)
I realise I confused parser tool with dictionary - Mecab being the former, UniDic the latter. I have yet to investigate ChaSen as a tool (another one @_@), but simply knowing of it is great. It seems it can be used with a variety of dictionaries. The lesser Windows version (the only one available to me in my present setup) defaults to using "ipadic" which I think is the same one used in cb's JTAT. However, ChaSen seems to work with UniDic and JAIST dictionaries as well?
Do you know what dictionary you used for your frequency study? Even if I'm using a different tool/version, matching the dictionary is probably the most important thing.
PS: I'm now thinking that ChaSen could be great as cb's JTAT, the tool I'm presently parsing with, lacks some control. If I need help with ChaSen, might I get in touch?
If you want to get in touch, send me a mail at hello@houhou-srs.com. And if you want to chat on skype, line, etc, tell me your ids in the mail. :)
cant download it says that the file is currupted
juanpra said... cant download it says that the file is currupted :(Try again. I just installed 1.2 on two computers yesterday and the download worked fine on both of them.
juanpra said... cant download it says that the file is currupted :(Try this link:
https://www.dropbox.com/s/sbc97yp2surfpkj/Houhou_Setup_1.2.exe?dl=0
Hmm⦠ SRS Import from WK doesnāt seem to capture vocab from levels 51-60. Ā Kanji is okay, though.
rfindley said... Hmm... SRS Import from WK doesn't seem to capture vocab from levels 51-60. Kanji is okay, though.Yeah it was done before the addition of levels 51-60 and the code needs an update only because WaniKani's API will kind of crash if you request all vocab, so you have to get it in multiple parts, by explicitly specifying the levels you want to get.
Doublevil said...Ahh, thanks.rfindley said... Hmm... SRS Import from WK doesn't seem to capture vocab from levels 51-60. Kanji is okay, though.Yeah it was done before the addition of levels 51-60 and the code needs an update only because WaniKani's API will kind of crash if you request all vocab, so you have to get it in multiple parts, by explicitly specifying the levels you want to get.
BTW... I discovered through experimentation that 5 levels at a time seems to be pretty optimal for fetching vocab. In the Ultimate Timeline script, I run 5-at-a-time queries in parallel, and it ends up getting everything usually in 5-10 seconds. Not sure why the backend seems to like that better.
rfindley said...Ahh, thanks.Ah thanks, that's helpful. I don't know why it crashes either but Tofugu didn't seem to care last time I contacted them about the issue.
BTW... I discovered through experimentation that 5 levels at a time seems to be pretty optimal for fetching vocab. In the Ultimate Timeline script, I run 5-at-a-time queries in parallel, and it ends up getting everything usually in 5-10 seconds. Not sure why the backend seems to like that better.
is there any way to reset houhou back to zero?
i imported my wanikani data but i accidentally duplicated everything and i don wanna be erasing words one by one
never mind i fixed it
Thank you so much for making this great application. I would like to second the suggestion to add a lessons queue. What I currently do is I write down all the unknown vocab that I encounter while reading, then later I go and add it to Houhou all at once. This means like 50-100 new words at one time. So the next review session ends up being a ton of stuff I donāt know, which is really intimidating the first time around. My current workaround is to suspend all the newly added items and only unsuspend like 10 at a time, but I feel like there could be a better solution. Maybe Iām just doing something wrong idk.Ā
Also, and this is maybe crazy, it would be great if there was a checkbox when you add a vocab item and it could say āI donāt know this/these kanjiā. If Ā you check that checkbox, HouHou would add all the kanji used in that vocab word to your review queue. Then, once those kanji get passed a certain SRS level, the vocab word gets released into your review queue (unsuspended) (kindaĀ like Wanikani).
Alternatively, there could be a setting where you mark which Kanji you already know, and whenever you add a vocab word that has a Kanji you donāt know in it, it automatically does the above stuff.
Lastly, there seems to be a bug with my audio. It randomly says the wrong words when I do reviews. It will be words that I am reviewing, just not the one I just typed in. I turned it off because it was bugging me.
Thanks again, I love this so much more than ankiā¦
Edit - also, is there a way to view onlyĀ suspended items in the SRS items list?
Reading the WaniKani radicals in houhou, take for example āanimalā and āsnakeā for the kanji ēÆ. Firstly, ācobraā was renamed(?) to primarily āsnakeā on WaniKani, thus does not appear in the radical search unless you search for the old name. Secondly, when selecting these two radicals which WK deems a match for this Kanji, the Kanji will not appear. It seems that the Kanji themselves continue to use the Kangxi(?)/Jim Breen breakdown, rather than the WK one. I understand this would be a time consuming thing to implement, but would be worthwhile I think. It would also be nice if (besides the XML file) you could rename Radical names.