Houhou 1.2 - Dictionary and SRS application for Windows

jutendoji said... Thank you so much! I found the problem with the eras: there was a space before the word I entered (copy-paste :-p ).
Ah, something I should fix.
Thanks!

I kinda love this already, planning to replace anki decks with this :stuck_out_tongue:

Thanks

Planning to add couple things to my SRS such as;
Katakana (I keep forgetting a couple characters or more so mixing them up…)
Core ?K deck, most likely 10k like I have on Anki atm… if I can find a clean copy would  be great, or I’m going the long route… 
Manga/LN words, I was planning on just using a text file to write out, translate and identify words for anki… but your program suits me much better than anki

Is there a way to increase the font size? If not that would be a great feature for an upcoming version :wink:

Anyway, loving the application! Has pretty much replaced Anki for me, because adding items is so much more convenient. I also use it as a dictionary a lot!

DeepChord said... Is there a way to increase the font size? If not that would be a great feature for an upcoming version ;)

Anyway, loving the application! Has pretty much replaced Anki for me, because adding items is so much more convenient. I also use it as a dictionary a lot!
Thanks for your feedback!
There is currently no setting for the font size and I don't know of any way to change the font size on specific applications.
I'll consider adding a setting in future versions though.

Deathnetworks of Sect Turtles says...

I kinda love this already, planning to replace anki decks with this :P

Thanks

Planning to add couple things to my SRS such as;
Katakana (I keep forgetting a couple characters or more so mixing them up...)
Core ?K deck, most likely 10k like I have on Anki atm... if I can find a clean copy would  be great, or I'm going the long route.. 
Manga/LN words, I was planning on just using a text file to write out, translate and identify words for anki... but your program suits me much better than ankiv
It's a lot better than Anki, really. For the katakana, honestly the best is practicing reading words. Breadstickninja made 2 amazing decks if you're interested, one adding extra vocabulary according to WK levels, and another one with 6000 kana only words. Just let me find them so I can paste the links.


Here you go:
/t/Vocab-WaniKani-Expansion-Pack-v10-1132016-Now-on-Memrise/6958/1
/t/Why-doesnt-WaniKani-teach-non-kanji-vocabulary/7508/1  (in the first reply)



1 Like
jutendoji said...
Deathnetworks of Sect Turtles says... I kinda love this already, planning to replace anki decks with this :P

Thanks

Planning to add couple things to my SRS such as;
Katakana (I keep forgetting a couple characters or more so mixing them up...)
Core ?K deck, most likely 10k like I have on Anki atm... if I can find a clean copy would  be great, or I'm going the long route.. 
Manga/LN words, I was planning on just using a text file to write out, translate and identify words for anki... but your program suits me much better than ankiv
It's a lot better than Anki, really. For the katakana, honestly the best is practicing reading words. Breadstickninja made 2 amazing decks if you're interested, one adding extra vocabulary according to WK levels, and another one with 6000 kana only words. Just let me find them so I can paste the links.


 Aye, already grabbed all of those... and a core 10k deck, lots of duplication though.. so I'm only on....11757 items in houhou, but I was thinking for characters that don't come up very often and the V sound ones like ヴェ、ヴァ、ヴィ etc...

Cleaning up the core10k CSV file has made my eyes bleed.... well not literally, but they bloody hurt

@Doublevil
Howdy! Way back when, we had a wee chat, and I learned that your 本 commonality marker (so useful!) was from your own study of books and not any other.
/t/VDRJ-Japanese-Vocabulary-Frequency-Lists/8487/1
I’m looking through the massive TXT of the frequencies that you shared (never realised you had shared it until I went looking for said file for a project of my own - thank you!). 'Tis wonderful!
One thing though: What lexical tool did you use? Especially, what dictionary was it parsed/de-conjugated via? I was guessing Mecab, as that’s most common, but I’ve recently started using UniDic so I see that there are other viable options. JWParser, even.
I’m looking to parse more text and match it to these frequencies to judge the most useful (common) words, so I’d like to copy the type of parsing for compatibility’s sake.

EDIT: Actually, looking at the format of the entries, I think I would have to ask you for a new version to use it.
The setup is [Number]|[Item]|[Reading], with no line-breaks (at least as it renders for me), so there’s no spacing between the entries. ie. [N]|[I]|[R][N]|[R]|[N][N]|[R]|[N][N]|[R]|[N][N]|[R]|[N]…
Would you be able to change the format to C/TSV, or some other line-broken (or easily line broken) format? Would really appreciate it.

ocac said... @Doublevil
Howdy! Way back when, we had a wee chat, and I learned that your 本 commonality marker (so useful!) was from your own study of books and not any other.
https://www.wanikani.com/chat/kanji-and-japanese/8689
I'm looking through the massive TXT of the frequencies that you shared (never realised you had shared it until I went looking for said file for a project of my own - thank you!). 'Tis wonderful!
One thing though: What lexical tool did you use? Especially, what dictionary was it parsed/de-conjugated via? I was guessing Mecab, as that's most common, but I've recently started using UniDic so I see that there are other viable options. JWParser, even.
I'm looking to parse more text and match it to these frequencies to judge the most useful (common) words, so I'd like to copy the type of parsing for compatibility's sake.

EDIT: Actually, looking at the format of the entries, I think I would have to ask you for a new version to use it.
The setup is [Number]|[Item]|[Reading], with no line-breaks (at least as it renders for me), so there's no spacing between the entries. ie. [N]|[I]|[R][N]|[R]|[N][N]|[R]|[N][N]|[R]|[N][N]|[R]|[N]...
Would you be able to change the format to C/TSV, or some other line-broken (or easily line broken) format? Would really appreciate it.
Hey ocac!
The tool I used is chasen.

I'm not only sharing the file on WaniKani, I'm also sharing it with the rest of the world as it is included in Houhou's source code on GitHub.
There are Windows-format line breaks (\r\n). The format is:
<number of occurences>|<kanji reading>|<kana reading>
You should be able to use it as a | separated csv file already. If somehow you can't, I'm willing to transform it for you, it's easy enough. The upload would be by far the longest part.

If you just want to read it, you should be able to turn windows-style line breaks into your system's line breaks with any modern text editor. I'm using Notepad++.

Did the audio url (http://assets.languagepod101.com/dictionary/japanese/audiomp3.php?kanji=%kanji%&kana=%kana%) change?

Deathnetworks said... Did the audio url (http://assets.languagepod101.com/dictionary/japanese/audiomp3.php?kanji=%kanji%&kana=%kana%) change?
Apparently not, I can still get the audio clips in my browser. Can someone else confirm that audio still works on Houhou when configured with this url? (I'm at work at the moment but will check asap)
Doublevil said...
Deathnetworks said... Did the audio url (http://assets.languagepod101.com/dictionary/japanese/audiomp3.php?kanji=%kanji%&kana=%kana%) change?
Apparently not, I can still get the audio clips in my browser. Can someone else confirm that audio still works on Houhou when configured with this url? (I'm at work at the moment but will check asap)
 Works fine for me on Windows 10 and that URL.

I’ll turn it off and back on again :stuck_out_tongue:

That did the job, note to self… actually close down HouHou now and again.

This looks fantastic I can’t wait to try it

Doublevil said...
ocac said... @Doublevil
Howdy! Way back when, we had a wee chat, and I learned that your 本 commonality marker (so useful!) was from your own study of books and not any other.
https://www.wanikani.com/chat/kanji-and-japanese/8689
I'm looking through the massive TXT of the frequencies that you shared (never realised you had shared it until I went looking for said file for a project of my own - thank you!). 'Tis wonderful!
One thing though: What lexical tool did you use? Especially, what dictionary was it parsed/de-conjugated via? I was guessing Mecab, as that's most common, but I've recently started using UniDic so I see that there are other viable options. JWParser, even.
I'm looking to parse more text and match it to these frequencies to judge the most useful (common) words, so I'd like to copy the type of parsing for compatibility's sake.

EDIT: Actually, looking at the format of the entries, I think I would have to ask you for a new version to use it.
The setup is [Number]|[Item]|[Reading], with no line-breaks (at least as it renders for me), so there's no spacing between the entries. ie. [N]|[I]|[R][N]|[R]|[N][N]|[R]|[N][N]|[R]|[N][N]|[R]|[N]...
Would you be able to change the format to C/TSV, or some other line-broken (or easily line broken) format? Would really appreciate it.
Hey ocac!
The tool I used is chasen.

I'm not only sharing the file on WaniKani, I'm also sharing it with the rest of the world as it is included in Houhou's source code on GitHub.
There are Windows-format line breaks (\r\n). The format is:
<number of occurences>|<kanji reading>|<kana reading>
You should be able to use it as a | separated csv file already. If somehow you can't, I'm willing to transform it for you, it's easy enough. The upload would be by far the longest part.

If you just want to read it, you should be able to turn windows-style line breaks into your system's line breaks with any modern text editor. I'm using Notepad++.
You are great. If what I'm working on looks to be useful to others similarly, I'll try get it to them, yourself included. Certainly, it has benefited from the tools and work of others, greatly.
I uninstalled Notepad++ a long time ago for some bizarre, vengeful reason. Yeah, it totally managed the file-size (which notepad barely could) and the line-spacing (which Notepad couldn't at all). Thank you for prompting me to patch-up my relationship with this useful tool. ;-)
I realise I confused parser tool with dictionary - Mecab being the former, UniDic the latter. I have yet to investigate ChaSen as a tool (another one @_@), but simply knowing of it is great. It seems it can be used with a variety of dictionaries. The lesser Windows version (the only one available to me in my present setup) defaults to using "ipadic" which I think is the same one used in cb's JTAT. However, ChaSen seems to work with UniDic and JAIST dictionaries as well?
Do you know what dictionary you used for your frequency study? Even if I'm using a different tool/version, matching the dictionary is probably the most important thing.

PS: I'm now thinking that ChaSen could be great as cb's JTAT, the tool I'm presently parsing with, lacks some control. If I need help with ChaSen, might I get in touch?
ocac said...You are great. If what I'm working on looks to be useful to others similarly, I'll try get it to them, yourself included. Certainly, it has benefited from the tools and work of others, greatly.
I uninstalled Notepad++ a long time ago for some bizarre, vengeful reason. Yeah, it totally managed the file-size (which notepad barely could) and the line-spacing (which Notepad couldn't at all). Thank you for prompting me to patch-up my relationship with this useful tool. ;-)
I realise I confused parser tool with dictionary - Mecab being the former, UniDic the latter. I have yet to investigate ChaSen as a tool (another one @_@), but simply knowing of it is great. It seems it can be used with a variety of dictionaries. The lesser Windows version (the only one available to me in my present setup) defaults to using "ipadic" which I think is the same one used in cb's JTAT. However, ChaSen seems to work with UniDic and JAIST dictionaries as well?
Do you know what dictionary you used for your frequency study? Even if I'm using a different tool/version, matching the dictionary is probably the most important thing.

PS: I'm now thinking that ChaSen could be great as cb's JTAT, the tool I'm presently parsing with, lacks some control. If I need help with ChaSen, might I get in touch?
Ah good question. I can't remember the dictionary I used. I can tell you that I ran ChaSen in a linux virtual machine, and with some parameters to adjust the output to what I wanted, but that's about all. I'll be able to investigate some more this weekend.
If you want to get in touch, send me a mail at hello@houhou-srs.com. And if you want to chat on skype, line, etc, tell me your ids in the mail. :)
Doublevil said...
ocac said...You are great. If what I'm working on looks to be useful to others similarly, I'll try get it to them, yourself included. Certainly, it has benefited from the tools and work of others, greatly.
I uninstalled Notepad++ a long time ago for some bizarre, vengeful reason. Yeah, it totally managed the file-size (which notepad barely could) and the line-spacing (which Notepad couldn't at all). Thank you for prompting me to patch-up my relationship with this useful tool. ;-)
I realise I confused parser tool with dictionary - Mecab being the former, UniDic the latter. I have yet to investigate ChaSen as a tool (another one @_@), but simply knowing of it is great. It seems it can be used with a variety of dictionaries. The lesser Windows version (the only one available to me in my present setup) defaults to using "ipadic" which I think is the same one used in cb's JTAT. However, ChaSen seems to work with UniDic and JAIST dictionaries as well?
Do you know what dictionary you used for your frequency study? Even if I'm using a different tool/version, matching the dictionary is probably the most important thing.

PS: I'm now thinking that ChaSen could be great as cb's JTAT, the tool I'm presently parsing with, lacks some control. If I need help with ChaSen, might I get in touch?
Ah good question. I can't remember the dictionary I used. I can tell you that I ran ChaSen in a linux virtual machine, and with some parameters to adjust the output to what I wanted, but that's about all. I'll be able to investigate some more this weekend.
If you want to get in touch, send me a mail at hello@houhou-srs.com. And if you want to chat on skype, line, etc, tell me your ids in the mail. :)
 Sending!

cant download it says that the file is currupted :frowning:

juanpra said... cant download it says that the file is currupted :(
 Try again. I just installed 1.2 on two computers yesterday and the download worked fine on both of them.