Resource for finding the most common vocab reading?

Tried to search for this topic but ironically there’s too many ways to phrase this question for me to find if someone’s asked it already (which assuredly they have)

So anyways, I just hit 根本 and it has two readings (こんぼん and ねもと).

Naturally it gets me to thinking: “Which of these is going to be more popular in common speech?”

My biggest research resource is usually just Jisho, but for compound words it doesn’t really help that much with different readings.

Is there a good resource for finding this out or should I just pick a reading and apply it whenever I encounter the compound?

3 Likes

Example sentences in jisho or weblio might help: 「根本」の英語・英語例文・英語表現 - Weblio和英辞書

Here it seems to me that ねもと is primarily for literal tree roots while こんぽん happens for that and more metaphorical usages.

2 Likes

Ahhh, I didn’t even think about looking at the example sentences!

The context definitely helps. I do wonder if say, a novel, would have the furigana just for clarity though.

I’ll def have to start using weblio more often though. Just need to get over my Japanese web page aversion

1 Like

jpdb is also really nice for this (it’s even better for finding out which kanji representation is most often used for a word).

In the top right you can see こんぽん not こんぼん is in the top 6700 most frequently used words, while ねもと is only top 24700.
Also, it shows that ねもと is written 根元 in 87% of cases, and only in 11% of cases it is written 根本.

Which lets you guess that when you encounter 根本 written like this, it’s most likely read こんぽん.

I don’t know how their scraping algorithm works and how they know which reading is intended in written text, so I can say nothing about that, but I trust them :smiley:

6 Likes

YES YES YES

This is exactly what I wanted!

Now I can become even more granular in my study sessions!

1 Like

I would be super cautious about assuming JPDB is right about that kind of question. Where the input text has no furigana then the parser is just going to be guessing about the reading, and so a big difference in frequency like that could easily just be telling you “the parser guesses this way every time”, which isn’t necessarily the right way.

It’s more trustable on “which kanji is more common”, but even there you need to watch out for whether it’s assumed that one kanji representation has a different reading and assigned all those appearances to a different word.

5 Likes

Agreed, some parses might also come from the fact that こんぽん is more frequently used as a compound in longer words, like 根本的. I’m not sure if this is true but I could imagine an instance of 根本的 counting towards the frequency of こんぽん.
You should probably be reasonably cautious about the results, but it’s a starting point :slight_smile:

3 Likes