I noticed this quite a while ago, but it’s nagging me a bit. We’ll be introduced to kanji and vocab that have exactly the same meanings and pronunciations, for instance 糸 as thread and pronounced いと as both kanji/vocabulary…
So my concern is, if a lot of the kanji are duplicated as vocabulary (a lot of these seem to come up!), are we really getting 2k kanji and 6k vocab? Are they counting these in both columns or just one? It seems like if they’re counting both, the numbers are a bit inflated for what we learn here.
The thing is, 糸 the kanji and 糸 the vocab are different things, and while sure, obviously they’re related, if you see a kanji in a compound different concepts and readings should pop into your head first than if you see the same kanji on its own as a word in a sentence.
And there needs to be some information in there anyway to say “this can be a word on its own” since that’s not true of every kanji.
So it is worth learning them both seperately.
Either way though, I wouldn’t get too hung up on exact numbers! Suffice to say, WaniKani has a lot of kanji and vocab in it! I don’t really see the value in quibbling over the exact count. 6k vs. 5k and change vocab are gonna feel the same when you’re in it, you know?
I’m not too worried about it, it just seems a little inefficient and I wanted to know what people thought. I already paid for a year, so I’m here no matter what. I’d be just a tad happier if they simply marked ones like this, like 皿 and 糸 and 赤… as Kanji/Vocab, since I learned them already. It’s no big deal though.
It might be a little inefficient for ones like 糸 where kunyomi / standalone word reading is probably the most common in general, (not to mention learning it as a radical too…) but I think the consistency of the system on the other hand (by hammering home the vocab/kanji difference) definitely makes it worth it!
And hey, worst case it’s a little extra reinforcement.
Right, so the main question, then, is what is the count when you exclude the single kanji vocab words that use the pronunciation you are taught from the kanji card?
I haven’t really seen too many of these, so I’d be willing to be the numbers aren’t too far inflated. Even assuming an average of 5 per level, there would still be over 6000 vocab words. Nonetheless, even if the average is 10 per level, over 5000 (around 5700) vocab is a decent starting point.
All that said, I don’t care too much, myself. I use the vocab in WK as reinforcement for kanji readings only. My real vocab studies end up in Anki.
Only 404 single kanji vocabulary items share their reading with one of the accepted kanji readings, so that’s about the number of items that overlap. This doesn’t mean that they’re identical though, the kanji meaning might still be different in some cases. So worst case that still leaves you at 5954 vocabulary items, which would be fair to round up to 6000. I’m assuming that if you add the ones where the meaning is significantly different from the one taught for the kanji during the lesson you’ll probably end up over 6000.
How did you figured this out? Did you check for a common primary reading that counts as an accepted answer? Or did you just compare readings without checking for primary and accepted_answer attributes?
I only checked for common accepted answers, so it’d be readings that are valid for both vocabulary and kanji reviews. A significant part of what you’re left with are just counters and suffixes, which tend to differ only in meaning from the actual kanji itself.
If you want to I could rerun the script with only the primary readings available, I’d just have to change one condition.
I used the length of the vocab’s component_subject_ids field to see how many unique kanji they contain. This would cause a problem if some trailing hiragana caused it to somehow overlap the readings, but apparently there are no items for which this leads to a conflict, so it’s reliable enough. If I filter out any elements with a ~ in them, then you’re left with just 353 items.
I agree. The kana would add some moras to the reading of the kanji and that would exclude the vocab on the reading test. A way to make sure is to test on item.data.characters for a length of 1 but that seems superfluous to me.
Then there would be 6005 vocabs distinct from the kanji.
You guys really know your stuff. I had to stop and catch up, since the Crabigator went into maintenance mode when I was halfway through my reviews (I forgot sniff). I’ma mark one of these as the solution, but really there are many great answers. Thank you!
I just want to bring semantics into this discussion: you are learning 2000+ kanji, and 6000+ vocabulary. The fact that some items end up on both lists, doesn’t mean they shouldn’t be counted on both lists.
In my opinion the only actual duplicates in WK are those vocab that are identical, except one is the noun and the other has する behind it. You already get the ‘this is a suru verb’ info with the parts of speech for the noun only card. Those always feel like unnecessary padding to me, since you aren’t learning any new readings or combinations at all. Luckily there aren’t many of those either.
Oh and 誕生日おめでとうございます。Making you type a bunch of kana, only to teach you to read the first three kanji, that you already learn separately as well. Just put it in an example sentence, if you want to teach it so bad.
Yeah, this was the point I was trying to make above. 糸 being a kanji is undisputed, I’m sure, but the fact that いと is a reading for it does not guarantee that いと is also a word on its own. You have to learn that fact separately.
For instance the kunyomi of 付 is つ, but つ is not a word here (not in the same sense as いと anyway). The vocabulary you learn later on is 付く (つく).
For all one knows at the time of learning a kanji and its readings, いと could be similar. You just don’t know yet necessarily.
So no matter how you slice it, I don’t see how it can be seen as WK inflating numbers or something.