Most Incorrectly Answered Kanji/Vocabulary

Hi All,

First time poster here working in data analytics. This might be more of a question for the WaniKani dev team but I would be very interested to see which kanji and vocab terms have the highest percent of incorrect answers in the apprentice/guru phases. If they even collect data this granular, potentially there could be a community thread of mnemonics and tips for these most troublesome items.

Comment any thoughts or suggestions!

-SalamiBobby

5 Likes

They have the data, and it’s exposed at an individual level in the API.

https://docs.api.wanikani.com/20170710/#review-statistics

I agree it would interesting see the entire set of WK graphed with this data.

5 Likes

I’d be curious to know too!

2 Likes

ć…‰æ „ - こうえい - honor
æ „ć…‰ - えいこう - glory​

just kiddin’, I have no idea! :smile_cat:


Something, might? be gleaned from taking a look at the Most Incorrectly Answered Kanji/Vocabulary - #4 by ekg thread?

Which makes me think there will not be much “accumulated” data really. As the WK-team has a habit of making updates weekly for things being suggested, and improvements they’ve been working on for long-term keep-up for the site. But there are items being block listed every week for example, so yeah, that changes the statistics you can gain long-term of which items are failed and why! (before or after weekly changes). :eyes:

You’d have to post the questions directly to the Tofugu team likely and if they have some other means of measuring impact of their changes on the site.

2 Likes

I kept mixing up 蔷蚎猶äșˆ and ćŸ·èĄŒçŒ¶äșˆ

Most likely the statistics would show lower level items as the most incorrect just because fewer people get to the higher levels and not because they’re more difficult.

3 Likes

yes, but due to the Wanikani team moving items up and down levels, from quite a huge difference in height at times, there’s no real way of measuring this is there. Just week for week before a specific item has gotten a weekly item update and moved between levels. :thinking:

So, data might be accumulated “per item”?? But, how to measure the whole site’s items? and various update impacts?

I’m sure there are way to exclude interference in the data, perhaps, but I’m no statistics analysis person, so I wouldn’t know first thing about this.

I assume one would look at “percentage failed” and not at “absolute number failed”, no? :thinking:

Or are you saying that you think people generally fail more items in lower levels because they are new to kanji learning and stuff? I don’t expect this to be the case, but it would be an interesting result nonetheless!

Percentage failed favors items with fewer data points, so the information is less useful. If there’s an item that only 5 people have reviewed and they all got it wrong, it would be considered worse than an item that 1000 people reviewed and 900 got wrong. We have to determine what we consider to be most incorrect accounting for both the number of people who reviewed it as well as percentage of failure. Statistics is hard.

What I meant is that you’ll have a larger absolute number of incorrect items in the lower levels because there are more people there to be incorrect.

Oh you mean the information is less useful because the population is smaller? Interesting.

Well, to be honest I would also consider the item where 5 of 5 people got it wrong to be worse than the other. But that’s of course my interpretation, then. I see your point now, thanks for clarifying!

It’s just that with 5 people you don’t know if you just have the wrong 5 people. Maybe a different set of 5 would have gotten them all right. A larger data set reduces biases.

1 Like

If people could add their own cards, you’d have that tail-end effect where there are things only one or two people have ever reviewed and they kept missing it, so the percent failed is higher than any of the common items.

But with the curriculum in WK being a fixed corpus, and enough people having gone all the way through it to 60, there should be a large enough sample of all the items to do statistically significant analysis with. The worst case of what you are describing would be for words that were just added, and at a high enough level not many people have seen them yet, but there are ways to deal with that if it pollutes the results (like “show me the top five items by fail percentage out of those that have been in the WK corpus over over a year”).

2 Likes

This is a link back to the same post, though? That’s taking self-promotion to a whole new level. :stuck_out_tongue:

4 Likes

um
i realy don’t know

I’ll try to dig up some info if I can. All matters aside:
:confetti_ball: :crabigator:Welcome to the community! :crabigator: :confetti_ball:

Yes! And every time I come up with mnemonic to try and distinguish them, it never works.

Is there a badge to be obtained here?

One the one hand, my percentage correct was much better in the single digit levels, a bit worse in the teens, and then I died a few months ago on my way to Hell. On the other hand I’ve gotten to the point where I can guess readings from radicals for those that aren’t rendoku’d or aren’t äșș. So it’s a wash? Uh, no it’s still Death.

To which I’d assert that words containing äșș must win hands down in the lowest percentage correct category. I mean really, if the monks back in the day had an ISO standardization committee this all would be much easier.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.