How many vocab is required to be able to read things comfortably?

I mean, this is Wanikani so probably the average user knows more kanji and less words than they probably should

7 Likes

Vocab dosen’t matter if you don’t have the grammar to comphrend it. I know lots of words but fail to comphrend the sentence I spend too much on vocab oof.

2 Likes

I hope I can start reading yotsubato on level 28 and understand at least 30% of each chapter without having to check all the time dictionary :sweat_smile:

Yes that is true and the examples are really cool to see. Although, perhaps from a purely semantic perspective, I have some issue with the concept nonetheless.

To me, reading “comprehension” is not simply recognizing the words in a sentence. It’s understanding the whole of what you’ve read. So when someone says to me that they can read something with 90% comprehension, my thought isn’t that they literally recognized and could assign some meaning to 90% of the words written in the text, but that they got 90% of the overall meaning of the written work they were reading. For example, they could follow the story and give you a decent summary of what happened in it, but maybe they missed some details and/or misunderstood some parts of it.

If they more literally couldn’t recognize 10% of the words they saw and these were such key words that they got little if anything out of the story/article/whatever, then that’s something far below 90% reading comprehension to me.

It’s a difference between word recognition vs. reading comprehension (which also entails grammar, idiomatic usage of words, etc,). But reading comprehension as I’m defining it is certainly harder to measure objectively.

Again though, a semantic issue I suppose, but also part of the reason why the 5k words = 90% comprehension thing seemed so preposterous to me in the first place.

4 Likes

I don’t know about Japanese, but for Chinese, actually, I can see why 2500 hanzi was the number selected. I’m from a country that expects students to be bilingual, and with the current syllabus for the course I took a few years ago, students are expected to know about 2700-2800 hanzi actively (I think) by the end of the course. With that number + some reading of my own (which frankly probably taught me more new words than new hanzi), I’m able to understand most newspaper articles fairly easily unless they’re far beyond the domains I’m used to reading about in Mandarin. I’m also fairly comfortable with Classical Chinese despite never having had lessons in it. (Granted, I do struggle with the original classical versions of Chinese stories, but poems are generally more manageable, so one shouldn’t overestimate my ability either.) Also, while I’m not sure what level they were on, a Chinese teacher I met at the university level in France, whose classes were roughly pitched at a C1 level, was surprised to see that I almost never had trouble reading any of the hanzi in the articles she selected, which were mostly about topics like Chinese students abroad or technological developments in China like mobile payments. I’ve also translated an article about the real estate market from Chinese into English for my mother, only needing to look up technical terms, most of which were written with familiar hanzi. Like @seanblue said, vocabulary in these languages is still mostly a matter of knowing words, not just of knowing characters.

PS: If we’re talking about CEFR proficiency levels, I’ve heard people say that HSK 6 is really only equivalent to a B2 level. I’m not sure if that’s entirely true given how much I’m able to understand (I have the right hanzi numbers anyhow, even though I only have a HSK 5 cert, which I got when I was much less confident in Chinese), but I’d say it’s probably a pretty high B2 nonetheless, and should easily prime you for C1, so I think it’s more like a C1.1, whereas the JLPT N1 is much more of a range (B2-C1). My reasoning is test conditions: you can’t summarise a recording of a text of over a thousand characters into 400 characters from memory without decent written fluency and high comprehension. On the other hand, you can do without expressive fluency in the JLPT because it’s multiple choice.

1 Like


I see that very many sites claim that HSK6 = 5000+ vocab = C2… I hope they really mean 5000 Hanzi? Because the numbers claimed in all these charts I’m looking at seem ridiculous

1 Like

Yeah, I agree that it’s more like b2/n2. They extended levels to HSK 7-9 now. I always thought that knowledge of just 5000 words is a very small number to be called an advanced learner.
Still, 2500 hanzi covers almost 96% of common text, which is honestly a very good number to learn vocabulary just through context… in other languages, but in Chinese, it looks like it doesn’t work the same way.
https://www.chinesethehardway.com/article/hsk-6-gets-you-halfway/
I never studied Chinese though so I dunno how critical it would be to not understand even just one single character.

1 Like

In my experience, about 600 words is all you need.

Disclaimer

First you must commit to this being the only thing you’ll ever read:

12 Likes

That’s because that’s how HSK used to be marketed: HSK 1-6 = A1-C2. I don’t know if that’s still the case, and @alexsandred says new levels are going to be added (which I just verified, but it seems the new levels are being billed as tests for people who really need very advanced mastery of Chinese). I don’t know about C1 – like I said, I think HSK 6 gets you pretty close – but HSK 6 is definitely not C2 by the European definition (‘advanced technical language’).

I just checked another site though, and it seems like it really is ‘5000 words’ that they’re talking about. I’m not sure how that adds up – maybe the standard for fluency really is just ‘daily conversation’ – but to be fair, it’s very hard to define a single word in Chinese, especially when so many more advanced words are just combinations of common hanzi or other known words. I have no idea how many words I know in kanji anyhow, and if China’s official standard for literacy is really 2000 hanzi, then I guess the HSK test isn’t too far off, as shockingly small as 5000 words seems.

Well, all I can say is that I think this article did a poor job in one very crucial aspect: the example sentence provided was not highlighted as being one containing a hanzi that doesn’t come out on HSK 6. Also, I’m sorry, but I think that novels are not a great standard for fluency insofar as they use the most rare words, including words that most native speakers don’t hear very often. I have a C2 diploma in French; I score better on many French tests than most of the native speakers around me in my science-related university course; I still have words to look up when I read novels or philosophical writings, albeit only about 5 at most per page of A5 text. That doesn’t mean that you’ll be struggling with most everyday news, even if it’s about scientific discoveries or politics: I certainly don’t.

As for how crucial it might be… it depends. If you’re like me, you’ll be frustrated and look it up because you want to learn the precise nuance (and the reading) of the word, but if you’re like one of my friends, you’ll just guess the reading and the gist and move on provided context makes things clear. Can you do that? Yes, to an extent, though you might be wrong about the reading. (The prevalence of phono-semantic hanzi makes this less likely, however.) Meaning is much easier to guess, however, and this for two reasons:

  1. Most of modern Mandarin uses what we call 配詞 (pèicí) – character pairs. In other words, the meaning of most modern Mandarin words are determined by two hanzi, not just one, so you can guess about half of the meaning of each word just by knowing one hanzi. This helps in addition to context.
  2. You can guess character meanings from character components by using lateral thinking. This doesn’t always work, especially if large shifts from original meaning have occurred or obscure borrowings were used for hanzi components resulting in component deletions, but even up to… an intermediate level of Mandarin, I’d say that this is possible.

In any case, these Quora answers (those addressing the question at the top of the page, that is) all agree – about 3500-4000 hanzi are enough for reading almost anything:

I’m probably around 3000 (I don’t want to overestimate because I know I haven’t studied Chinese actively in a long time) and I run into… what, 3 new hanzi per paragraph at most? Here’s an example from the People’s Daily about Xi Jinping discussing China’s future plans:

The two characters in bolded italics are the only two I can’t read. Context, the hanzi around them, and the presence of the water radical and the horse radical suggest to me that they’re part of a four-character idiom that refers to sudden changes and developments. (Why the radicals? Because the sea is quick to change, and horses often represent speed. You need to know what radicals mean/are associated with in order to feel these things instinctively, and as I mentioned above, hanzi/kanji rely on a lot of lateral thinking.) I have about two candidate readings per hanzi immediately, not accounting for tones, which are hard to guess, but can usually be narrowed down based on which tones are more common for particular sounds. Upon checking, I find that I was right about : it’s ‘hài’, whose tone I also guessed correctly. I was wrong about , which is ‘tāo’ and not ‘chóu’ or ‘zhù’.

However, the real difficulty of this text is not new hanzi, but rather combinations that are relatively rare in everyday conversation, which can be understood with some thought, but are unfamiliar, such as 战略定力, which I think should translate literally as ‘battle strategy staying power/resolve’.

Point is though, novels are really not a good measure of everyday functional fluency, if you ask me. Newspapers and television programmes should be the measure, because they’re the biggest providers of advanced language in an everyday context, and I think that if you think about it in that sense, HSK 6 probably does bring you pretty close to handling a good amount of it, even if it doesn’t mean you’ll be able to understand everything made of hanzi you know, as I’ve just demonstrated.

5 Likes

I think the problem with this is that 90% comprehension means something completely different than knowing 90% of the words.

7 Likes

For sure, I completely agree that the two concepts are significantly different! Both interpretations seem applicable to “reading comprehension” though - semantics innit. However the more abstract definition of “understanding the meaning of the text” mentioned by you and @RDavid311 is, of course, near impossible to objectively measure (as RDavid already mentioned).

1 Like

Ty for input. Really interesting read. I’m planning to start learning mandarin later. Hope kanji will help with recognition. Afaik kanji share a similar meaning with hanzi ( in 70%+ of cases)

I agree with the novel’s language. I consider myself being fluent in fr/en even though my vocab is approximately 12k words for each but I almost never encounter any new words while watching movies, browsing sites, forums, playing games, and ext. Books on another hand- uhhh like a completely different language. Its like a very narrow domain where you need to know tons of adjectives and descriptive words which are specifially only used in written language.

Although even in games I sometimes question the extent of my vocabulary. )

5 Likes

Somehow I got the exact same score as you. I don’t really understand quizzes like this. First of all, I have no idea how many I got right. I was only sure of my answer for literally three of the 25 questions. Sometimes I knew the word only by a superficial English translation, so I couldn’t connect it to any of the answers. And also, quizzes like this should really have an “I don’t know” button. What’s the point of randomly guessing and possibly inflating your score?

In any case, this score is nonsense, as there’s absolutely no way I know 21k words. And even if I did know that many words, there’s no way I demonstrated that in this quiz.

6 Likes

I think this vocabulary quiz just takes into account that a person taking it is a native speaker. 20k is a lot. I don’t think it’s even possible to acquire that much without being exposed to language 24/7 for many years+ consuming very specific content.

Even native English speakers get like 25k words on average on a test like this http://testyourvocab.com/

I’m going to test it to see what you get for different amounts of correct answers, and to see if it matters which ones you get right.

First test… intentionally answering everything incorrectly netted me this.

image

4 Likes

You could do “reverse cheating” - if you don’t know, check it with yomichan or other dictionary extension, and then deliberately pick wrong answer :wink:

I did that for the second half of the quiz, yeah.

I tried it again, and this time I intentionally answered everything wrong except for the very hardest question.

It gave me the same result as when I answered with 0 correct. My hunch is that there are brackets like 0-5 right, 5-10 right, etc. and that’s why it’s easy to get the same score as someone else.

But I’ll try more combos later.

7 Likes

Yes, the same score occurrences do seem like there are brackets like you’re saying, as I got the same score as you and @ABCDEFGHIJ. Out of curiosity, if you took the other test linked here, what did you get on that one? :grin:

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.