How proficient in vocabulary will I be when I am done with Wanikani?

I think that counting words is okay-ish for language learning for acquiring vocabulary, but it’s best if you realise that it’s human lexicographers who decide on what’s a word (and what isn’t). What’s considered a word is not very clear-cut.

Say, 足 and 脚. Both are pronounced あし. Both generally mean ‘leg’ (though 足 tends to refer to the foot part). Or 木 and 樹, pronounced き, meaning tree/wood (樹 tends to refer to more impressive trees). Are the words 書く (to write) and 描く (to draw), both pronounced as かく, the same word, written differently depending on context, or are they 2 different words? 町 (town) and 街 (bigger town), both are まち.

… and the list goes on. I do not envy lexicographers who have to make all those decisions.

2 Likes

Heyyyy…if you have a moment and interest, could you please tell more about this? (Or point me in the right direction to begin a fruitful web search?) I have recently been infected with a suspicion that the idea of “word” as I had previously understood it as an English speaker maybe doesn’t really apply to Japanese…maybe not at all… that maybe the language is so granular that it’s basically about stringing together syllables that can be contextually understood as ideas but the idea of it being a bunch of “words” with nouns and verbs and adverbs and adjectives etc. might be less true than it is for, say, English or German … is this possible? (I mean, German likes to assemble words together into big long specific ideas that turn out to be like long nouns… and it’s looking like Japanese does that too only the pieces are soooo smallllll…am I holding a hammer and seeing everything as nails ?)

Or is Japanese just looking extra suspicious because it’s my first time seeing a language with only 50 possible syllables?

Agreed, the vocabulary section on the N4 was not hard at all with just Wanikani exposure. However, what resource would you recommend for doing the Core 10k or for studying vocabulary after N4? Also, how do you find time to balance all the SRS systems and actually using the language through reading? :slight_smile:

2 Likes

Heyyyy…if you have a moment and interest, could you please tell more about this? (Or point me in the right direction to begin a fruitful web search?) I have recently been infected with a suspicion that the idea of “word” as I had previously understood it as an English speaker maybe doesn’t really apply to Japanese…maybe not at all … that maybe the language is so granular that it’s basically about stringing together syllables that can be contextually understood as ideas but the idea of it being a bunch of “words” with nouns and verbs and adverbs and adjectives etc. might be less true than it is for, say, English or German … is this possible? (I mean, German likes to assemble words together into big long specific ideas that turn out to be like long nouns… and it’s looking like Japanese does that too only the pieces are soooo smallllll…am I holding a hammer and seeing everything as nails ?)

Or is Japanese just looking extra suspicious because it’s my first time seeing a language with only 50 possible syllables?

A little of column A, a little of column B. Japanese does have more granular units of meaning that may not necessarily be traditional “words” than English does. Technically most “conjugations” (as we might think of them as foreign learners) of verbs except negative ones are just different helper-verbs being attached to the stem (連用形 in Japanese, indicating that it is literally a form meant to be continued by other words, with 連 indicating “continuation/combination/attachment”). Then you have the 未然形 (みぜんけい) for the negative stem, so named because it’s the only true variant of the word with no discreet meaning of its own, to which negative verbs are attached. (Cluttering all these distinctions up further is thousands of years of evolution. Ex. “い adjectives” were verbs way back when/still kind of are. Fun! Is 高い distinct from 高う? I mean, probably not, but that kind of argument could maybe be made. This evolution is still clear in some dialects and even set phrases like ありがとうございます, so it’s just one of many lines where history makes what is a discreet unit of meaning and what isn’t a bit blurred.)

And then you have a lot of compound-kanji phrases and compound verbs that blur the line–not even natives will know whether they can expect all of these to be found under their own entries in native dictionaries, unless they’re dedicated linguists. As pointed out above, there are also all the alternate-kanji uses–some with discreet nuances, some without (in which case usually one will be considered outdated, but that doesn’t mean two or more won’t continue to be used, since no one can really police it). There are a bunch of different はかるs meaning “to measure” with different nuances and contexts. Different words? Not?

That said–the ideas of 語彙, and 単語数, and 語彙力 all exist natively in Japanese and as Japanese topics. I guess I’m not sure how far back they run, and to what an extent they may be efforts to facilitate translation and global interlingual interaction, but at any rate, there’s a pretty firm idea of “words” and “vocabulary” as countable, discreet things, even if the lines on what actually constitutes one may be blurred a bit more often than with English. So if you were to tell a native to, say, count the number of words in a given sentence, the answer you get back may be different from what you expect. (And even different between individuals; it’s generally just not a concept applied to anything except word lists. I feel like that’s probably one of the reasons character-count is the ubiquitous space-measurement for writing rather than words, though their consistent spacing also helps.)

Edit – If you look up 単語 or 語 on Japanese Wikipedia (the former redirects into the latter), it basically pulls a mea cupla on how the concept might apply to its own language.

日本語文法では自立語付属語という名前の「語」があるが、これらは言語学的には語よりも細かい形態素であり、語に相当するのはむしろ文節のほうが近い。また、付属語は言語学的には接語あるいは接辞である。

Essentially, a “word” is normally defined as the smallest unit of a language with both discreet pronunciation and meaning, but Japanese has units of meaning considered “words” in its own grammar studies that don’t quite meet the broader linguistic requirements and create trouble for comparisons.

Basically, “We call all these things ‘words’ in Japanese grammar, but they’re so granular that in broader linguistics, it’s probably more useful to look at larger units of meaning instead.”

That doesn’t stop Japanese dictionaries from boasting word counts in Japanese, but it’s more a case of “entry count” in that case, with distinctions determined by the publisher.

Also as a fun fact, 単語数 (“word count”) still exists as a feature in Japanese Microsoft Word, but from my experience it’s functionally useless, always reporting something close to the character count, since there are very few things it can discount as potential discreet units. I’m honestly not sure why it even exists, except for just being lazily carried over.

4 Likes

Except what a word IS properly defined and everyone knows what is and isn’t a word.

Just because some academics with nothing to do like to stirr up some drama and try to argue about simple things like that doesn’t mean we should start doubting the conventions established.

Ask a Japanese native to count the number of discreet words in a sentence. It’s not going to be what you think.

Scroll down to the 単語 section on this native grammar-study page for some potentially surprising/debatable “word” distinctions: 言葉の単位をマスターしよう - 国語の文法(口語文法)

If you ask someone how many words are in this sentence, every one will say “16.” The same sentence in Japanese? Eeeeeeeh, lots of wiggle room, and likely not something non-grammarian native speakers would ever even really think of.

I mean, like, obviously this is outside the real question this thread was asking, as, yes, we can count useful bits of meaning in Japanese for learners to pick up, and it’s totally reasonable to say, okay, yeah, I know X-thousand “words” in Japanese, in an everyday practical sense. This was all just in response to @Slooshy’s question of how the distinction works (or doesn’t work) in Japanese, since I think it’s an interesting topic. And like, really, practically, the distinction is blurred for natives in a way that you wouldn’t expect, although there is a concept of vocabulary and word-count to the extent that, like, word lists and dictionaries can exist by wading through the many blurred lines on their own. But “everyone knows what is and isn’t a word” is definitely not true for Japanese in the way it is for English.

文節を単語に分ける作業は、文を文節に分ける作業にくらべて難しい作業です。

“Breaking down phrases into words is much more difficult than breaking down sentences into phrases.”

The exact opposite is true for English, right?

5 Likes

I love their “word” (単語) decomposition, it’s insanely counter-intuitive for English speaker :stuck_out_tongue_closed_eyes:

From your link, on a quiz form, so that everybody can try :

菜の花が咲く季節になりました
How many “words” ?

Answer

菜の花|が|咲く|季節|に|なり|まし|た。→ 8 words :exploding_head:

大人らしく落ち着いて話してください。
How many “words” ?

Answer

大人らしく|落ち着い|て|話し|て|ください。→ 6 words :exploding_head:

3 Likes

Lol, I was just editing my post again to specifically include that one.

Ask someone how many words are in it.

A learner coming from English or a similar language might tell you:

Five.

菜, 花, 咲く, 季節, なりました, omitting the particles

“It was the season when the rape blossoms bloomed,” by the way, for meaning.

A strict grammarian native speaker will tell you:

Summary

Eight.

菜の花|が|咲く|季節|に|なり|まし|た

Counting all the particles as discreet bits of vocabulary, and highlighting the compound nature of what learners tend to think of as verb “conjugations” (but which are actually just additional units of meaning being stacked onto the 連用形/stem, its own noun).

And then 菜の花 is actually a compound that can’t be broken apart because it loses its specific meaning, per this writer.

A native who isn’t attentive to grammar in any particular way might give you any number in between.

: /

Notice that their “phrase” breakdown problems above keep the verb “conjugations” whole, though, which is why that Wikipedia entry on “words” in Japanese specified that they might be more useful for linguistic comparison. But even then there’s a lot more subjectivity than just counting words in English.

Anyway, sorry for hijacking the thread. WK will teach you plenty of words, but you’ll need outside study to contextualize them and fill in a lot of lower-register gaps. That’s the answer for everything WK-related that isn’t reading kanji. “It’s very helpful but can’t be your only source.” :man_shrugging:

1 Like

朝に|散歩する|ことに|して|いる
Except somehow for theている “conjugation” that get break down into two “phrases” (文節) !? Madness !

1 Like

That one actually makes sense when you think of phrases like してばかりいる–there can sometimes be modifiers breaking up the “て-form” and the “いる,” and you also have “てある” for certain inanimate states, so treating “ている” as one conjugation is an even bigger cheat for foreign instruction than most things verb-related. But yeah, it just highlights the logic differences in terms of discreet bits of vocabulary even further.

1 Like

To add to the excellent replies by @IanD, English doesn’t have a clear-cut definition of ‘word’ either. This what MW has to say about it:

1a(1) : a speech sound or series of speech sounds that symbolizes and communicates a meaning usually without being divisible into smaller units capable of independent use
(2) : the entire set of linguistic forms produced by combining a single base with various inflectional elements without change in the part of speech elements

Except that’s not exactly definitive either – is “without” one word, or two (with + out)? “Within”? What about “to go”? Is that the same as “go”, “goes”, “going”, or a different word? And wait – what’s the past tense of “go”? It’s “went”. Is that a whole new word?

Most native speakers have a fuzzy understanding what a word in their language is. How come? Speech is a single unit, with barely any pauses. It’s orthographic idiosyncrasy that a lot of written languages have spaces between what we call “words”, and it’s based on the native speakers’ understanding at a certain point in time, which can change. Take a look at the word “apron”. It wasn’t originally “an apron”, it was “a napron”. In speech, both of these sound like “anapron”, so it got redivided.

Did you know that English has 2 pluralisation systems? One is the -s (books), and one is the -en (children). We can probably agree that “book” is the same word as “books”, but what happens in this case?:

  • Brother → Brethren
  • Brother → Brothers

Take the word “clear-cut” from my first paragraph. Is it actually a single word? Or is it two? Does knowing it require knowing the meanings of “clear” and “cut”? Does knowing it mean you know 1 word, or 3?

And this is just all off the top of my head, and 10 years outside of linguistics. I’m sure there’s more.

3 Likes

Farout, that thing about how they divide up the words is an eye opener.

Nup, it’s very simple. I’m sticking with the definition of anything between two spaces as a single word.

  • Without is one word.
  • To go is two words.
  • No, to go is not the same as just go, goes, going, or any other transformation. They are each their own word(s). If it has a different spelling it’s its own word. Meanings have little to do with it when you’re trying to distinguish word count regardless of context. Right is one word but has a few homophones such as the to turn right, to be right, and to make something right. The meaning/context doesn’t matter, right is still one word in each of those cases. But of course you’d have to learn those three separate meanings and effectively treat them as three separate words that for whatever reason share the exact same spelling if you want to understand the text.
  • Went is indeed a whole new word but one word nonetheless.

Languages change naturally over time, I very much agree with you on that.

What? Does this not include other pluralisations such as radii?

  • It is a single word. Hyphens incorporate multiple words and since they are no longer divided by a space they are effectively joined as one word.
  • No, clear-cut does not require knowing the meanings of clear and cut but it can sure help when learning it. However, it can be independently learned like any other word. Funny enough I tend to avoid hyphens and now I’m not sure if I were to write it as clearcut or clear cut.
  • Knowing it would mean you know one word, just like any other word that you learn. However, it makes learning its component words easier.

TL;DR It really does not need to be as complicated as you make it. In cases of such ambiguity, simple definitions are the best. Apply occam’s razor, KISS, and all that and you’ll save yourself time and pain.

1 Like

There’s several (and free) options available, but the platform that made me actually start learning vocabulary was Kitsun. It offers all the mainstream decks (including Core 10k - probably the best version around) and they actually look good visually (not bland looks like most decks on other platforms). It also has a dictionary integrated, so you can literally search for a term and make a flashcard for it right away. It’s very friendly-user :slight_smile: It’s however a paid platform, but offers 14 days of free trial :slight_smile:

If you really can’t afford a paid platform, you can always study the deck on Anki. I recommend this version of the deck :slight_smile:

I usually try to do reviews during the day (most of it in the morning) and save the time for reading to around dinner time :slight_smile: I think it’s best to have different schedules for both because it allows to have a better focus. You won’t be worrying about doing reviews during your reading time because you do them at different times of the day :grin:

4 Likes

So you’re saying that “lead” (the verb) and “lead” (the material) are the same word? They’re spelt the same.

Also according to you, “to go” and its conjugations are all different words. Does that mean that when you count it for estimating how mantle words you know, you actually know 5 different words, per every single verb in English? If so, why does it show up in the dictionary only once (or twice)? Does that mean that every single conjugation of よむ is it’s own word?

I’m pretty sure everybody would agree that English spelling is a mess. It hasn’t really changed ever since the printing press was invented, several centuries ago. It no longer really represents the spoken language. Orthography, in any language, will always be accurate for a very short time until it becomes slightly irrelevant, and then steadily more and more irrelevant (just look at Arabic or Tibetan). A real language is not its writing system or its script or how it’s spelt - it’s how people speak.

You’re to, I did forget about that. Does that mean that Cactii and Cactuses are different words?

Ahhh thank you so much for this - especially for taking the time to find and translate to English from the Japanese Wikipedia article. It’s really helpful for me to know that different native speakers might come up with different word counts, and it’s also a relief to find that this is truly a difference between the languages and not just me being extreme. :slight_smile:

Also helps to know that we are aiming at the larger groupings specifically because the it allows for more easily relatable meaning vs. English word correspondences.

1 Like

Thank you soooooo much for the detailed breakdown and explanation. It’s extremely helpful.

1 Like

That may be the case in many languages or, at least for English, when you get a bunch of linguist types in a room who love to unpack the granular meanings in a word.

But there’s no way to dismiss the significance of this: when English speakers count words in sentences they come up with the same totals.

And thank you!!! I love unpacking my language!!! :smiley: I am delighted to have read your post! :slight_smile:

1 Like

I meant that natives would generally know what a word is and what isn’t, and besides having spaces in an orthography really primes people to consider everything that comes between 2 spaces ‘a word’. Japanese doesn’t have that priming.

1 Like

Ohhhhh ok. I see what you are saying.
Thank you!

All: sorry for the threadjacking and thank you very very much!

1 Like

No, in that case where you’ve distinguished their meanings they’re of course two different words that just happened to share the same spelling. I mentioned that a bit further down.

I thought you were talking about counting words in a text though, not whether or not to give a word its own dictionary entry, which is a completely separate matter where meaning is everything and spelling is of secondary importance. The priorities are opposite to word count! Speaking of whether or not to give a word its own dictionary entry…

To go, goes, going and so on all stem from the same meaning. They are their own word in their own right as you need to learn each of them to be able to use the verb ‘go’ properly. For the number of words a person knows though that seems more grey and I feel more inclined to say no, if you know all the conjugations of ‘go’ you still only know one word as their meaning is essentially the same and is why it makes sense to only use up one dictionary entry.

In fact, I think it would be more correct to say that you only really know the word once you know all of its conjugations because realistically in a text it would be conjugated. Latin declensions are probably a good example of this because you can’t really claim you know 10+ words from just knowing lingua’s variants linguae, linguam, linguas and so on.

For japanese I have no idea how they do their word count as it appears it is not as cut and dry as it is in english. よみました is apparently three words even though it was just conjugated from よむ which I think is just one. I guess you could look at it like ‘to go’ in english being transformed to ‘goes’ (2 words to 1) but I’m fine with this since to go is separated by a space. Whereas I have no idea what indication you would need other than memorisation that よみました is three words.

1 Like