On the nature of written language and "useless" words

In 1799, Pierre-François Bouchard, archaeologist, excavated the Rosetta Stone during a Napoleonic expedition. The slab was rather remarkable; on it, was a multilingual inscriptions of three languages: Ancient Greek, Demotic and more importantly, Ancient Egyptian. With only minor variations between the text, the hitherto unappreciated lump of granodiorite became the key to translating the yet-untranslated Ancient Egyptian language.

One funny thing the Egyptian scholars of the 19th century noted as they became more confident in deciphering the language and going further back in time was how little the written language actually changed. When compared to Demotic, the hieroglyphs from the Ptolemaic era still displayed the same writing conventions and grammar of writing from as far back as the Old Kingdom period, around a two thousand years earlier. Two thousand!

Admittedly, we live in much different times now. The advent of mass literacy has radically transformed the process in which languages build consensus and evolve. And that’s good! However, one aspect remains; even as spoken language evolves rapidly, written language remains notoriously conservative.

Now tell me: do father, bother and law have the same “a” vowel? Well, you tell me, as I said. If you call them chips, yes they do, but not if you called them crisps. And this is just one of the many examples of how conservative written language is!

The point that I’m trying to make is that reading and writing is a vastly different experience from speaking. Spoken language is all about efficiency and it moves fast and breaks things, but the written word is more thoughtful and deliberate.

And not only that, but the written word has to accommodate all of human experience and knowledge and make it transmittable. This is why the concept of “useless” vocabulary irks me quite a lot; I first met people complaining of such malarkey when I was learning English. You’re telling me that in all of the human condition of the English-speaking peoples, nobody will ever write down the word “acumen”?

It’s a similar thing with Japanese vocabulary. To me, the whole concept of a “useless” word speaks more about the person who raised the complaint more than anything else. It’s pretty presumptuous to say that the word will never come up and is utterly irrelevant to the plurality of the Japanese human condition, and all you’re saying is that you like comfort zones.

But WaniKani is a reading resource, for better or worse, and its vocabulary selection is geared towards that. And because of the conservative, thoughtful and pensive nature of written language, what qualifies as a common word is a vastly wider net than what it is for speaking, if only for the simple fact that you can just pause and think and what you’re writing, and that leads to vocabulary, grammar and turns of phrase that you may not necessarily use in everyday conversation, just because of the choice of medium for your language inquiry.

And that doesn’t even get into specialised knowledge one may possess and one may want to express in another language. Who am I, or who are you to deny an engineer the privilege of learning how to say “torque” in Japanese, Bulgarian or Azerbaijani? Taking professional fields into account, what is and is not “useful” suddenly become an entirely different question: in what context am I going to find that language, and is that context relevant to me? You’d be surprised at how much specialised language one has to learn just by interacting with other disciplines (I know far too much about electricity due to developing electricity distribution software; let us not go into that incident where I misinterpreted how many Newtons per Coulomb is too much).

So, if you, in any language you’re ever going to learn, decide a word is useless, think: “what exactly is my goal with this language?” You might just want to be conversational and not ABSOLUTE ULTIMATE MASTERY, and that’s fine. But that’s your goal. Someone else might have a more involved goal, and to him the word is useful because of the material he engages with or specialised knowledge he possesses. And we’ll walk up to the sun, hand in hand.


thank you for this detailed essay


I’m sorry, but I’m now intrigued by this


In all seriousness though, I do agree with your overall point. Yes, some words may be relatively rare, but it only takes one guy trying to sound smart and you’re going to be reading about someone’s exquisite coiffure - or, in Japanese, you’re gonna see 河豚 written in kanji.

That said, I also see the other side of the argument: while no vocabulary is inherently useless, one could ask whether learning about rarely-used words is the best use of ones time when they’re still very much at the beginner level. One could forgive a non-native still learning about the intricacies of the present simple for not quite understanding how one’s coiffure may or may not be splendiferous, I’m sure, and at that point that’s hardly the thing they should be focused on learning.


I’d imagine that the presumably tedious process of carving glyphs into stone would tend to work against changing the written language (although I suppose that writing on papyrus would be easier and thus more conducive to accommodating changes).


You’d be surprised

Classical Latin based on early Roman Republic era Latin remained virtually unchanged in its written form even as Vulgar Latin developed

Vulgar Latin had the last laugh, though, as all Romance languages are derived from it and not Classical Latin


People who talk about “useless” words are usually trying to communicate the idea that a word is, for them at that point in their learning, probably pretty far out on the end of the “how likely am I to need to say this or encounter it in what I hear or read in the near future” spectrum. At some point they’ll probably need to learn it, but maybe right now it’s not very efficient. They have a point, even if they’re maybe not expressing it as clearly as they could.

And underlying that gripe are some tensions, including:

  • to what extent is it practical or economically feasible to tailor learning to the individual and what they want to read or listen to, versus having everybody take the same path, at least initially?

  • for Japanese in particular, there’s a tension between introducing kanji in a “simple components first” order and a most common first order; similarly a desire to teach all or most readings conflicts with wanting to focus on more common vocabulary.

People can disagree about the best point to aim for in the design space when coming up with an educational tool, and also about whether the point being aimed at could have been hit slightly better.

PS: enthusiasts for the idea that written communication is not merely a recorded and lesser form of spoken language might like some of the arguments in the book length essay 日本語が亡びるとき.


I am such a person

The written word is art


At the time where writing meant having a scribe painstakingly engrave or draw symbols on often expensive materials, it was probably true. But recently the situation has changed tremendously with the internet. People read and write more than ever, at basically every layer of society. And while high literature of course still exists, there’s no shortage of extremely casual written text on social networks, that does indeed move very fast and break many things.

Well I mean, of course. If a word exists then it must have a reason to exist, when people talk about useless vocabulary they mean “useless to them at this point in their language learning journey”, which is absolutely valid and in no way presumptuous. That’s a very reasonable point to raise actually.

On Anki or Bunpro (and many other SRS solutions) I can trivially suspend a card I don’t want to study, or mark it as known if I already know it. Without this possibility adding more “niche” vocab is just frustrating for a large portion of the userbase.

But beyond all that I want to point out that I think that you make a mistake when you say:

I don’t think that’s true, or at least it’s not completely true. Some rather common words are missing from WK, while some rather obscure ones (in writing or otherwise) are present. Many words seem to have been chosen specifically to reinforce the readings and meanings of the kanji, not because the word itself is very useful or important.

Take words like 仁義, 雑費 or 出獄, I don’t think they would make it into anybody’s list of “top 6000 kanji words to learn” but they’re in WK. I think they’re here purely to reinforce the kanji, not because they’re very “useful” words to memorize.

To be clear I don’t think it’s an invalid approach, I personally use WK almost purely for kanji study and not for vocabulary, so that works for me.


For better or worse, we can agree that kanji are primarily for reading.

Imagine my shock the first time I landed in Japan and found out that Japanese people didn’t talk in Kanji at all!

It’s all hiragana!


