When do we learn stuff?

It ocurred to me recently that even though Wanikani teaches you the kanji 浅 at level 15, and 草 at level 5, at no point does the Almighty Crabigator pat you on the head and say “congratulations, now you’re able to read 浅草” (i.e. Asakusa, the location of Senso-ji and the Kaminarimon). I mean, I get that it’s a kanji learning machine and not a geography quiz, but sometimes I just think its nice to know that your knowledge has meaning in the real world.

So I thought I’d make up a massive list of Japanese geographic terms, and the level at which you’ve learnt all the component kanji. And here it is:

So far I’ve only done prefectures and regions, cities, towns and villages. The number in the “Level Learnt” column is the level at which every kanji in the name has been taught to you - if “#N/A” appears, it means there’s at least one kanji in the name that WaniKani never teaches you, which means you’ll need to learn to spot them on your own or forever hold your peace. (A surprising number of places I’ve actually visited come up as #N/A, which is kind of interesting.) Please note that you still may not be able to read the place names, just recgonise them (神戸, I’m looking at you, here).

As a further pet project, I’m also intending to do the list of all train stations in Tokyo (as a way of listing all the neighbourhoods), but oh boy there’s one and a half thousand of them, and I haven’t been able to Google up a complete list of their names in kanji meaning I’d have to write them all out manually, and also / or otherwise, a list of the most famous sightseeing places with TripAdvisor’s list as a guide, but where do I stop?

If anyone else wants to suggest a list of things they’d be interested in seeing on my spreadsheet, feel free to post it here.

55 Likes

Common Japanese names? Unless that’s been done before.

I’ve bookmarked this as a very, very in-the-future learning goal, thank you!

6 Likes

I was thinking to add Japanese names, cities, and train stations to WK as some kind of mini-quiz.

Anyway, there is a nice list of train stations at wikipedia (https://en.wikipedia.org/wiki/List_of_railway_stations_in_Japan:_A), I already extracted it as English, Kana, Kanji is you need it. There are also huge lists from JPPost for all addresses possible down to the street level, with Kana and English.

For family names there is this:


For given names I found this (most popular names since Shouwa), there is no kana however
http://www.tonsuke.com/nebin.html

4 Likes

Also Tofugu Store sells a list (PDF) of most common given names and surnames of boys/girls ordered by frequency.

6 Likes

Just finished doing the list of the top hundred surnames off Wikipedia. I’ll have to poke through that big Github list tomorrow, because it’s nearing midnight. I wonder if I’ve bought that Tofugu list at some point in the past…

As for Wikipedia’s list of stations, no idea how I failed to spot that. This seems workable.

1 Like

I have this anki deck with the 10,000 most common proper nouns found on Wikipedia. It contains the various readings as well as the found frequency.

Example:
image

If you’re interested I could get you a CSV

6 Likes

I love this and cant wait to look it over, i run into this a lot where I know I know the kanji im seeing but i cant put them together to make sense and then look it up and realize oh… its a place. i saw 神戸 the other day and in no world would i have thought that said KO BE. but I learned!

3 Likes

Because I suck at going to bed when I ought and get easily distracted, I started poking through the stations list. There’s over nine thousaaaand of them. To be precise, 9078. :slightly_smiling_face: Thinking I’ll strip the 駅 off the back of each station name - if I don’t, that puts the earliest possible time you’d learn any station name at level 13. It’s easy enough to learn 駅 anyway - I learnt it on my first trip to Japan, before I started learning Japanese.

This does sound intriguing. Is this just names and places all tossed into one bucket?

It’s your list, so you do what you feel is best, but I think it makes more sense to keep 駅, as that’s a part of the name. 渋谷駅 isn’t called 渋谷 As you said, learning 駅 isn’t hard, so why not keep it?

Yep, I believe so. It has a lot of train stations too; the most frequent, 無人駅, being mentioned on Wikipedia 6936 times, and the least frequent, 水上駅, being mentioned 184 times.

Here’s the CSV if you want to have a look at it

4 Likes

This is great! I love these kind of data sets, so thank you for making it. If you want any help with manual parts just shout as I’d be happy to chip in :slight_smile:

1 Like

I made this on Tiny Cards a while back if you are interested:

3 Likes

Oh thank you! I haven’t used Tiny Cards before but I’ll definitely give this a try.

Station list is up. I still need to run down the list and do a sanity check, though (ugh, so many station names with kana - the biggest offender being 東京ディズニーランド・ステーション駅, which is eighteen characters long, the longest of any station name, and why does it need to have ステーション and 駅?) but for most of it, it’s accurate. For reasons probably known only to a devoted railfan, precisely eight stations in all Japan have 停留場 instead of 駅, all of them on tram lines, but not all the same line (there’s three on the Arakawa Toden Line, two in Kumamoto, and one each in Takaoka, Okayama and Kagoshima).

It’s running teensy bit sluggish, though. Who woulda thought a table over nine thousand rows long would not entirely play nice with Google Drive? :stuck_out_tongue:

Why_not_both

Figured out a simple way to do it. Basically, I added a second column that checks whether the “level learnt” value is 13, and returns the second highest level number instead if so. My feeling is that 渋谷 may not be the name of Shibuya Station, but it’s sure the name of Shibuya, and part of my motivation for making a list of all station names is that it was a convenient way to make a list of all neighbourhoods. Or, all the important ones, at least.

Thanks. I certainly will take a look at it.

Thanks for the offer, but if all goes well, there shouldn’t be much by way of manual parts. :slightly_smiling_face:

3 Likes

Tofugu actually has a list of Japanese names. It’s come in very handy.

1 Like

The こう reading does show up in some other words, like 神々しい こうごうしい (divine) and 神月 こうづき (tenth lunar month), but not much more than that.

へ is a nanori reading for 戸, so something most of us are unlikely to know.

Just a guess but I think 停留場 can be used for buses or trains and then the people in those areas couldn’t decide if the tram was a bus or a train so went with this option.

Another guess perhaps there was already another station with that name and they used 停留場 to differentiate themselves.

One more guess, those tram stops were originally bus stops or are also bus stops so they continue to use that suffix.

Just happened to re-read this. It’s unclear if you’re aware of this from your wording (so apologies if you are), but 無人駅 isn’t a station name - it means “unmanned station”.

These are all fair guesses. I just find it a bit odd that only those eight stations use it, and not the adjacent stations on the same lines.

3 Likes

Alrighty, I’ve finished my sanity check of the train stations - discovered in the process that in my pre-processing I’d acidentally deleted the names of three stations which start with 駅 as well as end with it. Also found that Genbaku Dome Station in Hiroshima was there twice (once in English), so there’s only 9077 stations.

I’ve also added the list of surnames that @acm2010 suggested, and the list of Wikipedia nouns that @Kumirei provided (gotta say, though, some of those nouns labelled as locations are more like… concepts instead). As for given names, I can’t seem to find if I ever purchased the Tofugu names list pack in the past. Or downloaded it when it was the free offer-of-the-week…

Kanshudo also has lists (the same lists?) but in order to download them so that I can actually do anything with them, I need to have a Pro account.

One thing that jumped out at me when I was processing things is that 串 seems to be a surprising omission from the WaniKani lists - surprising considering how often I saw it in the wild in my last two trips. Perhaps Koichi was scrached by a splinter in his childhood, and he’s borne a grudge ever since?

1 Like

For the fun of it, I thought I’d play around a bit with the station name kanji data. Here’s the kanji sorted by how often they appear in the list:

Had to do that on its own spreadsheet, because Google Drive was complaining the other one was getting too big - did you know they have a limit of 400,000 cells? That a lot of cells.

Anyway, there’s some fairly interesting results there. For example, the fact that 田 appears in the name of six hundred and fifty-seven stations. And more stations have a ヶ or a ノ than, say, 水 or 白. And I noticed this one before, but 井 is taught as a kanji surprisingly late - at level 45 - especially considering (a) it’s taught as a radical at level 14, and (b) the very similar but slightly more complex kanji 囲 is also taught at level 14.

Of the cardinal directions, 西 is the most common, coming in 9th place with 322 appearences, followed by 東 at 12th (287), 中 at 13th (280), 南 at 19th (203) and lastly 北 at 22nd (190). As an added bonus, 上 is at 14th (259) and 下 is at 26th (180).

The first kanji on the list that’s not taught by WaniKani is 幡, which appears in the name of forty stations - wondering if it’s worth petitioning for it to be added to WaniKani, especially considering the 八幡 shrines are the second most numerous in Japan, after the 稲荷 shrines.

On a different note, I’ve decided I’m a bit uneasy about using the data from Tofugu’s paid-for names list to produce something that is essentially exactly the same data but released for free. Maybe if there was a free list somewhere. The one you linked to for given names, @acm2010, only seems to have ten names per year.

Just noticed I made a terrible mistake, and accidetally stripped the back end off any station which had 駅 appearing in the middle of the names - not just 駅前駅, but all the other stations named (X)駅前駅 and so forth. I’ve fixed that now, for both the list of names and the kanji data.

Edit: Wait, wait. 曽, which appears in the names of twenty-four stations, is taught as a radical in level eighteen, but never as a kanji?