Manga Wordlist Wiki

Hey everyone. My name is Dan, studied Japanese for 10 years, lived in Nagoya for 2 years. I want to create a website that is a wiki for wordlists for manga. If you think this is a good idea please, check out the demo site below and give me feedback. If you have any wordlists in excel/csv format please send them my way and I will put them up on the site. I have about 5 I did myself that I am adding shortly.
/edit 3: updated link to site
https://www.mangakotoba.com/

/edit yes only volume one of One Piece is currently available :joy:

Also, if you want to help please comment below. I could use help gathering wordlists, community people active in bookclubs (especially bookclub organizers) , coders (react, nodejs).

More about me and my plan.
For years I have read manga with my phone open trying to type out the kanji and get a definition, then find example sentences, maybe make a flash card. I have also gone through and made wordlist so I can read without stopping. I have found that you can’t learn every word, so wordlists are the way to go to learn but also enjoy it.

/edit2 I wanted to add my thoughts about flashcards and memorizing words. I am actually a person who studies. All day, everyday. Not always on Japanese, but I spent about 8 years, probably 3 hours a day before I moved to Japan, studying Japanese. In Japan I worked at MRJ, speaking Japanese at work, and I joined a Japanese rugby team and made a lot of friends. I spent a lot of time on flashcards, memorizing words and kanji, and trying to read manga. My conclusion, was that unless you are spending 10 hours a day, you cannot memorize Kanji to this extent. So the best thing to do is focus on what’s important, and then read manga using word lists. I don’t ever need to know how to say or read 環境汚染 (environmental pollution) . I want a wordlist to remind me when I read it. If I read it enough this way, I will remember it, if I don’t ever see it again, I won’t.

A good example of this is that I speak French, and have lived in France for over 1 year. I have never looked up how to say or read environmental pollution, but the word in French for environment is environment and the word for pollution is pollution, so if I see those words I can figure it out. To me this is the same concept of having a wordlist in Japanese. When speaking in French, I might not remember that the word for pollution is the same in French as it is in English, but when I read it I remember. Also being able to just read in French easily, because of the word similarities, allowed me to read in French, for as long as I want without burning out. So wordlists allowed me to do the same thing in Japanese.

Wanikani is the only flashcards I think people should use and finish. I found wanikani a few years ago after I learned Japanese, but it puts together readings, words, hirigana, together in a understandable format that is amazing for beginners and even intermediate people. After wanikani, I think flashcards are useful, but in the short term to memorize new words, and then throw the flashcards away, even if you forget the words.

So I don’t have any real intention of adding flashcards to this site. The one feature I was thinking about is to have pre-reading flashcards. I have done this many times. You make flashcards and study them for the book you want to read. Spend a reasonable amount of time on them, and then try to read through without looking anything up, but use the wordlist if needed. This can make the reading even more natural without stops. Again I would do this for 環境汚染. But then it is deleted from my study routine as I don’t expect to see it again anytime soon.

13 Likes

It’s a cool idea. I have some questions.

  • Is the plan to add every single word, even if it repeats on a later page? If so, do you have any concerns about copyright violations?
  • If a word has multiple unrelated meanings, will you show all of them or only include the one you think is most accurate?
  • How will you ensure quality/correctness? Will there be some kind of crowdsourcing/editing/wiki process?

As for feedback, I opened it on my phone and it doesn’t render particularly well. :sweat_smile: But as I said, it’s a cool idea. Definitely looking forward to following your progress.

6 Likes

@seanblue I said demo! I’ll get mobile working tomorrow!

-Honestly 80% coverage is the goal. It will never be perfect, so as long as it makes someone’s reading time 10x more enjoyable it is working. But in general I would repeat repeat words

  • I would enclude multiple definitions.
  • To start just me manually loading them. I can build out a dashboard for people to manually do it, just like you would sharing google docs. Yes I think some sort of review.

I have also had Japanese teachers explain some of the sentences and what they actually mean,and this is a future add. I had a teach explain the grammar and change it to “textbook” grammar, and after seeing that made it so much easier to understand it written in the manga.

4 Likes

This is the kind of project I’d like to have done if I weren’t so lazy. It has a lot of potential.

I’m very positive to what the site can become, so forgive me if the remainder of my comment sounds critical.

Here are some downsides I've thought of before on this concept.

(After typing the below, I realized you’re probably looking to accept partial word lists as well, even if the goal is at least 80%. Again, these comments are based on what I’ve thought of in the past for such a site.)

Typing up all the words from a manga volume, matching them up with J-to-E dictionary results (which can be automated, but then you have to pick out the proper English word in context), and tracking which page each word is on can easily be a five-hour task for a 200-page manga volume with a lower density of text.

There are many useful options and features that could be included, but any that requires extra information with the words adds friction and makes it take longer.

In all, it’s a big ask for an audience who, by the time they finish entering all the words, won’t have any use for the work they just completed.

And there’s always the potential for copyright issues. Even providing a word translation list can potentially be stepping on the toes of a company that owns the rights for an English release of the manga.

I gather the site is sort of pre-alpha release. Just some comments on the current status.

Presentation and first impressions are very important. Aside from getting those other five manga volumes you mentioned added, you’ll want to be sure to fix any big issues as soon as possible before showing it off to too many people.

You’re probably aware of issues that need clean-up, and basic features pending addition. I’m hesitant to list any for that reason, but if you’re looking for some things to tackle first, here’s what stood out to me:

  • The previous/next buttons are both labeled “Prev”. (Easy fix?)
  • There’s no way to jump to a certain page. (Moderate work to add?)
  • There’s no way to bookmark/link to a specific page. (Maybe not much need for it?)
  • Vertical centering of left and right pages might be a poor choice if one side has a lot of dialogue, and the other has very little.
  • Showing two pages at the same time is a good visual match with a physical manga having two pages visible at the same time, but being able to view one page at a time can make better use of monitor space. (Well, I’m on a 19", which is probably uncommon these days… I imagine you’ll want to support showing one page at a time for mobile, though.)
  • Clicking the “next” button moves the page on the right to the left. I would expect it to advance two pages, not one, since two pages are showing.
  • Removing that trailing semi-colon from the end of definitions seems like an easy fix.

Some thoughts of the One Piece entry you have:

  • Page 5, you have かつて in the kanji column. Having validation to ensure this column contains kanji will help ensure you have clean data from the start.
  • Page 5, you have a lot of unnecessary words listed. For はなす, you have “to release; to let go; to free; to set free; to let loose; to turn loose”. Although it takes longer to do, clearing out as many unnecessary words as possible will make it easier for a reader.
  • Page 6, you have the definition “means ぐらい”. That’s great if the reader knows what ぐらい means. But if they don’t…
  • Page 8…has issues. Another validation to put into place is to ensure the hiragana column has kana.
  • Along those lines, consider whether the hiragana column should be named “kana” instead. A lot of mangaka put Japanese words in katakana for style reasons, plus loan words are usually in katakana. Having the column marked “hiragana” may confuse a potential contributor once they hit a katakana word to enter.
  • Page 9 is a good example of why it’s good to cull the definitions. しょうこ is listed as “evidence; proof”. These English words have different meanings (even though they are very close in relation). By having only the more accurate one, the reader doesn’t have to stop and consider which one is correct. (But that depends on your goal for the site, whether you want the reader to have as little friction in their reading as possible, or want to learn the wide range of English meanings possible in various contexts beyond the manga panel.)

I’m a lazy person who often starts a personal project, puts a lot of work into it, then gets bored and moves on to a new fun personal project. What assurances are there for potential contributors that this site will remain (outside of copyright issues), and that it will continue to grow over time?

6 Likes

Hey Christopher. Thanks for the feed back. First, Japanese is like my first kid, so I would never abandon this project :innocent:. But realistically, I might just make it open source, so anyone can contribute code, and in the case I die, anyone can copy everything.

/edit
Here is the github repo is anyone wants to get involved or check out the code.

1 Like

Ok mobile styling is done. If anyone has a screen or something that doesn’t work let me know. I am going to now try and get 10 books on there. Once there is 10 that will be enough to start working on other styling.

If anyone has a wordlist please send it, or if you have time search through the bookclubs here and find those. I will get to that though after I upload a few of the wordlists I created myself.

2 Likes

Interesting concept but I’m concerned you haven’t responded to the queries about potential copyright infringement.

I also noticed that the website has no copyright declaration on it which might put you in violation of copyright laws.

It might be worth going onto Reddit and asking about the viability of such a project before investing loads of time and energy into a non-starter.

I’d hate it if you put valuable time into this project only to have it taken down by the owners of the manga copyrights, or even worse sued.

3 Likes

@zyoeru Hey thanks for the concern! There are no copyright issues for display a list of words. The only issue would be if large parts of text were reprinted. Reprinting small parts of text, photos, etc, is allowed if I add content on top of it. Like destructuring a difficult sentence and adding grammar explanation. This is similar to reaction videos, if you have been on youtube the last few years you will see a lot of reaction videos, where people watch something and comment. This is allowed because they are taking a small part of something and adding content on top.

3 Likes

Adding to the idea of it being just a list of words, are you expecting to list just the “base” forms of words?

For example, consider the following line:

「こんなテスト持って帰りたくない」

This boils down to the following words:

  • こんな
  • テスト
  • 持つ
  • 帰る

Like that, it’s even easier to say it’s a list of words, even if it covers a whole manga volume.

4 Likes

Additionally, I think, websites need an imprint with at least a contact-form to the admin or generally the person responsible for the site.

Also I could offer you vocab for p.1+5 (so far) for Haikyu!! Vol.1.

1 Like

That’s somewhat correct, but reaction videos are legally precarious. Fair use policy covers only certain circumstances and work has to be “transformative”.

Often content creators on YouTube don’t get pursued by copyright holders because they are receiving free publicity and not because creators are actually legitimately using the fair use exemption to use of copyrighted material.

1 Like

@ChristopherFritz If someone gives me a word list, I will upload as is. However if I am doing the wordlist or approving an addition, I would not include words everyone should know. So if it is Yotsubato, I would probably include words like こんな、and テスト, because all words are hard if you are reading entry level book. Also lots of hirigana words that are hard to look up, or are just sounds, so it would help.

However if it is for One Punch Man, then you should know all the words you listed above, and the focus would be on the harder words. If you look at the only book currently on the site, One Piece Volume one, I did the same approach since I made it by myself. If it is a word I consider a perquisite I personally would not spend the time to list out a definition. The other goal of the site once there are lots of wordlists, would be to guide people on a reading path.

@zyoeru There are dozens of sites that have been up for decades that flat out copy translated manga, raw manga, and every anime in existence. So I am not worried, in the slightest, about having a wordlist up from a book.

@JuiceS Great feedback, but it is a demo, so more features to coming soon! I am adding contact form today or tomorrow as I need people to volunteer to help, or who want to know when the books they are interested in are up.

I’m a bit biased because the first manga volume I read was way above my level. (I used it as a means to learn grammar and pick up some vocabulary along the way.) I imagine it’ll be difficult to target what others should or should not know.

That said, it sounds like what you’ll ultimately have (if you allow for user contributions) will be not unlike our book club vocabulary lists, where people may go to the list to look up a word they won’t know, find it’s not on the list, look it up elsewhere, and then (hopefully!) add it to the list for others to see. I’d say it works well enough for the book clubs.

To reiterate what I said before, I’m absolutely looking forward to your project’s potential and what the result will ultimately be.

1 Like

I’d love to find a way to contribute! I save a wordlist for every manga I read, which has accumulated to 20+ titles by now. Some of them better constructed than others but hopefully they can still be a resource!

3 Likes

@pucko Awesome. Let me figure out the best way for people to send me stuff, but if it is in google docs you can link it here (i think…). Any and all submissions are welcome. Also if it is say half done, and most people would want more words in the list, then I can work on how to make that happen. Can you list the titles, I am really curious on what you have been reading.

The easiest to implement would be to put everything up on google docs, and then have some sort of way to share the link to people who want to help. I could put the links up on the site, but worried about random people putting in d picks on the spreadsheet. So waiting for more feedback to decide. But I am hoping that if I get enough people to help, we can just do how the bookclubs here work together and work on it as a community. And have the data saved for the next person who wants to read it.

Lots of diff titles! will leave out the single volume titles to keep the list short
-Rookies -Rokudenashi Blues -Slam Dunk -Gokinjo Monogatari -Tenshi nanka ja’nai -Nodame Cantabile -Kimi wa Petto -Dr Slump -Haikara-san ga Tooru -Happy Mania -Sugar Sugar Rune -Tokimeki Tonight -What’s Michael?
Currently reading through Black Jack, Ashita no Joe, and Hime-chan no Ribon :grin:

definitely would be worth to make it as open source as possible to really get the ground covered, and maybe further down the road some sort of high frequency dictionary generated from how frequently a vocab shows up in various wordlists?

2 Likes

2nd book is on the site. I created an automated tool to convert excel, google docs, and csv files into a format I can easily put on the site.
https://mangakotoba.herokuapp.com/

@hanashippanashi What format is the data in. Soon I’ll throw an email signup or something on the site to get files sent to me. but feel free to send me the spreadsheets or share a file on google docs. Again if they are not complete it is no problem.

dwestlund531@gmail.com

2 Likes

http://japaneseapp.com/ I use this dictionary app as my wordlist! It has a share function but unsure what file format they use or if it can be somehow converted. Being able to share directly from the app would be the most convenient for me due to the amount of lists I have :sweat_smile:

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.