Manga Kotoba: Manga Frequency Lists and Stats

Correct, this export option is only at the series level. I could add more options upon request; this is just a default minimal implementation.

The path should work as:

  1. Access a series page.

  2. Select this option:
    image

  3. This should give the export options page:


    You can change the minimum frequency here.

  4. Select “Export” to get a textbox with Japanese words and frequencies:

There's also an admin-only unfiltered volume-level XLSX.

I should really make an overall improved export, with series-level and volume-level splitting, as options for formats such as CSV, XLSX, and JSON, and whether to include English translations, etc. It’s just a matter of finding out what people would actually and actively use, then implementing that.

The dashboard setting is for getting exports from the database. There’s a lot of room for improvement there as well, but that’s intended for someone who wants to export a copy of their data from Manga Kotoba.

Almost!

I had Friday off from work, so I made use of it to get through a lot of pending items:


Across all imports (multiple users), I’m down to 27 volumes left to get to. (It was over 100 yesterday.)

I’ll have to look into this, and to improve the interface for it. I don’t think I have pagination implemented yet, and it’d be good to be able to filter by status (whether pending, added, or not added with a specific reason).

1 Like

when I click this… I’m not seeing item 3 at all…hence the confusion I think

as an example when I hover over the link I see

but when I click on the link I am ending up at the dashboard…

which put me back in the json mess (no tab delimited option)

am I doing something stupid or did I find a bug? or a 3rd option?

1 Like

Looks like the developer really messed up some of these routes:

router
  .get('/series/:slug/export-words', [SeriesVocabularyController, 'exportWords'])
  .use(middleware.admin())
router
  .post('/series/:slug/export-words', [SeriesVocabularyController, 'exportWordsPost'])
  .use(middleware.admin())
  .use(middleware.admin())

Question 1) How did I get this to be set to only admin rather than simply authenticated users?

Question 2) Why do I have the second route limited to admin…twice???

Copy and paste errors. This is what happens when I don’t ask AI to look for anything suspicious.

Corrected routes:

router
  .get('/series/:slug/export-words', [SeriesVocabularyController, 'exportWords'])
  .use(middleware.auth())
router
  .post('/series/:slug/export-words', [SeriesVocabularyController, 'exportWordsPost'])
  .use(middleware.auth())

Deploying update now. Should be working within a couple of minutes.

2 Likes

Im so good at finding hidden bugs…no matter the platform :relieved_face:

I’m good at hiding bugs. Especially in plain sight.

Deploy (applying the update to the site) has completed!

It took a while this time because for some reason the script decided to delete a bunch of files from the server then re-upload all the same files… (At least it doesn’t break the site when doing that.)

1 Like

yet… :smiling_face_with_horns:

hey it does what you said it does now :slight_smile: woo hoo!!!

this should make this much much easier… just export everything dump and then scan through and delete what we want to keep unknown and then import :wink: nice

will try doing this (if i find any more bugs) i’ll let you know

it looks like this replaced known words on import (instead of appending)?
Am I interpreting that correctly?

meaning we would need to keep a master known list and append to our own each time we wanted to do this?

Your known words list in the database is retained and appended to

It’s only the uploaded import list that gets overwritten if you upload a new one, as there’s (probably?) no need to retain the import list of words to mark as known after they’ve been marked as known in the database.

1 Like

thanks for clarifying… I was confuzzled by the warning at the bottom …

doing this turns out to be super easy though… especially having the whole series… for something like takagisan… even if a word with low freq slips through that is marked known and maybe shouldn’t be, not a big deal if it’s only gonna be a handful of them anyway (for looking at my manga library that is in MK - being able to see what’s known (or at least encountered before) really helpful in sorting through some of the harder stuff for future reading choices

1 Like

That’s my number one use for the site. (Although I should really use it for learning more words. And kanji.)

1 Like

did I stumble upon a bug?

so exporting all of takagi san and imported the whole list as known…but it’s only saying 71%

the export I used was

would have expected 100% known with the whole series if I forced known for the export

also in separate things… if i uncheck "group words’ I get this message

It’s possible the site hasn’t updated all the frequency list for the volumes. I’ve clicked a button that should force it to do so the next time you view the page.

I guarantee this worked when I first implemented it. I’ll check into it. (I really need to create unit tests for all these so I can catch if something I do breaks a random thing.)

1 Like

no worries… I’m sure I’ll still keep tripping over bugs that are wandering around…

for some reason I just find them… w/o even really trying to break anything :bug:

1 Like

Boring solution: Look into what’s happening and fix it.

Fun solution: Re-do the entire export system.


Updated Export System

Series and Volume Level

Previously, exporting was only available at the series level.

The updated system now allows exporting a single volume.

New Menu Placement

The export option now appears on the series and volume menus.

Export Formats

The old format provided a list of words and their frequencies. This format is still available (listed as “JPDB”), but there are a few new formats available:

  • Text (Tab-delimited): Simple text format (tab-delimited)
  • CSV: Comma-separated values for spreadsheets
  • Excel (XLSX): Microsoft Excel format with metadata sheet
  • JSON: Machine-readable JSON format with metadata
  • JPDB: Word list for JPDB deck import (includes duplicates)

Export Options

You can still specify the minimum frequency to include in the export, and you can now optionally include your tracked and known words. The export doesn’t list them as tracked or known, but this can be useful if you want a base to create an Anki deck for others to use.

(I’ll shrink that minimum frequency box later.)

Data Fields

More columns are available:

image

  • Kanji: If the word appears with kanji in the source material, it appears here.
  • Kana: If the word appears with kanji in the source material, the kana reading is there. Otherwise, the source material’s reading is used.
  • English: English translation. Text and CSV formats use only the first line of the definition. JSON and Excel formats use all lines.
  • Frequency: The number of times the word appears in the series/volume. This is based on how it appears in the manga, so 猫, ねこ, and ネコ all appearing in a manga will be three separate entries with their own frequency numbers.

Rate Limiting

I’m sure there’s room for improvement on the implementation, so for the moment, I’m not certain what issues may occur if a large number of exports are done.

As such, I’ve set some default rate limits. If you find these are too limiting, let me know what you think would be a more reasonable limit, and if I agree, I’ll adjust it.

The current limits are:

  • Generate up to 5 exports per hour
  • Generate up to 3 series exports per day
  • Generate up to 10 volume exports per day

Rate limiting applies to generating new exports. You can re-download your three most recent exports for each volume and series without impacting the rate limit.

2 Likes

i’ll have to give it a shot later…but I did check this morning and takagi it still says 71%…

since you have made some major changes, I think what I’ll do is clear out all the known words and start again fresh and see what happens…will update later after brekky

never went to 100%… cleared all the known words and it hasn’t gone to zero… (doesn’t seem to be clearing on it’s own)

not sure how often scripts/backend things run for cleanup but something still seems fishy

gonna leave my known words list empty for now and then I’ll see about exporting tagaki again and importing known words later tonight or tomorrow and see what happens…

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

edit…checking again this morning haven’t done anything since last night :
known words list is empty (or should be?)

but manga vols are showing known words:

am I doing something silly or did I find another bug… ?

I won’t do anything else until hearing from you (for troubleshooting/bug chasing)… don’t wanna make things more complicated - no rush just let me know when to tinker again :wink:

1 Like

I have released a proof-of-concept UserScript for syncing Natively changes to Manga Kotoba.

Features:

  • Puts a not-so-beautiful “MK Sync” button on your otherwise clean Natively pages
  • Collects manga volume status changes in real-time and stores them
  • Allows manually syncing stored changes via a button click
  • Clutters your JavaScript console with log messages

Planned for future:

  • Support syncing when a status is removed from a volume
  • Sync in real time
  • Remove “MK Sync” button

Disclaimer: This will not sync your entire existing collection to Manga Kotoba. But it includes a link where you can import from a Natively export, which is a better way to do the same thing.


I’ve loaded a copy of the database locally, so looking into this will be one of the next things I work on.

3 Likes

Hey hey @ChristopherFritz, we’re opening up a new book club here:

I’ve scraped and uploaded the first two volumes of 推しの子 to Manga Kotoba, if you could try to look into them and potentially approve them, I’d add the volumes to the op of the book club as well!

1 Like

Good news is volume one was already available!

I’ll get volume two up a bit later (after book club duties, etc.)

1 Like

Not sure what happened but logging into my profile seems to have suddenly shifted some books around in my reading/paused/owned lists, I found books I am sure I do not own in the “Owned” tab, or it similarly showed that I was reading “Frieren” (I was not). Wishlist seems untouched.
Maybe during some natively mass sync with the newest feature some items got accidentally assigned to my account?
I haven’t tried most of the new features yet.
(Also marking words as known is really slow/buggy on pc now too.)

1 Like