(RebBlue’s response covers it, but I like writing about these stats, so:)
Eventually I want to include both total and unique known word counts. I just need to determine the best way to fit everything into the interface.
With only one of the two shown currently, consider the following stats. For a manga volume, learning the top 100 highest frequency words may give you coverage such as:
- Unique words known: 6%
- Total words known: 40%
If instead you had opted to learn the 100 lowest frequency words, not the highest, you would be looking at:
- Unique words known: 6%
- Total words known: 2%
In both of these situations, the unique known words is 6% of the volume’s unique words, but one scenario gives you 40% of the total word coverage and the other gives you 2% of the total word coverage.
Here, the total word coverage gives insight into how readable material will be, whereas the unique word count tells no useful information for reading. (Other stats, such as the percent of sentences you can read without lookups, supplement this in either case.)
That’s for individual volumes.
When it comes to a series where multiple volumes are on Manga Kotoba, the stats become even more interesting.
Consider the first 30 volumes of Detective Conan.
Here, to learn 75% of the total words in volume one, you need to learn 56% of the unique words (848 words).
But for those 30 volumes collectively, knowing 75% of the total words requires learning the top 23% of unique words (which, across 30 volumes, is 3,758 words).
At the series level, the unique word counts become unreliable the more volumes of a series are added to the site.