Introducing Natively - Now with Movies & TV Shows, Korean, Spanish, German and More!

sweetbeems · June 28, 2021, 3:33am

Woohoo!! I’m so happy to hear that, glad you like it!

seanblue · June 28, 2021, 4:50am

I do like how the profile shows specifically favorites and recently finished. I just think favorites could be improved by being a custom list instead of being assumed by 5 star ratings. So I think general custom lists could be useful, but maybe always include a favorites list (which can act similarly to other custom lists) so it can still keep a dedicated spot on the profile. Or maybe instead of that, allow users to designate a “featured” list, so they can include any custom list on their profile. That might be better since “favorites” wouldn’t need to be special, and it’s more flexible.

seanblue · June 29, 2021, 12:05am

Nichijou is missing the WK icon even though it has a WaniKani bookclub (which is already on the first volume’s page).

seanblue · June 29, 2021, 12:12am

Apparently I found this series much much harder than @Belerith did.
(It was one of the hardest manga I’ve read in the last couple years for reasons I can’t figure out or explain.)

(Sorry for the off topic post.)

sweetbeems · June 29, 2021, 4:50am

People certainly see things differently

Eventually I’d love to do some analysis on the books with the largest variance in gradings… at the very least list them out. I’m sure that as written reviews accumulate, a lot of the trends in the high variance books (kanji, dialects, weird content topics…etc) would become apparent.

Belerith · June 29, 2021, 6:34am

That’s really interesting since I found 狼少年 very smooth, easy to read.
But also, the manga that came up for grading were on the more difficult side (by my perception) and it would probably look less ridiculous had, say, レンタルおにいちゃん or にじいろフォトグラフ or something come up… Did you read any of the manga I graded it against other than きんいろモザイク? I did think about that grading for a while. For some reason 4koma always feel more difficult to me, though, so I ended up choosing ‘easier’ over ‘roughly the same’.

(If there were options for much easier, I’d have chosen that for the comparison with 累 at least. Probably けいおん！ as well )

Natsuha · June 29, 2021, 8:32am

Hardly anything I own is on that site. I will need to request all my manga and books to be added.

The difficulty level may be off. I mean how can I judge the difficulty before I have actually read it.

seanblue · June 29, 2021, 11:09am

Nope, that’s the only one. And in that case, I don’t truly remember how difficult きんいろモザイク is. I just associate it as being similar to ご注文はうさぎですか in difficulty, which is how I made my decision. I do generally agree that 4-koma manga are harder. There was just something about 狼少年… I found myself having to reread sections much more often than usual.

By the way, if you ever want to read a much easier 4-koma, I’d recommend ひとりぼっちの◯◯生活. Very light on text, but still very cute and funny.

Hilbert90 · June 29, 2021, 11:34am

It seems most of my thoughts have been expressed (joined and added a book yesterday).

There are so many posts, so maybe this was addressed. Are you planning on adding a “confidence factor” so that books with more comparisons are “stickier,” or did I see that you don’t want that for some reason?

I’m pretty sure the chess version has this. We’re all very certain of Magnus Carlson’s rating, so if he loses to a 1900 player in a 100 person simul match or something, his rating isn’t going to change noticeably. But if someone goes and joins Lichess today, their rating will swing wildly the first 10-20 games and then start moving less and less as their rating gains confidence.

Getting that factor into the algorithm should make it so that wider comparisons can be made without one person being able to manipulate big swings.

The levels are completely new, so no one actually knows what a “level 30” book looks like. If the notion of “level 30” difficulty drifts as a result of more comparisons, I think this is good! All that matters is an ordering with a lot of confidence. Once a huge number of comparisons have been done, only then will we find out what “level 30” looks like.

You could even list the confidence as part of the book info. Like 本好きの下剋上, Level 30, Confidence: 77%.

(For the record, I think 本好きの下剋上 is harder than キッチン, but I wasn’t asked to make that comparison, and it’s currently rated the other way around).

Anyway, this site is awesome. Keep it up!

Oh, and also, are you manually adding all these books? I have a bunch I’ve read, but I feel bad adding them since they aren’t popular books and hence won’t get any comparisons, which is the point. I’d just add them myself if that was a feature.

pocketcat · June 29, 2021, 11:39am

IIRC @sweetbeems has previously said he’s working on automating the process now that he has access to an Amazon API. But re: unpopular books – I say add 'em. I’m sure I read unpopular books as I routinely buy used paperbacks off eBay, but I tend to think if the title interests someone and it strikes them as the right level it might inspire them to pick up a copy too, and then there will be more ratings!

Hilbert90 · June 29, 2021, 12:27pm

This is exactly how I got so many too.

seanblue · June 29, 2021, 1:37pm

Another example of where allowing a few custom comparisons would be helpful.

That would be pretty cool if the data is available for it.

I added two manga so niche that they aren’t on AniList or MAL.

sweetbeems · June 29, 2021, 1:49pm

I wouldn’t worry too much about the difficulty - if you really aren’t sure I’d simply put it at level 30. I do have a temporary rating system where the rating fluctuates wildly for the the first few gradings in an attempt to place the book. I know it’s a bit confusing to have to guess and eventually I’ll simply have an ‘unrated’ option, just haven’t added yet.

I’m bummed to hear not many of your things are on the site! I’ll soon have an amazon search option where you can request to add things very easily.

Long discussion of rating systems

So for the official chess ranking system, they use an Elo system which does not have any certainty factor, but it does have a max rating change factor - if Magnus lost to a 1900 player even in an official tournament, he’d only lose around 32 Elo points. There’s no sticky factor.

Lichess is an interesting call out and something I modeled my ‘temporary ratings’ after (yes when a book is added the first 5 gradings change the book’s rating drastically, similar to Lichess’s system). You may not have encountered the temporary ratings for the book you added if you immediately graded it more than 5 times, but around the site you’ll see some book’s ratings with a ‘??’ which means they don’t have many gradings. I certainly could improve this system a bit, but it does work moderately well.

Now Lichess and Chess.com do have a ‘certainty’ factor in their algorithms as they use the other most popular skill rating system, Glicko. It’s pretty similar to Elo but the handling of the certainty factor complicates things a bit. Essentially, as you play chess matches the certainty factor quickly becomes very certain, but every so often (maybe once a month, depends on the configuration), the certainty factor is bumped down. This mitigates issues where a player might play a lot of chess, leave for 2 years and come back at the same rating. In this case, Glicko would treat that rating as uncertain and the rating would fluctuate wildly, whereas Elo would need a quite a few more matches to figure out the right rating. Do note though, this ‘certainty’ factor is different from the mechanism they use to onboard new players, which is just some custom handling they build.

That Glicko system, which I have played around with in other contexts, is a bit more complicated to deal with as you also have to tune your timing factor on bumping down your certainty (I’ve often had poorer results with it because of this). It also isn’t optimal to this use case (just like Elo isn’t perfectly optimal either), as the inherent book difficulty isn’t changing like an inherent player’s skill might be… so Natively shouldn’t care about recency like these skill rating systems do.

i did think about using that algorithm as well, but it wouldn’t really offer the certainty you’re looking for (it becomes certain quickly and expects to be bumped down) and so is unnecessarily more complicated, which is why I went with Elo. Eventually, as there has been talked about a bit before, I will need to improve the algorithm and switch to something more optimal for book difficulty, but Elo is doing a pretty good job for a V1 and, most importantly, it’s allowing me to collect the correct sort of comparison data. Figuring out a really optimal algorithm when I’m in relatively unexplored territory is something which may take a while, which is why I’ve pushed it off while Elo has been performing relatively well.

With regards to doing more gradings, I can’t really allow people to do a ton of gradings on one book in the current system without potentially adding a lot of volatility. I had a message in this thread I was going to link you to on this, but somehow i deleted it . Essentially if someone came on and thought Harry Potter 1 was insanely easy and was allowed to grade it 100 times, they could potentially push Harry Potter 1 from lvl 30 → lvl 20 all by themselves, even if it had a 1000 previous comparisons on it. Elo / Glicko / other skill systems don’t really care about how many previous matches / comparisons had occurred as any recent gradings are heavily weighted.

This is also why I’m hesitant to add a ‘certainty’ metric as it’d mostly be me just making it up. Once I get a more optimal algorithm, perhaps.

If you want more information on the grading system, there was an even longer discussion from about 3 days ago in this thread which you should check out

As @pocketcat said, I am in the process of drastically improving my book addition mechanism, should be much more automated. In general though, you should add all the books you like! It’s my issue to solve how to quickly add them

NickNickovich · June 29, 2021, 4:14pm

I will say straight away that I don’t know much about math and statistics, so sorry if it will be super obvious.

I agree with you that Elo rating system is doing a good job at the current stage, it helps to relatively quickly establish the position of the book compared to other books. But I also can see that you are aware of it’s limitations.
Two main problems of Elo rating system for books, as I see it, are:

It was created to show a “current/dynamic” rating of a player, because it’s assumed that it always changes. When we try to determine a rating for a book, we are looking for some “objective” rating.
Elo rating is a “race to the top” type, so higher rating is always better. This becomes a factor when we try to manipulate the K-factor. In chess K=32 for players with rating below 2100, K=24 for players with rating between 2100 and 2400, K=16 for players with rating above 2400.

In my understanding, the time factor for books is basically irrelevant, so the variables you are left with are the number of gradings and the number of users who graded the book. I don’t know if you are already doing this, but a simple way to count that in would be to use something like this:

K=\frac{1000}{1000+N_{g}+10N_{u}}

N_{g} - number of gradings
N_{u} - number of users who graded
1000, 10 - arbitrary constants

For example, at the time of writing, よつばと! 1 has 117 gradings from 18 users, so K-factor will be 0.77 instead of 1. There are, obviously, more elegant ways to calculate weighted K, this is just the one that came to mind.

Using those numbers, you can probably generate approximate confidence level as well.

In the long term, I think the method Nicole mentioned is the way to go, but then again, I feel that people are naturally drawn to having a number describing the difficulty, so the information you gather now will be valuable even if the method changes, because pairwise comparison doesn’t produce numbers itself (if I understand correctly).

NicoleIsEnough · June 29, 2021, 5:27pm

Yep, that’s my understanding as well. But the current Elo number is invisible, so in that respect I don’t see a difference to not having a difficulty number at all. Of course we‘d want a level number, which I assume is derived from the Elo number somehow at the moment? For the ranking I suggested, one could define equivalence classes to determine which books go into the same level.

sweetbeems · June 29, 2021, 7:09pm

Right, I certainly could do something like this! This is something along the lines of what I was thinking about if I made books with a lot of gradings ‘stickier’, but it’s not entirely without potential complications. One such complication is that books which have a lot of gradings are less receptive to changes in the general rating system … for example if a bunch of new books are added and slightly change the notion of a lvl 30 book, the ‘sticky’ books will react to this change slower.

Another thing to consider is I really only care about the diversity of user’s opinions of a book, rather than the number of gradings. While your solution does incorporate both, it’s not entirely straightforward to figure out the right correspondence. Your solution also doesn’t totally solve the grading bias issue and it may make things worse, I’m not sure. For instance, 1Q84 already would be quite sticky in this situation, but perhaps that’s just because it started there… perhaps @NicoleIsEnough perception of its difficulty is actually more popular. This stickiness would prevent it from quickly moving to a better grading.

Now, that’s not to say that ultimately it’s not better than my current solution and perhaps it would allow more gradings, so it could be a better solution than my current solution! And it’s potentially something I can implement However, what I do find attractive about my current solution is that it’s incredibly simple and the warts are easy to mitigate, whereas I think this (slightly) more complicated solution might have more hidden effects, like the stickiness of bad gradings.

Yep, I agree with both these statements! Any new algorithm I implement will, for all intents and purposes, be invisible to everyone. You’ll still do the same comparisons and you’ll still see difficulty levels as they appear now. The pairwise comparison, while it may not generate a rating number, I’m sure I’ll be able to figure out some level correspondence from the ‘group’ ordering it generates.

And just to reiterate, as I think it’s important to, all the comparison data you’re currently submitting will still be the core data I continue to collect in any new algorithm. So, there’s no need to worry any gradings you do will be for naught!

Edit: I will say too - @NickNickovich and @NicoleIsEnough I really do appreciate all this thought both of you are doing towards the grading algorithm! It’s super helpful and it’s a tricky issue to figure out a better algorithm. I do like @NickNickovich’s approach, it’s a bit more elegant than what I was initially thinking.

seanblue · June 29, 2021, 7:46pm

I like anything that lets me grade All The Things^TM

But more seriously, there’s probably a way to adjust this K-factor like Nick suggested that would let users grade more without too much adverse effect to the entire grading system.

By the way, do you have some kind of issue/feature tracker for us to look at? Not for all the smaller issues or bugs, but just for the larger features.

seanblue · June 30, 2021, 2:14am

I find it a little odd that the book comparisons are part of the first book in a series rather than for the series itself. This manifests in the following ways:

The comparisons show up on the first book’s page, not the series page. This is particularly weird since the Gradings section is always blank for non-first volumes of a series.
The comparisons that you can make say they are for the first volume, but they are applied to the whole series. This appears when doing comparisons, when looking at comparisons on a book’s page, and when looking at your own past comparisons from the “My Difficulty Gradings” page.

I get that this may be easier since everything has a “first” book, either because it’s literally the first book in the series or because it’s the only book and not part of a series. But I do think this can be improved quite a bit.

NickNickovich · June 30, 2021, 8:48am

I don’t think that situation is very likely to happen. What I can imagine happening at this point is that a bunch of books can create a “bubble” where they all have consistent levels within their group, but these levels are different from more established levels. I think that will just be a temporary fluctuation and with time they will get proper rating once distribution flattens, they won’t change “the notion of level 30”, they will just be wrongly ranked as level 30 for a while.

1Q84 has 60 gradings from 4 users (what?), so even using my unoptimised formula the K would be:

K=\frac{1000}{1000+(60-1)+10(4-1)}=0.918
(I used N-1 instead of N, so the K will be 1 on the first grading from the first person, didn’t want to add that -1 in the original formula, because it makes the formula less readable)
So, not that sticky.

You can also make it so K is constant until a certain number of users who graded the book and after that it starts to decrease.

Another suggestion for mitigating the single user bias, would be to increase the number of available gradings as the number of users who graded the book increases:

Baseline is 6 gradings.
When number of people who graded the book is more than 4 AND the number of gradings is more than 16:
Number of gradings increased to 12
When number of people who graded the book is more than 8 AND the number of gradings is more than 48:
Number of gradings increased to 18
When number of people who graded the book is more than 16 AND the number of gradings is more than 72:
Unlimited number of gradings

The numbers are totally arbitrary and will probably need to be different for different media. For example, some of the books I added will be lucky if they get two or three other people who’ve read them, while Yotsuba already has more than 16.

Hilbert90 · June 30, 2021, 10:54am

It took me a while, but it finally clicked in my brain what you’re concerned about. You keep saying “if so-and-so rates a book 100 times” and my brain was like, “how would they do that?”

I kept overlooking the fact that allowing all/large level differences would potentially show a single person a huge number of comparisons, and as the level got pushed to a different place, it would open up more comparisons, allowing it to get pushed more.

That’s a really interesting problem.

Topic		Replies	Views
Book difficulty website Reading	4	625	July 6, 2022
Resource that has a grading system like natively but for tv/anime? Resources	7	829	January 30, 2024
Book Reading for Beginner Reading	7	1752	March 14, 2020
Level 1... again WaniKani	17	824	June 26, 2024
Increasing overall Japanese ability Japanese Language	6	885	July 25, 2022

Introducing Natively - Now with Movies & TV Shows, Korean, Spanish, German and More!

Related topics