Really? I felt like 1Q84 and 容疑者 were about the same
*checks grading*
And that’s how I have graded them too.
And this what makes a site like this so useful. Just look how wonderfully different all our impressions of just these few books are.
I already shook my head enough about this decision of yours
Now, in all seriousness, I think this all boils down to the questions of “what aspects is ‘difficulty’ comprised of” and “how much am I affected by each aspect”.
The main aspects I can think of right now are kanji usage (what words are written in kanji, which kanji are used), vocab usage (this is of course somewhat related to kanji usage), grammar usage, writing style and contents.
I agree that the three books are pretty close in terms of grammar usage and writing style. But when it comes to kanji and vocab usage, my feeling is that Murakami’s level is way above the others’. I can totally see that this does not matter too much to somebody like @Naphthalene or @Belerith but it does matter to me
I know that this of course depends on the kanji and vocab I chose to study until now. But as this site tries to build a loose connection to WK and JLPT levels, I think it should be in order to grade a book more difficult than another one if it contains more unknown kanji for a person at a given WK level.
I actually remember that I was around WK level 33-38 when I read 容疑者, and I would literally discover freshly learned kanji each time I read it, while the number of kanji I didn’t know was not that high (maybe on average 1-3 per page or so? And they got repeated often). While for 1Q84 I’m constantly looking up unknown kanji.
Yes, when it comes to technical background knowledge, probably すべて is the hardest. But we did have a bunch of maths talk in 容疑者 as well… Did that not affect you as much? Or was it easier to skim over as it did not tie so tightly into the main story?
That’s what I value as well! And that’s why I would be a bit sad if we were only able to compare neighbouring books, which to me feels more like only manifesting the existing gradings instead of being able to stir things up a bit if needed.
I’m just better with maths than computers. (I even took Mathe-LK once upon a time, not that that’s super complicated maths, haha)
But yes, I think that it didn’t matter so much to the main story / wasn’t as omnipresent also helped, like you said.
Right, that was my point about local optimums earlier. I’d like to also add another factor. Since I’ve read some harder manga (particularly in terms of kanji usage), but haven’t read any previously graded hard manga, I mostly got to compare them with books. This can only be so useful. How am I supposed to compare ご注文はうさぎですか with 夜市 for example? They are difficult in different ways, so can’t be easily compared. I think it’s important that comparisons happen both within a media type and between types.
Wow lots of chatter this morning!
The limitation would be how many ‘custom’ comparisons you could do per book. Perhaps it’d be 4, perhaps 6.
The reason why I prefer to limit how many grading you do is mostly due to Elo. Elo does not give ‘weight’ to a book with more comparisons, so allowing unlimited gradings would allow someone the ability to change コンビニ人間 from level 28 to level 33 if they wanted to. This would be especially bad as a lot of people have graded コンビニ人間 so new people looking at the book would expect it to be graded well as we have so much data… but in fact it’d be mostly graded by only the last person.
I do appreciate that it seems odd not to collect comparisons if people are willing to submit especially at this young phase… but I’m balancing spending a lot of time solving the aggregate rating algorithm limitations (Elo) vs rolling out a lot more engaging features. I will solve this eventually, I just think the limited custom comparisons feature I’ve laid out will probably resolve a lot of this sort of frustration (i.e. you could say you thought 1Q84 is harder than すべて) until I do get around to optimizing the algorithm.
I think this is the crux of the matter - the difficulty ratings are very much an approximate, monolithic measure. It will never be totally accurate for each individual person and certain books will have a wider variance of people’s perception of its difficulty. I could certainly show how large of variance a book has in the future and I think that’d be illuminating, however I don’t think I’ll dig into the specific nature of the difficulties… it’s too complicated I think.
Rather, these sorts of things should be hashed out in written reviews. I think written reviews are much better at giving someone the sense of a book than a rating number . The difficulty rating does set the context for a review which is helpful and allows for easy filtering / ordering, but yeah saying a book has a lot of maths talk or difficult grammar, that’s well suited to a review.
I’m not sure if you’re implying this but others have certainly been under the impression Natively levels somehow are related to WK levels, which I must admit they are not. I made the levels with no thought to WK and difficulty is much more than kanji & vocabulary… so while there may be a loose correlation, it is certainly not something I’m trying to do, simply may occur by accident
This is something I’ve thought about implementing sooner rather than later, you’re correct. I think I probably should prioritize media types (and that’s relatively easily done) because yeah, it’ll be less variable and probably better comparisons!
Exactly!
I’m a bit confused here. Are you saying that a book with 100 comparisons can be easily manipulated by just a few more comparisons by a new user? Or would that only happen if the new user did dozens or hundreds of comparisons? What if the book already had 10000 comparisons? In that case, could a new user doing 100 comparisons still shift the difficulty rating significantly?
This would definitely be helpful, yeah.
Overall, while I think you can afford to wait on making significant changes to the algorithm, allowing unlimited ratings by accounting for it in the algorithm, etc., I do think a few changes are needed sooner rather than later. Making sure comparisons happen within a type (even if you have to widen the level range from 5 to 10 or something like that) seems important to me. And really just manga and non-manga. I think novels and light novels can and should be considered the same group for comparisons. In any case, this feels more urgent to me than things like author and publisher.
Honestly, even just upping the limit from 6 to 10 might be a nice start. If you think that wouldn’t give a single user too much influence. While you want to avoid giving a single user too much power, for less-read manga, it would be better to give that single user accurate power at least. Right now, I feel like I can’t even give accurate scores to the manga I requested because I only got six comparisons. Several of those manga are unlikely to get comparisons from other users any time soon.
You see now why I want to limit gradings . A book with a 1000 comparisons is treated the same as one with 10. Not optimal, but not a major issue quite yet… but will be something I need to address! It’s also not as simple (I don’t think) as making books with a lot of comparisons very sticky… that can cause other issues as the notion of what a ‘level 30’ book may change slightly or perhaps the level of the books compared to… and you want those changes to permeate throughout the system reasonably quickly, which is something Elo does well! And the latter is pretty important for a nascent grading system.
The current system I think is a reasonable compromise - it takes a few users with a very different notion of a book’s difficulty to change a book by 2-3 levels. Nicole, for instance, I think mostly changed 1Q84 from 33 → 34. It’d be rather unlikely for a large rating change to happen unless the book’s difficulty actually is underrated.
That’s probably right, I’d certainly do it before author and publisher.
I think I’d prefer to allow you to do 2 / 4 custom comparisons in addition to your six, but yeah I’ll think on it.
And I don’t know, it seems like your manga ended up right where you expected them to!
Edit: And I want to emphasize again, this I just think is a good approximation for the current situation. Eventually I’ll certainly have difficulty rating changes take into account all past comparisons as well.
That is so odd… that just screams to me that this algorithm isn’t a good fit for this use case.
To an extent. Several of them landed exactly on level 20, and I think that’s partly because they were all compared to each other, plus only 1-2 previously added manga. Funny enough, I think after I removed a few comparisons and redid them, the higher level manga difficulties ended up worse than before (specifically, きんいろモザイク is too high now, though I think doing a few custom comparisons would resolve that).
Partly, this is also probably a matter of my perception being bad. I see 26 and 30 and think “these are similar difficulty”. But when 99% of manga are between level 20 and level 35, the difference between 26 and 30 becomes more significant.
Now I just need to get other people to read all the Manga Time Kirara 4-koma manga so I’m not the only one deciding their relative difficulties.
It’s not optimal and it will be something to change, but like I said, I think it is a pretty good approximation for the current state. The essential difference between my utilization of Elo for book difficulty vs for chess is that a chess match done yesterday is more valuable than a chess match done years ago… whereas that’s not true for a comparison of books. A comparison done years ago is just as valid as one today. A book’s difficulty does not change, whereas a player’s inherent skill may change.
This means that the algorithm for book’s difficulty can be made more accurate than an Elo skill rating system, agreed, but it’s not entirely straightforward! And from my experience with rating systems, you have to tread carefully
That does look accurate actually… but perhaps I can name it better. It’s most ‘written’ reviews.
Oh, I understand now! I skimmed the Wikipedia article on Elo, but I didn’t catch the part about recency. Makes complete sense for chess, as you said!
Yeah, absolutely. Sounds like it would be really hard to get it right!
Oh, I see. Maybe having a “most rated” would also be useful then. That’s what I thought it meant.
By the way, it is a little confusing that some of the sort options show an entire series and others show individual volumes. Not sure I have a good suggestion for that though. Maybe a toggle for series/individual, assuming all sort options can make sense for both. But I’m not actually sure that would be better.
I was also going to ask that you show half-stars for average scores, but then I noticed you’re doing a continuous fill of the 5 green stars to show exact scores. But this doesn’t work well with the current approach. Because of the gap between the stars, filling it in as a continuous fill makes (for example) all scores between about 76% and 85% look exactly the same. You may want to consider filling each star individually, and perhaps just rounded to the nearest 10% / half star visually (you could show the exact rating on hover or something like that).
Well it probably wouldn’t, as Elo simply doesn’t acknowledge historical matches - it only holds that information within the rating. That is a relatively ok approximation for chess for the reason I described above, but in my use for book’s difficulty, probably doesn’t make sense!
And this sort of recency bias is pretty fundamental in any skill rating system you’ll find, which makes sense. Glicko, the other most popular rating system, has a time variable baked into the algorithm . So, i’m somewhat in unexplored territory here, but like I said, it just means I can make the algorithm better than a normal ‘skill’ use case, as I don’t need to worry about inherent book difficulty changing.
I do, however, need to worry about the biases which graders introduce, which is why I need to limit gradings for the time being, until I can leverage past gradings in an effective manner.
Right, I do have a series / individual book toggle on the to-do list… where I would have some disabling effect for certain sorts (like most reviewed) to make this more clear, but yeah it’s a tad unfortunate.
Hah! Good catch, I should do that
I’m sorry, but I still don’t get it… If a book with 1000 comparisons is treated the same as one with 10, why do you need to limit the number of comparisons then?
And I’m probably too inexperienced and/or too naive regarding these rankings, but thinking about the problem as a whole, it’s not clear to me why you would want to create and constantly adjust an invisible Elo number when basically everything you’d need is a partial order of the books based on their pairwise comparisons?
I don’t actually think that was me - as I had mentioned, I discovered the constellation with 1Q84 on level 33 a short while ago, when I had already rated it. So there seem to be others who think it’s more on the harder side
But if you only allow comparisons between relatively close books (level-wise), then you’re basically removing the ability for large rating changes, no? Because with Elo, the amount of change depends on the difference between the involved parties as far as I understand.
I think that’s only because you use each grading to adapt the book’s invisible Elo number (this is how the Elo algorithm works to my understanding).
Well, I would only be able to grade it against 100 different other books, right? In this case, my additional data should be used only to refine the relation between the books, but should not weigh towards the same book. That’s what I’m grappling with at the moment. The gradings should modify the edges of the graph, not the nodes.
If you were to use each grading only as additional ranking information (to help turn the books into an ordered sequence) then many gradings would not necessarily move one book, but it would influence the relative ordering of the books.
Could you run that “the algorithm is suitable” bit past me once again please?
Therefore I believe that those kinds of algorithms are not suitable in general for our ranking problem…
I don’t see that the percentage of readers doing a given comparison is an important aspect as you would get a partial order anyways, and you would need to work with that no matter whether each edge expresses the opinion of 100% of the users or only of a little fraction.
Research on non-numerical pairwise comparisons based ranking is quite young, it seems, so maybe it’s about time to part ways when it comes to traditions . I found a guy who did a lot of research in that area:
Especially the Janicki-Zhai paper from 2012 seems quite interesting.
It also ties into the notion of possibly having more comparison options than simply “harder” and “easier” - something I was actually wondering about as well recently, as I think you will get better results faster if you allow something like “much easier” and “much harder” as well. This might be especially interesting because of the low number of gradings that you will usually have per pair.
Well yes, like I said it’s approximate and can be improved!
Wow, thanks for that! To be clear, I am certainly not an expert in this area and like I said, I will have to dedicate more time to research and develop a better algorithm, so really appreciate you passing that along!
I do still think, however, that Elo is doing a pretty darn good job currently and it is allowing Natively to collect the correct sort of comparison data. Yes, there are warts with regards to trying to prevent grader bias, but I think it’s pretty good for a V1!
If the pairwise algorithm is easy to implement though and it’s convincing, perhaps I’ll play around with it sooner rather than later, so thank you!
I think you should reformat the profile “about me” section. Even though I don’t have much text, the section being narrow and off-center just makes it feel weird to me. (Not a huge thing obviously, but figured I’d mention.)
Also, you might want to add a “you have unsaved changes” warning message if you try to leave the page when editing the profile.
Finally, any thoughts on supporting basic markdown? I’d like to have proper bullet points and be able to link my favorites to the Natively book/series pages for them. I suppose an alterative would be to allow us to explicitly select favorites and have them shown in a dedicated section of the profile. I find the current favorites section to be lacking, as it’s mostly recency based. Not to mention, I’d want to be able to select either individual books or entire series (and I’d usually pick series).
I also just noticed that the “Edit Profile” link is clickable even all the way on the left side of the page. That’s not very intuitive.
Apologies for those new error logs you probably have. I was just stress testing the finished date by setting future and invalid dates.
I tried:
- Future year, month, and day - got an error
- Invalid date (e.g. April 31st, June 45th) - got an error
- A valid year that’s not in the dropdown naturally (e.g. year 190) - successfully saved
Wow! I just check this website and this is amazing!
This will change my life and prevent me from buying books that are way above my level, thank you so much for putting this together, I will definitely recommend this to all my reader (in japanese) friends!!!
Yeah I agree with all of those things and they’re all on a to do list!
I really want to get community features going… improving the custom lists / follows / profile page / reviews / adding markdown are all in the batch of features I want to do next, I just keep having high priority things pushed up in front of it (ex: right now, I absolutely need to make my book addition process more scaleable with the Amazon API).
I do think custom lists are the way to go here. If you can adjust your favorites you could simply say in your about me to checkout your favorites. Of course though, I do agree it should be markdown, which really shouldn’t be hard.
And perhaps I can take care of those styling bugs however. I have noticed them, I just haven’t gotten around to them quite yet. Very easy though.
Heh, thanks for the note, I was going to investigate as those shouldn’t be possible in the UI which is why the backend errors. I’m not terribly concerned with any of them thankfully, but I really appreciate you checking… as it’s certainly not always the case!