API changes - Get All Reviews

Fryie · April 28, 2023, 9:16pm

In my understanding, an index maps a key to specific rows of a table, not just blocks of binary data, which is why I can’t really follow. Admittedly, there may be IO overhead that I’m not aware of, especially if you have slow storage media without random access, but at least you save the time of having to compare every row in the table.

If an index on a user id, in the case where there are probably at least thousands of users, isn’t selective I wonder what kind of index is.

But admittedly, this is not my area of expertise, so I must have fundamentally misunderstood something.

fundy · April 28, 2023, 9:38pm

index maps to rowid (or similar) - which tells the db where to look for the row, it then needs to load the specific data page from disk (if it is not in memory,; even on ssd this io takes a long time). This might help? - How Postgres Stores Rows
Imagine you need to obtain 5 books from 5 different libraries - it does not help you if you know which book is in which, if you still need to go to all of them.

You are correct that the userid will reduce the number of data obtained - but that does not help in this case where the data is so scattered that it does not help. I admit though that I might be completely wrong (I have not seen the execution plan for the query).

Fryie · April 28, 2023, 9:48pm

Fair enough, but the way I would have understood it

A full table scan needs to load all blocks into memory too, and so even if the indexed query had to load most or even all blocks, the only additional I/O it needs to perform is loading the index which… probably is not a lot in comparison?
The indexed query avoids more CPU work by not having to compare each row to the value.

So I could see that in a specific case, the index might not make the query much faster, but that it actually makes it slower?

seanblue · April 28, 2023, 9:51pm

That’s just focusing on the read side though. By adding an index the write side likely gets slower, and for a table that’s written to every time a review is completely that could have a big impact.

Fryie · April 29, 2023, 8:26am

True, but I was under the impression that we’re talking about the GET reviews API here which is dealing with reads.

seanblue · April 29, 2023, 12:23pm

Yes, but it’s still something the WaniKani devs would have to be mindful of. Making the reads faster doesn’t help if it takes down the app when writing. Not that I have any idea if that would be relevant to them in practice.

l_l · April 29, 2023, 1:54pm

It seems to me like partitioning the database by user is the way to go then. As a bonus, you could find inactive users and put them in a separate partition to reduce the load on the main partitions.

gijsc1 · April 29, 2023, 4:16pm

I am assuming that if it was all that simple this thread would not exist in the first place though.

Cr_nch · April 29, 2023, 11:32pm

Reorganising databases is not that simple. It would probably take hours just for the update process itself, and even then it would need to be thoroughly tested to ensure nothing went awry during the process.

PityPancake · April 30, 2023, 1:53am

Dang, heatmap was my main tool for motivating myself to do my reviews as often as possible. Kinda sad if it never works again.

Belthazar · April 30, 2023, 1:55am

There’s been an update so that it records session data locally. Just tell Tampermonkey (or whatever) to check for it.

blex · April 30, 2023, 5:10am

This is the first time I saw https://nihongostats.com/, pretty cool site! I wanted to develop something roughly along the lines myself But I assume the site is affected by the /v2/reviews API change and there is no workaround, right?

blex · April 30, 2023, 5:46am

Databases are not my forte, but this sounds like the typical use case for MapReduce to me, basically doing what google, who coined this technique, did with their search results, though of course on a much smaller scale. However, this would require a NoSQL database (like MongoDB), which seems to me the better choice for this type of data due to the usual problem of fitting the user table on one server and not being able to use things like sharding. But of course, hindsight is always 20/20…

blex · April 30, 2023, 4:29pm

@tofugu-scott
I’ve read a bit in [Userscript] Wanikani Heatmap and it seems most people, myself included, are missing the streak counter above all. So I would suggest implementing this as two new endpoints, for example GET /user/streak/review and GET /user/streak/lesson to soften the blow from removing GET /reviews.

From what I understand about WaniKani’s architecture and data model, the backend looks to me like the better place to do this than the client because to query the number of reviews for a user and for every day since creation is probably not cheap.

It might even be preferable to add this as a field to the user model, which (after initialization) gets incremented or reset via an hourly (to avoid timezone trouble) cronjob that only checks reviews/lessons of the past day from the user configured “New day starts at” time to avoid the JIT calculation in the first place. Changing the “New day starts at” time probably needs reinitialization of the two streak fields, but that’s much rarer than on every page refresh from someone who has the heatmap userscript installed.

username21 · May 1, 2023, 1:46am

Let me double you, dear sir.
Same here, missing streak feature for Heatmap script.
@tofugu-scott I should admit, looking for daily tiles being painted was pleasant as well, but most of all I miss the streak counter itself.

I’m sorry for bringing another service as example, but Bunpro guys might be doing it in quite the way I’d be happy with.

balujoshi · May 1, 2023, 2:07am

Same boat my friend. My previous streak was 329 days and after a few breaks, was close to breaking it. Let’s hope there’s a solution soon…

doncr · May 1, 2023, 3:27am

I agree with this. My thinking is that the reviews table is very big and write-heavy which caused the negative impact when the new index was added.

Partitioning the table on user id would almost certainly fix the problem but the challenge would be getting there if the reviews table is essential for the main application to work.

(I had MySQL / MariaDB in mind when I wrote this but I could be wrong…)

saraqael · May 1, 2023, 11:08am

Wait, am I dreaming or has this feature been restored just now?

My heatmap seems to have loaded, I got the WKOF window loading the missing reviews, and my own stats site works again as well.

@tofugu-scott Can you confirm this? Have you found a solution?

Mrs_Diss · May 1, 2023, 11:46am

heatmap has a fix because it can get by with just pulling recent reviews, and keeps the rest of the data cached. breaks if you move between devices though.

other scripts (e.g. ganbarometer) aren’t working, so doesn’t look like it’s fixed.

plantron · May 1, 2023, 12:52pm

Our explorer Dora… er I mean @Kumirei has fixed that userscript.

Topic		Replies	Views
API Information API And Third-Party Apps	82	15873	August 5, 2015
The New And Improved List Of API and Third Party Apps API And Third-Party Apps	720	393807	February 16, 2026
Try to get back sense into the WK developers - listen to your customers movement WaniKani	39	2433	May 10, 2024
Review count analysis API And Third-Party Apps	73	11998	February 2, 2021
Current state of WaniKani? WaniKani	19	1387	May 2, 2025

API changes - Get All Reviews

Related topics