That’s great news, thank you!
Really, I am kind of speechless. First, the summary pages were removed, without the slightest even pretense of solving this problem for the last three weeks (working on it…), and now this stat thing is removed too. So again no stats. What a pityful performance. And how stupid to get a lifetime for this site.
This. Shameless conduct.
They forgot the basics of teaching and focussed more on reducing their costs.
I don’t think it’s fair to be so critical. While I am very disappointed by this as a user, it sounds like this is a real technical issue that gave them no choice, and they are trying to find a workaround. And of course, the reviews API has often broken in the past, so it’s not hard to believe that it would be straining under the load.
P.S. I just realized that even if I manually track the number of reviews each day for my log, there is no way for me to get accuracy data any more. I can’t even try to calculate accuracy manually because the session summary page is gone too.
Maybe if WK team had worked with/included userscript creators in the conversation earlier, some of the loss of stats and streaks could have been prevented? Seeing as heatmap was updated so quickly to regain at least some of the same functionality as it had before. I think the userscripts are a big part of what makes WK a great learning tool, so I just hope that the WK team isn’t taking them for granted.
I’d think that 90% of the heatmap users are using it to track review metadata, not the actual reviews themselves. I realize that the actual review data is present, but most people care about the streak number and review count, and the nice colored graph.
To that end, a review metadata table is a possible solution that gets what 90% of the people care about. The metadata table keeps a count of reviews for any given day. Maybe it has other things, maybe it doesn’t but that would give you a much smaller table to query from that can get you at least enough data to display this…
I realize this is work, and work is not free, so I won’t make any judgements about ease of implementation. Just something to think about.
work is not free
Unless you want a Heatmap ![]()
The metadata table keeps a count of reviews for any given day.
I would be entirely satisfied with this. The extra information I provide in the Heatmap (mainly by clicking on or selecting dates) is more or less superfluous
Although then they would have to account for user timezone, which is tricky with aggregate data
I did quite like the extra details from clicking on the individual dates - always gave a nice idea of the success rate at the various levels.
Here’s the script that I’ve been using for myself: WaniKani Stats / Fedor Indutny | Observable (and it got broken obviously). It is a sort of heatmap-like view of wanikani progress, and I made sure in the script to cache the previous results and request only the latest entries to keep the server load minimal. FWIW, I really only need the number of reviews/lessons done per day. Is there any other way to request that?
tl;dr: This broke my personal script that counts “reviews I’ve done today”; I was able to fix it, but I don’t think the fix generalizes to most peoples’ use cases in this thread.
This temporarily broke my personal script which calls'/v2/reviews?updated_after=' + todays_date_at_midnight.toISOString() and checks the total_count to see how many reviews I’ve done today.
For that use case, I was able to just change reviews to review_statistics, which works almost the same. (It does undercount items that have been reviewed more than once in a day, but that’s fine.) And I imagine it’s easier on the database because the number of review statistics is much smaller than the number of reviews.
Unfortunately, I don’t think the review_statistics endpoint is useful for getting review stats for any day before ‘today’, so it won’t be useful for e.g. heatmap scripts unless they run every day from the start. A new endpoint for review metadata would be nice!
@tofugu-scott Would it be easy to update the documentation with a warning that this endpoint currently returns an empty dataset?
As upsetting as this is. Ty for being forward on the technical side, Scott. Sounds like a migration is needed if you have hopes of getting this end point back up, that’s daunting to hear ![]()
Underrated feature >.> The detailed data on the dates is great. It’s a great collection of your items and stats. I use it as a quick review method with the mouse over data, and after reviews I use it to go to item’s pages when I think I need more work on an item. Since the summary page was removed heatmap’s details have been the replacement for it, too
Might be worth updating the API docs while this is in place.
@tofugu-scott Would it be easy to update the documentation with a warning that this endpoint currently returns an empty dataset?
I have updated the documentation while we continue our evaluation
FWIW, I really only need the number of reviews/lessons done per day. Is there any other way to request that?
You could potentially use the assignments endpoint in conjunction with the updated_after parameter. It wont be perfect as it wont account for reviewing the same item twice in one day
@tofugu-scott Thanks for making the post to inform us. I hope WK is able to restore the API functionality soon. ![]()
I’ll add a note to the https://nihongostats.com page
That’s because the particular user is not very selective. The data is scattered across the disk anyway and the db still reads the whole thing (but in this case it first needs to access the index on the disk).
I’m still puzzling over this reply (purely out of self-interest, not necessarily just related to the issue at hand). Why wouldn’t the user be very selective? True, a user will have a ton of reviews, but that’s still several orders of magnitude away from the total number of reviews across all users? Why would the DB need to read “the whole thing”? Isn’t an index exactly trying to solve the problem of full table scans?
Or, as tofugu-scott mentioned, table partitioning might be even better.
here’s the deal:
data comes in continuously for all the users simultaneously. The db writes this into datablocks one after the other, it is not pre-sorted (aka table is not partitioned). Each datablock has finite size and will fill up. Then the next one is used, and then another one. Then you have an index on the user. Great, so this just tells you: user Fryie has review data in blocks AA, AC, AK, AM, AZ, …
Or the user fundy has data in blocks AB, AC, AK, AZ, …
When a particular user is queried, the reason it is not selective enough is that we all have the data scattered across nearly all of the blocks of the reviews table. Now the DB needs to read the data blocks of the index first (index is nothing else then another table which tells Fryie->AB, AC, AK, …)
And then read nearly the whole table anyway.
Does it make sense?
If the table was partitioned, let’s say by date (and having the accordingly modified api), we would not be having this discussion.
If the table was partitioned, let’s say by date (and having the accordingly modified api), we would not be having this discussion.
This might be a stupid question as I know little about this, but why partition by date? Wouldn’t it make much more sense to partition by user, as all read and write requests to this endpoint would always only affect one user per request?
Not a stupid question at all. But that would be a LOT of partitions. Not sure whether the actual wk db supports that.
I mean, you can always group some users in the same partition to get however many partitions you want.
