API changes - Get All Reviews

I am really sorry that I broke your heart. If it is of any consolation I want to assure you that we are still collecting the review data while we are continuing to investigate the path forward.

5 Likes

Thanks I have updated the list.

2 Likes

Any chance you could just limit the API to return data from the last 24 hours while you figure it out? The Heatmap caches the reviews so that would keep it working

2 Likes

Unfortunately not. I have had a think about and I have come up blank sorry.

The bottleneck is caused by filtering the large review table not by the volume of data that is sent across the wire so unfortunately it is not so straight forward. It is good to know though that you could get by with much less data on an ongoing basis, but a new user of the heatmap who has been a long time Wanikani user would still require you to fetch all the reviews for the initial cache right?!

2 Likes

Thank you, I figured that might be the case. Indeed, new users still have their whole history retrieved.

1 Like

Could have a userscript, that listens out for completed quiz items on the review page and then stores that data locally. I think it that could work.

3 Likes

That’s a good idea. I might add something like that to the Heatmap to keep the review cache updated

3 Likes

I think initial retrieval is a marginal load compared to the number of users who retrieve updates whenever they reload the dashboard.
It makes sense that depending on whether the table has an index on review date (and/or user+date) that that would be an expensive operation once the table gets large.

Already worked out, but indeed correct! Review Summary uses the underlying Review Cache utility script. Do note that in this structure, users who already had the heatmap as part of their dashboard aren’t actually putting any additional load on the endpoint by adding Review Summary, as the single retrieval is used for both the heatmap and the summary.

2 Likes

The review cache now tracks completed reviews locally as long as you are letting it run on the reviews page (which is why the Heatmap now runs on the reviews page). People should start seeing new reviews roll in once they update. Note that his is per-device, and reviews done on one device won’t show up on another.

11 Likes

Really bummed to hear about this change, although it does make sense. Sorry you’re in the situation that you are :frowning:

One of my main motivators in doing WaniKani every day is seeing the streak counter and looking at my heatmap and feeling like, “I can’t break it now, right?” (Was on a 1554 day streak :smiley:)

I don’t know the tech behind WaniKani, but if a streak counter or API could be created that would certainly help remedy this situation for me – even if I couldn’t get a heatmap of all the reviews I’ve done.

Best of luck figuring out something!

Edit: If possible, it would be great if the API could be re-enabled for a day at some point so I (and others) can grab all of our review data to have. I refreshed my cache before I saw this thread and don’t have my data anymore :frowning:

4 Likes

Is there a user_id index on that table?

3 Likes

Are any changes that have been made related to any of the endpoints that @viet created as part of the WaniKani Google Sheet connection?

I’ve had my sheet pulling data for nearly three years now and today is the first time I’ve had a failure notification. I’m hoping that this connection won’t be abandoned, because I find it extremely useful to keep track of historical progress.

Exception: Request failed for https://api.wanikani.com returned code 503. Truncated server response: <!DOCTYPE html>
	<html>
	  <head>
		<meta name="viewport" content="width=device-width, initial-scale=1">
		<meta charset="utf-8">
		<title>Applicat... (use muteHttpExceptions option to examine full response)
    at fetchData_(Code:240:34)
    at generateDataFromAssignments(Code:166:24)
    at generateNewDailyEntry_(Code:251:3)

Looking at the gist, I can’t see it calls that Get All Reviews endpoint, so I’m hoping it’s just a bug.

2 Likes

This broke my (personal) script. My script periodically pulls the last few days’ worth of reviews using the updated_after parameter.

1 Like

I wrote a discord bot and one of the features is to periodically check how many reviews users have done (it just grabs all reviews that were done after midnight from the previous day based on their timezone and then looks at the total_count, never more than 1 page). This change ends up breaking this and showing that the user has completed 0 reviews for the previous day. Doesn’t break the bot or any notifications, but it would be nice to see how many reviews someone did each day.

2 Likes

Might be a bit of work, but you could set up a read-only slave instance on another server that the reviews API hits (or other APIs where data being very slightly out of sync isn’t an issue). We use a master and two slaves (mysql) on our website. The master automatically pushes updates to the slaves to keep them in sync, and the slave lags behind by only a few milliseconds most of the time which is enough for us when we’re using it for larger batch scripts that run like once a day or whatever (vs normal web traffic that usually hits the master). The reads from the slave don’t affect the master (and thus don’t affect the site itself) which means we can do much more expensive and long-running queries on the slaves without issue.

6 Likes

I’ve been keeping a log of my epic Wanikani journey for the last two years, and have been using the WK api every night in order to record the number of reviews I did that day in my log so that people can track my progress. Unfortunately, as of tonight, the api doesn’t return any reviews, so I can’t record the number of reviews completed to my log anymore.

1 Like

Will this break the wkstats website too? I checked and it is working now - I hope it will keep working, but if not, I want to brace myself for losing it.

We tried to add this last year but it came as a surprise to us that it made performance much worse. I have been thinking table partitioning but this course of action requires careful consideration and needs to be done without downtime. However please be assured we have been trying for some time to carefully improve the performance without disruption. Unfortunately this disruption became necessary.

I looked at the gist and I can’t see where it is calling the reviews endpoint. The 503 error is also not caused by the changes I made to the reviews endpoint. It might be best to address this back in the original post.

I wish it were that simple. We have a lot of data and a lot of moving parts too. Thank you for the suggestions though.

I am sorry that this had to happen. We are working on features that may address your use case as well as continuing to investigate the right course of action for the reviews table

No I believe wkstats doesn’t pull data from this endpoint so it should continue to work as the developer who built it intended.

5 Likes

That’s because the particular user is not very selective. The data is scattered across the disk anyway and the db still reads the whole thing (but in this case it first needs to access the index on the disk).

Having the api unresticted by the time range is pretty limiting the possible options you have. If you need to solve this without the API change the best way forward is to offload the data as it is comming in to a service that is designed for this (let’s say stream the data through kafka and read it in elasticsearch and then have the endpoint read it from there).

4 Likes

Ah, thank you for clarifying - I has assumed the thread would be closed, failing to take into account that threads in announcements don’t close, hence tagging Viet, haha.

I’ll go and post there now :+1:t2: