API changes - Get All Reviews

Hey, it is almost one year already!

5 Likes

boooooo

4 Likes

We’ll have to prepare the “celebrations” for the big day when it arrives… :skull: :birthday: :skull:

2 Likes

The cake is a lie

6 Likes

I think this problem could be solved extremely easily (like, in 3-4 hours with no risk of breaking anything and the ability to roll it back with the click of a button if something goes wrong) by simply creating a new API endpoint, something like enterqueueforreviewdata (or literally anything) that takes in a number specifying how many reviews the client wants to recieve (and also accepts a string like “all” if they want all of them) and then generates an ID, adds it to a queue, and sends that ID back as the resultion to the request. Then another endpoint, we’ll call it getreviewdata, that expects incoming requests to include the previously generated ID and responds by either telling the requester to try again later if it hasn’t finished yet, or if it has finished, responds with the reviews as requested.

This way you can throttle the database any way you want to, for example, only allowing 5 requests to go through every second or something like that. The client will then just keep pinging the getreviewdata endpoint with their ID every 5 seconds (or any number of seconds) until their data has been collected and is ready. As soon as the data is collected it can be wiped from the server’s memory - or wiped if it isn’t collected within 20 seconds of being pulled from the database (or any amount of time, really). That way it won’t fill up the RAM on the server since a max of 100 users’ worth of data will need to be stored at any given time.

This would also allow existing applications to not break, since it isn’t replacing the reviews endpoint and is simply adding two new ones - thus allowing applications to make the switch if they want to, but not forcing them to.

This way, when every third-party app wants to update all their data at the same moment in the day, it won’t overload the server. Those 1,000 (or however many) requests all coming in at the same moment will be stretched over a 3-minute period and processed as the database becomes available. This not only makes it impossible to overload the database, but also incentivizes third-party app developers to have some variety in their request times for their own sake - since making the request during a dead period reduces the wait time - thus naturally spreading out the requests throughout the day. It also allows apps like the one I want to make who only need the data one time (not on a recurring basis) and at random times throughout the day (at the user’s discretion) to still access the data regardless of the time of day.

I see no downsides to this solution at all, and if Tofugu would give me access to their infrastructure (please don’t) for one single day I could have it finished by the time the day’s up. I guarantee you their engineer(s) is/are MORE than capable of doing this within a few hours, and like I said - if anything goes wrong they can roll back the changes instantly. Implementing this runs no risk to the existing infrastructure whatsoever - it’s simply an isolated addition that won’t affect anything else. There’s no reason NOT to do this unless they’re just being extremely stingey and really don’t want to shill out the ~$400 worth of labor that this would cost in total.

Additionally, the endpoints could even respond with a (nearly exact) time estimate to help the client know when and how frequently to poll for the results. If a request to enterqueueforreviewdata comes in when there are already 800 people in the queue, and the server (obviously) knows it’s processing 5 requests per second, it can tell the end client that their results should be ready in approximately 2 minutes and 40 seconds (it could even add like a 2% buffer to account for query times). Then if the client requests again after that amount of time and their result is still not ready, the getreviewdata endpoint could respond with a new estimate. This way the client application could display a progress bar to the end user if necessary and give them an extremely accurate (usually down to the second) estimate of how long the wait will be. It also greatly reduces how many requests need to be made to the getreviewdata endpoint (but that doesn’t really matter much since the requests will be like 30 bytes anyway)

Alright, thank you for coming to my TED talk. I really wish Tofugu would just implement this already. It really is as easy as I made it sound. Sure, it’s a “band-aid” solution but considering it would take 4 hours to setup and 5 minutes to take down if and when the time comes, I think it’s a pretty darn good band-aid if you ask me. And the user experience is a heck of a lot nicer than “sorry, this data is unavailable deal with it”

Oh, this also greatly reduces the number of requests made to the database to begin with, since it was previously limited to only 1,000 reviews at a time - if you wanted 10,000 reviews you previously needed to make 10 queries on the database. With this new method, you specify how much data you want upfront (and have a really big incentive to give an accurate value because of the wait times) that you’ll rarely ever need to run more than one single query to get all the data you want - and you’ll definitely never need to run more than 2.

And if Tofugu is worried about bandwidth usage (e.g. they want to send as little data over the internet as possible and want to discourage people from just constantly requesting all the reviews) then they could setup additional parameters. For example, 1 second is artificially added onto your wait time for every 100 reviews you request - or requests are processed in order of smallest first, largest last - or really anything. If they decided to go this route they’d have an essentially limitless number of options and an incredible amount of control and fine-tuning at their fingertips.

I have no clue why they haven’t already done this. Please. Do this. I’ll literally pay you $50 out of my own pocket if you do - that alone offsets 1/8 of the total cost (or more).

Edit - I forgot to add, you could even store the results of the database queries in .json files on the server itself. This would decrease the overall speed (having to include write times in the delay between processing requests, as well as include read time in the time it takes to respond to a getreviewdata request after the data is finished) but would decrease the cost since it would be using significantly less RAM. If RAM is extremely limited, this would solve that issue. You could even have another database running and use that to store the intermediary results instead of .json files. That would be even faster (and possibly cheaper) while still requiring no RAM - but would take a tiny bit of extra setup beforehand. - But anyway, these are both very good options if (somehow) the server can’t afford to give up 50MB of RAM for 5 minutes a day. So even that’s not an excuse. (You could even use a non-relational database like MongoDB to store the intermediary results. That would probably be even easier, cheaper, and faster.)

:sparkles: limitless :sparkles: options here. Please choose one and implement it. :upside_down_face: I’m tired of waiting.

15 Likes

Maybe send this to them in an email if you haven’t yet? At least that way it might get them thinking about it again even if only for 5 seconds :upside_down_face:

6 Likes

Alright, I’ll send them an email with a (much condensed) version of what I wrote there and update here with their response. They tend to be very good about replying to any and all contact (which I appreciate a lot) so hopefully we’ll hear at least something.

10 Likes

I am very happy to disable all my TamperMonkey scripts as I come to Wanikani to learn Japanese not to show off my stats. :slight_smile:

Technically, there are many solutions to this problem, like creating an analytic read only replica that is aggregated by day or hours instead of serving so much details on the public API. I am sure the Wanikani team is competent to figure all this out. :+1:

4 Likes

Alright, finally got a response… And it’s just them completely dodging the question and leaving us in the dark again.

Hi there,

I wanted to start off by saying thank you for writing such a comprehensive and thoughtful email to us about this topic. We don’t come across this level of detail very often, and we can tell you’re a dedicated WaniKani user who wants the best for our program and our learners.

We understand that there are learners like yourself who used the information retrieved from the ‘Get All Reviews’ endpoint, but the priority right now is to build and tweak new features for native WaniKani, and to do it as efficiently as possible (Crabigator willing :pray:). There are other potential consequences we have to evaluate first and make decisions on, which is why we haven’t addressed this topic yet.

We know this answer is not the one you wanted, but we will share more information related to any changes on WaniKani when we can. Thanks for being patient with us, and sharing your thoughts.

Regards,

[employee name]

No ETA, no explanation of what “other potential consequences” they’re referencing… no information at all other than “we don’t want to communicate with you and expect that to not change any time soon.”

13 Likes

They must have decided the problem wasn’t worth fixing, or even worth mentioning ever again.

3 Likes

Apparently.

Smh WaniKani. Huge disappointment from an otherwise well-managed product. Do better.

2 Likes

That’s right, with Mr. Wiggles out of commission there is no way we’ll be able to take back the Reichstag!

insane that its been almost a year with zero real info

10 Likes

Really frustrating that this functionality has not been restored in some way. I had a catastrophic PC failure the other day and after getting it up and running again with a fresh install of Windows I had to go about adding dummy reviews to 1000+ days so the heatmap would have a streak to continue. Seeing that number tick forward is a big motivator in keeping me coming back daily and almost losing it was a gutpunch. It’s already annoying enough that the data isn’t accurate at the moment so I do hope someone over at WK’s engineering team is still going to look into reimplementing this in some fashion at some point.

10 Likes

Not sure how much difference this actually makes, but I’d probably recommend anyone who wants to see this fixed write them a (polite of course) email. I imagine they have some way of keeping track of common bugs / requests they get by email, so showing them that there’s more than just one or two random people who want this fixed might help. And they actually reply to the emails and so have to read them - which it feels like they definitely don’t do here :sweat_smile:

8 Likes

One year :tada::tada::tada:

6 Likes

385 days. I still haven’t forgotten.

10 Likes

Over 400 days now :tada: :frowning_face:
I wonder if the devs even remember they killed this feature off…

7 Likes

Coming back from a long break and I am pretty bummed this API is still not restored :frowning:

I suppose as a work around, someone could make a user script that manually tracks reviews in the browser and can make a downloadable file with the data which could then be used by 3rd party apps. Of course the downside is that it would not work on across computers or with the 3rd party mobile apps like Tsurukame (unless the app maintainers also added that as a feature :eyes:).

If someone wants to create something like that I would be willing to update https://nihongostats.com to add support for using an uploaded data file instead of the API.

Not a great solution but I suppose something is better than nothing :person_shrugging:

1 Like

It doesn’t do the extra stuff, but this is actually what Heatmap already uses internally. It’s called Review Cache. It doesn’t solve much though, as the Heatmap thread has shown the browser indexeddb is prone to issues that lead to data loss.

1 Like