[Userscript] Wanikani Heatmap

They both rely on the Open Framework, so maybe you could try uninstalling and reinstalling that. Also do you have any privacy related extensions installed that might cause scripts not to run?

I uninstalled and reinstalled Open Framework. And I still canā€™t get the Heatmap and Level Duration to be working. As for the extensions, I only have Tampermonkey, rikaikun, Grammarly, and Adobe.

1 Like

Make sure to make it the #1 script, too (just in case you forgot)

I donā€™t know what else it could be. In the console you could try entering wkof.version.value to see if itā€™s being installed at all, but I expect it to be if youā€™re not seeing any errorsā€¦

Itā€™s working! After trying your suggestion i thought off reinstalling Chrome. And now itā€™s working again. Thank you very much. I may be able to enjoy learning again.

3 Likes

Happy to hear it!

1 Like

Does the heatmap download your entire WK history every time you load the homepage?

1 Like

No, the reviews are cached in IndexedDB through WKOF. It only fetches the reviews not already cached

Thanks. I have 76,000+ reviews from over 4,000 sessions over 800+ days and it seems like it takes an inordinately long amount of time to load. I presume then that is loading from the database, not over the network which was the motivation for the question.

Hmmm, thatā€™s a good pointā€¦ Iā€™m observing very slow loading as well (and Iā€™m sitting on 118,000 reviews as of now). I just checked and the call to the backend takes 45 seconds (!) , but it restricts the reviews (https://api.wanikani.com/v2/reviews?updated_after=2022-02-16T22:04:25.935890Z) and returns an empty array (because I did not do any reviews since I last loaded the dashboard). So I assume itā€™s just WaniKani thatā€™s slow here?

Thatā€™s quite interesting. I had not thought to check the network tab of the browserā€™s developer tools. Iā€™m going to start watching that on traversal back to the main page after finishing a review session.

1 Like

its similar for me. if i want any of these to load, i have to leave the wanikani homescreen open in a separate tab for some time. i went twice to level 60 and am now on my third attempt. which also makes me sit on 110024 reviews

1 Like

Yup.

18.6 seconds for that endpoint to return upon navigation back to the main page after doing a set of reviews.

This would appear be entirely a WaniKani problem. I hope @Kumirei can forgive me.

2 Likes

I could make it display the data thatā€™s cached first and then update when the API responds, but Iā€™ve learned so much since I rewrote the script last time that I would rather rewrite the whole thing than work on any significant features. Although if @Rrwrex decides to rewrite the Review Cache it might be easy to subscribe to reviews instead of having a single call

4 Likes

Iā€™m currently in toolchain hell trying to get everything playing together nicely in node for testing as well as actually, you know, working in the browser. I became so annoyed with node not yet supporting fetch and spending so much time on tooling vs. coding that I havenā€™t worked on it in a week or two.

Iā€™ll bump up the priority with the, uh, dev team.

The plan is to make get_reviews() function identically the current version. Planned functional improvements:

  • Use indexedDB with an index so you donā€™t have to retrieve the entire array full of reviews every time. Will probably use Dexie.js for this.
  • Update the cache as you perform reviews, so that when you navigate back to the dashboard it should already be up to date.
  • Provide additional functions for retrieving review-sessions as well as an array of individual reviews.
  • Provide a pub/sub store using svelte semantics (you can subscribe to async updates by passing a function to the subscribe() call).

The last I think is what youā€™re looking for.

I also need to perform some experiments with various browsers to determine the best way to compress the data.

Needless to say, it will be a while before I finish all of this.

5 Likes

That would solve the user perception of ā€œwhat is it doing all this time?ā€ problem, but TBH, and I donā€™t want to bag on WK too much, but they should spend a bit of time looking at performance optimizations for that API route. Especially given that the Heatmap has to be one of the 10 most common userscripts and so Iā€™d presume a rather large portion of their long time users are seeing this.

2 Likes

No hurry, Iā€™m just happy that youā€™re willing to work on it so that I donā€™t have to. Iā€™m way too busy learning other things so that I can get a good job

1 Like

In their defense, it can be a lot of data (Iā€™ve got over 100,000 review records just for myself). They appear to be doing some sort of server-side caching but with a pretty aggressive cache-expiry policy. The server can take a long time to respond to the first call to the reviews endpoint, but itā€™s very quick on subsequent calls. ā€œFirst callā€ seems to mean the first call in the past several minutes or so.

Better client-side caching and asynchronous/reactive updates are a good idea regardless of server speed due to the sheer amount of data that might need to be transferred.

If I could wish for one server-side improvement, it would be pre-loading the cache with the past few days worth of reviews for each user as client-side caching only helps with older data. It would be nice if requests for recent data were always quick, but itā€™s reasonable for older records to take longer to retrieve.

2 Likes

I noticed this as well and assumed some sort of caching was being done.
But wouldnā€™t this just be something along the lines of

SELECT * FROM Reviews WHERE userId == {userId} AND reviewDate > {queryDate}

I obviously donā€™t know what their db schema looks like, and this is probably way too simplistic and they probably have a few hundreds of millions of reviews. Could they have a few billions? I guess thatā€™s only 1,000,000 users doing 1000 reviews each, or half a million doing 2000 reviews, which doesnā€™t seem out of the question.

But either way, I would not expect that query to be roughly linear in time to the number of reviews a user has done when those reviews are a small number where small number is fungible but greater than the 100,000 number. Iā€™d expect ā€œgive me all of a userā€™s reviewsā€ to be faster, not counting data transfer time than ā€œgive me userā€™s review after this timestampā€, but again, not as glacially slow as we are seeing in anecdotal reports here.

1 Like

Okay, now in all seriousness. I do seriously hope they have a database in place to hold that data. Now, databases are specifically built to efficiently handle large amounts of data. Give a database 100,000 datasets and it will be mildly amused. The filter operation is just a date comparison that should run efficiently as well if they use a suitable datatype for this. The data transfer cannot be the issue either as I get an empty array back most of the time. So whatā€™s blocking the backend? Itā€™s a real, serious puzzle for me.

(Just for comparison, in my current day job I wrote a React app that gets around 100,000 datasets of serious data from the backend and holds them in an immutable.js Map in the frontend, and operating on this data in the frontend takes way less than one second, no matter which sort, filter or edit operations the user performs.)

Fair point! But if the number of reviews seriously is an issue, wouldnā€™t you split up the database into several which each holds a range of users? As the user would be the first filter thatā€™s being applied, like you sketched above. Also, you can tell a database to add an index to a field and this will unleash lots of performance magic.

3 Likes

Iā€™m giving them the benefit of the doubt that they donā€™t have an index on reviewDate. Iā€™d assume they have one on userId though which Iā€™d think would make selection of entire review dataset for any given user relatively efficient.

But what do I know. I just peeked at our production database cause I was curious and its largest table only has about 225,000 records. My day job for most of my career was writing software that slings packets. There was no sql involved, so you can take my comments for what they are worth. :slight_smile:

2 Likes