{"error":"Rate limit exceeded","code":429}

Yep. The API rate limit counter is tied to the API key, and not the IP. Hope this helps.

We are open to adjusting the rate limit as we come to a better understanding of the current setup.

Is there any way to correlate which scripts the requests are coming from? If not, maybe in the future APIv2 could optionally take a script id/name for logging purposes or something? I don’t know if that would make sense, let alone help. But maybe there’s something that can be done. :thinking:

We could probably learn a lot from the particular queries being made.

Scripts using Open Framework would generally not see rate-limit errors on the ‘subjects’, ‘assignments’, ‘review_statistics’, and ‘study_materials’ endpoints.

But it doesn’t cache the ‘reviews’ endpoint since the data could get too large for indexedDB. So, if any scripts are using that endpoint without some kind of cache, that would be a likely candidate.

Edit: Another possible culprit would be Firefox browsers in always-private mode, which would prevent indexedDB from working, which disables the Open Framework cache. But then again, I think the default is for it to silently error out if indexedDB isn’t working.

Majority of the 429’d requests are hitting the study-queue endpoint. Like hard. Correct me if I am wrong @oldbonsai

There are a few rate limits going on. There are 60 RPM limits for API v1 and v2, but they’re independent of each other. There’s an overall limit that’s throttled per IP, but that’s pretty generous and shouldn’t count API requests.

As @viet mentioned, there looks to be some kind of misbehaved tool or script out there that checks the API v1 study-queue endpoint as rapidly as possible. When monitoring our throttling changes, I found a few API keys that were querying that endpoint at ~1500 RPM — above the rate limit, and all from the same IP address. :dizzy_face:

If people wanted to append query arguments to their get requests, those’d show up in our logs. It’s also something that could be set in the headers, although I’d need to check to see if our logging records extra headers without any extra setup on our part.

With 60 RPM, that only leaves about 5/script/minute. That should be plenty, since it’s all the same data, but they’re probably all getting it independently and eating up the API limit. (Points quietly at @rfindley’s excellent framework for caching all that goodness as the solution to that problem…)

Let us know if you figure out which one is causing the problem so we can share it with the author or the community at large. Like @viet said, we’re open to feedback on the limits. We want to find the right balance between keeping server load steady (and therefore the responses speedy) and giving people what they need to use the site and the API. :slight_smile:

3 Likes

Happy to help if I can, but I’ll need some direction in accessing the cache. @rfindley If it wouldn’t be too much trouble, could you point me in the right direction?

First, a list of scripts you’re running.

I’ve started to see this page a few times as well. For me, the story goes like this:

I’m doing my reviews, then all of a sudden, I get the screen “You lost connection. Refresh?” although I’m on a LAN so I don’t expect the connection to drop for an extended period, and usually I don’t get complaints from other apps either.
When I click “Refresh” I get redirected to the error message mentioned in the OP.

I’ve tried to investigate this and checked the background.html network connections to see whether any userscripts massively access the WK API, but I could not see anything suspicious. Unfortunately, the last time it happened, I did not have my console open so I cannot report on the network connections that were going on from the page itself, but usually during reviews there is nothing suspicious in the console either. :thinking:

I will try to investigate this further, but I thought I’d start to mention it here already in case somebody has any ideas.

Today I remembered to keep the console open during reviews, and I managed to trigger the error again. Interestingly, the access limit does not only apply when the API is accessed but also when the CDN is accessed (that was not clear to me, so I had ignored these so far). What my console showed me was this:

I’m seeing around 15 or so of those per radical or kanji review. (Interestingly, they are all 404’s anyway…)

So when I’m pretty fast, I seem to be able to hit the access limit:

As you can see, here the 429’s are starting. Initially, that doesn’t matter for the script as it doesn’t use the requested data anyway. But as soon as I want to switch to the next lesson item, WK gets blocked as well which triggers the error. (Sorry if this is clear to all of you, I just needed to write this down.)

When I hover over the rightmost entry, the hover contains an entry “onload” which points to a userscript. When I click on that, I always end up in @polv 's “WaniKani External Definition” script. The highlighted line is

  var result2 = $('<div />').append(data.responseText).find('#kanjiRightSection p').html();

which doesn’t look like the script is accessing WK on purpose…?! The rest of the script also doesn’t contain any references to WK. So at that point I got really curious :laughing:

I looked at the responseText and discovered that this is a full HTML page (from Kanjipedia, I guess), and this page contains lots of images - whose URLs are of course unqualified. And that’s where those funny requests came from, because the Kanjipedia HTML code is now being plugged into the WaniKani page and so all the image URLs get qualified with wanikani.com :joy_cat:
(Luckily this means that the fix should be simple: just qualify all the image URLs with the correct servername.)

I think I should better deactivate this script for now…

4 Likes
  var result2 = $('<div />').append(data.responseText).find('#kanjiRightSection p').html();

Maybe I shouldn’t do webscraping with client-side JavaScript alone. Had better use an API; or create my own webserver.

However, I am not doing WaniKani anymore, it is hard to test on this.

2 Likes

Yeah, it apparently needs some kind of post-processing :wink:

If you like, I could try to take over maintenance for the script (a hotfix should be doable now that I have already figured out where the problem lies) ? I don’t know if it is possible for me to release a new version of the script, though - do you know how that might work?

You might try create a Greasyfork account, and copy the code from here WaniKani JJ External Definition - Source code

You can filter out the img tag using Regex, and without jQuery using

html.match(new RegExp(`<ul[^>]*id=['"]?${id}['"]?[^>]*>(.*?)<\/ul[^>]*>`, "i"))![0]
        .replace(/<img[^>]*>.*<\/img>/gi, "")

That part is clear to me so far; although I would rather lean towards rewriting the image URLs to be fully qualified so one can actually see the images (if they are not too large, that is). I haven’t yet looked into it, will experiment over the weekend, I guess.

I was more wondering about how to publish the new version on Greasyfork under the same script name? (If that’s possible at all)
Because I assume that the script “belongs to you” on Greasyfork so that not any arbitrary person can replace it?

You can create a new script with the same name, because every script gets a different prefix number, which is the “35970” in polv’s link above. People will have to manually install from your version’s url the first time, but after that it should update automatically from the new url (if the user has updates enabled, of course).

1 Like

Thank you! I will do that then.

Okay, I have just created as pseudo-API for userscript, see here GitHub - patarapolw/wanikani-userscript: A GitHub source for `polv`'s userscripts

As I cannot test the UserScripts myself, I would probably not go further.

I might explain on how to use Node.js modules in Userscript in another topic sometimes later.

Hint: I used Cheerio.

As a first step, I have released version 0.6 of the WaniKani External Definition script today. As the original post for the script is meanwhile closed, I will create a new one.

I’ve been getting this fairly constantly. So it’s caused by add-on’s making too many requests?
I have a few add-ons that aren’t essential I’ll try turning those off.

I have not encountered this from any other script other than the old External Definition script, but yes, that’s usually the root cause.
You may want to look at the browser console to get an idea of what might trigger these requests.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.