subject_ids isn’t a valid filter on the subjects endpoint. Unless I’m mistaken, you’ll have to request one subject at a time, like /subjects/1, /subjects/2 and so on.
@rfindley comma separating only returns the first item
I’m working on a client-side userscript that caches all of the /subject data in indexeddb. It uses incremental updates, so it will be super efficient. Other userscripts will be able to include mine as a @require.
It’s going to be a little while before it’s all done, though.
Seems like a major oversight on our part. I think we had a reason to not include a subject_ids filter on the /subject endpoint, but it doesn’t make sense to exclude it now.
We’ve been working on some changes to pagination for v2, and I wanted to send out a general warning that we’re going to be pushing out some breaking changes later this week.
We’ll respond with an empty data set if you try and go too far forward or too far back.
We’ll respond with nulls on the appropriate URLs if you’re at the beginning or the end of the result set.
The value for the page_(before/after)_id is the ID of the record that acts as the cursor.
If you want to prep for this change, you can send those new request parameters along with the old ones — we’ll ignore the ones we don’t support when we don’t support them.
It’s far more performant when there are a ton of records and you’re further into the results, and this handles some of the weirdness of pagination when records can appear, disappear, or move around in the results.
We know there are some disadvantages to this scheme, namely that it’s hard/impossible to get the pages in parallel. We’re hoping that the performance gains from this (and ongoing work on performance) will let us bump the per page limit, reducing the number of calls necessary.
Thoughts?
Let us know what you think. It’s a reasonably common scheme, but it’s not quite as easy to process (mentally or programmatically) as ye old page numbering.
This answers some latent questions that I had so far left unasked.
I’ve done a lot of projects where the “weirdness of pagination” that you describe would be detrimental, so it’s sort of second-nature for my mind to go there
Losing this ability is quite unfortunate, and I see this as a net loss unless “far more performant” means >4x increase in throughput.
Another slightly less obvious downside is we also lose the ability to give meaningful progress information, as we now no longer know how many pages there will be until we get to the final one.
This is just an idea and we’ll need to see how costly it is, but I think a good middle ground would be to include a list of the “stepping” cursors with the result? That’ll address the parallelization.
This can be calculated from the total number of results / max size of the page. Obviously not as simple as the current provided info of what the max page is.
I don’t really want to start a new topic for that, but wouldn’t it be too difficult to create an additional cookie just for API key, or have it included in the jStorage?
Right now to have a simple userscript running you have to: fetch API key from the account page, handle possible errors, validate it (optional), store it, make a request, check for the user_not_found error and only then you can work with your data, but if API key was changed > do it all over again. Additionally, you have to somehow display possible errors to the user.
That’s a bit excessive for something that is going to run maybe once or twice, but every API-dependant userscript has to deal with this insanity.
I do think that would be pretty fantastic for scripts that live on WK itself (as opposed to external websites and apps).
As it is now, I wonder how many scripts are simultaneously trying to fetch the API key when it’s not present… though I do know of at least a few scripts that just let other scripts populate the apikey, or require you to manually paste it in the script itself.
The reasoning is this: APIs are meant to be used by external services. Users need to opt-in to/authorize those external services, and they do so by providing that API key explicitly (or authorizing through something like OAuth, which we might do someday). “External services” include userscripts — even though the scripts are running within the context of our site, they’re external to the application.
We’re not trying to make anybody’s life harder, but it’s the way almost all APIs are designed. It’s always up to the developers who use those APIs to figure out nice ways of handling authentication.
Keep up the good work, though, and let us know if you have any other thoughts about API v2.
Oh wow. This is terrible for my particular use of the WK API. Previously I’d be able to get crazy fast parallel response times. Instead of 21 sequential calls to review_statistics for example (at 0.40 - 0.63 seconds each) I’d get the worst response time plus a few milliseconds as the total. See here for how much fast that can be for the end user (in this case a cache-miss) 1.056 seconds in total for about 40 calls where the slowest individual response was 1.044 seconds:
Edit: I’ve switched my code over to use the new pagination that doesn’t allow for parallelization. I must say that while many requests sometimes need to be made in parallel, I’m seeing a much faster median response time per request (more like 170ms, rather than 750ms). Well done!