API V2 Beta Documentation

subject_ids isn’t a valid filter on the subjects endpoint. Unless I’m mistaken, you’ll have to request one subject at a time, like /subjects/1, /subjects/2 and so on.

@rfindley comma separating only returns the first item

EDIT: Encoding the commas as %2C doesn’t help.

1 Like

That’s annoying. Is there an easier way of getting all this data?

1 Like

Do you know of an easier way to get all this data then?

1 Like

I’m working on a client-side userscript that caches all of the /subject data in indexeddb. It uses incremental updates, so it will be super efficient. Other userscripts will be able to include mine as a @require.

It’s going to be a little while before it’s all done, though.

2 Likes

Seems like a major oversight on our part. I think we had a reason to not include a subject_ids filter on the /subject endpoint, but it doesn’t make sense to exclude it now.

We’ll implement it.

5 Likes

We’ve been working on some changes to pagination for v2, and I wanted to send out a general warning that we’re going to be pushing out some breaking changes later this week.

Short story

We’re going to a cursor-based pagination scheme.

Long Story

Requests will look like:

https://www.wanikani.com/api/v2/assignments?page_after_id=12345678
https://www.wanikani.com/api/v2/assignments?page_before_id=12345678

The pagination section of the response will look like:

pages: {
    next_url: :string_or_null,
    previous_url: :string_or_null 
}

Notes

  • We’ll respond with an empty data set if you try and go too far forward or too far back.
  • We’ll respond with nulls on the appropriate URLs if you’re at the beginning or the end of the result set.
  • The value for the page_(before/after)_id is the ID of the record that acts as the cursor.
  • If you want to prep for this change, you can send those new request parameters along with the old ones — we’ll ignore the ones we don’t support when we don’t support them. :wink:

It’s far more performant when there are a ton of records and you’re further into the results, and this handles some of the weirdness of pagination when records can appear, disappear, or move around in the results.

We know there are some disadvantages to this scheme, namely that it’s hard/impossible to get the pages in parallel. We’re hoping that the performance gains from this (and ongoing work on performance) will let us bump the per page limit, reducing the number of calls necessary.

Thoughts?

Let us know what you think. It’s a reasonably common scheme, but it’s not quite as easy to process (mentally or programmatically) as ye old page numbering.

4 Likes

For those interested in reading up on the topic of SQL paginating techniques, here are some resources:

3 Likes

:+1:
I’m looking forward to giving it a try.

This answers some latent questions that I had so far left unasked.
I’ve done a lot of projects where the “weirdness of pagination” that you describe would be detrimental, so it’s sort of second-nature for my mind to go there :slight_smile:

1 Like

Losing this ability is quite unfortunate, and I see this as a net loss unless “far more performant” means >4x increase in throughput.

Another slightly less obvious downside is we also lose the ability to give meaningful progress information, as we now no longer know how many pages there will be until we get to the final one.

1 Like

This is just an idea and we’ll need to see how costly it is, but I think a good middle ground would be to include a list of the “stepping” cursors with the result? That’ll address the parallelization.

1 Like

This can be calculated from the total number of results / max size of the page. Obviously not as simple as the current provided info of what the max page is.

2 Likes

Ah, yes. I had forgotten the total item count is the header of the collection. That’ll work.

1 Like

I don’t really want to start a new topic for that, but wouldn’t it be too difficult to create an additional cookie just for API key, or have it included in the jStorage?

Right now to have a simple userscript running you have to: fetch API key from the account page, handle possible errors, validate it (optional), store it, make a request, check for the user_not_found error and only then you can work with your data, but if API key was changed > do it all over again. Additionally, you have to somehow display possible errors to the user.
That’s a bit excessive for something that is going to run maybe once or twice, but every API-dependant userscript has to deal with this insanity.

1 Like

Awesome! You guys are the best!

1 Like

@anon72902257,
Did you mean:

I do think that would be pretty fantastic for scripts that live on WK itself (as opposed to external websites and apps).

As it is now, I wonder how many scripts are simultaneously trying to fetch the API key when it’s not present… though I do know of at least a few scripts that just let other scripts populate the apikey, or require you to manually paste it in the script itself.

1 Like

Yes. I think it should be easy enough, but maybe there’s a legitimate reason to make people’s lives harder ¯\_(ツ)_/¯

1 Like

We’re most likely not going to do this.

The reasoning is this: APIs are meant to be used by external services. Users need to opt-in to/authorize those external services, and they do so by providing that API key explicitly (or authorizing through something like OAuth, which we might do someday). “External services” include userscripts — even though the scripts are running within the context of our site, they’re external to the application.

We’re not trying to make anybody’s life harder, but it’s the way almost all APIs are designed. It’s always up to the developers who use those APIs to figure out nice ways of handling authentication.

Keep up the good work, though, and let us know if you have any other thoughts about API v2. :slight_smile:

5 Likes

Aaaand it’s live! Let us know if there are any questions on the new pagination scheme!

2 Likes

Oh wow. This is terrible for my particular use of the WK API. Previously I’d be able to get crazy fast parallel response times. Instead of 21 sequential calls to review_statistics for example (at 0.40 - 0.63 seconds each) I’d get the worst response time plus a few milliseconds as the total. See here for how much fast that can be for the end user (in this case a cache-miss) 1.056 seconds in total for about 40 calls where the slowest individual response was 1.044 seconds:

2017-11-17T02:56:40.558428802Z app[web.1]: 0.251375: https://wanikani.com/api/v2/summary
2017-11-17T02:56:40.576694283Z app[web.1]: 0.269222: https://wanikani.com/api/v2/user
2017-11-17T02:56:40.715775608Z app[web.1]: 0.408080: https://wanikani.com/api/v2/review_statistics?page=21
2017-11-17T02:56:40.728467504Z app[web.1]: 0.417233: https://wanikani.com/api/v2/review_statistics?page=19
2017-11-17T02:56:40.752233242Z app[web.1]: 0.442377: https://wanikani.com/api/v2/review_statistics?page=12
2017-11-17T02:56:40.758368732Z app[web.1]: 0.448057: https://wanikani.com/api/v2/review_statistics?page=15
2017-11-17T02:56:40.779070662Z app[web.1]: 0.470245: https://wanikani.com/api/v2/review_statistics?page=6
2017-11-17T02:56:40.794019963Z app[web.1]: 0.485666: https://wanikani.com/api/v2/review_statistics?page=3
2017-11-17T02:56:40.811820229Z app[web.1]: 0.502195: https://wanikani.com/api/v2/review_statistics?page=11
2017-11-17T02:56:40.820313039Z app[web.1]: 0.510985: https://wanikani.com/api/v2/review_statistics?page=9
2017-11-17T02:56:40.833479667Z app[web.1]: 0.524795: https://wanikani.com/api/v2/review_statistics?page=5
2017-11-17T02:56:40.851473888Z app[web.1]: 0.540107: https://wanikani.com/api/v2/review_statistics?page=20
2017-11-17T02:56:40.858738879Z app[web.1]: 0.549531: https://wanikani.com/api/v2/review_statistics?page=8
2017-11-17T02:56:40.885855718Z app[web.1]: 0.576844: https://wanikani.com/api/v2/review_statistics?page=7
2017-11-17T02:56:40.916933862Z app[web.1]: 0.608831: https://wanikani.com/api/v2/review_statistics?page=1
2017-11-17T02:56:40.922829987Z app[web.1]: 0.612945: https://wanikani.com/api/v2/review_statistics?page=13
2017-11-17T02:56:40.927967465Z app[web.1]: 0.619802: https://wanikani.com/api/v2/review_statistics?page=2
2017-11-17T02:56:40.933256746Z app[web.1]: 0.622409: https://wanikani.com/api/v2/review_statistics?page=17
2017-11-17T02:56:40.938656740Z app[web.1]: 0.629200: https://wanikani.com/api/v2/review_statistics?page=10
2017-11-17T02:56:40.975947683Z app[web.1]: 0.664255: https://wanikani.com/api/v2/assignments?page=2
2017-11-17T02:56:41.001338592Z app[web.1]: 0.689579: https://wanikani.com/api/v2/assignments?page=3
2017-11-17T02:56:41.043199637Z app[web.1]: 0.734606: https://wanikani.com/api/v2/review_statistics?page=4
2017-11-17T02:56:41.055108629Z app[web.1]: 0.742043: https://wanikani.com/api/v2/assignments?page=16
2017-11-17T02:56:41.065736533Z app[web.1]: 0.754678: https://wanikani.com/api/v2/review_statistics?page=18
2017-11-17T02:56:41.075012527Z app[web.1]: 0.763536: https://wanikani.com/api/v2/assignments?page=1
2017-11-17T02:56:41.079506185Z app[web.1]: 0.769463: https://wanikani.com/api/v2/review_statistics?page=14
2017-11-17T02:56:41.084677377Z app[web.1]: 0.772831: https://wanikani.com/api/v2/assignments?page=4
2017-11-17T02:56:41.089543776Z app[web.1]: 0.777474: https://wanikani.com/api/v2/assignments?page=6
2017-11-17T02:56:41.094865061Z app[web.1]: 0.782907: https://wanikani.com/api/v2/assignments?page=5
2017-11-17T02:56:41.134776950Z app[web.1]: 0.826525: https://wanikani.com/api/v2/assignments?page=18
2017-11-17T02:56:41.202074334Z app[web.1]: 0.891531: https://wanikani.com/api/v2/review_statistics?page=16
2017-11-17T02:56:41.222351306Z app[web.1]: 0.909805: https://wanikani.com/api/v2/assignments?page=8
2017-11-17T02:56:41.290977298Z app[web.1]: 0.978310: https://wanikani.com/api/v2/assignments?page=10
2017-11-17T02:56:41.298248166Z app[web.1]: 0.985805: https://wanikani.com/api/v2/assignments?page=7
2017-11-17T02:56:41.302488441Z app[web.1]: 0.989800: https://wanikani.com/api/v2/assignments?page=12
2017-11-17T02:56:41.309573015Z app[web.1]: 0.996880: https://wanikani.com/api/v2/assignments?page=11
2017-11-17T02:56:41.313524224Z app[web.1]: 1.000745: https://wanikani.com/api/v2/assignments?page=13
2017-11-17T02:56:41.317789580Z app[web.1]: 1.004859: https://wanikani.com/api/v2/assignments?page=15
2017-11-17T02:56:41.321781246Z app[web.1]: 1.008918: https://wanikani.com/api/v2/assignments?page=14
2017-11-17T02:56:41.350864973Z app[web.1]: 1.037726: https://wanikani.com/api/v2/assignments?page=17
2017-11-17T02:56:41.357209612Z app[web.1]: 1.044632: https://wanikani.com/api/v2/assignments?page=9
2017-11-17T02:56:41.363361610Z app[web.1]: [GIN] 2017/11/17 - 02:56:41 | 200 |  1.056432854s |   113.37.82.201 | GET      /srs/status?api_key=XXXX

Edit: I’ve switched my code over to use the new pagination that doesn’t allow for parallelization. I must say that while many requests sometimes need to be made in parallel, I’m seeing a much faster median response time per request (more like 170ms, rather than 750ms). Well done! :dancing_men:

1 Like

@viet I’m seeing a reproducible 500 error from the WK API. I’m not sure when it started happening:

curl -X GET -H 'Authorization: Token token=747731af-c6e2-4a6d-801b-b54514dec7d2' 'https://www.wanikani.com/api/v2/review_statistics?page_after_id=87365227'

responds with:

{"status":500,"error":"Internal Server Error"}

Note that this link came from the successful previous page’s next_url:

curl -X GET -H 'Authorization: Token token=747731af-c6e2-4a6d-801b-b54514dec7d2' 'https://www.wanikani.com/api/v2/review_statistics?page_after_id=85343379'

{
  "object":"collection", 
  "url":"https://www.wanikani.com/api/v2/review_statistics?page_after_id=85343379",
  "pages":{
    "per_page":250,
    "next_url":"https://www.wanikani.com/api/v2/review_statisticspage_after_id=87365227",
    "previous_url":"https://www.wanikani.com/api/v2/review_statistics?page_before_id=85343394"
  },
  ...
1 Like