Indexeddb has a storage limit of about 50MB, but it’s not as easy to use. There are libraries you can use to make it easier.
I imagine you wrap the ajax in a while loop which is triggered by a flag variable. You’ll need to define a starting url variable and an empty data array variable outside the function.
Inside the while loop you’ll do the ajax call by passing in the variable with the starting URL, push the data from the response body to the array variable, and then check if pages.next_url exists. If it does, you leave the flag variable alone and set the pages.next_url to the url variable. If pages.next_url does not exist, then you toggle the flag variable to false and the loop is finished.
That’ll be my first approach to getting through the pagination.
I switched to Python instead. Actually, I just want to write Javascript to save to file.
import requests
import json
def get_pretty_print(json_object):
return json.dumps(json_object, sort_keys=True, indent=4, separators=(',', ': '))
apiKey = 'XXXXX'
url = 'https://www.wanikani.com/api/v2/'
endpoint = 'review_statistics'
parameters = ''
HEADERS = { 'Authorization' : 'Bearer {}'.format(apiKey) }
with requests.Session() as s:
s.headers.update(HEADERS)
r = s.get(url+endpoint+parameters)
fout = open('stat.tsv', "w")
j = 0
while True:
j = j+1
for i in range(0, len(r.json()['data'])):
print 'Printing line ', i, j
fout.write(str(r.json()['data'][i]['data']['subject_id']))
fout.write('\t')
fout.write(json.dumps(r.json()['data'][i]['data']))
fout.write('\n')
full_url = r.json()['pages']['next_url']
print 'Opening', full_url
if full_url == None:
break
r = s.get(full_url)
fout.close()
The response code when the rate limit is triggered has been update to 429. It was previously a 403.
FYI, the /subjects endpoint’s type filter doesn’t seem to accept multiple types. I wasn’t sure if that was intentional.
https://www.wanikani.com/api/v2/subjects?type=radical,kanji&levels=1
=> 422 Unprocessable Entity
https://www.wanikani.com/api/v2/subjects?type=radical&levels=1
=> 200 OK
https://www.wanikani.com/api/v2/subjects?type=kanji&levels=1
=> 200 OK
Thats how it was specced out and built. If the filter is not pluralize then its a safe assumption it can only take one entry.
I am assuming there are use cases for multiple types?
Ahh, I’m sure I read that, but apparently forgot.
In my Burn Manager script, when the user is selecting what items they want to select from for resurrection or retirement, they can select by various criteria such as level and type. Under the old API, I’m just sending multiple requests, so it’s no problem to continue doing so. I’m content with APIv2 as-is.
Gotcha. We had a quick discussion about this and we see nothing wrong with expanding out type to be types. We’ll refactor and have it accept multiple types. Will let you know when it is live.
We’ll also cascade this change to endpoints which take in subject_type
So just a heads up, next round of updates coming out soonish will have subject_type filter on Assignment, Review Statistics, and Study Material endpoints updated to subject_types. And the Subject endpoints type filter will be updated to types. This will be a breaking change.
Updates:
-
subject_typefilters forassignments,review_statistics, andstudy_materialsendpoints have been converted tosubject_types, which can take in a comma-delimited list. -
typefilter forsubjectsendpoint has been converted totypes, which can take in a comma-delimited list. - Added
max_level_granted_by_subscriptionto the user endpoint. It returns the maximum level access the user has, based on their subscription status. At the moment its3for freemium and60for paid users.
Does this imply there will be additional options down the line? Or was it done this way (as opposed to just returning subscription status) as a convenient way of getting the max level for looping API requests and to gracefully handle if more levels are added or become free?
Maybe, Nothing planned right now.
Yep, you can use it for this purpose.
We plan on updating the terms and conditions for the API, which will have some clauses on what content can be displayed to the user. This attribute will help the developer respect the T&C and to make the capped levels explicit in the API.
Great! I’m looking forward to this.
Hopefully, the T&C will include a lot of detail about which stuff is considered proprietary. Even just listing what kanji are on each level could theoretically be an issue, because there’s a surprising amount of value (and work!) just in the curated organization of the items by level.
I’m seeing really slow query times on the /review endpoint.
- Without any filters, it times out after 30-60s
- With a single value for subject_ids, it’s about 0.5 to 2s. (depending on whether there’s data for that subject_id)
We’ll take a look at it Friday. Sorry about that.
No problem. I’m just exercising my framework at this point, and finally getting a real feel for what the API has to offer. Looking great!
On endpoint /summary, what does result.data.review_subject_ids signify?
I have 3 items in [/summary]->data.review_subject_ids that aren’t in [/subjects],
and 33 items in [/subjects] that aren’t in [/summary]->data.review_subject_ids.
Here’s what I’m doing:
// Fetch all /subjects data
// [not shown]
// Extract the ids
var subject_ids = result.data.map(item => item.id);
console.log(subject_ids.length); // 8792
// Fetch all /summary data
// [not shown]
// Extract the ids
var summary_ids = result.data.review_subject_ids;
console.log(summary_ids.length); // 8762
Then I compare the two arrays:
// Items in /summary that aren't in /subjects
summary_ids.filter(item => subject_ids.indexOf(item) === -1)
--> (3) [3877, 6390, 7633]
// Items in /subjects that aren't in /summary
subject_ids.filter(item => summary_ids.indexOf(item) === -1)
--> (33) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20,
21, 22, 23, 25, 443, 6956, 7280, 7453, 7457, 7461, 7715, 8761, 8762, 8763]
[edit]
Six of the 33 subject_ids that aren’t in review_subject_ids are new items from June 2017 (up for Burn review in 5 days). The rest are all of the level 1 radicals, which doesn’t make sense to me.
data.review_subject_ids represent the subjects currently in the API key owner’s review queue. They should map to subject resources by id. Whereas /subjects is a collection endpoint of all the published subjects (radicals, kanji, and vocabulary) on WK.
I’ll have to look into the IDs from review_subject_ids which aren’t matching anything from /subjects; it could be we forgot to scope it down on our end (possibly unpublish/removed subjects…). Is this data under your API key?
Also, a little off tangent. We reviewed the /reviews performance Friday and made some db query and index improvements. Changes will be on production Monday.
Yes, it’s under my API key.
Should I have 8762 items in my review_subject_ids? All but 6 items are burned (and not resurrected).
