Autoplay Audio is slow

Will this take more time than the previous version? There are times where I’ve submitted my answers and it’s been a minute but nothing’s played back yet :frowning:

I’ve even tried and left it for an hour and nothing was played back. And my internet connection’s pretty solid, latency of 6ms on average and consistently faster than any countries I’ve been to so far.

For me I’m at the stage where Im trying to get the pronunciation right. Like I don’t want to end up mispronouncing my 端 and 箸 and 橋 :joy:

thanks for your input though. I just cleared 100 items and only one of it had audio feedback. Sigh

2 Likes

Is it possible to get the old behavior of pre-loading everything reverted?

The CDN doesn’t really work in speeding things up. It only works sometimes, but sometimes results in long delay (in the range of 1s)

Personally, this issue is taking away from the good experience that I previously had - I just want the audio to play immediately after I answer.

The way that it was done in the past was working great.

3 Likes

Here’s additional evidence that there’s something wrong with the CDN.

Sometimes, it takes me 882ms to load the audio, which is why the audio feels delayed.
(I’ve seen cases where it took > 1s, but didn’t manage to capture it)


Pinging the CDN yields 3ms response, so this shouldn’t be a network issue.
Instead, the CDN is just slow to respond.

Pinging 18.155.68.83 with 32 bytes of data:
Reply from 18.155.68.83: bytes=32 time=3ms TTL=248
Reply from 18.155.68.83: bytes=32 time=3ms TTL=248
Reply from 18.155.68.83: bytes=32 time=3ms TTL=248
Reply from 18.155.68.83: bytes=32 time=3ms TTL=248

I think we need to have either:

  • A method to cache the audio files locally in the user’s browser. (Maybe some kind of Cache-Control: public, max-age=XXX header in the response? I have no idea if this will work.)
  • Pre-load all the audio files when the user hits the page.
5 Likes

Thank you for the extra info. I see that the request was a RefreshHit from cloudfront which means it has gone back to s3 and that seems to be far away from your edge location hence the longer response time.

I went and investigated this and I found that we weren’t setting the cache control headers in the response from the origin server so the CDN (cloudfront in this instance) was using the default TTL of 24 hours. I have updated the origin server to return a far future expiration (1 year), so hopefully that will resolve the issue.

I did not invalidate the CDN cache as I thought it was better that it can use whatever cache it has to keep serving quickly and the RefreshHits would sort themselves out after 24 hours.

I have also not discounted preloading the audio, but it is not as straight forward as some might believe, so it will take some time and caution as I don’t want to introduce the same bugs that I got rid of by simplifying the code. I am still investigating this. In the meantime if this cache control change does improve the situation please do let me know.

2 Likes

It seems to somewhat help because many audio files are being loaded from the disk cache now. This is probably because of the Cache-Control header that’s new.


The original issue might still be around if the files are not cached locally on my computer’s disk.

For example, if my computer’s disk cache (I cleared my cache to test this) does not have the file, then it’s potentially going to be slow. If you look below case, it took 1s to load the audio but it’s a Hit from cloudfront.


Thanks for the feedback and I am glad there is an improvement for you.

The original issue should not be as bad because cloudfront doesn’t need to go back the to the origin server after 24 hours. You should now always get a cache hit from cloudfront provided the cache is warm (which I expect it to be) so the latency will be reduced which is what you were experiencing before.

Oh sorry, I just edited my previous reply.
I decided to mess around by clearing my browser’s disk cache, and got an issue whereby it took 1s to load the audio. It looks like CloudFront was a hit, so I’m not sure why it took so long for the audio to load.

Anyways, the cache control header definitely helps and improves my experience because it causes my browser to cache it locally. But I’m not 100% certain that it actually fixes the actual issue.

Could you possibly take a screenshot of the timing tab too when you have the latency?

I think I did it in my edit of the previous post.


Here’s another - 937ms. I didn’t capture which word was it, but I guess if we really want to, we can manually download the audio and listen to it.


Sorry I wasn’t clear. The timing tab is the one shown by the blue arrow:

For today, I got a few misses from Cloudfront that cause audio delays to 888ms.
This is 3/106, so it isn’t too bad. It is already much better compared to the past.


Thanks for the update. I am much more interested in the timing when you experience latency for a cache hit. If you come across a cache hit that has poor performance please send me a screenshot.

The misses will resolve themselves as more people that use the same edge location in your region do reviews and lessons.

Audio autoplay only works for me about 40 % of the time (at all, unless I need to wait reeeeeeeally long). I’m using: Google Chrome Version 109.0.5414.119 (Official Build) (x86_64)
I’m using zero scripts.

Can you provide screenshots of network latency and timings?

Found one right now.


I’ve realized that I was connected to a VPN and the ping to CloudFront is actually 250ms. Will also see if this can be reproduced without the VPN (which should give me around 3ms RTT to CloudFront)

Pinging 18.67.17.72 with 32 bytes of data:
Reply from 18.67.17.72: bytes=32 time=251ms TTL=234
Reply from 18.67.17.72: bytes=32 time=251ms TTL=234
Reply from 18.67.17.72: bytes=32 time=251ms TTL=234
1 Like

For me, most of the delays are because I have a VPN connection, changing my region, and increasing the round trip time. The experience without preloading is honestly not the best. There are tons of random delays whenever something isn’t cached locally on disk.

I don’t want to have to disconnect the VPN whenever I want to use WK.

It’s honestly slightly annoying at this time and it’s a bit of a pain to do reviews on WK right now.

And what happens when you turn off the VPN just to test it out?
The connection start takes more than 700ms which isn’t great.

Yes, it’s not as noticeable when the VPN is off.

It’s slow because the DNS to the S3 bucket gave me a DNS server in a different country.
The network RTT to this S3 bucket is around and we have to do the SSL handshake, which involves a few RTT.
Therefore, the delay to load the file is terrible.


Anyways, as I understand, that this is the current state:

  • The auto playback of audio during reviews can be delayed because the audio file must be downloaded before playing.
    Unlike in the past, audio files are no longer preloaded.

  • In the past, this was not noticeable because the audio file was being preloaded while the user was inputting the answer.
    It’s going to take a few seconds for the user to input the answer.
    This is more than sufficient time to handle any slow loading of audio.

  • Improvement was made in this thread to use the “Cache-Control” header.

    • This helps the S3 bucket to cache the file longer.
    • It also helps the Web Browser to try to cache locally to disk.
      This helps (and I appreciate it), but it’s not 100%.
  • We are now relying on two levels of caches to try to mitigate slowness (instead of pre-loading).
    Browser disk cache and S3 bucket cache.

  • Caches are not 100% confirmed to contain data.
    Cache misses are always possible, which results in an annoying delay in the autoplay of audio.

  • If you use a VPN, you can be directed to an S3 bucket that’s far away from you.
    This means that the S3 bucket isn’t helpful (because it’s far away, with high RTT) and you are now relying solely on your browser’s disk cache for speed.

  • Yes, I know I can turn off the VPN, but that’s a bit of a pain.


I’m going to try looking for a user script (or maybe even work on one) that can pre-fetch the audio file.
Thanks!

2 Likes

I’ve tested for the last week and pre-loading the audio basically solves all issues with delays caused by slow loading.

I’ve added the following into my browser’s console and this solves all problems for me.
@tofugu-scott, would it be possible to consider pre-loading the audio again? If not, I guess we can make the following into a real user script.

// For now, just get all audio for the next page.
// Maybe some optimizations can be done in the future.
// We are just doing a fetch and ignoring the output to force the browser to pre-load the audio.

window.addEventListener('willShowNextQuestion', (ev) => {
  console.log(ev)
  if(ev["detail"]["subject"]["readings"]) {
    for(const reading of ev["detail"]["subject"]["readings"]) {
	  for(const pronunciation of reading["pronunciations"]){
	    for(const source of pronunciation["sources"]) {
		  var audio = source["url"];
		  console.log("Pre-fetching " + audio);
          fetch(audio, {mode: 'no-cors'});
        }
      }
	}
  }
})
2 Likes