A way to backup Discourse threads

Hello! Apologies if this is the wrong place to ask/wrong way to ask. Recently I’ve been thinking that it would be nice to be able to make a backup/archive of a thread: my replies, others’ replies, poll results, stuff in summary tags, images, etc. Saving a page as an html page seems to be the closest I can find at the moment, but it doesn’t save poll results. Requesting an archive of your data through your user profile only seems to give you the bare minimum, plus you lose others’ posts and such in the thread.

Does a tool like this exist already? Is this a difficult ask for a new tool?

5 Likes

Not exactly an answer to your question, but you can bookmark a thread. Of course, this is not the same as backup – you would still need internet access and the thread would still need to be there, but as a temporary solution until you find the right tool for making an actual backup – it might be useful.

image

Sorry, I know this is not of much help :sweat_smile:

1 Like

How about appending .json to the end of the URL?

There is also Discourse API, but I don’t know how to use it. Following the reverse engineer thread may work, though.

2 Likes

That does bring back quick a bit, but it’s not exactly easily readable:

1 Like

Firefox has a built-in JSON viewer. Chrome would need a extension, like this one.

3 Likes

Double-checking with the JSON viewer, it seems only 20 posts are initially loaded in to generate the JSON for. I tried scrolling for a bit to force more to populate, but they appear to be de-loaded after a certain point, so it’s only ever a subset of posts.

1 Like

Here’s a thread that discusses handling this issue and some person seems to have written a script to solve it? Did not read too far though, so maybe the links don’t work any more… (sorry if that’s the case!)

5 Likes

Reading through that topic was pretty interesting! For the .json bit, if you add ?print=true to the end of the URL it sets the chunk size to 1000, which is more than enough to bring back all the needed posts. I believe there’s a cap on how many requests/hr you can do for that, but it seems to pull back all the data.

2 Likes

Maybe a UserScript like this. Open DevTool console (F12 / Inspect Element)

Type backupThread() or backupThread(thread_id) and press Enter.

?print=true is rate-limited. For a large thread, backupThread(true) or backupThread(thread_id, true)

Doesn’t download images. Save web page complete on the generated HTML to download images.

3 Likes

Nice, that works really well! Looks like poll results aren’t captured, however:

There should be 17 voters in the Interest Poll, 5 in the first Volumes Read, and 3 in the second.

I believe the filename generation might be a bit bugged as well. I got “t.html” as the downloaded file name; the original thread title was: "Flesh&Blood” Pirate Series Reading Club :pirate_flag: :sailboat:

Edit: Looks like testing backupThread(false) failed:

I double-checked and there was no download.

1 Like

Unless you try to back up something like the POLLs thread… But then again why would anyone ever back up the POLLs thread? :sweat_smile:
Other threads have reasonable amount of posts, so this method should work for them :slightly_smiling_face:

1 Like

Poll results are captured, but trying to build a corrected poll visual seems like trouble.

Also, it really should default to print=false.

2 Likes

Or backing up CAT pics.

Truthfully, though, trying with Optical Illusion Thread (1.7k posts), it takes less than a minute. Re-saving as Webpage, complete, to download images takes around 6 minutes.

2 Likes

The script now displays polls.

1 Like

Thank you! I won’t be able to double-check the script for a bit as I’m traveling, but I’ll definitely do so when I have the chance!

This is waaaay later, sorry, but just wanted to say again that the script works fantastically; thank you so much!

1 Like