Import all the things!

Last year Elon acquired Twitter and began running it into the ground, so I imported my tweets here & stopped posting there. A few months later I also imported all the posts from my Mastodon account. Here are a couple of examples:

This is a static website, and the static site generator I use is home-grown. So in both cases I wrote an importer from scratch. I really half-assed it, to be honest. Neither importer handles media, so any photos in my posts are missing. I told myself I’d get back to that but I have yet to.

The other day I decided to have a go at importing my Bluesky and Threads posts as well. I started with Threads. I’ve only posted there a couple dozen times, so I figured it wouldn’t be that hard.

It turns out you can’t even download your Threads archive without also downloading your Instagram archive, and it was very easy on the same page to also request an archive of my Facebook posts. The archives of all three Meta services are very similar, which I suppose shouldn’t be all that much of a surprise. So instead of writing a Threads importer, I wrote an importer that handles all three archives. It handles media too. Here are some examples:

My first Facebook post is apparently just a link to my Twitter account.
My first Instagram post is a shot of my father-out-law’s old homebrewing setup.
My first Thread is literally just the 💯 emoji.

Having written several importers like this over the last year, here are some disconnected observations about these services’ archive formats:

Twitter

Their archive format isn’t technically JSON; it’s a JavaScript file you’re expected to eval(). 🤮
The data for a tweet doesn’t contain a URL to the tweet on Twitter’s website. It does contain the tweet ID, and you know what account tweeted it, so you’re able to construct a URL for it, but the format itself doesn’t contain the URL.
Twitter didn’t originally have native support for at-mentions, hashtags, or other such syntax, so they had to kind of bolt on support for formatted tweet content, and it really shows. It’s super awkward to code for.
The per-tweet field called full_text, well, isn’t. It’s often truncated.

Mastodon

Unlike Twitter and Meta, Mastodon’s archive format feels like it was designed by web nerds for web nerds. It’s super-easy to generate a high-fidelity version of a Mastodon post on your own site.