29 Apr 2024 · 11 min read

Publish Your Mastodon Posts to Astro

I’m a big fan of hosting your own content. Partly because I like keeping control over what I’ve posted, partly because platforms like ~~MySpace~~ some older video sharing sites will just lose all of your content that you hadn’t backed up anywhere else because you were young(er) and stupid(er).

Since the implosion of The Site Formerly Known as Twitter, I’ve been using Mastodon to post infrequent nonsense that crosses my mind. With the Fediverse, it’s certainly possible to run your own instance and own your content that way, but I was looking for a simpler solution to just copy the posts into the site here.

Anyway — whatever your reason — let’s look at automatically importing Toots into Astro.

The General Workflow

How this is going to work is:

Make a post/toot on Mastodon
Trigger a webhook to grab the content of that toot as JSON
Write that JSON file directly into an Astro repository on GitHub
Trigger a build and deployment of the site

Easy, right? Let’s get things set up to work.

Examining the Toot Structure

Getting the JSON representation of a Toot is pretty simple — copy the URL of a single post and add .json to the end of the URL. Doing that with a test toot gives us something like this:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {
      "ostatus": "http://ostatus.org#",
      "atomUri": "ostatus:atomUri",
      "inReplyToAtomUri": "ostatus:inReplyToAtomUri",
      "conversation": "ostatus:conversation",
      "sensitive": "as:sensitive",
      "toot": "http://joinmastodon.org/ns#",
      "votersCount": "toot:votersCount",
      "blurhash": "toot:blurhash",
      "focalPoint": { "@container": "@list", "@id": "toot:focalPoint" }
    }
  ],
  "id": "https://mastodon.social/users/kwilson81/statuses/112357239872765151",
  "type": "Note",
  "summary": null,
  "inReplyTo": null,
  "published": "2024-04-30T00:03:42Z",
  "url": "https://mastodon.social/@kwilson81/112357239872765151",
  "attributedTo": "https://mastodon.social/users/kwilson81",
  "to": ["https://www.w3.org/ns/activitystreams#Public"],
  "cc": ["https://mastodon.social/users/kwilson81/followers"],
  "sensitive": false,
  "atomUri": "https://mastodon.social/users/kwilson81/statuses/112357239872765151",
  "inReplyToAtomUri": null,
  "conversation": "tag:mastodon.social,2024-04-30:objectId=696705597:objectType=Conversation",
  "content": "\u003cp\u003ePenny clearly excited to find out if this Cloudflare pipeline thing actually works... 🤞\u003c/p\u003e",
  "contentMap": {
    "en": "\u003cp\u003ePenny clearly excited to find out if this Cloudflare pipeline thing actually works... 🤞\u003c/p\u003e"
  },
  "attachment": [
    {
      "type": "Document",
      "mediaType": "image/jpeg",
      "url": "https://files.mastodon.social/media_attachments/files/112/357/232/898/119/052/original/14e0ac3b6bcebf12.jpg",
      "name": "Small Chorkie dog looking pensive.",
      "blurhash": "UHD93Kj;Na?F~U%1Ip%1^%%1xHbF%LoMRoWV",
      "width": 2499,
      "height": 3319
    }
  ],
  "tag": [],
  "replies": {
    "id": "https://mastodon.social/users/kwilson81/statuses/112357239872765151/replies",
    "type": "Collection",
    "first": {
      "type": "CollectionPage",
      "next": "https://mastodon.social/users/kwilson81/statuses/112357239872765151/replies?only_other_accounts=true\u0026page=true",
      "partOf": "https://mastodon.social/users/kwilson81/statuses/112357239872765151/replies",
      "items": []
    }
  }
}

For the setup I’m going for (at the moment), I’m only really interested in a subset of this data:

id
published
url
content
attachment

So that’s what we’ll be using inside Astro.

Creating a Toot Collection in Astro

Collections in Astro are stored under ~/src/content, so let’s create a mastodon folder under there that will host the content.

We can now add a definition for that collection using Zod inside our ~/src/content.config.ts file. We’ll just ignore attachments for now.

So that gives us this:

const mastodon = defineCollection({
  loader: glob({ pattern: '**/*.json', base: './src/content/mastodon' }),
  schema: z
    .object({
      id: z.string(),
      published: z.coerce.date(),
      url: z.string(),
      content: z.string(),
    }),
});

Looking at the JSON of the Toot, the ID is just the URL:

https://mastodon.social/users/kwilson81/statuses/112357239872765151

If we want to be able to have a local URL pointing at that content (something like mywebsite/notes/[tootId]) then we’ll need to pull the numeric ID from the end of that so we have something specific that we can use to grab it.

We can do this at a collection level using a Zod transform:

const mastodon = defineCollection({
  loader: glob({ pattern: '**/*.json', base: './src/content/mastodon' }),
  schema: z
    .object({
      id: z.string(),
      published: z.coerce.date(),
      url: z.string(),
      content: z.string(),
    })
    .transform((data) => ({
      ...data,
      tootId: /(?<tootId>[A-z0-9]+$)/.exec(data.id)?.groups?.['tootId'] ?? null,
    })),
});

Now we’ve got a collection set up, let’s look at pulling the data into the repo.

Cloudflare Worker

Cloudflare Workers are easy to use, fast, and free. So that makes them a good candidate for running our grabbing pipeline.

How we want this to work is:

Worker receives a POST with the Toot URL
Worker grabs the JSON data of the Toot
Worker uploads any image attachments to Cloudinary
Worker creates a new commit with the JSON file directly into the collection folder we created above

Again, let’s break this down step-by-step.

The Basic Worker

Creating a basic worker with npm create cloudflare@latest will give us something like this:

export default {
	async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
		return new Response('Hello World!');
	},
};

Now running npm run deploy will take you through the wizard to get that deployed into Cloudflare. I’m not going to go into the specifics of that, but you’ll need to create an account and decide where your worker is going to be deployed to. This is all free though and a pretty simple process.

Here’s a quick look at what that looks like for me:

Got that deployed somewhere? Okay, take a note of the URL and let’s continue.

We want to validate that we can receive a POST with data such as { "uri": "https://the-toot-url/id" } so let’s use Zod again (npm i zod) to set up validation for that:

import { z } from 'zod';
 
const schema = z.object({
  uri: z.string(),
});
 
export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const { uri } = schema.parse(await request.json());
    return new Response(uri);
  },
};

If we now POST to the URL of your worker, we should get that uri value echoed back to us:

> curl -d '{"uri":"http://test.invalid"}' \
       -H "Content-Type: application/json" \
       -X POST https://YOUR-WORKER-URL/
 
http://test.invalid

Now that we have a worker that can accept the URL as an input, let’s grab the content from Mastodon.

Fetching the Toot

Remember how we could get the JSON for any toot by appending .json to its URL? That’s exactly what we’ll do here. We’ll fetch the toot data, then validate it with another Zod schema:

const documentSchema = z
  .object({
    mediaType: z.string(),
    url: z.string(),
  })
  .passthrough();
 
const mastodonSchema = z
  .object({
    id: z.string(),
    published: z.string(),
    attachment: z.array(documentSchema).optional(),
  })
  .passthrough();

Using .passthrough() means the schema will validate the fields we care about while keeping all the other ActivityPub data intact. This is handy because the JSON structure is quite verbose — we don’t need to define every field, just the ones we want to validate.

Now we can fetch and parse the toot inside the worker:

const { uri } = schema.parse(await request.json());
 
const response = await fetch(`${uri}.json`);
const postData = mastodonSchema.parse(await response.json());

Handling Image Attachments

If the toot has images attached, we need to deal with them. The attachment URLs point to Mastodon’s file server, but we can’t rely on those URLs being stable forever (which kind of defeats the purpose of archiving our content). So we’ll upload them to Cloudinary and reference them from there instead.

First, we need a stable identifier for each image. The Mastodon URL itself is long and messy, so let’s generate a SHA-256 hash of it:

import crypto from 'node:crypto';
 
const hashedAttachments = postData.attachment?.map((file) => ({
  ...file,
  urlHash: crypto.createHash('sha256').update(file.url).digest('base64url'),
}));

This gives us a short, URL-safe string we can use as both a Cloudinary public ID and a way to reference the image later in our Astro components.

Now we need to upload the images to Cloudinary. To do that, we’ll use Cloudinary’s Upload API with a signed request:

async function uploadImageToCloudinary(imageUrl: string, imageName: string, env: Env) {
  const timestamp = new Date().getTime();
  const api_key = env.CLOUDINARY_API_KEY;
  const api_secret = env.CLOUDINARY_API_SECRET;
  const folder_path = env.CLOUDINARY_UPLOAD_FOLDER_PATH;
  const cloud_name = env.CLOUDINARY_CLOUD_NAME;
 
  const signatureInput = [
    `folder=${folder_path}`,
    `public_id=${imageName}`,
    `timestamp=${timestamp}`,
  ].join('&');
 
  const signature = crypto
    .createHash('sha1')
    .update(`${signatureInput}${api_secret}`)
    .digest('hex');
 
  const parameters = [
    `api_key=${api_key}`,
    `file=${imageUrl}`,
    `folder=${folder_path}`,
    `public_id=${imageName}`,
    `timestamp=${timestamp}`,
    `signature=${signature}`,
  ].join('&');
 
  const response = await fetch(
    `http://api.cloudinary.com/v1_1/${cloud_name}/image/upload`,
    {
      method: 'POST',
      body: parameters,
      headers: {
        'Content-Type': 'application/x-www-form-urlencoded',
      },
    },
  );
 
  return response;
}

The Cloudinary signed upload API requires you to create a SHA-1 hash of the upload parameters (alphabetically sorted) concatenated with your API secret. It sounds more complicated than it is — the key thing is that Cloudinary can then verify the request is legitimate without you needing to expose your API secret.

We can then upload all attachments in parallel:

await Promise.all(
  hashedAttachments?.map((attachment) => {
    return uploadImageToCloudinary(attachment.url, attachment.urlHash, env);
  }) ?? [],
);

Committing to GitHub

Now for the clever bit. Rather than having a server pull the repo, make a commit and push, we can use the GitHub API to create a commit directly. This is perfect for a serverless worker where we don’t have a filesystem to play with.

We’ll use Octokit, GitHub’s official API client:

npm i @octokit/core base-64 utf8

The GitHub Contents API expects the file content to be base64-encoded, so we need to encode our JSON:

import { Octokit } from '@octokit/core';
import utf8 from 'utf8';
import base64 from 'base-64';
 
const octokit = new Octokit({
  auth: env.GITHUB_ACCESS_TOKEN,
});
 
const hashedPostData = {
  ...postData,
  attachment: hashedAttachments,
};
 
const contentBytes = utf8.encode(JSON.stringify(hashedPostData));
const encodedContent = base64.encode(contentBytes);
 
await octokit.request('PUT /repos/{owner}/{repo}/contents/{path}', {
  owner: 'YOUR-GITHUB-USERNAME',
  repo: 'YOUR-REPO-NAME',
  path: `src/content/mastodon/${postData.published}.json`,
  message: `new post - ${postData.published}`,
  committer: {
    name: 'Mastodon Post',
    email: '[email protected]',
  },
  content: encodedContent,
  headers: {
    'X-GitHub-Api-Version': '2022-11-28',
  },
});

A few things to note here:

The file path uses the toot’s published timestamp as the filename (e.g. 2024-04-30T00:03:42Z.json). This gives us unique, chronologically sortable filenames for free.
You’ll need a GitHub Personal Access Token with repo scope to write to your repository.
The committer can be whatever you like — I set it to something identifiable so I can see at a glance which commits were automated.

Environment Variables

The worker needs a few secrets to be configured. You can set these using the Wrangler CLI:

npx wrangler secret put GITHUB_ACCESS_TOKEN
npx wrangler secret put CLOUDINARY_API_KEY
npx wrangler secret put CLOUDINARY_API_SECRET

For the non-secret values, you can set them in your wrangler.toml:

[vars]
CLOUDINARY_CLOUD_NAME = "your-cloud-name"
CLOUDINARY_UPLOAD_FOLDER_PATH = "mastodon/images"

We also need to enable the nodejs_compat compatibility flag in wrangler.toml so we can use the crypto module:

compatibility_flags = ["nodejs_compat"]

The Full Worker

Putting it all together, the complete worker looks like this:

import { Octokit } from '@octokit/core';
import { z } from 'zod';
import utf8 from 'utf8';
import base64 from 'base-64';
import crypto from 'node:crypto';
 
const schema = z.object({
  uri: z.string(),
});
 
const documentSchema = z
  .object({
    mediaType: z.string(),
    url: z.string(),
  })
  .passthrough();
 
const mastodonSchema = z
  .object({
    id: z.string(),
    published: z.string(),
    attachment: z.array(documentSchema).optional(),
  })
  .passthrough();
 
export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const { uri } = schema.parse(await request.json());
 
    const response = await fetch(`${uri}.json`);
    const postData = mastodonSchema.parse(await response.json());
 
    const octokit = new Octokit({
      auth: env.GITHUB_ACCESS_TOKEN,
    });
 
    const hashedAttachments = postData.attachment?.map((file) => ({
      ...file,
      urlHash: crypto.createHash('sha256').update(file.url).digest('base64url'),
    }));
 
    const hashedPostData = {
      ...postData,
      attachment: hashedAttachments,
    };
 
    const contentBytes = utf8.encode(JSON.stringify(hashedPostData));
    const encodedContent = base64.encode(contentBytes);
 
    await Promise.all(
      hashedAttachments?.map((attachment) => {
        return uploadImageToCloudinary(attachment.url, attachment.urlHash, env);
      }) ?? [],
    );
 
    const result = await octokit.request('PUT /repos/{owner}/{repo}/contents/{path}', {
      owner: 'YOUR-GITHUB-USERNAME',
      repo: 'YOUR-REPO-NAME',
      path: `src/content/mastodon/${postData.published}.json`,
      message: `new post - ${postData.published}`,
      committer: {
        name: 'Mastodon Post',
        email: '[email protected]',
      },
      content: encodedContent,
      headers: {
        'X-GitHub-Api-Version': '2022-11-28',
      },
    });
 
    return new Response(JSON.stringify(result));
  },
};

Triggering the Pipeline with IFTTT

We’ve got a worker that can process a toot — but we still need something to trigger it when we actually post something. This is where IFTTT (If This Then That) comes in.

Mastodon provides an RSS feed for every user at https://mastodon.social/@YOUR-USERNAME.rss. IFTTT can monitor this feed and fire a webhook whenever a new item appears.

Here’s the setup:

Create a new IFTTT Applet
For the “If This” trigger, choose RSS Feed → New feed item
Set the feed URL to your Mastodon RSS feed (e.g. https://mastodon.social/@kwilson81.rss)
For the “Then That” action, choose Webhooks → Make a web request
Configure the webhook:
- URL: Your Cloudflare Worker URL
- Method: POST
- Content Type: application/json
- Body: {"uri": "{{EntryUrl}}"}

The {{EntryUrl}} is an IFTTT ingredient that gets replaced with the URL of the new RSS item — which is exactly the toot URL that our worker expects.

Auto-Deployment

The final piece of the puzzle is getting the site to rebuild when a new toot is committed. If you’re hosting on Cloudflare Pages (or Netlify, Vercel, etc.) and have it connected to your GitHub repository, this happens automatically — any new commit triggers a build and deployment.

So the full flow ends up being:

Post a toot on Mastodon
IFTTT detects the new post via the RSS feed
IFTTT sends the toot URL to the Cloudflare Worker
The worker fetches the toot JSON, uploads any images to Cloudinary, and commits the JSON file to GitHub
Cloudflare Pages detects the new commit and rebuilds the site
The toot appears on the site

All fully automated — post and forget.