Request Amplification in Mastodon

Along with many others in the InfoSec community on Twitter, I’ve moved onwards to Mastodon (find me here!). Also like many others in the InfoSec community, I perked up when I heard that Mastodon servers were sending out huge request stampedes (enough that some people called it a “DDoS”) and wanted to learn more.

Essentially, the problem is that Mastodon wants to generate link previews for any link you put in a post, but all Mastodon servers will generate link previews independently of one another - they don’t share the link preview itself, just the link. This means that for every post you make, both the server you’re on and any server receiving your post will access the sites you’ve linked.

This is intended behavior and not an exploit or anything of the sort - it’s been listed as an issue in Mastodon since 2017, originally filed as “Mastodon can be used as a DDoS tool.” Offhand, it doesn’t sound… too bad? But consider this:

It typically takes one or two requests to generate a link preview (ex. one for the page, one for a referenced image).
Even a small Mastodon server can easily federate with over 100 other Mastodon servers. Large servers could federate with >1,000.
Your post can travel far - and if your post is boosted by someone on another Mastodon server, their boost of your post will be sent to servers that that server federates with, triggering link preview generation on any Mastodon servers seeing that link for the first time (tens, hundreds, or thousands more…).

Of course, I needed to trigger that behavior on my own site to find out more!

What are the real-world results of this request amplification?

So to test the real-world impact, I posted a link to a novel URL (https://chris.partridge.tech/?testodon) on my small Mastodon server and waited for requests to come in as all the Mastodon servers I federate with generate a link preview. Every Mastodon server I federate with should query my website twice since I have OpenGraph tags configured correctly to allow visitors to generate a link preview - so every Mastodon server should access that URL once, then fetch an image from my site to use as the preview image.

Within two minutes, 381 requests were made by Mastodon servers, peaking at 20 requests per second. That’s not so bad, but then a friend boosted my post on their midsize Mastodon server, and another 761 requests piled in over two more minutes, peaking at 35 requests per second.

So in terms of requests the impact is obvious - making a post on Mastodon turned one POST request into 1147 GET requests against an arbitrary target, or request amplification of 1147:1. However, I got a bit of a nasty surprise when I looked into how much traffic this had consumed - a single roughly ~3KB POST to Mastodon caused servers to pull a bit of HTML and… fuck, an image. In total, 114.7 MB of data was requested from my site in just under five minutes - making for a traffic amplification of 36704:1.

These are more than enough to enable someone to intentionally knock sites offline which use serverside rendering (by request/s - this is the most common case where Mastodon knocks sites offline) or saturate poorly-connected websites connections to the world (by traffic amplification - or at least increase costs).

Could the request amplification change from what I observed?

Absolutely. To be clear, what I did is a tiny test, and the actual amplification observed could vary by a magnitude or more by three main factors:

The request amplification factor will vary by how many Mastodon servers see a given post (and thus generate a link preview for the target). There are over 11,000 Mastodon servers currently (source) - by my count, only ~500 generated a link preview within my little corner of the fediverse. You could see significantly different results if someone conducted this test on a larger instance, if that person has many more followers spread across many instances, or if a post is boosted by a better-connected instance.
The traffic amplification factor is dictated by how much traffic it takes to generate a link preview for the target’s website. Using a small or well-optimized image could dramatically reduce how much bandwidth is consumed - or using a large image could do the opposite.
The request and traffic amplification factors could also vary by whether or not the target website uses OpenGraph tags to generate a link preview. If the target doesn’t provide an image, each Mastodon server would only access the target once (fetching the page to look for OpenGraph tags, but not making any additional requests).

There are other factors to consider, but these tend to be less impactful. For example, if an HTTP link to the target is posted, but the target will upgrade all requests to HTTPS, that will increase the number of requests made by each Mastodon server by one (but that upgrade would use little resources).

Is generating link previews really a DDoS?

Whether or not this qualifies as a “DDoS” is subject to recurrent debate, but that’s usually because people semantically expect a DDoS to be an intentional attack. A more conservative definition of Mastodon’s link preview behavior would call it “request amplification,” and that’s what I’m referring to this behavior with throughout this post.

Fact is, this particular feature is known to take sites offline, and falls somewhere between hilariously wasteful and outright irresponsible. Even if you don’t agree that it qualifies as a “DDoS” on its own, it can easily be abused as part of a DDoS attack.

It’s a great subject for the Stallman copypasta though.

I’d just like to interject for a moment. What you’re referring to as a Mastodon DDoS, is in fact, Mastodon Request Amplification, or as I’ve recently taken to calling it, Mastodon plus Request Amplification. Request Amplification is not a DDoS unto itself, but rather another free component of a fully functioning DDoS system made useful by the REST API, Sidekiq jobs, and vital link preview generating components comprising a full Mastodon server as defined by Eugen.

Did your site go down while you were testing this?

No - my site is very small and is statically generated so every new request does not require any expensive database calls. While it’s currently hosted on cloud services for resilience, I could run my site off a Raspberry Pi in my closet and it’d have been fine.

In general the problems people are having with this request amplification are websites that require significant CPU resources to generate each page - such as Wordpress sites, web applications, etc.

I wrote this because I was impressed at the amplification factor if someone were to abuse this - not because my own site was in danger.

What - if anything - should you do with this information?

If you suspect your website is being attacked with request amplification from Mastodon, it probably isn’t being ‘attacked’ - while this behavior in Mastodon is a vector for request amplification, I still haven’t seen credible evidence that it’s been used for that by a malicious actor. Your site might be down, but that doesn’t mean what’s happening is intentional.

If you suspect your website is down due to Mastodon link previews, you should see many concurrent requests with a User Agent similar to what’s below:

http.rb/5.1.0 (Mastodon/4.0.2; +https://example.com/)

You probably don’t need any specialized security service or WAF to defend against this (though those might mitigate such an attack automatically, so there’s that). For the simplest/easiest fix - but one that provides no other resilience against DDoS attacks - you can block requests with “Mastodon” and/or “http.rb” in the User Agent, as those represented >92% of link preview traffic to my site during my test (see Appendix). Of course this will prevent link previews from being shown for your site on Mastodon.

If you’re a website operator concerned about your site’s overall resilience, ideally you should be thoughtful about how your website performs under load and what resilience it has against concurrent or sustained requests. Depending on your needs, you may want to:

Reduce the resources needed to serve your site - for example: my site is pre-generated using Jekyll, this page is ~93KB total, and would have been stayed online during testing even if it was served by a Raspberry Pi over a satellite connection. That’s on the low end of sites online today, sure, reducing your cost to serve every page of whatever website you run pays dividends. If you’re seeing real financial cost from link previews, this is probably something worth fixing in general.
Integrate a CDN or security services into your site preemptively - though these may take expertise to integrate and depending on your requirements or traffic served these can also add substantial ongoing costs.

Yes, I know, in an ideal world you shouldn’t need to be a tuning/monitoring/security/etc. expert to have an online presence. Unfortunately, we live in a world where snot-nosed CoD-lobby-inhabiting children with daddy’s credit card can spend $20 on a booter service and generate more than enough traffic to knock small sites offline anyway.

While Mastodon certainly isn’t the only source of bad behavior on the internet, it shouldn’t be a vector to enable bad behavior.

So if you’re a Mastodon server admin, please advocate for better solutions from Mastodon and other ActivityPub-implementing software, such as:

Administrative control over when (or if!) link previews are generated, or
Allowing viewers to request a link preview instead of generating one automatically, or
Remove the link preview feature entirely.

As of 2024, Mastodon now has this issue on their roadmap to update their link preview implementation, although it’s been pushed out to the 4.4.0 release.

As far as I know, nobody has (intentionally) knocked sites offline with request amplification in Mastodon yet. Hopefully that day is not coming soon, as this method is free and requires little or no expertise - as we’ve seen throughout, some sites already have to wait out the “Mastodon stampede” for a minute or two.

Appendix

A CSV file containing the following logs from my test is available if you’d like to take a peek at approximately where traffic was coming from, what the traffic pattern looked like, etc. This contains only link preview generation traffic from my target site when I ran the experiment, and has the following columns of data:

Cache Status
HTTP Code
Time (Unix time, in milliseconds)
Response Size
Path
Requester Country
Requester Software

No IP addresses, Mastodon instance information, etc. are shared. The CSV is available for download here for those curious.

Footnote

This page was linked to in a controversial “It’s FOSS” post titled “Please Don’t Share Our Links on Mastodon: Here’s Why!” (post, archived) which caused a bit of a stir on Mastodon, Hacker News, etc.

As @siguza@infosec.space noted, without adblock enabled, in four minutes that site made 3,740 requests and used 267.22MB of bandwidth (mostly to/from ad networks) - taxing even a 24-core M2 Ultra Mac Studio with 128GB of RAM. I’d like to point out is >3x as many requests and >2x as much bandwidth as I observed while Mastodon servers around the world generated link previews for my site.

I still firmly believe that Mastodon should fix this behavior - there’s no need for Mastodon to be this wasteful, or open the opportunity for abuse by bad actors - but in that particular case I don’t agree Mastodon is the core issue, optimization is.