It's possible to spin up alternative relays. For example: https://whtwnd.com/bne...

boramalper · 2025-04-26T08:10:46 1745655046

Genuine question: if it's so easy and cheap to host a relay, why then "Free Our Feeds" initiative [0] looking to raise $4,000,000 [1] to establish a second relay [2]? Most of that money must be earmarked for administrative and human expenses then, right?

[0] https://freeourfeeds.com/

[1] https://www.gofundme.com/f/help-us-free-social-media-from-bi...

[2] https://freeourfeeds.com/ § FAQ § What will the money be used for?

yellowapple · 2025-04-26T08:42:40 1745656960

> if it's so easy and cheap to host a relay

That wasn't asserted anywhere. Quite the opposite: as I explained above, the expense is why few people have done it (and even fewer have done it in production). It's the PDSes which are (relatively) cheap and easy to self-host.

> why then "Free Our Feeds" initiative [0] looking to raise $4,000,000 [1] to establish a second relay [2]?

Per the section you cite, they're doing a lot more with that money than running a second relay: they're spinning up an entirely separate organization independent of Bluesky to develop ATproto and applications using it. That includes, but is nowhere implied or explied to be limited to, the "second relay" they mention.

In any case, even the self-hosted relay described in that above-linked blog post (let alone some RPi under someone's bed) is in all likelihood a long ways off from one that's even remotely production-ready. There's no mention of redundancy, no mention of future-proofing, etc. It's reasonable to assume that the "second relay" would be multiple such relays, likely on machines with even beefier specs - in other words, at least as capable as the existing Bluesky-managed relay. I'd also be unsurprised if it expanded to a "third relay" and "fourth relay" and so on.

Further, there's more to running a relay than just the hardware; you need someone to maintain it. $4 million pays for 40 employee-years (assuming every employee is full-time with an annual salary of $100k). That could be one sysadmin for 40 years, or an 8 person team for 5 years, or a 40 person team for 1 year, or what have you. Free our feeds claims they'll need $30 million over 3 years, i.e. $10 million per year; if half that goes to salaries, we end up with a napkin-math-guesstimated team size of 50 - which is about the size I'd expect for an organization that wants to independently maintain a bunch of technical infrastructure, develop applications, prod whomever needs prodded to get ATproto formally standardized, etc.

xethos · 2025-04-26T09:26:54 1745659614

I at least am contrasting

> [running a relay being cheap and easy] wasn't asserted anywhere

With

> I recall posts of people running their own relays on RPi4s with NVMe drives

I would absolutely consider software I can host at home, on a RPi, cheap and easy to self-host. That's the assertion that's being called out here. Bluesky's relays do not scale down easily, and are difficult and expensive to host

yellowapple · 2025-04-26T09:34:32 1745660072

> I would absolutely consider software I can host at home, on a RPi, cheap and easy to self-host.

That's expensive and difficult compared to running a PDS or appview (either of which can run with a tiny fraction of even an RPi's resources), which is exactly what I said. And to reiterate: an RPi4 with an NVMe SSD is very far off from something that's production-ready and suitable for public use. You can run your own relay, but it's probably not going to handle 30+ million users like Bluesky's relay does, or like Free Our Feeds' "second relay" presumably seeks to do.

haileyok · 2025-04-27T02:26:56 1745720816

- A raspberry pi with a nvme drive costs 200 dollars one time. Are you seriously going to assert that is expensive?

- You don’t understand what a relay does in atproto given the rest of your reply and should look.

yellowapple · 2025-04-27T23:57:56 1745798276

> A raspberry pi with a nvme drive costs 200 dollars one time. Are you seriously going to assert that is expensive?

Compared to the hardware required to run a PDS, yes, absolutely. "Expensive" is relative.

And like I've said above, a Raspberry Pi with an NVMe drive is surely a long ways off from Bluesky's own relay. It's good enough for personal needs, not for any sort of production use.

> You don’t understand what a relay does in atproto given the rest of your reply and should look.

I'd appreciate specific corrections, so that I (and anyone else reading these comments) can be better-informed.

(EDIT: my apologies for the previous version of this comment, which might've come across a bit hostile. That ain't my intention.)

Karliss · 2025-04-26T08:26:45 1745656005

Scale, its always question of scale. Whatever youncan easily and cheaply host yourself will likely be only good enough for you and your friends or family. Anything more will require more hardware and dedicated people for maintaining that hardware and software.

half-kh-hacker · 2025-04-26T09:51:27 1745661087

plainly, free our feeds are grifting.

the relay at this point is non-archival and can be spun up trivially. with a small sliding history window for subscriber catchup u can use like 32gb of scratch disk space and keep a few hours, the relay is literally just a subscribeRepos forwarder from PDSes.

the AppView is vastly more expensive to run since you need to handle the write volume of all bsky activity. if you build a non-bsky app on atproto this is a non-issue

the issue here really is that nobody writes about the state of things in long form outside the network so it's not really known how fast things move and change by those not engaged with the platform

yellowapple · 2025-04-26T12:58:44 1745672324

Re: the relay, that depends on your needs. My impression from these sorts of "run a relay on an RPi" projects is that they're only dealing with a subset of the full firehose Bluesky's relay has to deal with - be it a shorter timeline (as you mention) or only concerning themselves with specific accounts (like the relay operator's own "following" feed, in the case of someone running a personal relay) or what have you. Pretty sure even Bluesky's relay doesn't try to drink from the whole firehose (or if it does, it's tolerant of "dribbling" so to speak; I recall a Bluesky dev blog post about how temporarily dropping posts from users' feeds is acceptable if the relay can't keep up).

Re: write traffic, my understanding is that the appviews shift most (if not all) of that burden to the PDSes, no?

half-kh-hacker · 2025-04-27T03:36:39 1745724999

- the relay storage volume scales only with the backfill window for consumers that drop briefly - the bluesky pbc operated relays let you reconnect up to 24h later and not miss any events but that requires around 200gb of scratch disk space -- live tailing an rpi relay without dropping a connection can give you events from the full network span (ie the complete set of the firehose) without requiring any backfill window, but it's nice to use a few tens of gigabytes anyway. -- the full firehose is like 20mbps at maximum so it's far from hard to serve a few live consumers

- bluesky's feed gen post-dropping is about internal operation of their appview and not anything to do with network sync semantics

- if you're running an AppView for the bsky data you are likely keeping a copy of all bsky posts in a database, since fetching from PDSes on-the-fly is network intensive over a relatively small pipe, which is what i mean by write volume requirements.

yellowapple · 2025-04-28T00:30:11 1745800211

> the relay storage volume scales only with the backfill window for consumers that drop briefly […] bluesky's feed gen post-dropping is about internal operation of their appview and not anything to do with network sync semantics

Gotcha; thanks for the clarifications/corrections. Good to know that the firehose bandwidth is a lot less than I thought (though 20Mbps can certainly add up to some hefty pricetags depending on how you're billed for traffic).

> if you're running an AppView for the bsky data you are likely keeping a copy of all bsky posts in a database, since fetching from PDSes on-the-fly is network intensive over a relatively small pipe, which is what i mean by write volume requirements.

Right, but how much of that actually needs to hit the disk? I'd imagine most appviews can readily get away with just keeping posts in RAM, and even if disk storage is desired (e.g. to avoid needing to pull everything from the PDSes if an appview server reboots), it ain't like the writes need to be synchronous or low-latency. A full-blown ACID-compliant DBMS is probably overkill.

It'd also be overkill to cache all posts, rather than subsets (e.g. each users' "Discover" and "Following" feeds), so I reckon that'd also reduce the in-appview caching needs further.

half-kh-hacker · 2025-04-28T11:08:58 1745838538

the reference bluesky backend does just keep everything around but this idea has merit!! you're actually reinventing something like AppViewLite right now, which does throw away old data: https://github.com/alnkesq/AppViewLite

bluesky chooses to not refetch data from PDSes all the time so that the load for a PDS stays low (they like it to be possible to run on a home connection)