Skip to main content

2 posts tagged with "federation"

View All Tags

Early Access Federation for Self-Hosters

· 5 min read

For a high-level introduction to data federation, as well as a comparison to other federated social protocols, check out the Bluesky blog.

Update May 2024: we have removed the Discord registration requirement, and PDS instances can now connect to the network directly. You are still welcome to join the PDS Admins Discord for community support.

Today, we’re releasing an early-access version of federation intended for self-hosters and developers.

The atproto network is built upon a layer of self-authenticating data. This foundation is critical to guaranteeing the network’s long term integrity. But the protocol’s openness ultimately flows from a diverse set of hosts broadcasting this data across the network.

Up until now, every user on the network used a Bluesky PDS (Personal Data Server) to host their data. We’ve already federated our own data hosting on the backend, both to help operationally scale our service, and to prove out the technical underpinnings of an openly federated network. But today we’re opening up federation for anyone else to begin connecting with the network.

The PDS, in many ways, fulfills a simple role: it hosts your account and gives you the ability to log in, it holds the signing keys for your data, and it keeps your data online and highly available. Unlike a Mastodon instance, it does not need to function as a full-fledged social media service. We wanted to make atproto data hosting—like web hosting—into a fairly simple commoditized service. The PDS’s role has been limited in scope to achieve this goal. By limiting the scope, the role of a PDS in maintaining an open and fluid data network has become all the more powerful.

We’ve packaged the PDS into a friendly distribution with an installer script that handles much of the complexity of setting up a PDS. After you set up your PDS and join the PDS Admins Discord to submit a request for your PDS to be added to the network, your PDS’s data will get routed to other services in the network (like feed generators and the Bluesky Appview) through our Relay, the firehose provider. Check out our Federation Overview for more information on how data flows through the atproto network.

Early Access Limitations

As with many open systems, Relays will never be totally unconstrained in terms of what data they’re willing to crawl and rebroadcast. To prevent network and resource abuse, it will be necessary to place rate limits on the PDS hosts that they consume data from. As trust and reputation is established with PDS hosts, those rate limits will increase. We’ll gain capacity to increase the baseline rate limits we have in place for new PDSs in the network as we build better tools for detecting and mitigating abuse..

For a smooth transition into a federated network, we’re starting with some lower limits. Specifically, each PDS will be able to host 10 accounts and limited to 1500 evts/hr and 10,000 evts/day. After those limits are surpassed, we’ll stop crawling the PDS until the rate limit period resets. This is intended to keep the network and firehose running smoothly for everyone in the ecosystem.

These are early days, and we have some big changes still planned for the PDS distribution (including the introduction of OAuth!). The software will be updating frequently and things may break. We will not be breaking things indiscriminately. However, in this early period, in order to avoid cruft in the protocol and PDS distribution, we are not making promises of backwards compatibility. We will be supporting a migration path for each release, but if you do not keep your PDS distribution up to date, it may break and render the app unusable until you do so.

Because the PDS distribution is not totally settled, we want to have a line of communication with PDS admins in the network, so we’re asking any developer that plans to run a PDS to join the PDS Admins Discord. You’ll need to provide the hostname of your PDS and a contact email in order to get your PDS added to the Relay’s allowlist. This Discord will serve as a channel where we can announce updates about the PDS distribution, relay policy, and federation more generally. It will also serve as a community where PDS admins can experiment, chat, and help each other debug issues.

Account Migration

A major promise of the AT Protocol is the ability to migrate accounts between PDS hosts. This is an important check against potential abuse, and further safeguards the fluid open layer of data hosting that underpins the network.

The PDS distribution that we’re releasing has all of the facilities required to migrate accounts between servers. We’re also opening routes on our PDS that will allow you to migrate your account off our server. However — we do not yet provide the capability to migrate back to the Bluesky PDS, so for the time being, this is a one way street. Be warned: these migrations involve possibly destructive identity operations. While we have some guardrails in place, it may still be possible for you to break your account and we will not be able to help you recover it. So although it’s technically possible, we do not recommend migrating your main account between servers yet. We especially recommend against doing so if you do not have familiarity with how DID PLC operations work.

In the coming months we will be hardening this feature and making it safer and easier to do, including creating an in-app flow for moving between servers.

Getting Started

To get started, join the PDS Administrators Discord, and check out the bluesky-social/pds repo on Github. The README will provide all necessary information on getting your PDS setup and running.

Federation Developer Sandbox Guidelines

· 7 min read

Update June 2024: Since earlier this year it has been possible to federate directly in the live atproto network. The sandbox environment is no longer active and has been shut down.

Welcome to the atproto federation developer sandbox! ✨

This is a completely separate network from our production services that allows us to test out the federation architecture and wire protocol.

The federation sandbox environment is an area set up for exploration and testing of the technical components of the AT Protocol distributed social network. It is intended for developers and self-hosters to test out data availability in a federated environment.

To maintain a positive and productive developer experience, we've established this Code of Conduct that outlines our expectations and guidelines. This sandbox environment is initially meant to test the technical components of federation.

Given that this is a testing environment, we will be defederating from any instances that do not abide by these guidelines, or that cause unnecessary trouble, and will not be providing specific justifications for these decisions.

Using the sandbox environment means you agree to adhere to our Guidelines. Please read the following carefully:

Post responsibly. The sandbox environment is intended to test infrastructure, but user content may be created as part of this testing process. Content generation can be automated or manual. Do not post content that requires active moderation or violates the Bluesky Community Guidelines.

Keep the emphasis on testing. We’re striving to maintain a sandbox environment that fosters learning and technical growth. We will defederate with instances that recruit users without making it clear that this is a test environment.

Do limit account creation. We don't want any one server using a majority of the resources in the sandbox. To keep things balanced, to start, we’re only federating with Personal Data Servers (PDS) with up to 1000 accounts. However, we may change this if needed.

Don’t expect persistence or uptime. We will routinely be wiping the data on our infrastructure. This is intended to reset the network state and to test sync protocols. Accounts and content should not be mirrored or migrated between the sandbox and real-world environments.

Don't advertise your service as being "Bluesky." This is a developer sandbox and is meant for technical users. Do not promote your service as being a way for non-technical users to use Bluesky.

Do not mirror sandbox did:plcs to production.

Status and Wipes

🐉 Beware of dragons!

This hasn’t been production tested yet. It seems to work pretty well, but who knows what’s lurking under the surface — that's what this sandbox is for! Have patience with us as we prep for federation.

On that note, please give us feedback either in Issues (actual bugs) or Discussions (higher-level questions/discussions) on the atproto repo.

🗓 Routine wipes

As part of the sandbox, we will be doing routine wipes of all network data.

We expect to perform wipes on a weekly or bi-weekly basis, though we reserve the right to do a wipe at any point.

When we wipe data, we will be wiping it on all services (BGS, App View, PLC). We will also mark any existing DIDs as “invalid” & will refuse to index those accounts in the next epoch of the network to discourage users from attempting to “rollover” their accounts across wipes.

Getting started ✨

Now that you've read the sandbox guidelines, you're ready to self-host a PDS in the developer sandbox. For complete instructions on getting your PDS set up, check out the README.

To access your account, you’ll log in with the client of your choice in the exact same way that you log into production Bluesky, for instance the Bluesky web client. When you do so, please provide the url of your PDS as the service that you wish to log in to.

Auto-updates

We’ve included Watchtower in the PDS distribution. Every day at midnight PST, this will check our GitHub container registry to see if there is a new version of the PDS container & update it on your service.

This will allow us to rapidly iterate on protocol changes, as we’ll be able to push them out to the network on a daily basis.

When we do routine network wipes, we will be pushing out a database migration to participating PDS that wipes content and accounts.

You are within your rights to disable Watchtower auto-updates, but we strongly encourage their use and will not be providing support if you decide not to run the most up-to-date PDS distribution.

Odds & Ends & Warnings & Reminders

🧪 Experiment & have fun!

🤖 Run feed generators. They should work the exact same way as production - be sure to adjust your env to listen to Sandbox BGS!

🌈 Feel free to run your own AppView or BGS - although it’s a bit more involved & we’ll be providing limited support for this.

👤 Your PDS will provide your handle by default. Custom domain handles should work exactly the same in sandbox as they do on production Bluesky. Although you will not be able to re-use your handle from production Bluesky as you can only have one DID set per handle.

🚨 If you follow the self-hosted PDS setup instructions, you’ll have private key material in your env file - be careful about sharing that!

📣 This is a sandbox version of a public broadcast protocol - please do not share sensitive information.

🤝 Help each other out! Respond to issues & discussions, chat in the community-run Matrix or Discord, etc.

Learn more about atproto federation

Check out the high-level view of federation.

Dive deeper with the atproto docs.

Network Services

We are running three services: PLC, BGS, Bluesky "App View"

PLC

Hostname: plc.bsky-sandbox.dev

Code: https://github.com/did-method-plc/did-method-plc

PLC is the default DID provider for the network. DIDs are the root of your identity in the network. Sandbox PLC functions exactly the same as production PLC, but it is run as a separate service with a separate dataset. The DID resolution client in the self-hosted PDS package is set up to talk the correct PLC service.

BGS

Hostname: bgs.bsky-sandbox.dev

Code: https://github.com/bluesky-social/indigo/tree/main/bgs

BGS (Big Graph Service) is the firehose for the entire network. It collates data from PDSs & rebroadcasts them out on one giant websocket.

BGS has to find out about your server somehow, so when we do any sort of write, we ping BGS with com.atproto.sync.requestCrawl to notify it of new data. This is done automatically in the self-hosted PDS package.

If you’re familiar with the Bluesky production firehose, you can subscribe to the BGS firehose in the exact same manner, the interface & data should be identical

Bluesky App View

Hostname: api.bsky-sandbox.dev

Code: https://github.com/bluesky-social/atproto/tree/main/packages/bsky

The Bluesky App View aggregates data from across the network to service the Bluesky microblogging application. It consumes the firehose from the BGS, processing it into serviceable views of the network such as feeds, post threads, and user profiles. It functions as a fairly traditional web service.

When you request a Bluesky-related view from your PDS (getProfile for instance), your PDS will actually proxy the request up to App View.

Feel free to experiment with running your own App View if you like!

The PDS

The PDS (Personal Data Server) is where users host their social data such as posts, profiles, likes, and follows. The goal of the sandbox is to federate many PDS together, so we hope you’ll run your own.

We’re not actually running a Bluesky PDS in sandbox. You might see Bluesky team members' accounts in the sandbox environment, but those are self-hosted too.

The PDS that you’ll be running is much of the same code that is running on the Bluesky production PDS. Notably, all of the in-pds-appview code has been torn out. You can see the actual PDS code that you’re running on the atproto/simplify-pds branch.

Feedback

We're excited for you to join us in the developer sandbox soon! Please give us feedback either in Issues (actual bugs) or Discussions (higher-level questions/discussions) on the atproto repo.