4 posts tagged with "updates"

2024 Protocol Roadmap

May 6, 2024 · 11 min read

Discuss this post in our Github Discussion forums here

This roadmap is an update on our progress and lays out our general goals and focus for the coming months. This document is written for developers working on atproto clients, implementations, and applications (including Bluesky-specific projects). This is not a product announcement: while some product features are hinted at, we aren't promising specific timelines here. As always, most Bluesky software is free and open source, and observant folks can follow along with our progress week by week in GitHub.

In the big picture, we made a lot of progress on the protocol in early 2024. We opened up federation on the production network, demonstrated account migration, specified and launched stackable moderation (labeling and Ozone), shared our plan for OAuth, specified a generic proxying mechanism, built a new API documentation website (docs.bsky.app), and more.

After this big push on the protocol, the Bluesky engineering team is spending a few months catching up on some long-requested features like GIFs, video, and DMs. At the same time, we do have a few "enabling" pieces of protocol work underway, and continue to make progress towards a milestone of protocol maturity and stability.

Summary-level notes:

Federation is now open: you don't need to pre-register in Discord any more.
It is increasingly possible to build independent apps and integrations on atproto. One early example is https://whtwnd.com/, a blogging web app built on atproto.
The timeline for a formal standards body process is being pushed back until we have additional independent active projects building on the protocol.

Current Work

Proxying of Independent Lexicons: earlier this year we added a generic HTTP proxying mechanism, which allows clients to specify which onward service (eg, AppView) instance they want to communicate with. To date this has been limited to known Lexicons, but we will soon relax this restriction and make arbitrary XRPC query and procedure requests. Combined with allowing records with independent Lexicon schemas (now allowed), this finally enables building new independent atproto applications. PR for this work

Open Federation: the Bluesky Relay service initially required pre-registration before new PDS instances were crawled. This was a very informal process (using Discord) to prevent automated abuse, but we have removed this requirement, making it even easier to set up PDS instances. We will also bump the per-PDS account limits, though we will still enforce some limits to minimize automated abuse; these limits can be bumped for rapidly growing communities and projects.

Email 2FA: while OAuth is our main focus for improving account security (OAuth flows will enable arbitrary MFA, including passkeys, hardware tokens, authenticators, etc), we are rapidly rolling out a basic form of 2FA, using an emailed code in addition to account password for sign-in. This will be an optional opt-in functionality. Announcement with details

OAuth: we continue to make progress implementing our plan for OAuth. Ultimately this will completely replace the current account sign-up, session, and app-password API endpoints, though we will maintain backwards compatibility for a long period. With OAuth, account lifecycle, sign-in, and permission flows will be implementation-specific web views. This means that PDS implementations can add any sign-up screening or MFA methods they see fit, without needing support in the com.atproto.* Lexicons. Detailed Proposal

Product Features

These are not directly protocol-related, but are likely to impact many developers, so we wanted to give a heads up on these.

Harassment Mitigations: additional controls and mechanisms to reduce the prevalence, visibility, and impact of abusive mentions and replies, particularly coming from newly created single-purpose or throw-away accounts. May expand on the existing thread-gating and reply-gating functionality.

Post Embeds: the ability to embed Bluesky posts in external public websites. Including oEmbed support. This has already shipped! See embed.bsky.app

Basic "Off-Protocol" Direct Messages (DMs): having some mechanism to privately contact other Bluesky accounts is the most requested product feature. We looked closely at alternatives like linking to external services, re-using an existing protocol like Matrix, or rushing out on-protocol encrypted DMs, but ultimately decided to launch a basic centralized system to take the time pressure off our team and make our user community happy. We intend to iterate and fully support E2EE DMs as part of atproto itself, without a centralized service, and will take the time to get the user experience, security, and privacy polished. This will be a distinct part of the protocol from the repository abstraction, which is only used for public content.

Better GIF and Video support: the first step is improving embeds from external platforms (like Tenor for GIFs, and YouTube for video). Both the post-creation flow and embed-view experience will be improved.

Feed Interaction Metrics: feed services currently have no feedback on how users are interacting with the content that they curate. There is no way for users to tell specific feeds that they want to see more or less of certain kinds of content, or whether they have already seen content. We are adding a new endpoint for clients to submit behavior metrics to feed generators as a feedback mechanism. This feedback will be most useful for personalized feeds, and less useful for topic or community-oriented feeds. It also raises privacy and efficiency concerns, so sending of this metadata will both be controlled by clients (optional), and will require feed generator opt-in in the feed declaration record.

Topic/Community Feeds: one of the more common uses for feed generators is to categorize content by topic or community. These feeds are not personalized (they look the same to all users), are not particularly "algorithmic" (posts are either in the feed or not), and often have relatively clear inclusion criteria (though they may be additionally curated or filtered). We are exploring ways to make it easier to create, curate, and explore this type of feed.

User/Labeler Messaging: currently, independent moderators have no private mechanism to communicate with accounts which have reported content, or account which moderation actions have been taken against. All reports, including appeals, are uni-directional, and accounts have no record of the reports they have submitted. While Bluesky can send notification emails to accounts hosted on our own PDS instance, this does not work cross-provider with self-hosted PDS instances or independent labelers.

Protocol Stability Milestone

A lot of progress has been made in recent months on the parts of the protocol relevant to large-scale public conversation. The core concepts of autonomous identity (DIDs and handles), self-certifying data (repositories), content curation (feed generators), and stackable moderation (labelers) have now all been demonstrated on the live network.

While we will continue to make progress on additional objectives (see below), we feel we are approaching a milestone in development and stability of these components of the protocol. There are a few smaller tasks to resolve towards this milestone.

Takedowns: we have a written proposal for how content and account takedowns will work across different pieces of infrastructure in the network. Takedowns are a stronger intervention that complement the labeling system. Bluesky already has mechanisms to enact takedowns on our own infrastructure when needed, but there are some details of how inter-provider takedown requests are communicated.

Remaining Written Specifications: a few parts of the protocol have not been written up in the specifications at atproto.com.

Guidance on Building Apps and Integrations: while we hope the protocol will be adopted and built upon in unexpected ways, it would be helpful to have some basic pointers and advice on creating new applications and integrations. These will probably be informal tutorials and example code to start.

Account and Identity Firehose Events: while account and identity state are authoritatively managed across the DID, DNS, and PDS systems, it is efficient and helpful for changes to this state to be broadcast over the repository event stream ("firehose"). The semantics and behavior of the existing #identity event type will be updated and clarified, and an additional #account event type will be added to communicate PDS account deletion and takedown state to downstream services (Relay, and on to AppView, feed generator, labelers, etc). Downstream services might still need to resolve state from an authoritative source after being notified on the firehose.

Private Account Data Iteration: the app.bsky Lexicons currently include a preferences API, as well as some additional private state like mutes. The design of the current API is somewhat error-prone, difficult for independent developers to extend, and has unclear expectations around providing access to service providers (like independent AppViews). We are planning to iterate on this API, though it might not end up part of the near-term protocol milestone.

Protocol Tech Debt: there are a few other small technical issues to resolve or clean up; these are tracked in this GitHub discussion

On the Horizon

There are a few other pieces of protocol work which we are starting to plan out, but which are not currently scheduled to complete in 2024. It is very possible that priorities and schedules will be shuffled, but we mostly want to call these out as things we do want to complete, but will take a bit more time.

Protocol-Native DMs: as mentioned above, we want to have a "proper" DM solution as part of atproto, which is decentralized, E2EE, and follows modern security best practices.

Limited-Audience (Non-Public) Content: to start, we have prioritized the large-scale public conversation use cases in our protocol design, centered around the public data repository concept. While we support using the right tool for the job, and atproto is not trying to encompass every possible social modality, there are many situations and use-cases where having limited-audience content in the same overall application would be helpful. We intend to build a mechanism for group-private content sharing. It will likely be distinct from public data repositories and the Relay/firehose mechanism, but retain other parts of the protocol stack.

Firehose Bandwidth Efficiency: as the network grows, and the volume and rate of repository commits increases, the cost of subscribing to the entire Relay firehose increases. There are a number of ways to significantly improve bandwidth requirements: removing MST metadata for most use-cases; filtering by record types or subsets of accounts; batch compression; etc.

Record Versioning (Post Editing): atproto already supports updating records in repositories: one example is updating bsky profile records. And preparations were made early in the protocol design to support post editing while avoiding misleading edits. Ideally, it would also be possible to (optionally) keep old versions of records around in the repository, and allow referencing and accessing multiple versions of the same record.

PLC Transparency Log: we are exploring technical and organizational mechanisms to further de-centralize the DID PLC directory service. The most promising next step looks to be publishing a transparency log of all directory operations. This will make it easier for other organizations to audit the behavior of the directory and maintain verifiable replicas. The recent "tiling" transparency log design used for https://sunlight.dev/ (described here) is particularly promising. Compatibility with RFC 6962 (Certificate Transparency) could allow future integration with an existing ecosystem of witnesses and auditors.

Identity Key Self-Management UX: the DID PLC system has a concept of "rotation keys" to control the identity itself (in the form of the DID document). We would like to make it possible for users to optionally register additional keys on their personal devices, password managers, or hardware security keys. If done right, this should improve the resilience of the system and reduce some of the burden of responsibility on PDS operators. While this is technically possible today, it will require careful product design and security review to make this a safe and widely-adopted option.

Standards Body Timeline

As described in our 2023 Protocol Roadmap, we hope to bring atproto to an existing standards body to solidify governance and interoperability of the lower levels of the protocol. We had planned to start the formal process this summer, but as we talked to more people experienced with this process, we realized that we should wait until the design of the protocol has been explored by more developers. It would be ideal to have a couple organizations with atproto experience collaborate on the standards process together. If you are interested in being part of the atproto standards process, leave a message in the discussion thread for this post, or email protocol@blueskyweb.xyz.

While there has been a flowering of many projects built around the app.bsky microblogging application, there have been very few additional Lexicons and applications built from scratch. Some of this stemmed from restrictions on data schemas and proxying behavior on the Bluesky-hosted PDS instances, only relaxed just recently. We hope that new apps and Lexicons will exercise the full capabilities and corner-cases of the protocol.

We will continue to participate in adjacent standards efforts to make connections and get experience. Bluesky staff will attend IETF 120 in July, and are always happy to discuss responsible DNS integrations, OAuth, and HTTP API best practices.

Bluesky BGS and DID Document Formatting Changes

October 6, 2023 · 3 min read

We have a number of protocol and infrastructure changes rolling out in the next three months, and want to keep everybody in the loop.

This update was also emailed to the developer mailing list, which you can subscribe to here.

TL;DR

As of this week, the Bluesky AppView instance now consumes from a Bluesky BGS, instead of directly from the PDS. Devs can access the current streaming API at https://bsky.network/xrpc/com.atproto.sync.subscribeRepos or for WebSocket directly, wss://bsky.network/xrpc/com.atproto.sync.subscribeRepos
Your existing cursor for bsky.social will not be in sync with bsky.network, so check the live stream first to grab a recent seq before connecting!
We are updating the DID document public key syntax to “Multikey” format next week on the main network PLC directory (plc.directory). This change is already live on the sandbox PLC directory.

How will this affect me?

For today, if you're consuming the firehose, grab a new cursor from bsky.network and restart your firehose consumer pointed at bsky.network.

Bluesky BGS

The Bluesky services themselves are moving to a federated deployment, with multiple Bluesky (the company) PDS instances aggregated by a BGS, and the AppView downstream of that. As of yesterday, the Bluesky Appview instance (api.bsky.app) consumes from a Bluesky PBC BGS (bsky.network), which consumes from the Bluesky PDS (bsky.social). Until now, the AppView consumed directly from the PDS.

How close are we to federation?

Technically, the main network BGS could start consuming from independent PDS instances today, the same as the sandbox BGS does. We have configured it not to do so until we finish implementing some more details, and do our own round of security hardening. If you want to bang on the BGS implementation (written in Go, code in the indigo github repository), please do so in the sandbox environment, not the main network.

This change impacts devs in two ways:

In the next couple weeks, new Bluesky (company) PDS instances will appear in the main network. Our plan is to optionally abstract this away for most client developers, so they can continue to connect to bsky.social as a virtual PDS. But the actual PDS hostnames will be distinct and will show up in DID documents.
Firehose consumers (feed generators, mirrors, metrics dashboards, etc) will need to switch over and consume from the BGS instead of the PDS directly. If they do not, they will miss content from the new (Bluesky) PDS instances.

The firehose subscription endpoint, which works as of today, is https://bsky.network/xrpc/com.atproto.sync.subscribeRepos (or wss:// for WebSocket directly). Note that this endpoint has different sequence numbers. When switching over, we recommend folks consume from both the BGS and PDS for a period to ensure no events are lost, or to scroll back the BGS cursor to ensure there is reasonable overlap in streams.

We encourage developers and operators to switch to the BGS firehose sooner than later.

DID Document Formatting Changes

We also want to remind folks that we are planning to update the DID document public key syntax to “Multikey” format next week on the main network PLC directory (plc.directory). These changes are described here, with example documents for testing, and are live now on the sandbox PLC directory.

Rate Limits, PDS Distribution v3, and More

September 15, 2023 · 5 min read

To get future blog posts directly in your email, you can now subscribe to Bluesky’s Developer Mailing List here.

Adding Rate Limits

Now that we have a better sense of user activity on the network, we’re adding some application rate limits. This helps us keep the network secure — for example, by limiting the number of requests a user or bot can make in a given time period, it prevents bad actors from brute-forcing certain requests and helps us limit spammy behavior.

We’re adding a rate limit for the number of created actions per DID. These numbers shouldn’t affect typical Bluesky users, and won’t affect the majority of developers either, but it will affect prolific bots, such as the ones that follow every user or like every post on the network. The limit is 5,000 points per hour and 35,000 points per day, where:

Action Type	Value
CREATE	3 points
UPDATE	2 points
DELETE	1 point

To reiterate, these limits should be high enough to affect no human users, but low enough to constrain abusive or spammy bots. We decided to release this new rate limit immediately instead of giving developers an advance notice to secure the network from abusive behavior as soon as possible, especially since bad actors might take this blog post as an open invite!

Per this system, an account may create at most 1,666 records per hour and 11,666 records per day. That means an account can like up to 1,666 records in one hour with no problem. We took the most active human users on the network into account when we set this threshold (you surpassed our expectations!).

In case you missed it, in August, we added some other rate limits as well.

Global limit (aggregated across all routes)
- Rate limited by IP
- 3000/5 min
updateHandle
- Rate limited by DID
- 10/5 min
- 50/day
createAccount
- Rate limited by IP
- 100/5 min
createSession
- Rate limited by handle
- 30/5 min
- 300/day
deleteAccount
- Rate limited by IP
- 50/5 min
resetPassword
- Rate limited by IP
- 50/5 min

We’ll also return rate limit headers on each response so developers can dynamically adapt to these standards.

In a future update (in about a week), we’re also lowering the applyWrites limit from 200 to 10. This function applies a batch transaction of creates, updates, and deletes. This is part of the PDS distribution upgrade to v3 (read more below) — now that repos are ahistorical, we no longer need a higher limit to account for batch writes. applyWrites is used for transactional writes, and logic that requires more than 10 transactional records is rare.

PDS Distribution v3

We’re rolling out v3 of the PDS distribution. This shouldn’t be a breaking change, though we will be wiping the PLC sandbox. PDSs in parallel networks should still continue to operate with the new distribution.

Reminder: The PDS distribution auto-updates via the Watchtower companion Docker container, unless you specifically disabled that option. We’re adding the admin upgradeRepoVersion endpoint to the upgraded PDS distribution, so PDS admins can also upgrade their repos by hand.

Handle Invalidations on App View

Last month, we began proxying requests to the App View. In our federation architecture, the App View is the piece of the stack that gives you all your views of data, such as profiles and threads. Initially, we started out by serving all of these requests from our bsky.social PDS, but proxying these to the App View is one way of scaling our infrastructure to handle many more users on the network. (Read our federation architecture overview blog post for more information.)

For some users, this caused an invalid handle error. If you have an invalid handle, the user-facing UI will display this instead of your handle:

Screenshot of a profile with an invalid handle

You can use our debugging tool to investigate this: https://bsky-debug.app/handle. Just type your handle in. If it shows no error, please try updating your handle to the same handle you currently have to resolve this issue.

If the debugging page shows an error for your handle, follow this guide to make sure you set up your handle properly.

If that still isn’t working for you, file a support ticket through the app (“Help” button in the left menu on mobile or right side on desktop) and a Bluesky team member will assist you.

You can subscribe to Bluesky’s Developer Mailing List here to receive future updates in your email. If you received your invite code from the developer waitlist, you’re already subscribed. Each email will have the option to unsubscribe.

We’ll continue to publish updates to our technical blog as well as on the app from @atproto.com.

Updates to Repository Sync Semantics

August 24, 2023 · 4 min read

We’re excited to announce that we’re rolling out a new version of atproto repositories that removes history from the canonical structure of repositories, and replaces it with a logical clock. We’ll start rolling out this update next week (August 28, 2023).

For most developers with projects subscribed to the firehose, such as feed generators, this change shouldn’t affect you. These will only affect you if you’re doing commit-aware repo sync (a good rule of thumb is if you’ve ever passed earliest or latest to the com.atproto.sync.getRepo method) or are explicitly checking the repo version when processing commits.

Removing Repository History

Repositories on the AT Protocol are like Git repositories, but for structured records. Just like Git, each commit to an atproto repository currently includes a pointer to the previous commit. However, this approach has caused a couple of pain points:

Record deletions are difficult to process. If a user deletes a record, that commit needs to be erased from their repository to match their intent.
Increased storage cost. Maintaining repo history can cause anywhere from a 5-10x increase in repo size.

We attempted to resolve both of these in the current model through rebases (discrete moments when the history of a repository is deleted/mutated, like in Git). However, this is a tricky and sensitive operation that is expensive to conduct and complex to communicate across the network.

Using a Logical Clock for Repositories

To address the above issues, we’re replacing the prev pointer in commits with a logical clock. We originally published our intention to do so a few weeks ago. These are the changes we’re making to the way we handle repository history:

Incrementing the repo version to 3
Making the prev field on repo commits optional
Adding a new required rev (revision) field which is a logical clock
Removing or adjusting commit-aware repo sync mechanisms

Note: If you explicitly verify the version of a repo commit or do strict type checking on commit repo commits (which you shouldn’t — the spec allows unspecified fields!), you will need to make that check inclusive of version 3.

To facilitate backwards compatibility with software that is still running repo v2, we will continue setting the prev field on commits in the interim.

Even though we are setting the prev field, this can be considered a “hint” and the history is no longer considered a canonical part of the repository.

Repository Revisions

The new sync semantics for the repository rely on a logical clock included in each signed commit.

This “revision” takes the form of a TID and must be monotonically increasing.

The included revision serves a few functions:

Ordering

The clock provides a simple ordering mechanism for encountered repos or commits. If a consumer encounters the same repo from two different sources, each with a valid signature and structure, the revision gives a simple mechanism to determine which is the most recent repository.

Sync

When syncing a repository, revisions give a series of signposts that allow you to request everything from a given repo since a previously seen version. Because revisions are ordered and monotonically increasing, the provider does not necessarily need the exact revision that the consumer is asking for (as with a commit hash), rather they can provide all repo contents from the latest version of the repo that they remember that is before the requested revision.

The PDS for instance will track the revision at which each repo block or record was introduced into a repository. If a consumer asks for every block or record since a given revision, the PDS has a simple mechanism by which to give that information, without needing a complicated sync algorithm.

Stale Reads

Finally, a logical clock on the repo gives us a mechanism through which we can detect stale reads. (We actually already snuck this in with an optional revision field on v2 repos!)

Repo revisions may be returned in response headers to most requests. A client will know their own repo’s current revision and can compare that with the upstream service’s revision.

We use this today on the PDS to paper over some read-after-write concerns that are inherent in eventually consistent architectures. Some clients may use these headers to alert their users that their PDS is “out of sync” with other services in the network (for instance an AppView).

Available sync methods

If you have questions about these changes, join us on GitHub Discussions here.

Current Work​

Product Features​

Protocol Stability Milestone​

On the Horizon​

Standards Body Timeline​

TL;DR​

Bluesky BGS​

How close are we to federation?​

This change impacts devs in two ways:​

DID Document Formatting Changes​

Adding Rate Limits​

PDS Distribution v3​

Handle Invalidations on App View​

Subscribe for Developer Updates​

Removing Repository History​

Using a Logical Clock for Repositories​

Repository Revisions​

Ordering​

Sync​

Stale Reads​

Available sync methods​