ActivityPub and WebFinger

Final Community Group Report

This version:
https://www.w3.org/community/reports/socialcg/CG-FINAL-apwf-20240608/
Latest published version:
https://www.w3.org/community/reports/socialcg/apwf/
Latest editor's draft:
https://swicg.github.io/activitypub-webfinger/
Editors:
a
Evan Prodromou
Feedback:
GitHub swicg/activitypub-webfinger (pull requests, new issue, open issues)

Abstract

Identifiers in ActivityPub tend to be HTTPS URIs. The use of WebFinger (as defined in [RFC7033]) allows for discovery of an actor's identifier given a username and a hostname, which may be more socially salient or otherwise easier to communicate across various contexts and media. The username and hostname are resolved at the WebFinger endpoint of the hostname in order to discover a link to an actor associated with the user's account, and that actor similarly can be back-linked to the username and hostname.

Status of This Document

This specification was published by the Social Web Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Final Specification Agreement (FSA) other conditions apply. Learn more about W3C Community and Business Groups.

GitHub Issues are preferred for discussion of this specification.

1. Motivation

This section is non-normative.

Consider an HTTPS URI of the form https://social.example/actors/9c5b94b1-35ad-49bb-b118-8e8fc24abf80 being used as an identifier for an actor associated with a user account. Communicating this digitally may be done by simply using the HTTPS URI as-is, as a hyperlink reference. However, communicating this verbally or in a space-constrained visual format can be difficult. WebFinger allows communicating aliases of the form alyssa@social.example, which are easier to work with in the previously-cited cases.

Additional benefits of using WebFinger include smoothing over the differences between varying actor URI schemas. Different softwares may provide human-friendly URLs for an actor's profile, but these URLs may take several different forms:

Conventionally, people can be identified by their user@domain address, while documents can be identified by their HTTPS location.

2. Discovery

Discovery can occur in one of two directions:

The former will be referred to as "forward discovery" and the latter will be referred to as "reverse discovery".

2.1 Forward discovery of an actor document given a WebFinger address

Given a username and hostname in the form user@domain:

  1. Construct an acct: URI of the form acct:user@domain (as defined in [RFC7565])
  2. Make an HTTP GET request to that hostname's WebFinger well-known endpoint, using the acct: URI as the value of the resource query parameter (as described in [RFC7033])

For example, the WebFinger address alyssa@social.example can be resolved as a resource by making an HTTP GET request for https://social.example/.well-known/webfinger?resource=acct:alyssa@social.example (which is https://social.example/.well-known/webfinger?resource=acct:alyssa%40social.example when percent-encoded). This request MUST returns a JRD (JSON Resource Descriptor, as defined in [RFC6415]) with application/jrd+json as the content type (assuming no specified Accept header).

The WebFinger request and response may look like this:

GET /.well-known/webfinger?resource=acct:alyssa@social.example HTTP/1.1
Host: social.example

HTTP/1.1 200 OK
Content-Type: application/jrd+json

{
  "subject": "acct:alyssa@social.example",
  "aliases": [
    "https://social.example/@alyssa",
    "https://social.example/actors/9c5b94b1-35ad-49bb-b118-8e8fc24abf80"
  ],
  "links": [
    {
      "rel": "http://webfinger.net/rel/profile-page",
      "type": "text/html",
      "href": "https://social.example/@alyssa"
    },
    {
      "rel": "self",
      "type": "application/activity+json",
      "href": "https://social.example/actors/9c5b94b1-35ad-49bb-b118-8e8fc24abf80"
    }
  ]
}

At this point, you can parse for the href of the element of links that has a rel of self and a type of either application/ld+json; profile="https://www.w3.org/ns/activitystreams" or application/activity+json (depending on the implementation). See 3.2 Establishing a link between the WebFinger resource and the actor document for more information about this.

2.2 Reverse discovery of a WebFinger address given an actor document

Given an actor with an id and a preferredUsername:

  1. Take the hostname of the id to discover the WebFinger domain
  2. Combine the preferredUsername and the WebFinger domain in order to form a WebFinger address
  3. Verify that this WebFinger address links back to the same actor when performing discovery as described in 2.1 Forward discovery of an actor document given a WebFinger address
  4. Optionally: If the JRD from the previous step has a subject and it contains an acct: URI different from the one you constructed, perform a verification discovery against that acct: URI afterward. (In such cases, the subject of the JRD denotes the expected canonical identifier.)

For example, given an actor document at https://activitypub.example.com/actor/1 like so:

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "id": "https://activitypub.example.com/actor/1",
  "preferredUsername": "alice",
  "name": "Alice P. Hacker"
}

The reverse discovery process would extract alice and activitypub.example.com, construct the acct: URI acct:alice@activitypub.example.com, then request https://activitypub.example.com/.well-known/webfinger?resource=acct:alice@activitypub.example.com like so:

GET /.well-known/webfinger?resource=acct:alice@activitypub.example.com HTTP/1.1
Host: activitypub.example.com

HTTP/1.1 200 OK
Content-Type: application/jrd+json

{
  "subject": "acct:alice@example.com",
  "aliases": [
    "https://example.com/@alice",
    "https://activitypub.example.com/actors/1"
  ],
  "links": [
    {
      "rel": "http://webfinger.net/rel/profile-page",
      "type": "text/html",
      "href": "https://example.com/@alice"
    },
    {
      "rel": "self",
      "type": "application/ld+json; profile=\"https://www.w3.org/ns/activitystreams\"",
      "href": "https://activitypub.example.com/actors/1"
    }
  ]
}

At this point, we have validated that alice@activitypub.example.com links back to our actor document, but we can optionally verify that the canonical WebFinger address of alice@example.com also links back to the same actor document:

GET /.well-known/webfinger?resource=acct:alice@example.com HTTP/1.1
Host: example.com

HTTP/1.1 307 Temporary Redirect
Location: https://activitypub.example.com/.well-known/webfinger?resource=acct:alice@example.com

GET /.well-known/webfinger?resource=acct:alice@example.com HTTP/1.1
Host: activitypub.example.com

HTTP/1.1 200 OK
Content-Type: application/jrd+json

{
  "subject": "acct:alice@example.com",
  "aliases": [
    "https://example.com/@alice",
    "https://activitypub.example.com/actors/1"
  ],
  "links": [
    {
      "rel": "http://webfinger.net/rel/profile-page",
      "type": "text/html",
      "href": "https://example.com/@alice"
    },
    {
      "rel": "self",
      "type": "application/ld+json; profile=\"https://www.w3.org/ns/activitystreams\"",
      "href": "https://activitypub.example.com/actors/1"
    }
  ]
}

3. Encoding

To ensure smooth operation of the WebFinger discovery flows, identifiers and responses should follow certain guidelines for encoding.

3.1 Limitations on usernames and hostnames

The acct: URI scheme is defined in [RFC7565], which contains ABNF ([RFC5234]) for allowed characters (inheriting from [RFC3986] as well):

acctURI      = "acct" ":" userpart "@" host
userpart     = unreserved / sub-delims
               0*( unreserved / pct-encoded / sub-delims )
; userpart regex: [A-Za-z0-9\-\.\_\~\!\$\&\'\(\)\*\+\,\;\=](?:[A-Za-z0-9\-\.\_\~\!\$\&\'\(\)\*\+\,\;\=]|(?:%[0-9A-Fa-f]{2}))*
unreserved   = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims   = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
pct-encoded  = "%" HEXDIG HEXDIG

host         = IP-literal / IPv4address / reg-name
reg-name     = *( unreserved / pct-encoded / sub-delims )
; reg-name regex: (?:[A-Za-z0-9\-\.\_\~\!\$\&\'\(\)\*\+\,\;\=]|(?:%[0-9A-Fa-f]{2}))*

Further restrictions are specified in [RFC7565]:

3.1.1 Current implementation rules

This section is non-normative.

Note that while there are several symbols allowed in the userpart, the de facto limits set by some current implementers are much more restrictive.

At the time of this writing, Mastodon enforces the following rules:

  • The username must be at least one character
  • ASCII alphanumeric characters (A through Z, a through z, 0 through 9) and underscores (_) are generally allowed anywhere in the username
  • Dots (.) and dashes (-) are allowed in the middle of a username, but not as the first character or the last character
  • All other symbols are disallowed
  • Usernames are case-insensitive

In other words, Mastodon will accept the regular expression or the following ABNF:

; As a regular expression, this can be expressed as follows:
; /[a-z0-9_]+([a-z0-9_.-]+[a-z0-9_]+)?/i

username = word
           *( rest )
word     = ALPHA / DIGIT / "_"
rest     = *( extended )
           word
extended = word / "." / "-"

Similarly at the time of this writing, Misskey is subject to the following limitations:

  • Usernames are limited to 128 characters
  • Hostnames are limited to 128 characters

3.1.2 Recommendations for handling usernames

  • Implementers SHOULD treat local usernames as case-insensitive.
  • Implementers SHOULD NOT assume case insensitivity for external usernames.
  • Implementers SHOULD NOT treat usernames as stable identifiers that will always map to the same actor, and SHOULD use the actor id in any references to an actor. (Implementers that currently treat usernames as canonical identifiers SHOULD take steps to avoid doing so in the future.)
  • Implementers SHOULD limit the length of local usernames. The exact limit is not specified, but it is noteworthy that similar systems such as email often limit the localpart to 64 characters (per [RFC2821]).
  • Implementers SHOULD support remote usernames containing valid characters per [RFC7565]. For short-term compatibility, implementers SHOULD NOT use characters other than alphanumeric (A-Z, a-z, 0-9) and underscores (_).

4. Other uses of WebFinger

This section is non-normative.

Aside from the self-link to the associated actor, resolving a WebFinger query may expose some other links of potential interest. The following link relations are currently common among WebFinger implementers, and are recommended for use especially when the actor document is not publicly available:

The following link relations are less common, but offer useful information to ActivityPub implementers:

Also uncommon but supported by at least one implementation (WordPress) is the ability to query non-actor, non-user resources via WebFinger. The following link relations are exposed:

5. Security Considerations

This section is non-normative.

Using WebFinger can provide proof of existence of an associated actor document, as well as make it easier to discover that associated actor document; following this, an actor's inbox can be likewise discovered, and spam or other unwanted messages can be delivered to that actor's inbox. It may be desirable for some systems to not publicly expose an actor's existence and instead rely on the user manually entering their actor's HTTPS URI, or maintaining a "contact list" of bookmarked actors or resources. For such systems, the use of WebFinger is not advisable.

6. Future Enhancements

This section is non-normative.

The current use of WebFinger with ActivityPub could be improved in several ways:

7. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MUST, SHOULD, and SHOULD NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

A. References

A.1 Normative references

[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC2821]
Simple Mail Transfer Protocol. J. Klensin, Ed.. IETF. April 2001. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc2821
[RFC3986]
Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
[RFC5234]
Augmented BNF for Syntax Specifications: ABNF. D. Crocker, Ed.; P. Overell. IETF. January 2008. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc5234
[RFC5980]
NSIS Protocol Operation in Mobile Environments. T. Sanda, Ed.; X. Fu; S. Jeong; J. Manner; H. Tschofenig. IETF. March 2011. Informational. URL: https://www.rfc-editor.org/rfc/rfc5980
[RFC5982]
IP Flow Information Export (IPFIX) Mediation: Problem Statement. A. Kobayashi, Ed.; B. Claise, Ed.. IETF. August 2010. Informational. URL: https://www.rfc-editor.org/rfc/rfc5982
[RFC6415]
Web Host Metadata. E. Hammer-Lahav, Ed.; B. Cook. IETF. October 2011. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc6415
[RFC7033]
WebFinger. P. Jones; G. Salgueiro; M. Jones; J. Smarr. IETF. September 2013. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7033
[RFC7564]
PRECIS Framework: Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols. P. Saint-Andre; M. Blanchet. IETF. May 2015. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7564
[RFC7565]
The 'acct' URI Scheme. P. Saint-Andre. IETF. May 2015. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7565
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174