Verifiable Credential Data Integrity 1.0

Abstract

This specification describes mechanisms for ensuring the authenticity and integrity of Verifiable Credentials and similar types of constrained digital documents using cryptography, especially through the use of digital signatures and related mathematical proofs.

This section specifies the data model that is used for expressing data integrity proofs, controller documents, and verification methods.

All of the data model properties and types in this specification map to URLs. The vocabulary where these URLs are defined is the [SECURITY-VOCABULARY]. The explicit mechanism that is used to perform this mapping in a secured document is the @context property.

The mapping mechanism is defined by JSON-LD [JSON-LD11]. To ensure a document can be interoperably consumed without the use of a JSON-LD library, document authors are advised to ensure that domain experts have 1) specified the expected order for all values associated with a @context property, 2) published cryptographic hashes for each @context file, and 3) deemed that the contents of each @context file are appropriate for the intended use case.

When a document is processed by a non-JSON-LD processor and there is a requirement to use the same semantics as those used in a JSON-LD environment, implementers are advised to 1) enforce the expected order and values in the @context property, and 2) ensure that each @context file matches the known cryptographic hashes for each @context file.

Using static, versioned @context files with published cryptographic hashes in conjunction with JSON Schema is one acceptable approach to implementing the mechanisms described above, which ensures proper term identification, typing, and order, when a non-JSON-LD processor is used.

A data integrity proof provides information about the proof mechanism, parameters required to verify that proof, and the proof value itself. All of this information is provided using Linked Data vocabularies such as the [SECURITY-VOCABULARY].

When expressing a data integrity proof on an object, a proof property MUST be used. If present, its value MUST be either a single object, or an unordered set of objects, expressed using the properties below:

id: An optional identifier for the proof, which MUST be a URL [URL], such as a UUID as a URN (urn:uuid:6a1676b8-b51f-11ed-937b-d76685a20ff5). The usage of this property is further explained in Section 2.1.2 Proof Chains.
type: The specific proof type used for the cryptographic proof MUST be specified as a string that maps to a URL [URL]. Examples of proof types include DataIntegrityProof and Ed25519Signature2020. Proof types determine what other fields are required to secure and verify the proof.
proofPurpose: The reason the proof was created MUST be specified as a string that maps to a URL [URL]. The proof purpose acts as a safeguard to prevent the proof from being misused by being applied to a purpose other than the one that was intended. For example, without this value the creator of a proof could be tricked into using cryptographic material typically used to create a Verifiable Credential (assertionMethod) during a login process (authentication) which would then result in the creation of a Verifiable Credential they never meant to create instead of the intended action, which was to merely logging into a website.
verificationMethod: The means and information needed to verify the proof MUST be specified as a string that maps to a [URL]. An example of a verification method is a link to a public key which includes cryptographic material that is used by a verifier during the verification process.
created: The date and time the proof was created is OPTIONAL and, if included, MUST be specified as an [XMLSCHEMA11-2] dateTimeStamp string.
expires: The expires property is OPTIONAL. If present, it MUST be an [XMLSCHEMA11-2] dateTimeStamp string specifying when the proof expires.
domain: The domain property is OPTIONAL. It conveys one or more security domains in which the proof is meant to be used. If specified, the associated value MUST be either a string, or an unordered set of strings. A verifier SHOULD use the value to ensure that the proof was intended to be used in the security domain in which the verifier is operating. The specification of the domain parameter is useful in challenge-response protocols where the verifier is operating from within a security domain known to the creator of the proof. Example domain values include: domain.example (DNS domain), https://domain.example:8443 (Web origin), mycorp-intranet (bespoke text string), and b31d37d4-dd59-47d3-9dd8-c973da43b63a (UUID).
challenge: A string value that SHOULD be included in a proof if a domain is specified. The value is used once for a particular domain and window of time. This value is used to mitigate replay attacks. Examples of a challenge value include: 1235abcd6789, 79d34551-ae81-44ae-823b-6dadbab9ebd4, and ruby.
proofValue: A string value that contains the base-encoded binary data necessary to verify the digital proof using the verificationMethod specified. The contents of the value MUST be expressed with a header and encoding as described in Section 2.4 Multibase. The contents of this value are determined by a specific cryptosuite and set to the proof value generated by the Add Proof Algorithm for that cryptosuite. Alternative properties with different encodings specified by the cryptosuite MAY be used, instead of this property, to encode the data necessary to verify the digital proof.
previousProof: An OPTIONAL string value or unordered list of string values. Each value identifies another data integrity proof that MUST verify before the current proof is processed. If an unordered list, all referenced proofs in the array MUST verify. This property is used in Section 2.1.2 Proof Chains.
nonce: An OPTIONAL string value supplied by the proof creator. One use of this field is to increase privacy by decreasing linkability that is the result of deterministically generated signatures.

A proof can be added to a JSON document like the following:

Example 1: A simple JSON data document

{
  "myWebsite": "https://hello.world.example/"
};

The following proof secures the document above using the jcs-eddsa-2022 cryptography suite [DI-EDDSA], which produces a verifiable digital proof by transforming the input data using the JSON Canonicalization Scheme (JCS) [RFC8785] and then digitally signing it using an Edwards Digital Signature Algorithm (EdDSA).

Example 2: A simple signed JSON data document

{
  "myWebsite": "https://hello.world.example/",
  "proof": {
    "type": "DataIntegrityProof",
    "cryptosuite": "jcs-eddsa-2022",
    "created": "2023-03-05T19:23:24Z",
    "verificationMethod": "https://di.example/issuer#z6MkjLrk3gKS2nnkeWcmcxiZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "zQeVbY4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYV8nQApmEcqaqA3Q1gVHMrXFkXJeV6doDwLWx"
  }
}

Similarly, a proof can be added to a JSON-LD data document like the following:

Example 3: A simple JSON-LD data document

{
  "@context": {"myWebsite": "https://vocabulary.example/myWebsite"},
  "myWebsite": "https://hello.world.example/"
};

The following proof secures the document above by using the ecdsa-2019 cryptography suite [DI-ECDSA], which produces a verifiable digital proof by transforming the input data using the RDF Dataset Canonicalization Scheme [RDF-CANON] and then digitally signing it using the Elliptic Curve Digital Signature Algorithm (ECDSA).

Example 4: A simple signed JSON-LD data document

{
  "@context": [
    {"myWebsite": "https://vocabulary.example/myWebsite"},
    "https://w3id.org/security/data-integrity/v2"
  ],
  "myWebsite": "https://hello.world.example/",
  "proof": {
    "type": "DataIntegrityProof",
    "cryptosuite": "ecdsa-2019",
    "created": "2020-06-11T19:14:04Z",
    "verificationMethod": "https://ldi.example/issuer#zDnaepBuvsQ8cpsWrVKw8fbpGpvPeNSjVPTWoq6cRqaYzBKVP",
    "proofPurpose": "assertionMethod",
    "proofValue": "zXb23ZkdakfJNUhiTEdwyE598X7RLrkjnXEADLQZ7vZyUGXX8cyJZRBkNw813SGsJHWrcpo4Y8hRJ7adYn35Eetq"
  }
}

Note: Representing time values to individuals

This specification enables the expression of dates and times, such as through the created and expires properties. This information might be indirectly exposed to an individual if a proof is processed and is detected to be outside an allowable time range. When displaying date and time values related to the validity of cryptographic proofs, implementers are advised to respect the locale and local calendar preferences of the individual [LTLI]. Conversion of timestamps to local time values are expected to consider the time zone expectations of the individual. See Verifiable Credentials Data Model v2.0 for more details about representing time values to individuals.

Issue

Add a note indicating that selective disclosure proof mechanisms can be compatible with Data Integrity; for example, an algorithm could produce a merkle tree from a canonicalized set of N-Quads and then sign the root hash. Disclosure would involve including the merkle paths for each N-Quad that is to be revealed. This mechanism would merely consume the normalized output differently (this, and the proof mechanism would be modifications to this core spec). It might also be necessary to generate proof parameters such as a private key/seed that can be used along with an algorithm to deterministically generate nonces that are concatenated with each N-Quad to prevent rainbow table or similar attacks.

Issue

Add a note indicating that this specification should not be construed to indicate that public key controllers should be restricted to a single public key or that systems that use this spec and involve real people should identify each person as only ever being a single entity rather than perhaps N entities with M keys. There are no such restrictions and in many cases those kinds of restrictions are ill-advised due to privacy considerations.

The Data Integrity specification supports the concept of multiple proofs in a single document. There are two types of multi-proof approaches that are identified: Proof Sets (un-ordered) and Proof Chains (ordered).

A proof set is useful when the same data needs to be secured by multiple entities, but where the order of proofs does not matter, such as in the case of a set of signatures on a contract. A proof set, which has no order, is represented by associating a set of proofs with the proof key in a document.

Example 5: A proof set in a data document

{
  "@context": [
    {"myWebsite": "https://vocabulary.example/myWebsite"},
    "https://w3id.org/security/data-integrity/v2"
],
  "myWebsite": "https://hello.world.example/",
  "proof": [{
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T19:23:24Z",
    "verificationMethod": "https://ldi.example/issuer/1#z6MkjLrk3gKS2nnkeWcmcxiZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "z4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQeVbY8nQAVHMrXFkXJpmEcqdoDwLWxaqA3Q1geV6"
  }, {
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T13:08:49Z",
    "verificationMethod": "https://pfps.example/issuer/2#z6MkGskxnGjLrk3gKS2mesDpuwRBokeWcmrgHxUXfnncxiZP",
    "proofPurpose": "assertionMethod",
    "proofValue": "z5QLBrp19KiWXerb8ByPnAZ9wujVFN8PDsxxXeMoyvDqhZ6Qnzr5CG9876zNht8BpStWi8H2Mi7XCY3inbLrZrm95"
  }]
}

A proof chain is useful when the same data needs to be signed by multiple entities and the order of when the proofs occurred matters, such as in the case of a notary counter-signing a proof that had been created on a document. A proof chain, where proof order needs to be preserved, is expressed by providing at least one proof with an id, such as a UUID as a URN, and another proof with a previousProof value that identifies the previous proof.

Example 6: A proof chain in a data document

{
  "@context": [
    {"myWebsite": "https://vocabulary.example/myWebsite"},
    "https://w3id.org/security/data-integrity/v2"
],
  "myWebsite": "https://hello.world.example/",
  "proof": [{
    "id": "urn:uuid:60102d04-b51e-11ed-acfe-2fcd717666a7",
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T19:23:42Z",
    "verificationMethod": "https://ldi.example/issuer/1#z6MkjLrk3gKS2nnkeWcmcxiZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "zVbY8nQAVHMrXFkXJpmEcqdoDwLWxaqA3Q1geV64oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQe"
  }, {
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T21:28:14Z",
    "verificationMethod": "https://pfps.example/issuer/2#z6MkGskxnGjLrk3gKS2mesDpuwRBokeWcmrgHxUXfnncxiZP",
    "proofPurpose": "assertionMethod",
    "proofValue": "z6Qnzr5CG9876zNht8BpStWi8H2Mi7XCY3inbLrZrm955QLBrp19KiWXerb8ByPnAZ9wujVFN8PDsxxXeMoyvDqhZ",
    "previousProof": "urn:uuid:60102d04-b51e-11ed-acfe-2fcd717666a7"
  }]
}

A proof that describes its purpose helps prevent it from being misused for some other purpose.

Issue

Add a mention of JWK's key_ops parameter and WebCrypto's KeyUsage restrictions; explain that Proof Purpose serves a different goal and allows for finer-grained restrictions.

Dave Longley suggested that proof purposes enable verifiers to know what the proof creator's intent was so the message can't be accidentally abused for another purpose, e.g., a message signed for the purpose of merely making an assertion (and thus perhaps intended to be widely shared) being abused as a message to authenticate to a service or take some action (invoke a capability). It's a goal to keep the number of them limited to as few categories as are really needed to accomplish this goal.

The following is a list of commonly used proof purpose values.

authentication: Indicates that a given proof is only to be used for the purposes of an authentication protocol.
assertionMethod: Indicates that a proof can only be used for making assertions, for example signing a Verifiable Credential.
keyAgreement: Indicates that a proof is used for for key agreement protocols, such as Elliptic Curve Diffie Hellman key agreement used by popular encryption libraries.
capabilityDelegation: Indicates that the proof can only be used for delegating capabilities. See the Authorization Capabilities [ZCAP] specification for more detail.
capabilityInvocation: Indicates that the proof can only be used for invoking capabilities. See the Authorization Capabilities [ZCAP] specification for more detail.

Note: The Authorization Capabilities [ZCAP] specification defines additional proof purposes for that use case, such as capabilityInvocation and capabilityDelegation.

A controller document is a set of data that specifies one or more relationships between a controller and a set of data, such as a set of public cryptographic keys. The controller document SHOULD contain verification relationships that explicitly permit the use of certain verification methods for specific purposes.

(Feature at Risk) Issue: Potential for stand-alone Controller Document specification

There are many commonalities between this section on Controller Documents and similar sections in other securing mechanisms such as [VC-JOSE-COSE], as well as sections on similar concepts in specifications such as [DID-CORE]. The Working Group is currently discussing the possibility of moving this section to an independent Controller Document specification that can be referenced normatively. If this migration occurs, it is expected that there will be little to no impact on implementations, as the normative statements that exist in this section will remain in this or the new document as an additive set of requirements on top of the base Controller Document specification.

Issue

Add examples of common Controller documents, such as controller documents published on a ledger-based registry, or on a mutable medium in combination with an integrity protection mechanism such as Hashlinks.

A controller document can express verification methods, such as cryptographic public keys, which can be used to authenticate or authorize interactions with the controller or associated parties. For example, a cryptographic public key can be used as a verification method with respect to a digital signature; in such usage, it verifies that the signer could use the associated cryptographic private key. Verification methods might take many parameters. An example of this is a set of five cryptographic keys from which any three are required to contribute to a cryptographic threshold signature.

verificationMethod

The verificationMethod property is OPTIONAL. If present, the value MUST be a set of verification methods, where each verification method is expressed using a map. The verification method map MUST include the id, type, controller, and specific verification material properties that are determined by the value of type and are defined in 2.3.1.1 Verification Material. A verification method MAY include additional properties. Verification methods SHOULD be registered in the Data Integrity Specification Registries [TBD - DIS-REGISTRIES].

Note

The verificationMethod property is REQUIRED for proofs, unlike controller documents, for which it is optional. See section 2.1 Proofs.

id: The value of the id property for a verification method MUST be a string that conforms to the conforms to the [URL] syntax.
type: The value of the type property MUST be a string that references exactly one verification method type. In order to maximize global interoperability, the verification method type SHOULD be registered in the Data Integrity Specification Registries [TBD -- DIS-REGISTRIES].
controller: The value of the controller property MUST be a string that conforms to the [URL] syntax.
expires: The expires property is OPTIONAL. It is set, in advance, by the controller of a verification method to signal when that method can no longer be used for verification purposes. If provided, it MUST be an [XMLSCHEMA11-2] dateTimeStamp string specifying when the verification method SHOULD cease to be used. Once the value is set, it is not expected to be updated, and systems depending on the value are expected to not verify any proofs associated with the verification method at or after the time of expiration.
revoked: The revoked property is OPTIONAL. It is set by the controller of a verification method to signal when that method is to no longer to be used for verification purposes, such as after a security compromise of the verification method. If provided, it MUST be an [XMLSCHEMA11-2] dateTimeStamp string specifying when the verification method SHOULD cease to be used. Once the value is set, it is not expected to be updated, and systems depending on the value are expected to not verify any proofs associated with the verification method at or after the time of revocation.

Example 7: Example verification method structure

{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/data-integrity/v2"
  ]
  "id": "did:example:123456789abcdefghi",
  ...
  "verificationMethod": [{
    "id": ...,
    "type": ...,
    "controller": ...,
    "publicKeyJwk": ...
  }, {
    "id": ...,
    "type": ...,
    "controller": ...,
    "publicKeyMultibase": ...
  }]
}

Note: Verification method controller(s) and controller(s)

The semantics of the controller property are the same when the subject of the relationship is the controller document as when the subject of the relationship is a verification method, such as a cryptographic public key. Since a key can't control itself, and the key controller cannot be inferred from the controller document, it is necessary to explicitly express the identity of the controller of the key. The difference is that the value of controller for a verification method is not necessarily a controller. controllers are expressed using the `controller` property at the highest level of the controller document.

Verification material is any information that is used by a process that applies a verification method. The type of a verification method is expected to be used to determine its compatibility with such processes. Examples of verification methods include JsonWebKey and Multikey. A cryptographic suite specification is responsible for specifying the verification method type and its associated verification material format. For examples, see the Data Integrity ECDSA Cryptosuites and the Data Integrity EdDSA Cryptosuites. For a list of verification method types, please see the [SECURITY-VOCABULARY].

To increase the likelihood of interoperable implementations, this specification limits the number of formats for expressing verification material in a controller document. The fewer formats that implementers have to implement, the more likely it will be that they will support all of them. This approach attempts to strike a delicate balance between easing implementation and providing support for formats that have historically had broad deployment.

A verification method MUST NOT contain multiple verification material properties for the same material. For example, expressing key material in a verification method using both publicKeyJwk and publicKeyMultibase at the same time is prohibited.

An example of a controller document containing verification methods using both properties above is shown below.

Example 8: Verification methods using publicKeyJwk and publicKeyMultibase

{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/jwk/v1",
    "https://w3id.org/security/multikey/v1"
  ]
  "id": "did:example:123456789abcdefghi",
  ...
  "verificationMethod": [{
    "id": "did:example:123#_Qq0UL2Fq651Q0Fjd6TvnYE-faHiOpRlPVQcY_-tA4A",
    "type": "JsonWebKey", // external (property value)
    "controller": "did:example:123",
    "publicKeyJwk": {
      "crv": "Ed25519", // external (property name)
      "x": "VCpo2LMLhn6iWku8MKvSLg2ZAoC-nlOyPVQaO3FxVeQ", // external (property name)
      "kty": "OKP", // external (property name)
      "kid": "_Qq0UL2Fq651Q0Fjd6TvnYE-faHiOpRlPVQcY_-tA4A" // external (property name)
    }
  }, {
    "id": "did:example:123456789abcdefghi#keys-1",
    "type": "Multikey", // external (property value)
    "controller": "did:example:pqrstuvwxyz0987654321",
    "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
  }],
  ...
}

The Multikey data model is a specific type of verification method that encodes key types into a single binary stream that is then encoded as a Multibase value as described in Section 2.4 Multibase.

When specifing a Multikey, the object takes the following form:

type: The value of the type property MUST contain the string Multikey.
publicKeyMultibase: The publicKeyMultibase property is OPTIONAL. If present, its value MUST be a Multibase encoded value as described in Section 2.4 Multibase.
secretKeyMultibase: The secretKeyMultibase property is OPTIONAL. If present, its value MUST be a Multibase encoded value as described in Section 2.4 Multibase.

An example of a Multikey is provided below:

Example 9: Multikey encoding of a Ed25519 public key

{
  "@context": ["https://w3id.org/security/multikey/v1"],
  "id": "did:example:123456789abcdefghi#keys-1",
  "type": "Multikey",
  "controller": "did:example:123456789abcdefghi",
  "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
}

In the example above, the publicKeyMultibase value starts with the letter z, which is the Multibase header that conveys that the binary data is base-58-btc-encoded using the Bitcoin base-encoding alphabet. The decoded binary data header is 0xed01, which specifies that the remaining data is a 32-byte raw Ed25519 public key.

The Multikey data model is also capable of encoding secret keys, whose subtypes include symmetric keys and private keys.

Example 10: Multikey encoding of a Ed25519 secret key

{
  "@context": ["https://w3id.org/security/suites/secrets/v1"],
  "id": "did:example:123456789abcdefghi#keys-1",
  "type": "Multikey",
  "controller": "did:example:123456789abcdefghi",
  "secretKeyMultibase": "z3u2fprgdREFtGakrHr6zLyTeTEZtivDnYCPZmcSt16EYCER"
}

In the example above, the secretKeyMultibase value starts with the letter z, which is the Multibase header that conveys that the binary data is base-58-btc-encoded using the Bitcoin base-encoding alphabet. The decoded binary data header is 0x8026, which specifies that the remaining data is a 32-byte raw Ed25519 private key.

The JSON Web Key (JWK) data model is a specific type of verification method that uses the JWK specification [RFC7517] to encode key types into a set of parameters.

When specifing a JsonWebKey, the object takes the following form:

type: The value of the type property MUST contain the string JsonWebKey.
publicKeyJwk: The publicKeyJwk property is OPTIONAL. If present, its value MUST be a map representing a JSON Web Key that conforms to [RFC7517]. The map MUST NOT include any members of the private information class, such as d, as described in the JWK Registration Template. It is RECOMMENDED that verification methods that use JWKs [RFC7517] to represent their public keys use the value of kid as their fragment identifier. It is RECOMMENDED that JWK kid values are set to the public key fingerprint [RFC7638]. See the first key in Example 8 for an example of a public key with a compound key identifier.
secretKeyJwk: The secretKeyJwk property is OPTIONAL. If present, its value MUST be a map representing a JSON Web Key that conforms to [RFC7517].

An example of an object that conforms to this data model is provided below:

Example 11: JSON Web Key encoding of an Ed25519 public key

{
  "@context": ["https://www.w3.org/ns/security/jwk/v1"],
  "id": "did:example:123456789abcdefghi#key-1",
  "type": "JsonWebKey",
  "controller": "did:example:123456789abcdefghi",
  "publicKeyJwk": {
    "kty": "OKP",
    "alg": "EdDSA"
    "crv": "Ed25519",
    "kid": "key-1",
    "x": "_1EiHquO2aUx9JARSu0P8jdYT_OVneYxYOnOMAmUcFI",
  }
}

In the example above, the publicKeyJwk value contains the JSON Web Key. The kty property encodes the key type of "OKP", which means "Octet string key pairs". The alg property identifies the algorithm intended for use with the public key. The crv property identifies the particular curve type of the public key. The kid property specifies how the public key might be referenced in software systems; if present, the kid value SHOULD match the id property of the encapsulating JsonWebKey object. Finally, the x property specifies the point on the Ed25519 curve that is associated with the public key.

The publicKeyJwk property MUST NOT contain any property marked as "Private" in any registry contained in the JOSE Registries [JOSE-REGISTRIES].

The JSON Web Key data model is also capable of encoding secret keys, sometimes referred to as private keys.

Example 12: JSON Web Key encoding of an Ed25519 secret key

{
  "@context": ["https://www.w3.org/ns/security/jwk/v1"],
  "id": "did:example:123456789abcdefghi#key-1",
  "type": "JsonWebKey",
  "controller": "did:example:123456789abcdefghi",
  "secretKeyJwk": {
    "kty": "OKP",
    "alg": "EdDSA"
    "crv": "Ed25519",
    "kid": "key-1",
    "d": "Q6JwjCUdThSnoxfXHSFt5C1nVFycY_ZpW7qVzK644_g",
    "x": "_1EiHquO2aUx9JARSu0P8jdYT_OVneYxYOnOMAmUcFI",
  }
}

The private key example above is almost identical to the previous example of the public key, except that the information is stored in the secretKeyJwk property (rather than the publicKeyJwk), and the private key value is encoded in the d property thereof (alongside the x property, which still specifies the point on the Ed25519 curve that is associated with the public key).

Verification methods can be embedded in or referenced from properties associated with various verification relationships as described in 2.3.2 Verification Relationships. Referencing verification methods allows them to be used by more than one verification relationship.

If the value of a verification method property is a map, the verification method has been embedded and its properties can be accessed directly. However, if the value is a URL string, the verification method has been included by reference and its properties will need to be retrieved from elsewhere in the controller document or from another controller document. This is done by dereferencing the URL and searching the resulting resource for a verification method map with an id property whose value matches the URL.

Example 13: Embedding and referencing verification methods

    {
...

      "authentication": [
        // this key is referenced and might be used by
        // more than one verification relationship
        "did:example:123456789abcdefghi#keys-1",
        // this key is embedded and may *only* be used for authentication
        {
          "id": "did:example:123456789abcdefghi#keys-2",
          "type": "Multikey", // external (property value)
          "controller": "did:example:123456789abcdefghi",
          "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
        }
      ],

...
    }

A verification relationship expresses the relationship between the controller and a verification method.

Different verification relationships enable the associated verification methods to be used for different purposes. It is up to a verifier to ascertain the validity of a verification attempt by checking that the verification method used is contained in the appropriate verification relationship property of the controller document.

The verification relationship between the controller and the verification method is explicit in the controller document. Verification methods that are not associated with a particular verification relationship cannot be used for that verification relationship. For example, a verification method in the value of the `authentication` property cannot be used to engage in key agreement protocols with the controller—the value of the `keyAgreement` property needs to be used for that.

The controller document does not express revoked keys using a verification relationship. If a referenced verification method is not in the latest controller document used to dereference it, then that verification method is considered invalid or revoked.

The following sections define several useful verification relationships. A controller document MAY include any of these, or other properties, to express a specific verification relationship. In order to maximize global interoperability, any such properties used SHOULD be registered in the Data Integrity Specification Registries [TBD: DIS-REGISTRIES].

The authentication verification relationship is used to specify how the controller is expected to be authenticated, for purposes such as logging into a website or engaging in any sort of challenge-response protocol.

authentication: The authentication property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

Example 14: Authentication property containing three verification methods

{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/multikey/v1"
  ],
  "id": "did:example:123456789abcdefghi",
  ...
  "authentication": [
    // this method can be used to authenticate as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for authentication, it may not
    // be used for any other proof purpose, so its full description is
    // embedded here rather than using only a reference
    {
      "id": "did:example:123456789abcdefghi#keys-2",
      "type": "Multikey",
      "controller": "did:example:123456789abcdefghi",
      "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
    }
  ],
  ...
}

If authentication is established, it is up to the application to decide what to do with that information.

This is useful to any authentication verifier that needs to check to see if an entity that is attempting to authenticate is, in fact, presenting a valid proof of authentication. When a verifier receives some data (in some protocol-specific format) that contains a proof that was made for the purpose of "authentication", and that says that an entity is identified by the id, then that verifier checks to ensure that the proof can be verified using a verification method (e.g., public key) listed under `authentication` in the controller document.

Note that the verification method indicated by the `authentication` property of a controller document can only be used to authenticate the controller. To authenticate a different controller, the entity associated with the value of controller needs to authenticate with its own controller document and associated `authentication` verification relationship.

The assertionMethod verification relationship is used to specify how the controller is expected to express claims, such as for the purposes of issuing a Verifiable Credential [VC-DATA-MODEL-2.0].

assertionMethod: The assertionMethod property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

This property is useful, for example, during the processing of a verifiable credential by a verifier. During verification, a verifier checks to see if a verifiable credential contains a proof created by the controller by checking that the verification method used to assert the proof is associated with the `assertionMethod` property in the corresponding controller document.

Example 15: Assertion method property containing two verification methods

{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/multikey/v1"
  ],
  "id": "did:example:123456789abcdefghi",
  ...
  "assertionMethod": [
    // this method can be used to assert statements as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for assertion of statements, it is not
    // used for any other verification relationship, so its full description is
    // embedded here rather than using a reference
    {
      "id": "did:example:123456789abcdefghi#keys-2",
      "type": "Multikey", // external (property value)
      "controller": "did:example:123456789abcdefghi",
      "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
    }
  ],
  ...
}

The keyAgreement verification relationship is used to specify how an entity can generate encryption material in order to transmit confidential information intended for the controller, such as for the purposes of establishing a secure communication channel with the recipient.

keyAgreement: The keyAgreement property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when encrypting a message intended for the controller. In this case, the counterparty uses the cryptographic public key information in the verification method to wrap a decryption key for the recipient.

Example 16: Key agreement property containing two verification methods

{
  "@context": "https://www.w3.org/ns/did/v1",
  "id": "did:example:123456789abcdefghi",
  ...
  "keyAgreement": [
    // this method can be used to perform key agreement as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for key agreement usage, it will not
    // be used for any other verification relationship, so its full description is
    // embedded here rather than using only a reference
    {
      "id": "did:example:123#zC9ByQ8aJs8vrNXyDhPHHNNMSHPcaSgNpjjsBYpMMjsTdS",
      "type": "X25519KeyAgreementKey2019", // external (property value)
      "controller": "did:example:123",
      "publicKeyMultibase": "z6LSn6p3HRxx1ZZk1dT9VwcfTBCYgtNWdzdDMKPZjShLNWG7"
    }
  ],
  ...
}

The capabilityInvocation verification relationship is used to specify a verification method that might be used by the controller to invoke a cryptographic capability, such as the authorization to update the controller document.

capabilityInvocation: The capabilityInvocation property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when a controller needs to access a protected HTTP API that requires authorization in order to use it. In order to authorize when using the HTTP API, the controller uses a capability that is associated with a particular URL that is exposed via the HTTP API. The invocation of the capability could be expressed in a number of ways, e.g., as a digitally signed message that is placed into the HTTP Headers.

The server providing the HTTP API is the verifier of the capability and it would need to verify that the verification method referred to by the invoked capability exists in the `capabilityInvocation` property of the controller document. The verifier would also check to make sure that the action being performed is valid and the capability is appropriate for the resource being accessed. If the verification is successful, the server has cryptographically determined that the invoker is authorized to access the protected resource.

Example 17: Capability invocation property containing two verification methods

{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/multikey/v1"
  ],
  "id": "did:example:123456789abcdefghi",
  ...
  "capabilityInvocation": [
    // this method can be used to invoke capabilities as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for capability invocation usage, it will not
    // be used for any other verification relationship, so its full description is
    // embedded here rather than using only a reference
    {
    "id": "did:example:123456789abcdefghi#keys-2",
    "type": "Multikey", // external (property value)
    "controller": "did:example:123456789abcdefghi",
    "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
    }
  ],
  ...
}

The capabilityDelegation verification relationship is used to specify a mechanism that might be used by the controller to delegate a cryptographic capability to another party, such as delegating the authority to access a specific HTTP API to a subordinate.

capabilityDelegation: The capabilityDelegation property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when a controller chooses to delegate their capability to access a protected HTTP API to a party other than themselves. In order to delegate the capability, the controller would use a verification method associated with the capabilityDelegation verification relationship to cryptographically sign the capability over to another controller. The delegate would then use the capability in a manner that is similar to the example described in 2.3.2.4 Capability Invocation.

Example 18: Capability Delegation property containing two verification methods

{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/multikey/v1"
  ],
  "id": "did:example:123456789abcdefghi",
  ...
  "capabilityDelegation": [
    // this method can be used to perform capability delegation as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for granting capabilities; it will not
    // be used for any other verification relationship, so its full description is
    // embedded here rather than using only a reference
    {
    "id": "did:example:123456789abcdefghi#keys-2",
    "type": "Multikey", // external (property value)
    "controller": "did:example:123456789abcdefghi",
    "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
    }
  ],
  ...
}

Issue: Multibase may be standardized at IETF

The [MULTIBASE] specification has been dispatched at IETF and may be standardized there. There is active discussion on this initiative in the Multiformats mailing list at IETF. If the Multibase draft is stabilized before this specification goes to the Proposed Recommendation phase, the table below will be replaced with normative references to the Multibase specification at IETF. It is the intention of the Working Group to ensure alignment between the Multibase values used in this specification and the Multibase values defined by the current Multibase community and any potential future IETF Multiformats Working Group.

A Multibase string includes a single character header which identifies the base and encoding alphabet used to encode a binary value, followed by the encoded binary value (using that base and alphabet). The common Multibase header values and their associated base encoding alphabets as provided below are normative:

Multibase Header	Description
`u`	The base-64-url-no-pad alphabet is used to encode the bytes. The base-alphabet consists of the following characters, in order: `ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_`
`z`	The base-58-btc alphabet is used to encode the bytes. The base-alphabet consists of the following characters, in order: `123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz`

Other Multibase encoding values MAY be used, but interoperability is not guaranteed between implementations using such values.

To base-encode a binary value into a Multibase string, an implementation MUST apply the algorithm in Section 4.1 Base Encode to the binary value, with the desired base encoding and alphabet from the table above, ensuring to prepend the associated Multibase header from the table above to the result. Any algorithm with equivalent output MAY be used.

To base-decode a Multibase string, an implementation MUST apply the algorithm in Section 4.2 Base Decode to the string following the first character (Multibase header), with the alphabet associated with the Multibase header. Any algorithm with equivalent output MAY be used.

Issue: Multihash may be standardized at IETF

The [MULTIHASH] specification has been dispatched at IETF and may be standardized. There is active discussion on this initiative in the Multiformats mailing list at IETF. If the IETF draft is stabilized before this specification goes to the Proposed Recommendation phase, the table below will be replaced with normative references to the Multihash specification. It is the intention of the Working Group to ensure alignment between the Multihash values used in this specification and the Multihash values defined by the current Multihash community and any potential future IETF Multiformats Working Group.

A Multihash value starts with a binary header, which identifies the specific cryptographic hash algorithm and parameters used to generate the digest, followed by the cryptographic digest value. The normative Multihash header values defined by this specification, and their associated output sizes and associated specifications, are provided below:

Multihash Identifier	Multihash Header	Description
`sha2-256`	`0x12`	SHA-2 with 256 bits (32 bytes) of output, as defined by [RFC6234].
`sha2-384`	`0x20`	SHA-2 with 384 bits (48 bytes) of output, as defined by [RFC6234].
`sha3-256`	`0x16`	SHA-3 with 256 bits (32 bytes) of output, as defined by [SHA3].
`sha3-384`	`0x15`	SHA-3 with 384 bits (48 bytes) of output, as defined by [SHA3].

Other Multihash encoding values MAY be used, but interoperability is not guaranteed between implementations.

To encode to a Multihash value, an implementation MUST prepend the associated Multihash header value to the cryptographic hash value.

To decode a Multihash value, an implementation MUST remove the prepended Multihash header value, which identifies the type of cryptographic hashing algorithm as well as its output length, leaving the raw cryptographic hash value which MUST match the output length associated with the Multihash header.

(Feature at Risk) Issue: Unification of cryptographic hash expression formats are under discussion

The Working Group is currently attempting to determine whether cryptographic hash expression formats can be unified across all of the VCWG core specifications. Candidates for this mechanism include digestSRI and digestMultibase. There are arguments for and against unification that the WG is currently debating.

When a link to an external resource is included in a conforming secured document, it is desirable to know whether the resource that is identified has changed since the proof was created. This applies to cases where there is an external resource that is remotely retrieved as well as to cases where the verifier might have a locally cached copy of the resource.

To enable confirmation that a resource referenced by a conforming secured document has not changed since the document was secured, an implementer MAY include a property named digestMultibase in any object that includes an id property. If present, the digestMultibase value MUST be a single string value, or an array of string values that are Multibase-encoded Multihash values.

An example of a resource integrity protected object is shown below:

Example 19: An integrity-protected image that is associated with an object

{
  ...
  "image": {
    "id": "https://university.example.org/images/58473",
    "digestMultibase": "zQmdfTbBqBPQ7VNxZEYEj14VmRuZBkqFbiwReogJgS1zR1n"
  },
  ...
}

Implementers are urged to consult appropriate sources, such as the FIPS 180-4 Secure Hash Standard and the Commercial National Security Algorithm Suite 2.0 to ensure that they are choosing a hash algorithm that is appropriate for their use case.

The term Linked Data is used to describe a recommended best practice for exposing, sharing, and connecting information on the Web using standards, such as URLs, to identify things and their properties. When information is presented as Linked Data, other related information can be easily discovered and new information can be easily linked to it. Linked Data is extensible in a decentralized way, greatly reducing barriers to large scale integration.

With the increase in usage of Linked Data for a variety of applications, there is a need to be able to verify the authenticity and integrity of Linked Data documents. This specification adds authentication and integrity protection to data documents through the use of mathematical proofs without sacrificing Linked Data features such as extensibility and composability.

Note: Use of Linked Data is an optional feature

While this specification provides mechanisms to digitally sign Linked Data, the use of Linked Data is not necessary to gain some of the advantages provided by this specification.

Cryptographic suites that implement this specification can be used to secure verifiable credentials and verifiable presentations. Implementers that are addressing those use cases are cautioned that additional checks might be appropriate when processing those types of documents.

There are some use cases where it is important to ensure that the verification method used in a proof is associated with the issuer in a verifiable credential, or the holder in a verifiable presentation, during the process of validation. One way to check for such an association is to ensure that the value of the controller property of a proof's verification method matches the URL value used to identify the issuer or holder, respectively. This particular association indicates that the issuer or holder, respectively, is the controller of the verification method used to verify the proof.

Document authors and implementers are advised to understand the difference between the validity period of a proof, which is expressed using the created and expires properties, and the validity period of a credential, which is expressed using the validFrom and validUntil properties. While these properties might sometimes express the same validity periods, at other times they might not be aligned. When verifying a proof, it is important to ensure that the time of interest (which might be the current time or any other time) is within the validity period for the proof (that is, between created and expires ). When validating a verifiable credential, it is important to ensure that the time of interest is within the validity period for the credential (that is, betweeen validFrom and validUntil). Note that a failure to validate either the validity period for the proof, or the validity period for the credential, might result in accepting data that ought to have been rejected.

Finally, implementers are also urged to understand that there is a difference between the revocation time and expiration time for a verification method, and the revocation information associated with a verifiable credential. The revocation time and expiration time for a verification method are expressed using the revocation and expires properties, respectively, and are related to events such as a private key being compromised or expiring and can provide timing information which might reveal details about a controller such as their security practices or when they might have been compromised. The revocation information for a verifiable credential is expressed using the credentialStatus property and is related to events such as an individual losing the privilege that is granted by the verifiable credential and does not provide timing information, which enhances privacy.

Issue: (AT RISK) Hash values might change during Candidate Recommendation

This section lists cryptographic hash values that might change during the Candidate Recommendation phase based on implementer feedback that requires the referenced files to be modified.

Implementations that perform JSON-LD processing MUST treat the following JSON-LD context URLs as already resolved, where the resolved document matches the corresponding hash values below:

URL and Media Type	Content
https://w3id.org/security/data-integrity/v2 application/ld+json	sha256: v/POI0jhSjPansxhJAP1fwepCBZ2HK77fRZfCCyBDs0= sha3-512: Sg1PLFxKyEYQns9Zr0BoYXtFeDNfrHUDNMkyq4QEWvwGIaX1v5xovnCG+dfceZEzr7BhBjm396noZF1HEeCM8g==
https://w3id.org/security/multikey/v1 application/ld+json	sha256: uiwYLeLZL35HGEvMqPzwvq7m05hsUnv2ZMGVu8fFhZc= sha3-512: En0TOOp/cC10XtW/aQDtqKrEQZ2lRjGB/KAsJ+BQhRBufT7s6eoOFWvP9cP5nurTl3hcRTvFeffdVJapkeELXw==
https://w3id.org/security/jwk/v1 application/ld+json	sha256: 9h/GLRVuGCl0rn/ye6lzf219aN7Sgzq9yBFqUoOI54k= sha3-512: VDH85TsaX6kH2nwmII0WXKzAi2MRNsJd+rJfYL5cw0b12sAVPKDsVvRNJo0MMGd1RhV5W6Ii6D9GM7PCFeq97A==

The security vocabulary terms that the JSON-LD contexts listed above resolve to are in the https://w3id.org/security# namespace. That is, all security terms in this vocabulary are of the form https://w3id.org/security#TERM, where TERM is the name of a term.

Implementations that perform RDF processing MUST treat the following JSON-LD vocabulary URL as already resolved, where the resolved document matches the corresponding hash values below.

When dereferencing the https://w3id.org/security# URL, the data returned depends on HTTP content negotiation. These are as follows:

Media Type	Description and Cryptographic Hashes
application/ld+json	The vocabulary in JSON-LD format [JSON-LD11]. sha256: LEaoTyf796eTaSlYWjfPe3Yb+poCW9TjWYTbFDmC0tc= sha3-512: f4DhJ3xhT8nT+GZ8UUZi4QC+HT//wXE2fRTgUP4UNwe4kvel2PFfd6jcofHBm9BjwEiGzVFGv4K+fFTKXRD2NA==
text/turtle	The vocabulary in Turtle format [TURTLE]. sha256: McnhLyt7+/A/0iLb3CUXD0itNw+7bwwjtzOww/zwoyI= sha3-512: jZtZsqgPPPo+jphAcN8/St4VdRLLAmN3nEQhzs0twEMTmCY45euQ01Z4Zo7VlJMYNTf0KC6BMpogpSTAi/1J7Q==
text/html	The vocabulary in HTML+RDFa Format [HTML-RDFA]. sha256: eUHP1xiSC157iTPDydZmxg/hvmX3g/nnCn+FO25d4dc= sha3-512: z53j8ryjVeX16Z/dby//ujhw37degwi09+LAZCTUB8WJZjjzW1AydhdEWmgHM0P5KUcPMmSe7edMlGr7G9rmcA==

It is possible to confirm the digests listed above by running the following command from a modern Unix command interface line: curl -sL -H "Accept: <MEDIA_TYPE>" <DOCUMENT_URL> | openssl dgst -<DIGEST_ALGORITHM> -binary | openssl base64 -nopad -a.

Authors of application-specific vocabularies and specifications SHOULD ensure that their JSON-LD context and vocabulary files are permanently cacheable using the approaches to caching described above or a functionally equivalent mechanism.

Implementations MAY load application-specific JSON-LD context files from the network during development, but SHOULD permanently cache JSON-LD context files used in conforming documents in production settings to increase their security and privacy characteristics. Caching goals MAY be achieved through approaches such as those described above or functionally equivalent mechanisms.

Some applications, such as digital wallets, that are capable of holding arbitrary verifiable credentials or other data-integrity-protected documents, from any issuer and using any contexts, might need to be able to load externally linked resources, such as JSON-LD context files, in production settings. This is expected to increase user choice, scalability, and decentralized upgrades in the ecosystem over time. Authors of such applications are advised to read the security and privacy sections of this document for further considerations.

For further information regarding processing of JSON-LD contexts and vocabularies, see Verifiable Credentials v2.0: Base Context and Verifiable Credentials v2.0: Vocabularies.

The @context property is used to ensure that implementations are using the same semantics when terms in this specification are processed. For example, this can be important when properties like type are processed and its value, such as DataIntegrityProof, are used.

If an @context property is not provided in a document that is being secured or verified, or the Data Integrity terms used in the document are not mapped by existing values in the @context property, implementations MUST inject or add an @context property with a value of https://w3id.org/security/data-integrity/v2.

Context injection is expected to be unnecessary sometimes, such as when the Verifiable Credential Data Model v2.0 context (https://www.w3.org/ns/credentials/v2) exists as a value in the @context property, as that context maps all of the necessary Data Integrity terms that were previously mapped by https://w3id.org/security/data-integrity/v2.

HTML processors are designed to continue processing if recoverable errors are detected. JSON-LD processors operate in a similar manner. This design philosophy was meant to ensure that developers could use only the parts of the JSON-LD language that they find useful, without causing the processor to throw errors on things that might not be important to the developer. Among other effects, this philosophy led to JSON-LD processors being designed to not throw errors, but rather warn developers, when encountering things such as undefined terms.

When converting from JSON-LD to an RDF Dataset, such as when canonicalizing a document [RDF-CANON], undefined terms and relative URLs can be dropped silently. When values are dropped, they are not protected by a digital proof. This creates a mismatch of expectations, where a developer, who is unaware of how a JSON-LD processor works, might think that certain data was being secured, and then be surprised to find that it was not, when no error was thrown. This specification requires that any recoverable loss of data when performing JSON-LD transformations result in an error, to avoid a mismatch in the security expectations of developers.

Implementations that use JSON-LD processing, such as RDF Dataset Canonicalization [RDF-CANON], MUST throw an error, which SHOULD be DATA_LOSS_DETECTION_ERROR, when data is dropped by a JSON-LD processor, such as when an undefined term is detected in an input document.

Similarly, since conforming secured documents can be transferred from one security domain to another, conforming processors that process the conforming secured document cannot assume any particular base URL for the document. When deserializing to RDF, implementations MUST ensure that the base URL is set to null.

This section defines datatypes that are used by this specification.

This specification encodes cryptographic suite identifiers as enumerable strings, which is useful in processes that need to efficiently encode such strings, such as compression algorithms. In environments that support data types for string values, such as RDF [RDF-CONCEPTS], cryptographic identifier content is indicated using a literal value whose datatype is set to https://w3id.org/security#cryptosuiteString.

The cryptosuiteString datatype is defined as follows:

The URL denoting this datatype: https://w3id.org/security#cryptosuiteString
The lexical space: The union of all cryptosuite strings, expressed using American Standard Code for Information Interchange [ASCII] strings, that are defined by the collection of all Data Integrity cryptosuite specifications.
The value space: The union of all cryptosuite types that are expressed using the cryptosuite property, as defined in Section 3.1 DataIntegrityProof.
The lexical-to-value mapping: Any element of the lexical space is mapped to the result of parsing it into an internal representation that uniquely identifies the cryptosuite type from all other possible cryptosuite types.
The canonical mapping: Any element of the value space is mapped to the corresponding string in the lexical space.

Multibase-encoded strings are used to encode binary data into ASCII-only formats, which are useful in environments that cannot directly represent binary values. This specification makes use of this encoding. In environments that support data types for string values, such as RDF [RDF-CONCEPTS], Multibase-encoded content is indicated using a literal value whose datatype is set to https://w3id.org/security#multibase.

The multibase datatype is defined as follows:

The URL denoting this datatype: https://w3id.org/security#multibase
The lexical space: Any string that starts with a Multibase character and the rest of the characters consist of allowable characters in the respective base-encoding alphabet.
The value space: The standard mathematical concept of all integer numbers.
The lexical-to-value mapping: Any element of the lexical space is mapped to the value space by base-decoding the value based on the base-decoding alphabet associated with the first Multibase character in the lexical string.
The canonical mapping: The canonical mapping consists of using the lexical-to-value mapping.

The algorithms defined below are generalized in that they require a specific transformation algorithm, hashing algorithm, proof serialization algorithm, and proof verification algorithm to be specified by a particular cryptographic suite (see Section 3. Cryptographic Suites).

Issue: Verification hash algorithm definition

At present the creation of the verification hash is delegated to the cryptographic suite specification when generating and verifying a proof. It is expected that this algorithm is going to be common to most cryptographic suites. It is predicted that the algorithm that generates the verification hash will eventually be defined in this specification.

The following algorithm specifies how to encode an array of bytes, where each byte represents a base-256 value, to a different base representation that uses a particular base alphabet, such as base-64-url-no-pad or base-58-btc. The required inputs are the bytes, targetBase, and baseAlphabet. The output is a string that contains the base-encoded value. All mathematical operations MUST be performed using integer arithmetic. Alternatives to the algorithm provided below MAY be used as long as the outputs of the alternative algorithm remain the same.

Initialize the following variables; zeroes to 0, length to 0, begin to 0, and end to the length of bytes.
Set begin and zeroes to the number of leading 0 byte values in bytes.
Set baseValue to an empty byte array that is the size of the final base-expanded value. Calculate the final size of baseValue by dividing log(256) by log(targetBase) and then multiplying the length of bytes minus the leading zeroes. Add 1 to the value of size.
Process each byte in bytes as byte starting at offset begin:
1. Set the carry value to byte.
2. Perform base-expansion by starting at the end of the baseValue array. Initialize an iterator i to 0. Set basePosition to size minus 1. Perform the following loop as long as carry does not equal 0 or i is less than length, and basePosition does not equal -1.
  1. Multiply the value in baseValue[basePosition] by 256 and add it to carry.
  2. Set the value at baseValue[basePosition] to the remainder after dividing carry by targetBase.
  3. Set the value of carry to carry divided by targetBase ensuring that integer division is used to perform the division.
  4. Decrement basePosition by 1 and increment i by 1.
3. Set length to i and increment begin by 1.
Set the baseEncodingPosition to size minus length. While the baseEncodingPosition does not equal size and the baseValue[baseEncodingPosition] does not equal 0, increment baseEncodingPosition. This step skips the leading zeros in the base-encoded result.
Initialize the baseEncoding by repeating the first entry in the baseAlphabet by the value of zeroes (the number of leading zeroes in bytes).
Convert the rest of the baseValue to the base-encoding. While the baseEncodingPosition is less than size, increment the baseEncodingPosition: Set baseEncodedValue to baseValue[baseEncodingPosition]. Append baseAlphabet[baseEncodedValue] to baseEncoding.
Return baseEncoding as the base-encoded value.

Example 20: An implementation of the general base-encoding algorithm above in Javascript

function baseEncode(bytes, targetBase, baseAlphabet) {
  let zeroes = 0;
  let length = 0;
  let begin = 0;
  let end = bytes.length;

  // count the number of leading bytes that are zero
  while(begin !== end && bytes[begin] === 0) {
    begin++;
    zeroes++;
  }

  // allocate enough space to store the target base value
  const baseExpansionFactor = Math.log(256) / Math.log(targetBase);
  let size = Math.floor((end - begin) * baseExpansionFactor + 1);
  let baseValue = new Uint8Array(size);

  // process the entire input byte array
  while(begin !== end) {
    let carry = bytes[begin];

    // for each byte in the array, perform base-expansion
    let i = 0;
    for(let basePosition = size - 1;
        (carry !== 0 || i < length) && (basePosition !== -1);
        basePosition--, i++) {
      carry += Math.floor(256 * baseValue[basePosition]);
      baseValue[basePosition] = Math.floor(carry % targetBase);
      carry = Math.floor(carry / targetBase);
    }

    length = i;
    begin++;
  }

  // skip leading zeroes in base-encoded result
  let baseEncodingPosition = size - length;
  while(baseEncodingPosition !== size &&
        baseValue[baseEncodingPosition] === 0) {
    baseEncodingPosition++;
  }

  // convert the base value to the base encoding
  let baseEncoding = baseAlphabet.charAt(0).repeat(zeroes)
  for(; baseEncodingPosition < size; ++baseEncodingPosition) {
    baseEncoding += baseAlphabet.charAt(baseValue[baseEncodingPosition])
  }

  return baseEncoding;
}

The following algorithm specifies how to decode an array of bytes, where each byte represents a base-encoded value, to a different base representation that uses a particular base alphabet, such as base-64-url-no-pad or base-58-btc. The required inputs are the sourceEncoding, sourceBase, and baseAlphabet. The output is an array of bytes that contains the base-decoded value. All mathematical operations MUST be performed using integer arithmetic. Alternatives to the algorithm provided below MAY be used as long as the outputs of the alternative algorithm remain the same.

Initialize a baseMap mapping by associating each character in baseAlphabet to its integer position in the baseAlphabet string.
Initialize the following variables; sourceOffset to 0, zeroes to 0, and decodedLength to 0.
Set zeroes and sourceOffset to the number of leading baseAlphabet[0] values in sourceEncoding.
Set decodedBytes to an empty byte array that is the size of the final base-converted value. Calculate the size of decodedBytes by dividing log(sourceBase) by log(256) and then multiplying by the length of sourceEncoding minus the leading zeroes. Add 1 to the value of size.
Process each character in sourceEncoding as character starting at offset sourceOffset:
1. Set the carry value to the integer value in the baseMap that is associated with character.
2. Perform base-decoding by starting at the end of the decodedBytes array. Initialize an iterator i to 0. Set byteOffset to decodedSize minus 1. Perform the following loop as long as, carry does not equal 0 or i is less than decodedLength, and byteOffset does not equal -1:
  1. Add the result of multiplying sourceBase by decodedBytes[byteOffset] to carry.
  2. Set decodedBytes[byteOffset] to the remainder of dividing carry by 256.
  3. Set carry to carry divided by 256, ensuring that integer division is used to perform the division.
  4. Decrement byteOffset by 1 and increment i by 1.
3. Set decodedLength to i and increment sourceOffset by 1.
Set the decodedOffset to decodedSize minus decodedLength. While the decodedOffset does not equal the decodedSize and decodedBytes[decodedOffset] equals 0, increment decodedOffset by 1. This step skips the leading zeros in the final base-decoded byte array.
Set the size of the finalBytes array to zeroes plus, decodedSize minus decodedOffset. Initialize the first zeroes bytes in finalBytes to 0.
Starting at an offset equal to the number of zeroes in finalBytes plus 1, copy all bytes in decodedBytes, up to decodedSize, starting at offset decodedOffset to finalBytes.

Example 21: An implementation of the general base-decoding algorithm above in Javascript

function baseDecode(sourceEncoding, sourceBase, baseAlphabet) {
  // build the base-alphabet to integer value map
  baseMap = {};
  for(let i = 0; i < baseAlphabet.length; i++) {
    baseMap[baseAlphabet[i]] = i;
  }

  // skip and count zero-byte values in the sourceEncoding
  let sourceOffset = 0;
  let zeroes = 0;
  let decodedLength = 0;
  while(sourceEncoding[sourceOffset] === baseAlphabet[0]) {
    zeroes++;
    sourceOffset++;
  }

  // allocate the decoded byte array
  const baseContractionFactor = Math.log(sourceBase) / Math.log(256);
  let decodedSize = Math.floor((
    (sourceEncoding.length - sourceOffset) * baseContractionFactor) + 1);
  let decodedBytes = new Uint8Array(decodedSize);

  // perform base-conversion on the source encoding
  while(sourceEncoding[sourceOffset]) {
    // process each base-encoded number
    let carry = baseMap[sourceEncoding[sourceOffset]];

    // convert the base-encoded number by performing base-expansion
    let i = 0
    for(let byteOffset = decodedSize - 1;
      (carry !== 0 || i < decodedLength) && (byteOffset !== -1);
      byteOffset--, i++) {
      carry += Math.floor(sourceBase * decodedBytes[byteOffset]);
      decodedBytes[byteOffset] = Math.floor(carry % 256);
      carry = Math.floor(carry / 256);
    }

    decodedLength = i;
    sourceOffset++;
  }

  // skip leading zeros in the decoded byte array
  let decodedOffset = decodedSize - decodedLength;
  while(decodedOffset !== decodedSize && decodedBytes[decodedOffset] === 0) {
    decodedOffset++;
  }

  // create the final byte array that has been base-decoded
  let finalBytes = new Uint8Array(zeroes + (decodedSize - decodedOffset));
  let j = zeroes;
  while(decodedOffset !== decodedSize) {
    finalBytes[j++] = decodedBytes[decodedOffset++];
  }

  return finalBytes;
}

The following algorithm specifies how to add a digital proof to a document, which can be used to verify the authenticity and integrity of an unsecured data document. Required inputs are an unsecured data document (unsecuredDocument) and proof options (options). The proof options MUST contain a type identifier for the cryptographic suite (type) and any other properties needed by the cryptographic suite type; an identifier for the verification method (verificationMethod) that can be used to verify the authenticity of the proof; an [XMLSCHEMA11-2] dateTimeStamp string (created) containing the current date and time, accurate to at least one second, in Universal Time Code format. A security domain (domain) and/or a receiver-supplied challenge (challenge) MAY also be specified in the options. A secured data document is produced as output. Whenever this algorithm encodes strings, it MUST use UTF-8 encoding.

Let output be a copy of unsecuredDocument.
Let transformedData be the result of transforming unsecuredDocument according to a transformation algorithm associated with the cryptographic suite and the options parameters provided as inputs to the algorithm.
Let hashData be the result of hashing the transformedData according to a hashing algorithm associated with the cryptographic suite and the options parameters provided as inputs to the algorithm.
Let proof be the result of running the proof serialization algorithm associated with the cryptographic suite with the hashData and options parameters provided as inputs to the algorithm.
If the proof.type, proof.verificationMethod, or proof.proofPurpose values are not set, an error MUST be raised and SHOULD convey an error type of PROOF_GENERATION_ERROR.
If the cryptographic suite requires the use of a created timestamp, and the proof.created value is not set, a PROOF_GENERATION_ERROR MUST be raised.
If options.domain is set, it MUST be equal to proof.domain or an error MUST be raised and SHOULD convey an error type of PROOF_GENERATION_ERROR.
If options.challenge is set, it MUST be equal to proof.challenge or an error MUST be raised and SHOULD convey an error type of PROOF_GENERATION_ERROR.
Set output.proof to the value of proof.
Return output as the secured data document.

Note: `hashData` might not consist of a single value

While the output of the hashing algorithm can be a single value, such as a 32 byte SHA2-256 value, implementers are advised that some cryptographic suite(s) might define hashData to be comprised of multiple values that might be processed independently in the proof serialization algorithm. For example, this approach is known to be taken in certain cryptographic suites that allow selective disclosure or unlinkability via the digital proof.

The following algorithm specifies how to incrementally add a proof to a proof set or proof chain starting with a secured document containing either a proof or proof set/chain. Required inputs are a secured data document (securedDocument) and proof options (options). The proof options (options) must satisfy the criteria in section 4.3 Add Proof. A new secured data document is produced as output. Whenever this algorithm encodes strings, it MUST use UTF-8 encoding.

Let proof be set to securedDocument.proof. Let allProofs be an empty list. If proof is a list, copy all the elements of proof to allProofs. If proof is an object add a copy of that object to allProofs.
Let the unsecuredDocument be a copy of the securedDocument with the proof attribute removed. Let output be a copy of the unsecuredDocument.
If options contains a previousProof attribute and that attribute is a string, check whether an element of the allProofs has a matching id attribute. If not, an error MUST be raised and SHOULD convey an error type of PROOF_GENERATION_ERROR. If the previousProof attribute is an array, check that each element of that array is a string which matches the id attribute of an element of the allProofs. If not, an error MUST be raised and SHOULD convey an error type of PROOF_GENERATION_ERROR.
Run steps 2 through 8 of the algorithm in section 4.3 Add Proof. If no exceptions are raised, append the generated proof value to the allProofs; otherwise, raise the exception.
Set output.proof to the value of allProofs.
Return output as the new secured data document.

The following algorithm specifies how to check the authenticity and integrity of a secured data document by verifying its digital proof. Required inputs are a secured data document (securedDocument) and proof options (options). The proof options MAY contain the following values that help protect against relay and replay attacks: expectedProofPurpose, domain, and challenge. The expectedProofPurpose value is used to ensure that the proof was generated by the proof creator for the expected reason by the verifier, such as "authentication". The domain value is used by the proof creator to lock a proof to a particular security domain, and used by the verifier to ensure that a proof is not being used across different security domains. The challenge value is used by the verifier to ensure that an attacker is not replaying previously created proofs. Additional options are expected to be provided by libraries to ensure that digital proof time stamps do not deviate more than an acceptable time frame for a given use case. A verification result is produced as output.

Let proof be set to securedDocument.proof.
If the proof.type, proof.verificationMethod, or proof.proofPurpose values are not set, an error MUST be raised and SHOULD convey an error type of MALFORMED_PROOF_ERROR.
If the cryptographic suite requires the proof.created value, and it is not set, an error MUST be raised and SHOULD convey an error type of MALFORMED_PROOF_ERROR.
If the proof.proofPurpose value does not match options.expectedProofPurpose, an error MUST be raised and SHOULD convey an error type of MISMATCHED_PROOF_PURPOSE_ERROR.
Let unsecuredDocument be a copy of securedDocument with the proof value removed.
Let transformedData be the result of transforming the unsecuredDocument according to a transformation algorithm associated with the cryptographic suite specified in proof and the options parameters provided as inputs to the algorithm. The type of cryptographic suite is specified by the proof.type value and MAY be further described by cryptographic suite-specific properties expressed in proof.
Let hashData be the result of hashing the transformedData according to a hashing algorithm associated with the cryptographic suite specified in the proof and options parameters provided as inputs to the algorithm.
Let isProofVerified be the result of running the proof verification algorithm associated with the cryptographic suite with the hashData and options parameters provided as inputs to the algorithm.
If options.domain is set and it does not match proof.domain, an error MUST be raised and SHOULD convey an error type of INVALID_DOMAIN_ERROR.
If options.challenge is set and it does not match proof.challenge, an error MUST be raised and SHOULD convey an error type of INVALID_CHALLENGE_ERROR.
Return isProofVerified as the verification result.

In a proof set or proof chain, a secured data document has a proof attribute which contains a list of proofs (allProofs). The following algorithm specifies how to check the authenticity and integrity of a secured data document by verifying each proof in the allProofs. Required inputs are a secured data document (securedDocument). A list of verification results is produced as output.

Set allProofs to securedDocument.proof.
For each proof in allProofs, run steps 2 through 11 of the algorithm in section 4.5 Verify Proof; if no exceptions are raised, associate the isProofVerified value with this proof.
For each proof in allProofs that contains a previousProof attribute, modify the associated isProofVerified by iteratively checking that the proof identified by the id value in previousProof, along with every previousProof up the chain, is valid. If previousProof is an array, then all previous proofs identified by the array elements MUST be valid for the current proof to be considered valid. Cycle detection, such as keeping a list of previously visited proofs, MUST be used to guard against proof cycles. If a cycle is detected in a proof chain, then a PROOF_CHAIN_CYCLE_ERROR MUST be raised.
Return the allProofs along with each proof's associated isProofVerified information.

The following algorithm specifies how to safely retrieve a verification method, such as a cryptographic public key, by using a verification method identifier contained in a data integrity proof. Required inputs are a data integrity proof (proof) and a set of dereferencing options (options). A verification method is produced as output.

Let vmIdentifier be set to proof.verificationMethod.
Let vmPurpose be set to proof.proofPurpose.
If vmIdentifier is not a valid URL, an error MUST be raised and SHOULD convey an error type of INVALID_VERIFICATION_METHOD_URL.
Let controllerDocumentUrl be the result of parsing vmIdentifier according to the rules of the URL scheme and extracting the primary resource identifier (without the fragment identifier).
Let vmFragment be the result of parsing vmIdentifier according to the rules of the URL scheme and extracting the secondary resource identifier (the fragment identifier).
Let controllerDocument be the result of dereferencing controllerDocumentUrl, according to the rules of the URL scheme and using the supplied options.
If controllerDocument.id does not match the controllerDocumentUrl, an error MUST be raised and SHOULD convey an error type of INVALID_CONTROLLER_DOCUMENT_ID.
If controllerDocument is not a valid controller document, an error MUST be raised and SHOULD convey an error type of INVALID_CONTROLLER_DOCUMENT.
Let verificationMethod be the result of dereferencing the vmFragment from the controllerDocument according to the rules of the media type of the controllerDocument.
If verificationMethod is not a valid verification method, an error MUST be raised and SHOULD convey an error type of INVALID_VERIFICATION_METHOD.
If verificationMethod is not associated with the array of vmPurposes in the controllerDocument, either by reference (URL) or by value (object), an error MUST be raised and SHOULD convey an error type of INVALID_PROOF_PURPOSE_FOR_VERIFICATION_METHOD.
Return verificationMethod as the verification method.

The following example provides a minimum conformant controller document containing a minimum conformant verification method as required by the algorithm in this section:

Example 22: Minimum conformant controller document

{
  "id": "https://controller.example/123",
  "verificationMethod": [{
    "id": "https://controller.example/123#key-456",
    "type": "ExampleVerificationMethodType",
    "controller": "https://controller.example/123",
    // public cryptographic material goes here
  }],
  "authentication": ["#key-456"]
}

The algorithms described in this specification, as well as in various cryptographic suite specifications, throw specific types of errors. Implementers might find it useful to convey these errors to other libraries or software systems. This section provides specific URLs, descriptions, and error codes for the errors, such that an ecosystem implementing technologies described by this specification might interoperate more effectively when errors occur.

When exposing these errors through an HTTP interface, implementers SHOULD use [RFC9457] to encode the error data structure. If [RFC9457] is used:

The type value of the error object MUST be a URL that starts with the value https://w3id.org/security# and ends with the value in the section listed below.
The code value MUST be the integer code described in the table below (in parentheses, beside the type name).
The title value SHOULD provide a short but specific human-readable string for the error.
The detail value SHOULD provide a longer human-readable string for the error.

PROOF_GENERATION_ERROR (-16): A request to generate a proof failed. See Section 4.3 Add Proof, and Section 4.4 Add Proof Set/Chain.
MALFORMED_PROOF_ERROR (-17): A proof that is malformed was detected. See Section 4.5 Verify Proof.
MISMATCHED_PROOF_PURPOSE_ERROR (-18): The proofPurpose value in a proof did not match the expected value. See Section 4.5 Verify Proof.
INVALID_DOMAIN_ERROR (-19): The domain value in a proof did not match the expected value. See Section 4.5 Verify Proof.
INVALID_CHALLENGE_ERROR (-20): The challenge value in a proof did not match the expected value. See Section 4.5 Verify Proof.
INVALID_VERIFICATION_METHOD_URL (-21): The verificationMethod value in a proof was malformed. See Section 4.7 Retrieve Verification Method.
INVALID_CONTROLLER_DOCUMENT_ID (-22): The id value in a controller document was malformed. See Section 4.7 Retrieve Verification Method.
INVALID_CONTROLLER_DOCUMENT (-23): The controller document was malformed. See Section 4.7 Retrieve Verification Method.
INVALID_VERIFICATION_METHOD (-24): The verification method in a controller document was malformed. See Section 4.7 Retrieve Verification Method.
INVALID_PROOF_PURPOSE_FOR_VERIFICATION_METHOD (-25): The verification method in a controller document was not associated using the expected verification relationship as expressed in the proofPurpose property in the proof. See Section 4.7 Retrieve Verification Method.

The following section describes security considerations that developers implementing this specification should be aware of in order to create secure software.

Cryptography secures information through the use of secrets. Knowledge of the necessary secret makes it computationally easy to access certain information. The same information can be accessed if a computationally-difficult, brute-force effort successfully guesses the secret. All modern cryptography requires the computationally difficult approach to remain difficult throughout time, which does not always hold due to breakthroughs in science and mathematics. That is to say that Cryptography has a shelf life.

This specification plans for the obsolescence of all cryptographic approaches by asserting that whatever cryptography is in use today is highly likely to be broken over time. Software systems have to be able to change the cryptography in use over time in order to continue to secure information. Such changes might involve increasing required secret sizes or modifications to the cryptographic primitives used. However, some combinations of cryptographic parameters might actually reduce security. Given these assumptions, systems need to be able to distinguish different combinations of safe cryptographic parameters, also known as cryptographic suites, from one another. When identifying or versioning cryptographic suites, there are several approaches that can be taken which include: parameters, numbers, and dates.

Parametric versioning specifies the particular cryptographic parameters that are employed in a cryptographic suite. For example, one could use an identifier such as RSASSA-PKCS1-v1_5-SHA1. The benefit to this scheme is that a well-trained cryptographer will be able to determine all of the parameters in play by the identifier. The drawback to this scheme is that most of the population that uses these sorts of identifiers are not well trained and thus will not understand that the previously mentioned identifier is a cryptographic suite that is no longer safe to use. Additionally, this lack of knowledge might lead software developers to generalize the parsing of cryptographic suite identifiers such that any combination of cryptographic primitives becomes acceptable, resulting in reduced security. Ideally, cryptographic suites are implemented in software as specific, acceptable profiles of cryptographic parameters instead.

Numbered versioning might specify a major and minor version number such as 1.0 or 2.1. Numbered versioning conveys a specific order and suggests that higher version numbers are more capable than lower version numbers. The benefit of this approach is that it removes complex parameters that less expert developers might not understand with a simpler model that conveys that an upgrade might be appropriate. The drawback of this approach is that its not clear if an upgrade is necessary, as software version number increases often don't require an upgrade for the software to continue functioning. This can lead to developers thinking their usage of a particular version is safe, when it is not. Ideally, additional signals would be given to developers that use cryptographic suites in their software that periodic reviews of those suites for continued security are required.

Date-based versioning specifies a particular release date for a specific cryptographic suite. The benefit of a date, such as a year, is that it is immediately clear to a developer if the date is relatively old or new. Seeing an old date might prompt the developer to go searching for a newer cryptographic suite, where as a parametric or number-based versioning scheme might not. The downside of a date-based version is that some cryptographic suites might not expire for 5-10 years, prompting the developer to go searching for a newer cryptographic suite only to not find one that is newer. While this might be an inconvenience, it is one that results in safer ecosystem behavior.

Issue 38: Determine how cryptographic suites are named and versioned pr exists before CR

The following text is currently under debate:

It is highly encouraged that cryptographic suite identifiers are versioned using a year designation. For example, the cryptographic suite identifier ecdsa-2022 implies that the suite is probably an acceptable of ECDSA in the year 2025, but might not be a safe choice in the year 2042. A date-based versioning mechanism, however, is not enough by itself. All cryptographic suites that follow this specification are intended to be registered [VC-SPECS] in a way that clearly signal which cryptosuites are deprecated, standardized, or experimental. Cryptosuite registration will follow CFRG, IETF, NIST, FIPS, and safecurves guidance. Use of deprecated suites are expected to throw errors in implementations unless a useUnsafeCryptosuites option is used specifying exactly the unsafe cryptosuite to use. Use of experimental suites are expected to throw errors in implementations unless a useExperimentalCryptosuites option is used specifying exactly the experimental cryptosuite to use.

Modern cryptographic algorithms provide a number of tunable parameters and options to ensure that the algorithms can meet the varied requirements of different use cases. For example, embedded systems have limited processing and memory environments and might not have the resources to generate the strongest digital signatures for a given algorithm. Other environments, like financial trading systems, might only need to protect data for a day while the trade is occurring, while other environments might need to protect data for multiple decades. To meet these needs, cryptographic algorithm designers often provide multiple ways to configure a cryptographic algorithm.

Cryptographic library implementers often take the specifications created by cryptographic algorithm designers and specification authors and implement them such that all options are available to the application developers that use their libraries. This can be due to not knowing which combination of features a particular application developer might need for a given cryptographic deployment. All options are often exposed to application developers.

Application developers that use cryptographic libraries often do not have the requisite cryptographic expertise and knowledge necessary to appropriately select cryptographic parameters and options for a given application. This lack of expertise can lead to an inappropriate selection of cryptographic parameters and options for a particular application.

This specification sets the priority of constituencies to protect application developers over cryptographic library implementers over cryptographic specification authors over cryptographic algorithm designers. Given these priorities, the following recommendations are made:

Cryptographic algorithm designers are advised [RFC7696] to minimize the number of options and parameters to as few as possible to ensure that cryptographic library implementers have a more easily auditable security attack surface for their software libraries.
Cryptographic specification authors are advised to, if possible, further minimize the number of options and parameters to as few as possible to ensure cryptographic agility while also keeping the auditable security attack surface for downstream software libraries to a minimum.
Cryptographic library implementers are advised to, if possible, provide known good combinations of options and parameters to application developers. There would ideally be two pre-set default configurations for any algorithmic class, such as Elliptic Curve Digital Signatures, with no ability to fine tune parameters and options when using these pre-sets. Library options can be provided to experts to fine tune their use of the library, use of those options by the general application developer population is to be discouraged.
Application developers are advised to choose from a number of pre-set cryptography library configurations and to avoid modifying cryptographic options and parameters, or using experimental or deprecated cryptography.

The guidance above is meant to ensure that useful cryptographic options and parameters are provided at the lower layers of the architecture while not exposing those options and parameters to application developers who may not fully understand the balancing benefits and drawbacks of each option.

Issue: Use of experimental and deprecated cryptography

The VCWG is seeking guidance on adding language to allow the use of experimental or deprecated cryptography. By default, those features will be disabled and will require the application developer to specifically allow use on a per-cryptographic suite basis. There will be requirements for all implementing libraries to throw errors or warnings when deprecated or experimental options are selected without the appropriate override flags.

Section 5.1 Versioning Cryptography Suites emphasized the importance of providing relatively easy to understand information concerning the timeliness of particular cryptographic suite, while section 5.2 Protecting Application Developers further emphasized minimizing the number of options to be specified. Indeed, section 3. Cryptographic Suites lists requirements for cryptographic suites which include detailed specification of algorithm, transformation, hashing, and serialization. Hence, the name of the cryptographic suite does not need to include all this detail, which implies the parametric versioning mentioned in section 5.1 Versioning Cryptography Suites is neither necessary nor desirable.

The recommended naming convention for cryptographic suites is a string composed of a signature algorithm identifier, separated by a hyphen from an option identifier (if the cryptosuite supports incompatible implementation options), followed by a hyphen and designation of the approximate year that the suite was proposed.

For example, the [DI-EDDSA] is based on EdDSA digital signatures, supports two incompatible options based on canonicalization approaches, and was proposed in roughly the year 2022, so it would have two different cryptosuite names: eddsa-rdfc-2022 and eddsa-jcs-2022.

Although the [DI-ECDSA] is based on ECDSA digital signatures, supports the same two incompatible canonicalization approaches as [DI-EDDSA], and supports two different levels of security (128 bit and 192 bit) via two alternative sets of elliptic curves and hashes, it has only two cryptosuite names: ecdsa-rdfc-2019 and ecdsa-jcs-2019. The security level and corresponding curves and hashes are determined from the multi-key format of the public key used in validation.

Cryptographic agility is a practice by which one designs frequently connected information security systems to support switching between multiple cryptographic primitives and/or algorithms. The primary goal of cryptographic agility is to enable systems to rapidly adapt to new cryptographic primitives and algorithms without making disruptive changes to the systems' infrastructure. Thus, when a particular cryptographic primitive, such as the SHA-1 algorithm, is determined to be no longer safe to use, systems can be reconfigured to use a newer primitive via a simple configuration file change.

Cryptographic agility is most effective when the client and the server in the information security system are in regular contact. However, when the messages protected by a particular cryptographic algorithm are long-lived, as with Verifiable Credentials, and/or when the client (holder) might not be able to easily recontact the server (issuer), then cryptographic agility does not provide the desired protections.

Cryptographic layering is a practice where one designs rarely connected information security systems to employ multiple primitives and/or algorithms at the same time. The primary goal of cryptographic layering is to enable systems to survive the failure or one or more cryptographic algorithms or primitives without losing cryptographic protection on the payload. For example, digitally signing a single piece of information using RSA, ECDSA, and Falcon algorithms in parallel would provide a mechanism that could survive the failure of two of these three digital signature algorithms. When a particular cryptographic protection is compromised, such as an RSA digital signature using 768-bit keys, systems can still utilize the non-compromised cryptographic protections to continue to protect the information. Developers are urged to take advantage of this feature for all signed content that might need to be protected for a year or longer.

This specification provides for both forms of agility. It provides for cryptographic agility, which allows one to easily switch from one algorithm to another. It also provides for cryptographic layering, which allows one to simultaneously use multiple cryptographic algorithms, typically in parallel, such that any of those used to protect information can be used without reliance on or requirement of the others, while still keeping the digital proof format easy to use for developers.

At times, it is beneficial to transform the data being protected during the cryptographic protection process. Such "in-line" transformation can enable a particular type of cryptographic protection to be agnostic to the data format it is carried in. For example, some Data Integrity cryptographic suites utilize RDF Dataset Canonicalization [RDF-CANON] which transforms the initial representation into a canonical form [N-QUADS] that is then serialized, hashed, and digitally signed. As long as any syntax expressing the protected data can be transformed into this canonical form, the digital signature can be verified. This enables the same digital signature over the information to be expressed in JSON, CBOR, YAML, and other compatible syntaxes without having to create a cryptographic proof for every syntax.

Being able to express the same digital signature across a variety of syntaxes is beneficial because systems often have native data formats with which they operate. For example, some systems are written against JSON data, while others are written against CBOR data. Without transformation, systems that process their data internally as CBOR are required to store the digitally signed data structures as JSON (or vice-versa). This leads to double-storing data and can lead to increased security attack surface if the unsigned representation stored in databases accidentally deviates from the signed representation. By using transformations, the digital proof can live in the native data format to help prevent otherwise undetectable database drift over time.

This specification is designed to avoid requiring the duplication of signed information by utilizing "in-line" data transformations. Application developers are urged to work with cryptographically protected data in the native data format for their application and not separate storage of cryptographic proofs from the data being protected. Developers are also urged to regularly confirm that the cryptographically protected data has not been tampered with as it is written to and read from application storage.

Some transformations, such as RDF Dataset Canonicalization [RDF-CANON], have mitigations for input data sets that can be used by attackers to consume excessive processing cycles. This class of attack is called dataset poisoning, and all modern RDF Dataset canonicalizers are required to detect these sorts of bad inputs and halt processing. The test suites for RDF Dataset Canonicalization includes such poisoned datasets to ensure that such mitigations exist in all conforming implementations. Generally speaking, cryptographic suite specifications that use transformations are required to mitigate these sorts of attacks, and implementers are urged to ensure that the software libraries that they use enforce these mitigations. These attacks are in the same general category as any resource starvation attack, such as HTTP clients that deliberately slow connections, thus starving connections on the server. Implementers are advised to consider these sorts of attacks when implementing defensive security strategies.

Issue: Collision-resistant canonicalization requirements

The VCWG is seeking feedback on normative language that cryptographic suite implementers need to follow to ensure that they do not utilize data transformation mechanisms that can map to the same output. That is, given different inputs for canonicalization scheme #1 and canonicalization scheme #2, they must not produce the same output value. As an analogy, this is the same requirement for cryptographic hashing mechanisms and is why those schemes are designed to be collision resistant. Cryptographic canonicalization mechanisms have the same requirement. At present, this isn't a problem because the three expected canonicalization schemes — the Universal RDF Dataset Canonicalization Algorithm 2015 [RDF-CANON], JSON Canonicalization Scheme [RFC8785], and a theoretical future base-encoding canonicalization — have entirely different outputs.

Issue: Avoiding the pitfalls of XML Canonicalization

The VCWG is seeking feedback on whether to explain why modern canonicalization schemes are simpler than the far more complex XML Canonicalization schemes of the early 2000s. Some readers seem to be under the impression that all canonicalization is difficult and has to be avoided at all costs (including costs to application developers). The WG would like to understand if it would be helpful to include a section explaining why some simpler data syntaxes (such as JSON) are easier to canonicalize than more complex data syntaxes (such as XML).

The inspectability of application data has effects on system efficiency and developer productivity. When cryptographically protected application data, such as base-encoded binary data, is not easily processed by application subsystems, such as databases, it increases the effort of working with the cryptographically protected information. For example, a cryptographically protected payload that can be natively stored and indexed by a database will result in a simpler system that:

benefits from utilizing existing industry-standard database features with no changes to the protected information,
avoids the complexity of duplicating data where one copy of the data preserves the message and digital signature, while the other copy only stores and indexes the message and is what drives system behaviour,
avoids the complexity of bespoke solutions that have to structurally modify the protected information, such as serializing and deserializing nested digitally signed data that has multiple nested base-encoded payloads.

Similarly, a cryptographically protected payload that can be processed by multiple upstream networked systems increases the ability to properly layer security architectures. For example, if upstream systems do not have to repeatedly decode the incoming payload, it increases the ability for a system to distribute processing load by specializing upstream subsystems to actively combat attacks. While a digital signature needs to always be checked before taking substantive action, other upstream checks can be performed on transparent payloads — such as identifier-based rate limiting, signature expiration checking, or nonce/challenge checking — to reject obviously bad requests.

Additionally, if a developer is not able to easily view data in a system, the ability to easily audit or debug system correctness is hampered. For example, requiring application developers to cut-and-paste base-encoded application data makes development more challenging and increases the chances that obvious bugs will be missed because every message needs to go through a manually operated base-decoding tool.

There are times, however, where the correct design decision is to make data opaque. Data that does not need to be processed by other application subsystems, as well as data that does not need to be modified or accessed by an application developer, can be serialized into opaque formats. Examples include digital signature values, cryptographic key parameters, and other data fields that only need to be accessed by a cryptographic library and need not be modified by the application developer. There are also examples where data opacity is appropriate when the underlying subsystem does not expose the application developer to the underlying complexity of the opaque data, such as databases that perform encryption at rest. In these cases, the application developer continues to develop against transparent application data formats while the database manages the complexity of encrypting and decrypting the application data to and from long-term storage.

This specification strives to provide an architecture where application data remains in its native format and is not made opaque, while other cryptographic data, such as digital signatures, are kept in their opaque binary encoded form. Cryptographic suite implementers are urged to consider appropriate use of data opacity when designing their suites, and to weigh the design trade-offs when making application data opaque versus providing access to cryptographic data at the application layer.

Issue

Implementers must ensure that a verification method is bound to a particular controller by going from the verification method to the controller document, and then ensuring that the controller document also contains the verification method.

When an implementation is verifying a proof, it is imperative that it verify not only that the verification method used to generate the proof is listed in the controller document, but also that it was intended to be used to generate the proof that is being verified. This process is known as "verification relationship validation".

The process for verification relationship validation is outlined in Section 4.7 Retrieve Verification Method.

This process is used to ensure that cryptographic material, such as a private cryptographic key, is not misused by application to an unintended purpose. An example of cryptographic material misuse would be if a private cryptographic key meant to be used to issue a Verifiable Credential was instead used to log into a website (that is, for authentication). Not checking a verification relationship is dangerous because the restriction and protection profile for some cryptographic material could be determined by its intended use. For example, some applications could be trusted to use cryptographic material for only one purpose, or some cryptographic material could be more protected, such as through storage in a hardware security module in a data center versus as an unencrypted file on a laptop.

When an implementation is verifying a proof, it is imperative that it verify that the proof purpose match the intended use.

This process is used to ensure that proofs are not misused by an application for an unintended purpose, as this is dangerous for the proof creator. An example of misuse would be if a proof that stated its purpose was for securing assertions in verifiable credentials was instead used for authentication to log into a website. In this case, the proof creator attached proofs to any number of verifiable credentials that they expected to be distributed to an unbounded number of other parties. Any one of these parties could log into a website as the proof creator if the website erroneously accepted such a proof as authentication instead of its intended purpose.

The way in which a transformation, such as canonicalization, is performed can affect the security characteristics of a system. Selecting the best canonicalization mechanisms depends on the use case. Often, the simplest mechanism that satisfies the desired security requirements is the best choice. This section attempts to provide simple guidance to help implementers choose between the two main canonicalization mechanisms referred to in this specification, namely JSON Canonicalization Scheme [RFC8785] and RDF Dataset Canonicalization [RDF-CANON].

If an application only uses JSON and does not depend on any form of RDF semantics, then using a cryptography suite that uses JSON Canonicalization Scheme [RFC8785] is an attractive approach.

If an application uses JSON-LD and needs to secure the semantics of the document, then using a cryptography suite that uses RDF Dataset Canonicalization [RDF-CANON] is an attractive approach.

Implementers are also advised that other mechanisms that perform no transformations are available, that secure the data by wrapping it in a cryptographic envelope instead of embedding the proof in the data, such as JWTs [RFC7519] and CWTs [RFC8392]. These approaches have simplicity advantages in some use cases, at the expense of some of the benefits provided by the approach detailed in this specification.

One of the algorithmic processes used by this specification is canonicalization, which is a type of transformation. Canonicalization is the process of taking information that might be expressed in a variety of semantically equivalent ways as input, and expressing all output in a single way, called a "canonical form".

The security of a resulting data integrity proof that utilizes canonicalization is highly dependent on the correctness of the algorithm. For example, if a canonicalization algorithm converts two inputs that have different meanings into the same output, then the author's intentions can be misrepresented to a verifier. This can be used as an attack vector by adversaries.

Additionally, if semantically relevant information in an input is not present in the output, then an attacker could insert such information into a message without causing proof verification to fail. This is similar to another transformation that is commonly used when cryptographically signing messages: cryptographic hashing. If an attacker is able to produce the same cryptographic hash from a different input, then the cryptographic hash algorithm is not considered secure.

Implementers are strongly urged to ensure proper vetting of any canonicalization algorithms to be used for transformation of input to a hashing process. Proper vetting includes, at a minimum, association with a peer reviewed mathematical proof of algorithm correctness; multiple implementations and vetting by experts in a standards setting organization is preferred. Implementers are strongly urged not to invent or use new mechanisms unless they have formal training in information canonicalization and/or access to experts in the field who are capable of producing a peer reviewed mathematical proof of algorithm correctness.

This specification is designed in such a way that no network requests are required when verifying a proof on a conforming secured document. Readers might note, however, that JSON-LD contexts and verification methods can contain URLs that might be retrieved over a network connection. This concern exists for any URL that might be loaded from the network during or after verification.

To the extent possible, implementers are urged to permanently or aggressively cache such information to reduce the attack surface on an implementation that might need to fetch such URLs over the network. For example, caching techniques for JSON-LD contexts are described in Section 2.9 Contexts and Vocabularies, and some verification methods, such as did:key [DID-KEY], do not need to be fetched from the network at all.

When it is not possible to use cached information, such as when a specific HTTP URL-based instance of a verification method is encountered for the first time, implementers are cautioned to use defensive measures to mitigate denial-of-service attacks during any process that might fetch a resource from the network.

Since the technology to secure documents described by this specification is generalized in nature, the security implications of its use might not be immediately apparent to readers. To understand the sort of security concerns one might need to consider in a complete software system, implementers are urged to read about how this technology is used in the verifiable credentials ecosystem [VC-DATA-MODEL-2.0]; see the section on Verifiable Credential Security Considerations for more information.

Verifiable Credential Data Integrity 1.0

Securing the Integrity of Verifiable Credential Data

Abstract

Status of This Document

1. Introduction

1.1 How it Works

1.2 Design Goals and Rationale

1.3 Conformance

1.4 Terminology

2. Data Model

2.1 Proofs

2.1.1 Proof Sets

2.1.2 Proof Chains

2.2 Proof Purposes

2.3 Controller Documents

2.3.1 Verification Methods

2.3.1.1 Verification Material

2.3.1.2 Multikey

2.3.1.3 JsonWebKey

2.3.1.4 Referring to Verification Methods

2.3.2 Verification Relationships

2.3.2.1 Authentication

2.3.2.2 Assertion

2.3.2.3 Key Agreement

2.3.2.4 Capability Invocation

2.3.2.5 Capability Delegation

2.4 Multibase

2.5 Multihash

2.6 Resource Integrity

2.7 Relationship to Linked Data

2.8 Relationship to Verifiable Credentials

2.9 Contexts and Vocabularies

2.9.1 Context Injection

2.9.2 Securing Data Losslessly

2.9.3 Datatypes

2.9.3.1 The cryptosuiteString Datatype

2.9.3.2 The multibase Datatype

3. Cryptographic Suites

3.1 DataIntegrityProof

4. Algorithms

4.1 Base Encode

4.2 Base Decode

4.3 Add Proof

4.4 Add Proof Set/Chain

4.5 Verify Proof

4.6 Verify Proof Sets and Chains

4.7 Retrieve Verification Method

4.8 Processing Errors

5. Security Considerations

5.1 Versioning Cryptography Suites

5.2 Protecting Application Developers

5.3 Conventions for Naming Cryptography Suites

5.4 Agility and Layering

5.5 Transformations

5.6 Data Opacity

5.7 Verification Method Binding

5.8 Verification Relationship Validation

5.9 Proof Purpose Validation

5.10 Canonicalization Method Security

5.11 Canonicalization Method Correctness

5.12 Network Requests

5.13 Other Security Considerations

6. Privacy Considerations

6.1 Unlinkability

6.2 Selective Disclosure

6.3 Previous Proofs

6.4 Fingerprinting Network Requests

6.5 Canonicalization Method Privacy

6.6 Other Privacy Considerations

7. Accessibility Considerations

7.1 Presenting Time Values

8. Revision History

9. Acknowledgements

A. References

A.1 Normative references

A.2 Informative references

2.9.3.1 The `cryptosuiteString` Datatype

2.9.3.2 The `multibase` Datatype