Trace Context

W3C Candidate Recommendation

This version:
https://www.w3.org/TR/2019/CR-trace-context-20190509/
Latest published version:
https://www.w3.org/TR/trace-context/
Latest editor's draft:
https://w3c.github.io/trace-context/
Implementation report:
https://github.com/w3c/trace-context/#reference-implementations
Previous version:
https://www.w3.org/TR/2018/WD-trace-context-20181122/
Editors:
Sergey Kanzhelev (Microsoft)
Morgan McLean (Google)
Alois Reitbauer (Dynatrace)
Bogdan Drutu (Google)
Nik Molnar (Microsoft)
Yuri Shkuro (Invited Expert)
Participate:
GitHub w3c/trace-context
File a bug
Commit history
Pull requests
Discussions:
We are on Gitter.

Abstract

This specification defines standard headers and value format to propagate context information that enables distributed tracing scenarios. The specification standardizes how context information is sent and modified between services. Context information uniquely identifies individual requests in a distributed system and also defines a means to add and propagate provider-specific context information.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This specification is in Candidate Recommendation stage. It was widely viewed and discussed. It satisfies Distributed Tracing working group technical requirements. There are a few implementations of this specification available. We are gathering implementation experience and usage feedback. We recommend the wide deployment and use of this recommendation.

This document was published by the Distributed Tracing Working Group as a Candidate Recommendation. This document is intended to become a W3C Recommendation.

GitHub Issues are preferred for discussion of this specification. Alternatively, you can send comments to our mailing list. Please send them to public-trace-context@w3.org (archives) with trace-context at the start of your email's subject .

W3C publishes a Candidate Recommendation to indicate that the document is believed to be stable and to encourage implementation by the developer community. This Candidate Recommendation is expected to advance to Proposed Recommendation no earlier than 08 September 2019.

Please see the Working Group's implementation report.

Publication as a Candidate Recommendation does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 March 2019 W3C Process Document.

1. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, MUST NOT, SHOULD, and SHOULD NOT are to be interpreted as described in [RFC2119].

2. Overview

2.1 Problem Statement

Distributed tracing is a methodology implemented by tracing tools to follow, analyze and debug a transaction across multiple software components. Typically, a distributed trace traverses more than one component which requires it to be uniquely identifiable across all participating systems. Trace context propagation passes along this unique identification.

Today, trace context propagation is implemented individually by each tracing vendor. In multi-vendor environments, this causes interoperability problems, like:

In the past, these problems did not have a significant impact as most applications were monitored by a single tracing vendor and stayed within the boundaries of a single platform provider. Today, an increasing number of applications are highly distributed and leverage multiple middleware services and cloud platforms.

This transformation of modern applications calls for a distributed tracing context propagation standard.

2.2 Solution

The trace context specification defines a universally agreed-upon format for the exchange of trace context propagation data - referred to as trace context. Trace context solves the problems described above by

A unified approach for propagating trace data improves visibility into the behavior of distributed applications, facilitating problem and performance analysis. The interoperability provided by trace-context is a prerequisite to manage modern micro-service based applications.

2.3 Design overview

Trace context is split into two individual propagation fields supporting interoperability and vendor-specific extensibility:

Tracing tools can provide two levels of compliant behavior interacting with trace context:

A tracing tool can choose to change this behavior for each individual request to a component it is monitoring.

3. Trace context HTTP headers format

This section describes the binding of the distributed trace context to traceparent and tracestate http headers.

3.1 Relationship between the headers

The traceparent header represents the incoming request in a tracing system in a common format. The tracestate header includes the parent in a potentially vendor-specific format.

For example, a client traced in the congo system adds the following headers to an outbound http request.

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
tracestate: congo=t61rcWkgMzE

Note: In this case, the value t61rcWkgMzE, is the result of Base64 encoding the Trace-ID (b7ad6b7169203331), though such manipulations are not required in tracestate.

If the receiving server is traced in the rojo tracing system, it carries over the state it received and adds a new entry with the position in its trace.

traceparent: 00-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01
tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE

You'll notice that the rojo system reuses the value of traceparent in its entry in tracestate. This means it is a generic tracing system. Otherwise, tracestate entries are opaque.

If the receiving server of the above is congo again, it continues from its last position, overwriting its entry with one representing the new parent.

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b9c7c989f97918e1-01
tracestate: congo=ucfJifl5GOE,rojo=00f067aa0ba902b7

Note, ucfJifl5GOE is base64 encoded parent ID b9c7c989f97918e1.

Notice when congo wrote its traceparent entry, it reuses the last parent ID which helps in consistency for those doing correlation. However, the value of its entry tracestate is opaque and different. This is ok.

Finally, you'll see tracestate retains an entry for rojo exactly as it was, except pushed to the right. The left-most position lets the next server know which tracing system corresponds with traceparent. In this case, since congo wrote traceparent, its tracestate entry should be left-most.

3.2 Traceparent field

Field traceparent identifies the request in a tracing system.

3.2.1 Header name

In order to increase interoperability across multiple protocols and encourage successful integration by default it is recommended to keep the header name lower case. Header name is a single word without any delimiters like hyphen (-).

Header name: traceparent

Platforms and libraries MUST expect header name in any casing and SHOULD send header name in lower case.

3.2.2 Field value

This section uses the Augmented Backus-Naur Form (ABNF) notation of [RFC5234], including the DIGIT rule from that document. DIGIT rule defines a single number character 0-9.

HEXDIGLC = DIGIT / "a" / "b" / "c" / "d" / "e" / "f" ; lower case hex character
value           = version "-" version-format
version         = 2HEXDIGLC   ; this document assumes version 00. Version 255 is forbidden

The value is US-ASCII encoded (which is UTF-8 compliant). Character - is used as a delimiter between fields.

Version (version) is a 1 byte representing an 8-bit unsigned integer. Version 255 is invalid. Current specification assumes the version is set to 00.

The following version-format definition is used for version 00.

version-format   = trace-id "-" parent-id "-" trace-flags
trace-id         = 32HEXDIGLC  ; 16 bytes array identifier. All zeroes forbidden
parent-id        = 16HEXDIGLC  ; 8 bytes array identifier. All zeroes forbidden
trace-flags      = 2HEXDIGLC   ; 8 bit flags. Currently only one bit is used. See below for details
3.2.2.1 Trace-id

Is the ID of the whole trace forest. It is represented as a 16-bytes array, for example, 4bf92f3577b34da6a3ce929d0e0e4736. All bytes zero (00000000000000000000000000000000) is considered an invalid value.

Trace-id is used to uniquely identify a distributed trace. So implementation should generate globally unique values. Many algorithms of unique identification generation are based on some constant part - time or host based and a random value. There are systems that make random sampling decisions based on the value of trace-id. So to increase interoperability it is recommended to keep the random part on the right side of trace-id value.

When a system operates with a shorter trace-id - it is recommended to fill-in the extra bytes with random values rather than zeroes. Let's say the system works with a 8-byte trace-id like 3ce929d0e0e4736. Instead of setting trace-id value to 0000000000000003ce929d0e0e4736 it is recommended to generate a value like 4bf92f3577b34da6a3ce929d0e0e4736 where 4bf92f3577b34da6a is a random value or a function of time & host value. Note, even though a system may operate with a shorter trace-id for distributed trace reporting - full trace-id should be propagated to conform to the specification.

Implementations HAVE TO ignore the traceparent when the trace-id is invalid. For instance, if it contains non-allowed characters.

3.2.2.2 Parent-id

Is the ID of this call as known by the caller. It is also known as span-id as a few telemetry systems call the execution of a client call a span. It is represented as an 8-byte array, for example, 00f067aa0ba902b7. All bytes zero (0000000000000000) is considered an invalid value.

Implementations HAVE TO ignore the traceparent when the parent-id is invalid. For instance, if it contains non lower case hex characters.

3.2.3 Trace-flags

An 8-bit field that controls tracing flags such as sampling, trace level etc. These flags are recommendations given by the caller rather than strict rules to follow for three reasons:

  1. Trust and abuse
  2. Bug in caller
  3. Different load between caller service and callee service might force callee to down sample.

You can find more in security section of this specification.

Like other fields, trace-flags is hex-encoded. For example, all 8 flags set would be ff and no flags set would be 00.

As this is a bit field, you cannot interpret flags by decoding the hex value and looking at the resulting number. For example, a flag 00000001 could be encoded as 01 in hex, or 09 in hex if present with the flag 00001000. A common mistake in bit fields is forgetting to mask when interpreting flags.

Here is an example of properly handing trace flags:

static final byte FLAG_RECORDED = 1; // 00000001
...
boolean recorded = (traceFlags & FLAG_RECORDED) == FLAG_RECORDED

Current version of specification only supports a single flag called recorded.

3.2.3.1 Recorded Flag (00000001)

When set, the least significant bit documents that the caller may have recorded trace data. A caller who does not record trace data out-of-band leaves this flag unset.

Many distributed tracing scenarios may be broken when only a subset of calls participated in a distributed trace were recorded. At certain load recording information about every incoming and outgoing request becomes prohibitively expensive. Making a random or component-specific decision for data collection will lead to fragmented data in every distributed trace. Thus it is typical for tracing vendors and platforms to pass recording decision for given distributed trace or information needed to make this decision.

There is no consensus on what is the best algorithm to make a recording decision. Various techniques include: probability sampling (sample 1 out of 100 distributed traces by flipping a coin), delayed decision (make collection decision based on duration or a result of a call), deferred sampling (let callee decide whether information about this request need to be collected). There are variations and customizations of every technique which can be tracing vendor specific or application defined.

Field tracestate is designed to handle the variety of techniques for making recording decision specific (along any other specific information) for a given tracing system or a platform. Flag recorded is introduced for better interoperability between vendors. It allows to communicate recording decision and enable better experience for the customer.

For example, when SaaS services participate in distributed trace - this service has no knowledge of tracing system used by its caller. But this service may produce records of incoming requests for monitoring or troubleshooting purposes. Flag recorded can be used to ensure that information about requests that were marked for recording by caller will also be recorded by SaaS service. So caller can troubleshoot the behavior of every recorded request.

Flag recorded has no restriction on its mutations except that it can only be mutated when parent-id was updated. See section "Mutating the traceparent field". However there are set of suggestions that will increase vendors interoperability.

  1. If component made definitive recording decision - this decision SHOULD be reflected in recorded flag.
  2. If component needs to make a recording decision - it SHOULD respect recorded flag value. Security considerations should be applied to protect from abusive or malicious use of this flag - see security section.
  3. If component deferred or delayed decision and only a subset of telemetry will be recorded - flag recorded should be propagated unchanged. And set to 0 as a default option when trace is initiated by this component. There are two additional options:
    1. Component that makes deferred or delayed recording decision may communicate priority of recording by setting recorded flag to 1 for a subset of requests.
    2. Component may also fall back to probability sampling to set flag recorded to 1 for the subset of requests.
3.2.3.2 Other Flags

The behavior of other flags, such as (00000100) is not defined and reserved for future use. Implementations MUST set those to zero.

3.2.4 Examples of HTTP headers

Valid traceparent when caller recorded this request:

Value = 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
base16(version) = 00
base16(trace-id) = 4bf92f3577b34da6a3ce929d0e0e4736
base16(parent-id) = 00f067aa0ba902b7
base16(trace-flags) = 01  // recorded

Valid traceparent when caller haven't recorded this request:

Value = 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-00
base16(version) = 00
base16(trace-id) = 4bf92f3577b34da6a3ce929d0e0e4736
base16(parent-id) = 00f067aa0ba902b7
base16(trace-flags) = 00  // not recorded

3.2.5 Versioning of traceparent

This specification is opinionated about future version of the trace context. Current version of this specification assumes that the future versions of traceparent header will be additive to the current one.

Implementations should follow the following rules when parsing headers with an unexpected format:

  1. Pass thru services should not analyze version. Pass thru service needs to expect that headers may have bigger size limits in the future and only disallow prohibitively large headers.
  2. When version prefix cannot be parsed (it's not 2 hex characters followed by dash (-)), implementation should restart the trace.
  3. If higher version is detected - implementation SHOULD try to parse it.
    1. If the size of header is shorter than 55 characters -implementation should not parse header and should restart the trace.
    2. Try parse trace-id: from the first dash - next 32 characters. Implementations MUST check 32 characters to be hex. Make sure they are followed by dash.
    3. Try parse parent-id: from the second dash at 35th position - 16 characters. Implementations MUST check 16 characters to be hex. Make sure this is followed by a dash.
    4. Try parse sampling bit of flags: 2 characters from third dash. Following with either end of string or a dash. If all three values were parsed successfully - implementation should use them. Implementations MUST NOT parse or assume anything about any fields unknown for this version. Implementations MUST use these fields to construct the new traceparent field according to the highest version of the specification known to the implementation (in this specification it is 00).

3.3 Tracestate field

The tracestate HTTP header field conveys information about request position in multiple distributed tracing graphs. This header is a companion header for the traceparent. If library or platform failed to parse traceparent - it MUST NOT attempt to parse the tracestate. Note, that opposite it not true - failure to parse tracestate MUST NOT affect the parsing of traceparent.

3.3.1 Header name

In order to increase interoperability across multiple protocols and encourage successful integration by default it is recommended to keep the header name lower case. Header name is a single word without any delimiters like hyphen (-).

Header name: tracestate

Platforms and libraries MUST expect header name in any casing and SHOULD send header name in lower case.

3.3.2 Header value

Multiple tracestate headers are allowed. Values from multiple headers in incoming requests SHOULD be combined in a single header according to Field Order [RFC7230] and send as a single header in outgoing request.

This section uses the Augmented Backus-Naur Form (ABNF) notation of [RFC5234], including the DIGIT rule in appendix B.1 for RFC5234. It also includes the OWS rule from RFC7230 section 3.2.3.

DIGIT rule defines number 0-9.

The OWS rule defines an optional whitespace. It is used where zero or more whitespace characters might appear. When it is preferred to improve readability

  • a sender SHOULD generate the optional whitespace as a single space; otherwise, a sender SHOULD NOT generate optional whitespace. See details in corresponding RFC.

The tracestate field value is a list as defined below. The list is a series of list-members separated by commas ,, and a list-member is a key/value pair separated by an equals sign =. Spaces and horizontal tabs surrounding list-members are ignored. There can be a maximum of 32 list-members in a list.

Empty and whitespace-only list members are allowed. Libraries and platforms MUST accept empty tracestate headers, but SHOULD avoid sending them. The reason for allowing of empty list members in tracestate is a difficulty for implementor to recognize the empty value when multiple tracestate headers were sent. Whitespace characters are allowed for a similar reason as some frameworks will inject whitespace after , separator automatically even in case of an empty header.

A simple example of a list with two list-members might look like: vendorname1=opaqueValue1,vendorname2=opaqueValue2.

list  = list-member 0*31( OWS "," OWS list-member )
list-member = key "=" value
list-member = OWS

Identifiers are short (up to 256 characters) textual identifiers.

key = lcalpha 0*255( lcalpha / DIGIT / "_" / "-"/ "*" / "/" )
key = lcalpha 0*240( lcalpha / DIGIT / "_" / "-"/ "*" / "/" ) "@" lcalpha 0*13( lcalpha / DIGIT / "_" / "-"/ "*" / "/" )
lcalpha    = %x61-7A ; a-z

Note that identifiers MUST begin with a lowercase letter, and can only contain lowercase letters a-z, digits 0-9, underscores _, dashes -, asterisks *, and forward slashes /. For multi-tenant vendors scenarios @ sign can be used to prefix vendor name. Suggested use is to allow set tenant id in the beginning of key like fw529a3039@dt - fw529a3039 is a tenant id and @dt is a vendor name. Searching for @dt= would be more robust for parsing (searching for all vendor's keys).

Value is opaque string up to 256 characters printable ASCII [RFC0020] characters (i.e., the range 0x20 to 0x7E) except comma , and =. Note that this also excludes tabs, newlines, carriage returns, etc.

value    = 0*255(chr) nblk-chr
nblk-chr = %x21-2B / %x2D-3C / %x3E-7E
chr      = %x20 / nblk-chr

The length of a combined header MUST be less than or equal to 512 bytes. If the length of a combined header is more than 512 bytes it SHOULD be ignored.

Example: vendorname1=opaqueValue1,vendorname2=opaqueValue2

The value of a concatenation of trace graph key-value pairs. Only one entry per key is allowed because the entry represents that last position in the trace. Hence implementors must overwrite their entry upon reentry to their tracing system.

For example, if tracing system name is congo, and a trace started in their system, went through a system named rojo and later returned to congo, the tracestate value would not be:

congo=congosFirstPosition,rojo=rojosFirstPosition,congo=congosSecondPosition

Rather, the entry would be rewritten to only include the most recent position: congo=congosSecondPosition,rojo=rojosFirstPosition

Limits:

The tracestate field contains essential information for request correlation. Platforms and tracing systems MUST propagate this field. There might be multiple tracestate headers in a single request according to RFC7230 section 3.2.2. Platform or tracing system may propagate them as they came, combine into a single header or split into multiple headers differently following the RFC specification.

Platforms and tracing vendors SHOULD propagate at least 512 characters of a combined header. This length includes commas required to separate list items. But does not include optional white space (OWA) characters.

There are systems where propagating of 512 characters of tracestate may be expensive. In this case the maximum size of propagated tracestate header SHOULD be documented and explained. Cost of propagating tracestate SHOULD be weighted against the value of monitoring scenarios enabled for the end users.

In situation when tracestate needs to be truncated due to size limitations, platform of tracing vendor MUST truncate whole entries. Entries larger than 128 characters long SHOULD be removed first. Then entries SHOULD be removed starting from the end of tracestate. Note, other truncation strategies like safe list entries, blocked list entries or size-based truncation MAY be used, but highly discouraged. Those strategies will decrease interoperability of various tracing vendors.

3.3.3 Examples of HTTP headers

Single tracing system (generic format):

tracestate: rojo=00f067aa0ba902b7

Multiple tracing systems (with different formatting):

tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE

3.3.4 Versioning of tracestate

Version of tracestate is defined by the version prefix of traceparent header. Implementations needs to attempt parsing of tracestate if a higher version is detected to the best of its ability. It is the implementor's decision whether to use partially-parsed tracestate key-value pairs or not.

3.4 Mutating the traceparent field

Library or platform receiving traceparent request header MUST send it to outgoing requests. It MAY mutate the value of this header before passing to outgoing requests.

If the value of the traceparent field wasn't changed before propagation - tracestate MUST NOT be modified as well. Unmodified headers propagation is typically implemented in pass-thru services like proxies. This behavior may also be implemented in a service which currently does not collect distributed tracing information.

Here is the list of allowed mutations:

  1. Update parent-id. The value of property parent-id can be set to the new value representing the ID of the current operation. This is the most typical mutation and should be considered a default.
  2. Indicate recorded state. The value of recorded flag of trace-flags may be set to 1 if it had value 0 before or vice versa. parent-id MUST be set to the new value with the recorded flag update. See details of recorded flag for more information on how this flag is recommended to be used.
  3. Update recorded. The value of recorded reflects the caller's recording behavior: either the trace data were dropped or may have been recorded out-of-band. This mutation gives the downstream tracer information about the likelihood its parent's information was recorded.
  4. Restarting trace. All properties - trace-id, parent-id, trace-flags are regenerated. This mutation is used in the services defined as a front gate into secure networks and eliminates a potential denial of service attack surface. Implementations SHOULD clean up tracestate collection on traceparent restart. There are rare cases when the original tracestate entries must be preserved after restart. Typically, when trace-id will be reverted back at some point of the trace flow - for instance, when it leaves the secure network. However, it SHOULD be an explicit decision, not a default behavior. As trace vendors may rely on trace-id matching tracestate values.

Libraries and platforms MUST NOT make any other mutations to the traceparent header.

3.5 Mutating the tracestate field

Library or platform receiving tracestate request header MUST send it to outgoing requests. It MAY mutate the value of this header before passing to outgoing requests. The main concept of tracestate mutations is that the order of unmodified key-value pairs MUST be preserved. Modified keys MUST be moved to the beginning of the list.

Here is the list of allowed mutations:

  1. Update key value. The value of any key can be updated. Modified keys MUST be moved to the beginning of the list. This is the most common mutation resuming the trace.
  2. Add new key-value pair. New key-value pair should be added into the beginning of the list.
  3. Delete the key-value pair. Any key-value pair MAY be deleted. It is highly discouraged to delete keys that weren't generated by the same tracing system or platform. Deletion of unknown key-value pair will break correlation in other systems. This mutation enables two scenarios. The first is proxies can block certain tracestate keys for privacy and security concerns. The second scenario is a truncation of long tracestate's.

4. Other communication protocols

While trace context is defined for HTTP the authors acknowledge it is also relevant for other communication protocols. Extensions of this specification as well as specifications produced by external organizations define the format of trace context serialization and deserialization for other protocols. Note, that these extensions may be at a different maturity level than this specification.

Please refer to the [trace-context-protocols-registry] for the details of trace context implementation for other protocols.

5. Privacy Considerations

Requirements to propagate headers to downstream services as well as storing values of these headers opens up potential privacy concerns. Trace vendors MUST NOT use traceparent and tracestate fields for any personally identifiable or otherwise sensitive information. The only purpose of these fields is to enable trace correlation.

Trace vendors MUST assess the risk of header abuse. This section provides some considerations and initial assessment of the risk associated with storing and propagating these headers. Tracing systems or platforms may choose to inspect and remove sensitive information from the fields before allowing the tracing system to execute code that potentially can propagate or store these fields. All mutations should, however, conform to the list of mutations defined in this specification.

5.1 Privacy of traceparent field

The traceparent field is comprised of randomly-generated numbers. If a random number generator leverages any user identifiable information like IP address as seed state - this information may be exposed. Random number generators MUST NOT rely on any information that can potentially be user identifiable.

Another privacy risk of the traceparent field is an ability to correlate calls made as part of a single transaction. A downstream service may track and correlate two or more calls made in a single transaction and make assumptions about the identity of the caller of one call based on information from another call.

Note, both of the mentioned privacy concerns of the traceparent field are theoretical rather than practical. Some services initiating or receiving a call MAY choose to restart a traceparent field to eliminate those risks completely. It is recommended to find a way to minimize the number of distributed trace restarts to promote interoperability of tracing vendors. Instead, different techniques may be used. For example, services may define trust boundaries of upstream and downstream connections and the level of exposure any calls may bring. For instance, only restart traceparent for authentication calls from or to external services.

Services may also define an algorithm and audit mechanism to validate randomness of incoming or outgoing random numbers in the traceparent field. Note, this algorithm will be services-specific and not a part of this specification. One example could be a temporal algorithm where a reversible hash function is applied to the current clock time. The receiver can validate that the time is within agreed upon boundaries, meaning the random number was generated with the required algorithm and in fact doesn't contain any personal identifiable information.

5.2 Privacy of tracestate field

The tracestate field may contain any opaque value in any of the keys. The main purpose of this header is to provide additional vendor-specific trace-identification information across different distributed tracing systems.

Tracing systems MUST NOT include any personally identifiable information in the tracestate header.

Platforms and tracing systems extremely sensitive to personal information exposure MAY implement selective removal of values corresponding to the unknown keys. This mutation of the tracestate field is not forbidden, but highly discouraged. As it defeats the purpose of this field for allowing multiple tracing systems to collaborate.

5.3 Other risks

In implementations where traceparent and tracestate headers are included in responses, these values may inadvertently be passed to cross-origin callers. Implementations should ensure that they only include these response headers when responding to systems that participated in the trace.

6. Security Considerations

There are two types of potential security risks associated with this specification: information exposure and denial of service attacks against the tracing system.

Services and platforms relying on traceparent and tracestate headers should also follow all the best practices of parsing potentially malicious headers. Including checking for header length and content of header values. These practices help to avoid buffer overflow and html injection attacks.

6.1 Information exposure

As mentioned in the privacy section, information in traceparent and tracestate headers may carry information that can be considered sensitive. For example, traceparent may allow one call to be correlated to the data sent with another call. tracestate may imply the version of monitoring software used by the caller. This information could potentially be used to create a larger attack.

Application owners should either ensure that no proprietary or confidential information is stored in the tracestate, or they should ensure that tracestate isn't present in requests to external systems.

6.2 Denial of service

When distributed tracing is enabled on a service with a public API and naively continues any trace with the recorded flag set, a malicious attacker could overwhelm an application with tracing overhead, forge trace-id collisions that make monitoring data unusable, or run up your tracing bill with your SaaS tracing vendor.

Tracing vendors and platforms should account for these situations and make sure that checks and balances are in place to protect denial of monitoring by malicious or badly authored callers.

One examples of such protection may be different tracing behavior for authenticated and unauthenticated requests. Various rate limiters for data recording can also be implemented.

6.3 Other risks

Application owners need to make sure to test all code paths leading to the sending of traceparent and tracestate headers. For example, in single page browser applications it is typical to make cross-origin calls. If one of these code path leads to the sending of traceparent and tracestate headers - cross-origin calls restricted via Access-Control-Allow-Headers [FETCH], it may fail.

A. Acknowledgments

Thanks to Adrian Cole, Christoph Neumüller, Daniel Khan, Erika Arnold, Fabian Lange, Matthew Wear, Reiley Yang, Ted Young, Tyler Benson, Victor Soares for their contributions to this work.

B. Glossary

This section is non-normative.

Distributed trace
A distributed trace is a set of events, triggered as a result of a single logical operation, consolidated across various components of an application. A distributed trace contains events that cross process, network and security boundaries. A distributed trace may be initiated when someone presses a button to start an action on a website - in this example, the trace will represent calls made between the downstream services that handled the chain of requests initiated by this button being pressed.

C. References

C.1 Normative references

[BIT-FIELD]
8-bit field. Wikipedia. URL: https://en.wikipedia.org/wiki/Bit_field
[FETCH]
Fetch Standard. Anne van Kesteren. WHATWG. Living Standard. URL: https://fetch.spec.whatwg.org/
[RFC0020]
ASCII format for network interchange. V.G. Cerf. IETF. October 1969. Internet Standard. URL: https://tools.ietf.org/html/rfc20
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[RFC5234]
Augmented BNF for Syntax Specifications: ABNF. D. Crocker, Ed.; P. Overell. IETF. January 2008. Internet Standard. URL: https://tools.ietf.org/html/rfc5234
[RFC7230]
Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing. R. Fielding, Ed.; J. Reschke, Ed.. IETF. June 2014. Proposed Standard. URL: https://tools.ietf.org/html/rfc7230
[trace-context-protocols-registry]
Trace Context Protocols Registry. Sergey Kanzhelev; Philippe Le Hégaret. W3C. 14 March 2019. W3C Note. URL: https://www.w3.org/TR/trace-context-protocols-registry/