WebGPU Shading Language

W3C Working Draft,

This version:
https://www.w3.org/TR/2021/WD-WGSL-20210715/
Latest published version:
https://www.w3.org/TR/WGSL/
Editor's Draft:
https://gpuweb.github.io/gpuweb/wgsl/
Previous Versions:
Feedback:
public-gpu@w3.org with subject line “[WGSL] … message topic …” (archives)
Issue Tracking:
GitHub
Inline In Spec
Editors:
(Google)
(Apple Inc.)
Former Editor:
(Google)
Participate:
File an issue (open issues)

Abstract

Shading language for WebGPU.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

Feedback and comments on this specification are welcome. GitHub Issues are preferred for discussion on this specification. Alternatively, you can send comments to the GPU for the Web Working Group’s mailing-list, public-gpu@w3.org (archives). This draft highlights some of the pending issues that are still to be discussed in the working group. No decision has been taken on the outcome of these issues including whether they are valid.

This document was published by the GPU for the Web Working Group as a Working Draft. This document is intended to become a W3C Recommendation.

Publication as a Working Draft does not imply endorsement by the W3C Membership.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 15 September 2020 W3C Process Document.

1. Introduction

WebGPU Shading Language (WGSL) is the shader language for [WebGPU]. That is, an application using the WebGPU API uses WGSL to express the programs, known as shaders, that run on the GPU.

[[stage(fragment)]]
fn main() -> [[location(0)]] vec4<f32> {
    return vec4<f32>(0.4, 0.4, 0.8, 1.0);
}

1.1. Goals

1.2. Technical Overview

WebGPU issues a unit of work to the GPU in the form of a GPU command. WGSL is concerned with two kinds of GPU commands:

Both kinds of pipelines use shaders written in WGSL.

A shader is the portion of a WGSL program that executes a shader stage in a pipeline. A shader comprises:

When executing a shader stage, the implementation:

A WGSL program is organized into:

WGSL is an imperative language: behaviour is specified as a sequence of statements to execute. Statements:

WGSL is statically typed: each value computed by a particular expression is in a specific type, determined only by examining the program source.

WGSL has types to describe booleans, numbers, vectors, matrices, and aggregations of these in the form of arrays and structures. Additional types describe memory.

WGSL does not have implicit conversions or promotions between numeric or boolean types. Converting a value from one numeric or boolean type to another requires an explicit conversion, construction, or reinterpretation of bits. This also applies to vector types.

WGSL has texture and sampler types. Together with their associated built-in functions, these support functionality commonly used for graphics rendering, and commonly provided by GPUs.

The work of a shader stage is partitioned into one or more invocations, each of which executes the entry point, but under slightly different conditions. Invocations in a shader stage share access to certain variables:

However, the invocations act on different sets of pipeline inputs, including built-in inputs that provide an identifying value to distinguish an invocation from its peers. Also, each invocation has its own independent storage space in the form of variables in the private and function storage classes.

Invocations within a shader stage execute concurrently, and may often execute in parallel. The shader author is responsible for ensuring the dynamic behaviour of the invocations in a shader stage:

WGSL sometimes permits several possible behaviours for a given feature. This is a portability hazard, as different implementations may exhibit the different behaviours. The design of WGSL aims to minimize such cases, but is constrained by feasibility, and goals for achieving high performance across a broad range of devices.

1.3. Notation

The floor expression is defined over real numbers x:

The ceiling expression is defined over real numbers x:

The roundUp function is defined for positive integers k and n as:

The transpose of an n-column m-row matrix A is the m-column n-row matrix AT formed by copying the rows of A as the columns of AT:

The transpose of a column vector is defined by interpreting the column vector as a 1-row matrix. Similarly, the transpose of a row vector is defined by interpreting the row vector as a 1-column matrix.

2. Shader Lifecycle

There are four key events in the lifecycle of a WGSL program and the shaders it may contain. The first two correspond to the WebGPU API methods used to prepare a WGSL program for execution. The last two are the start and end of execution of a shader.

The events are:

  1. Shader module creation

    • This occurs when the WebGPU createShaderModule method is called. The source text for a WGSL program is provided at this time.

  2. Pipeline creation

  3. Shader execution start

  4. Shader execution end

    • This occurs when all work in the shader completes:

      • all its invocations terminate

      • and all accesses to resources complete

      • outputs, if any, are passed to downstream pipeline stages.

The events are ordered due to:

2.1. Kinds of errors

A program error is a failure to satisfy the requirements of this specification.

There are three kinds of errors, corresponding to the shader lifecycle:

Note: For example, a race condition may not be detectable.

Each requirement in this specification corresponds to a single kind of error. Generally, a requirement corresponds to the earliest error kind at which its violation could be feasibly detected. When unclear, the corresponding error kind is explicitly specified.

The WebGPU specification describes the consequences of each kind of error.

TODO: Update the WebGPU spec, referring back to the three kinds of errors defined here.

3. Textual structure TODO

TODO: This is a stub.

A WGSL program is text. This specification does not prescribe a particular encoding for that text. However, UTF-8 is always a valid encoding for a WGSL program.

Note: The intent of promoting UTF-8 like this is to simplify interchange of WGSL programs and to encourage interoperability among tools.

3.1. Comments

Comments begin with // and continue to the end of the current line. There are no multi-line comments.

TODO: What indicates the end of a line? (E.g. A line ends at the next linefeed or at the end of the program)

3.2. Tokens TODO

3.3. Literals TODO

Token Definition
DECIMAL_FLOAT_LITERAL (-?[0-9]*.[0-9]+ | -?[0-9]+.[0-9]*)((e|E)(+|-)?[0-9]+)?
HEX_FLOAT_LITERAL -?0x([0-9a-fA-F]*.?[0-9a-fA-F]+ | [0-9a-fA-F]+.[0-9a-fA-F]*)(p|P)(+|-)?[0-9]+
INT_LITERAL -?0x[0-9a-fA-F]+ | 0 | -?[1-9][0-9]*
UINT_LITERAL 0x[0-9a-fA-F]+u | 0u | [1-9][0-9]*u

Note: literals are parsed greedily. This means that for statements like a -5 this will not parse as a minus 5 but instead as a -5 which may be unexpected. A space must be inserted after the - if the first expression is desired.

const_literal
  : INT_LITERAL
  | UINT_LITERAL
  | FLOAT_LITERAL
  | TRUE
  | FALSE
FLOAT_LITERAL
  : DECIMAL_FLOAT_LITERAL
  | HEX_FLOAT_LITERAL

3.4. Keywords TODO

TODO: Stub

See § 14.1 Keyword Summary for a list of keywords.

3.5. Identifiers TODO

Token Definition
IDENT [a-zA-Z][0-9a-zA-Z_]*

An identifier must not have the same spelling as a keyword or as a reserved keyword.

3.6. Attributes

An attribute modifies an object or type. WGSL provides a unified syntax for applying attributes. Attributes are used for a variety of purposes such as specifying the interface with the API. Generally speaking, from the language’s point-of-view, attributes can be ignored for the purposes of type and semantic checking.

An attribute must not be specified more than once per object or type.

attribute_list
  : ATTR_LEFT (attribute COMMA)* attribute ATTR_RIGHT

attribute
  : IDENT PAREN_LEFT literal_or_ident PAREN_RIGHT
  | IDENT

literal_or_ident
  : FLOAT_LITERAL
  | INT_LITERAL
  | UINT_LITERAL
  | IDENT
Attributes defined in WGSL
Attribute Valid Values Description
align positive i32 literal Must only be applied to a member of a structure type.

Must be a power of 2.

See memory layout alignment and size.

binding non-negative i32 literal Must only be applied to a resource variable.

Specifies the binding number of the resource in a bind group. See § 9.3.2 Resource interface.

block None Must only be applied to a structure type.

Indicates this structure type represents the contents of a buffer resource occupying a single binding slot in the shader’s resource interface.

The block attribute must be applied to a structure type used as the store type of a uniform buffer or storage buffer variable.

A structure type with the block attribute must not be:

  • the element type of an array type

  • the member type in another structure

builtin a builtin variable identifier Must only be applied to an entry point function parameter, entry point return type, or member of a structure.

Declares a builtin variable. See § 15 Built-in variables.

group non-negative i32 literal Must only be applied to a resource variable.

Specifies the binding group of the resource. See § 9.3.2 Resource interface.

interpolate One or two parameters.

The first parameter must be an interpolation type. The second parameter, if present, must specify the interpolation sampling.

Must only be applied to an entry point function parameter, entry point return type, or member of a structure type. Must only be applied to declarations of scalars or vectors of floating-point type. Must not be used with the compute shader stage.

Specifies how the user-defined IO must be interpolated. The attribute is only significant on user-defined vertex outputs and fragment inputs. See § 9.3.1.3 Interpolation.

invariant None Must only be applied to the position built-in variable.

When applied to the position built-in output variable of a vertex shader, the computation of the result is invariant across different programs and different invocations of the same entry point. That is, if the data and control flow match for two position outputs in different entry points, then the result values are guaranteed to be the same. There is no affect on a position built-in input variable.

Note: this attribute maps to the Invariant decoration in SPIR-V, the invariant attribute in Metal shading language, the precise qualifier in HLSL, the invariant qualifier in GLSL.

location non-negative i32 literal Must only be applied to an entry point function parameter, entry point return type, or member of a structure type. Must only be applied to declarations of numeric scalar or numeric vector type. Must not be used with the compute shader stage.

Specifies a part of the user-defined IO of an entry point. See § 9.3.1.4 Input-output Locations.

override An optional, non-negative i32 literal Must only be applied to module scope constant declaration of scalar type.

Specifies a pipeline-overridable constant. In the WebGPU API, pipeline overridable constants are specified by the identifier of the constant the attribute is applied to. If the optional parameter is specified, the pipeline overridable constant is referred to by the numeric id specified instead.

size positive i32 literal Must only be applied to a member of a structure type.

The number of bytes reserved in the struct for this member.

stage compute, vertex, or fragment Must only be applied to a function declaration.

Declares an entry point by specifying its pipeline stage.

stride positive i32 literal Must only be applied to an array type.

The number of bytes from the start of one element of the array to the start of the next element.

workgroup_size One, two or three parameters.

Each parameter is either a literal or module-scope constant. All parameters must be of the same type, either i32 or u32.

Must be applied to a compute shader entry point function. Must not be applied to any other object.

Specifies the x, y, and z dimensions of the workgroup grid for the compute shader.

The first parameter specifies the x dimension. The second parameter, if provided, specifies the y dimension, otherwise is assumed to be 1. The third parameter, if provided, specifies the z dimension, otherwise is assumed to be 1. Each dimension must be at least 1 and at most an upper bound specified by the WebGPU API.

3.7. Directives TODO

A directive is a token sequence which modifies how a WGSL program is processed by a WebGPU implementation. See § 10.1 Enable Directive.

3.8. Declaration and scope

A declaration associates an identifier with one of the following kinds of objects:

In other words, a declaration introduces a name for an object. A name cannot be used before it is declared.

The scope of a declaration is the set of program locations where a use of the declared identifier potentially denotes its associated object. We say the identifier is in scope (of the declaration) at those source locations.

Each kind of declaration has its own rule for determining its scope. In general the scope is a span of text beginning immediately after the end of the declaration.

Certain objects are provided by the WebGPU implementation, and are treated as if they have already been declared at the start of a WGSL program. We say such objects are predeclared. Their scope is the entire WGSL program. Examples of predeclared objects are:

A declaration must not introduce a name when that identifier is already in scope with the same end scope as another instance of that name. When an identifier is used in scope of one or more declarations for that name, the identifier will denote the object of the declaration appearing closest to that use. We say the identifier use resolves to that declaration.

Note: A declaration always precedes its identifier’s scope. Therefore, the nearest in scope declaration of an identifier always precedes the use of the identifier.

EXAMPLE: Valid and invalid declarations
// Invalid, cannot reuse built-in function names.
var<private> modf: f32 = 0.0;

// Valid, foo_1 is in scope until the end of the program.
var<private> foo: f32 = 0.0; // foo_1

// Valid, bar_1 is in scope until the end of the program.
var<private> bar: u32 = 0u; // bar_1

// Valid, my_func_1 is in scope until the end of the program.
// Valid, foo_2 is in scope until the end of the function.
fn my_func(foo: f32) { // my_func_1, foo_2
  // Any reference to 'foo' resolves to the function parameter.

  // Invalid, the scope of foo_3 ends at the of the function.
  var foo: f32; // foo_3

  // Valid, bar_2 is in scope until the end of the function.
  var bar: u32; // bar_2
  // References to 'bar' resolve to bar_2
  {
    // Valid, bar_3 is in scope until the end of the compound statement.
    var bar: u32; // bar_3
    // References to 'bar' resolve to bar_3

    // Invalid, bar_4 has the same end scope as bar_3.
    var bar: i32; // bar_4

    // Valid, i_1 is in scope until the end of the for loop
    for (var i: i32 = 0; i < 10; i = i + 1) { // i_1
      // Invalid, i_2 has the same end scope as i_1.
      var i: i32 = 1; // i_2.
    }
  }

  // Invalid, bar_5 has the same end scope as bar_2.
  var bar: u32; // bar_5
}

// Invalid, bar_6 has the same end scope as bar_1.
var<private> bar: u32 = 1u; // bar_6

// Invalid, my_func_2 has the same end scope as my_func_1.
fn my_func() { } // my_func_2

// Valid, my_foo_1 is in scope until the end of the program.
fn my_foo(
  // Valid, my_foo_2 is in scope until the end of the function.
  my_foo: i32 // my_foo_2
) { }

There are multiple levels of scoping depending on how and where things are declared.

When an identifier is used, it must be in scope for some declaration, or as part of a directive.

A declaration is at module scope if the declaration appears outside the text of any other declaration.

Note: Only a function declaration can contain other declarations.

4. Types

Programs calculate values.

In WGSL, a type is set of values, and each value belongs to exactly one type. A value’s type determines the syntax and semantics of operations that can be performed on that value.

For example, the mathematical number 1 corresponds to three distinct values in WGSL:

WGSL treats these as different because their machine representation and operations differ.

A type is either predeclared, or created in WGSL source via a declaration.

We distinguish between the concept of a type and the syntax in WGSL to denote that type. In many cases the spelling of a type in this specification is the same as its WGSL syntax. For example:

Some WGSL types are only used for analyzing a source program and for determining the program’s runtime behaviour. This specification will describe such types, but they do not appear in WGSL source text.

Note: WGSL reference types are not written in WGSL programs. See TODO forward reference to ptr/ref.

4.1. Type Checking

A WGSL value is computed by evaluating an expression. An expression is a segment of source text parsed as one of the WGSL grammar rules whose name ends with "_expression". An expression E can contain subexpressions which are expressions properly contained in the outer expression E.

The particular value produced by an expression evaluation depends on:

The values that may result from evaluating a particular expression will always belong to a specific WGSL type, known as the static type of the expression. The rules of WGSL are designed so that the static type of an expression depends only on the expression’s static context.

Statements often use expressions, and may place requirements on the static types of those expressions. For example:

Type checking a successfully parsed WGSL program is the process of mapping each expression to its static type, and determining if the type requirements of each statement are satisfied.

A type assertion is a mapping from some WGSL source expression to a WGSL type. The notation

e : T

is a type assertion meaning T is the static type of WGSL expression e.

Note: A type assertion is a statement of fact about the text of a program. It is not a runtime check.

Finding static types for expressions can be performed by recursively applying type rules. A type rule has two parts:

A type rule applies to an expression when:

TODO: write an example such as 1+2, or 3 - a, where a is in-scope of a let declaration with i32 type.

The type rules are designed so that if parsing succeeds, at most one type rule will apply to each expression. If a type rule applies to an expression, then the conclusion is asserted, and therefore determines the static type of the expression.

A WGSL source program is well-typed when:

Otherwise there is a type error and the source program is not a valid WGSL program.

WGSL is a statically typed language because type checking a WGSL program will either succeed or discover a type error, while only having to inspect the program source text.

TODO(dneto): Lazy-decay is a tie-breaking rule. The above description can accomodate it by using priority-levels on potentially-matching type rules.

4.1.1. Type rule tables

The WGSL type rules are organized into type rule tables, with one row per type rule.

The semantics of an expression is the effect of evaluating that expression, and is primarily the production of a result value. The Description column of the type rule that applies to an expression will specify the expression’s semantics. The semantics usually depends on the values of the type rule parameters, including the assumed values of any subexpressions. Sometimes the semantics of an expression includes effects other than producing a result value, such as the non-result-value effects of its subexpressions.

TODO: example: non-result-value effect is any side effect of a function call subexpression.

4.2. Plain Types

Plain types are the types for representing boolean values, numbers, vectors, matrices, or aggregations of such values.

A plain type is either a scalar type, an atomic type, or a composite type.

Note: Plain types in WGSL are similar to Plain-Old-Data types in C++, but also include atomic types.

4.2.1. Boolean Type

The bool type contains the values true and false.

4.2.2. Integer Types

The u32 type is the set of 32-bit unsigned integers.

The i32 type is the set of 32-bit signed integers. It uses a two’s complementation representation, with the sign bit in the most significant bit position.

4.2.3. Floating Point Type

The f32 type is the set of 32-bit floating point values of the IEEE-754 binary32 (single precision) format. See § 12.5 Floating Point Evaluation for details.

4.2.4. Scalar Types

The scalar types are bool, i32, u32, and f32.

The numeric scalar types are i32, u32, and f32.

4.2.5. Vector Types

A vector is a grouped sequence of 2, 3, or 4 scalar components.

Type Description
vecN<T> Vector of N elements of type T. N must be in {2, 3, 4} and T must be one of the scalar types. We say T is the component type of the vector.

A vector is a numeric vector if its component type is a numeric scalar.

Key use cases of a vector include:

Many operations on vectors act component-wise, i.e. the result vector is formed by operating on each component independently.

EXAMPLE: Vector
vec2<f32>  // is a vector of two f32s.
EXAMPLE: Component-wise addition
let x : vec3<f32> = a + b; // a and b are vec3<f32>
// x[0] = a[0] + b[0]
// x[1] = a[1] + b[1]
// x[2] = a[2] + b[2]

4.2.6. Matrix Types

A matrix is a grouped sequence of 2, 3, or 4 floating point vectors.

Type Description
matNxM<f32> Matrix of N columns and M rows, where N and M are both in {2, 3, 4}. Equivalently, it can be viewed as N column vectors of type vecM<f32>.

The key use case for a matrix is to embody a linear transformation. In this interpretation, the vectors of a matrix are treated as column vectors.

The product operator (*) is used to either:

See § 6.9 Arithmetic Expressions.

EXAMPLE: Matrix
mat2x3<f32>  // This is a 2 column, 3 row matrix of 32-bit floats.
             // Equivalently, it is 2 column vectors of type vec3<f32>.

4.2.7. Atomic Types

An atomic type encapsulates a scalar type such that:

Type Description
atomic<T> Atomic of type T. T must be either u32 or i32.

An expression must not evaluate to an atomic type.

Atomic types may only be instantiated by variables in the workgroup storage class or by storage buffer variables with a read_write access mode.

An atomic modification is any operation on an atomic object which sets the content of the object. The operation counts as a modification even if the new value is the same as the object’s existing value.

In WGSL, atomic modifications are mutually ordered, for each object. That is, during execution of a shader stage, for each atomic object A, all agents observe the same order of modification operations applied to A. The ordering for distinct atomic objects may not be related in any way; no causality is implied. Note that variables in workgroup storage are shared within a workgroup, but are not shared between different workgroups.

TODO: Add links the eventual memory model descriptions.

EXAMPLE: Mapping atomics in a storage variable to SPIR-V
[[block]] struct S {
  a: atomic<i32>;
  b: atomic<u32>;
};

[[group(0), binding(0)]]
var<storage,read_write> x: S;

// Maps to the following SPIR-V:
// - When atomic types are members of a struct, the Volatile decoration
//   is annotated on the member.
// OpDecorate %S Block
// OpMemberDecorate %S 0 Volatile
// OpMemberDecorate %S 1 Volatile
// ...
// %i32 = OpTypeInt 32 1
// %u32 = OpTypeInt 32 0
// %S = OpTypeStruct %i32 %u32
// %ptr_storage_S = OpTypePointer StorageBuffer %S
// %x = OpVariable %ptr_storage_S StorageBuffer
EXAMPLE: Mapping atomics in a workgroup variable to SPIR-V
var<workgroup> x: atomic<u32>;

// Maps to the following SPIR-V:
// - When atomic types are directly instantiated by a variable,  the Volatile
//   decoration is annotated on the OpVariable.
// OpDecorate %x Volatile
// ...
// %u32 = OpTypeInt 32 0
// %ptr_workgroup_u32 = OpTypePointer Workgroup %S
// %x = OpVariable %ptr_workgroup_u32 Workgroup

4.2.8. Array Types

An array is an indexable grouping of element values.

Type Description
array<E,N> A sized array with N elements of type E.
N must be 1 or larger.
array<E> A runtime-sized array of elements of type E. These may only appear in specific contexts.

Note: Sized arrays and runtime-sized arrays are always distinct types. That is, array<E,N> and array<E> are different.

The first element in an array is at index 0, and each successive element is at the next integer index. See § 6.7.3 Array Access Expression.

An array element type must be one of:

Note: That is, the element type must be a plain type.

WGSL defines the following attributes that can be applied to array types:

Restrictions on runtime-sized arrays:

4.2.9. Structure Types

A structure is a grouping of named member values.

Type Description
struct<T1,...,TN> An ordered tuple of N members of types T1 through TN, with N being an integer greater than 0. A structure type declaration specifies an identifier name for each member. Two members of the same structure type must not have the same name.

A structure member type must be one of:

Note: That is, each member type must be a plain type.

Note: The structure member type restriction and the array element type restriction are mutually reinforcing. Combined, they imply that a pointer may not appear in any level of nesting within either an array or structure. Similarly, the same limitations apply to textures and samplers.

EXAMPLE: Structure
// A structure with two members.
struct Data {
  a: i32;
  b: vec2<f32>;
};
struct_decl
  : attribute_list* STRUCT IDENT struct_body_decl
struct_body_decl
  : BRACE_LEFT struct_member* BRACE_RIGHT

struct_member
  : attribute_list* variable_ident_decl SEMICOLON

WGSL defines the following attributes that can be applied to structure types:

WGSL defines the following attributes that can be applied to structure members:

Note: Layout attributes may be required if the structure type is used to define a uniform buffer or a storage buffer. See § 4.3.7 Memory Layout.

EXAMPLE: Structure WGSL
struct my_struct {
  a: f32;
  b: vec4<f32>;
};
EXAMPLE: Structure SPIR-V
             OpName %my_struct "my_struct"
             OpMemberName %my_struct 0 "a"
             OpMemberDecorate %my_struct 0 Offset 0
             OpMemberName %my_struct 1 "b"
             OpMemberDecorate %my_struct 1 Offset 4
%my_struct = OpTypeStruct %float %v4float
EXAMPLE: Structure WGSL
// Runtime Array
type RTArr = [[stride(16)]] array<vec4<f32>>;
[[block]] struct S {
  a: f32;
  b: f32;
  data: RTArr;
};
EXAMPLE: Structure SPIR-V
             OpName %my_struct "my_struct"
             OpMemberName %my_struct 0 "a"
             OpMemberDecorate %my_struct 0 Offset 0
             OpMemberName %my_struct 1 "b"
             OpMemberDecorate %my_struct 1 Offset 4
             OpMemberName %my_struct 2 "data"
             OpMemberDecorate %my_struct 2 Offset 16
             OpDecorate %rt_arr ArrayStride 16
   %rt_arr = OpTypeRuntimeArray %v4float
%my_struct = OpTypeStruct %float %v4float %rt_arr

4.2.10. Composite Types

A type is composite if it has internal structure expressed as a composition of other types. The internal parts do not overlap, and are called components.

The composite types are:

4.2.11. Constructible Types

Many kinds of values can be created, loaded, stored, passed into functions, and returned from functions. We call these constructible.

A type is constructible if it is one of:

Note: All constructible types are plain.

Note: Atomic types and runtime-sized array types are not constructible. Composite types containing atomics and runtime-sized arrays are not constructible.

4.3. Memory

In WGSL, a value of storable type may be stored in memory, for later retrieval. This section describes the structure of memory, and how WGSL types are used to describe the contents of memory.

In general WGSL follows the Vulkan Memory Model.

4.3.1. Memory Locations

Memory consists of a set of distinct memory locations. Each memory location is 8-bits in size. An operation affecting memory interacts with a set of one or more memory locations.

Two sets of memory locations overlap if the intersection of their sets of memory locations is non-empty. Each variable declaration has a set of memory locations that does not overlap with the sets of memory locations of any other variable declaration. Memory operations on structures and arrays may access padding between elements, but must not access padding at the end of the structure or array.

4.3.2. Memory Access Mode

A memory access is an operation that acts on memory locations.

A single operation can read, write, or both read and write.

Particular memory locations may support only certain kinds of accesses, expressed as the memory’s access mode:

read

Supports read accesses, but not writes.

write

Supports write accesses, but not reads.

read_write

Supports both read and write accesses.

access_mode
  : READ
  | WRITE
  | READ_WRITE

4.3.3. Storable Types

The value contained in a variable must be of a storable type. A storable type may have an explicit representation defined by WGSL, as described in § 4.3.7.4 Internal Layout of Values, or it may be opaque, such as for textures and samplers.

A type is storable if it is one of:

Note: That is, the storable types are the plain types, texture types, and sampler types.

4.3.4. IO-shareable Types

Pipeline input and output values must be of IO-shareable type.

A type is IO-shareable if it is one of:

The following kinds of values must be of IO-shareable type:

Note: Only built-in pipeline inputs may have a boolean type. A user input or output data attribute must not be of bool type or contain a bool type. See § 9.3.1 Pipeline Input and Output Interface.

4.3.5. Host-shareable Types

Host-shareable types are used to describe the contents of buffers which are shared between the host and the GPU, or copied between host and GPU without format translation. When used for this purpose, the type must be additionally decorated with layout attributes as described in § 4.3.7 Memory Layout. We will see in § 5.1 Module Scope Variables that the store type of uniform buffer and storage buffer variables must be host-shareable.

A type is host-shareable if it is one of:

WGSL defines the following attributes that affect memory layouts:

Note: An IO-shareable type T would also be host-shareable if T and its subtypes have appropriate stride attributes, and if T is not bool and does not contain a bool. Additionally, a runtime-sized array is host-shareable but is not IO-shareable.

Note: Both IO-shareable and host-shareable types have concrete sizes, but counted differently. IO-shareable types are sized by a location-count metric, see § 9.3.1.4 Input-output Locations. Host-shareable types are sized by a byte-count metric, see § 4.3.7 Memory Layout.

4.3.6. Storage Classes

Memory locations are partitioned into storage classes. Each storage class has unique properties determining mutability, visibility, the values it may contain, and how to use variables with it.

Storage Classes
Storage class Sharing among invocations Supported access modes Variable scope Restrictions on stored values Notes
function Same invocation only read_write Function scope Constructible type
private Same invocation only read_write Module scope Constructible type
workgroup Invocations in the same compute shader workgroup read_write Module scope Plain type, excluding runtime-sized arrays, or composite types containing runtime-sized arrays
uniform Invocations in the same shader stage read Module scope Constructible host-shareable types For uniform buffer variables
storage Invocations in the same shader stage read_write, read (default) Module scope Host-shareable For storage buffer variables
handle Invocations in the same shader stage read Module scope Sampler types or texture types For sampler and texture variables.

Note: The token handle is reserved: it is never used in a WGSL program.

Note: A texture variable holds an opaque handle which is used to access the underlying grid of texels. The handle itself is always read-only. In most cases the underlying texels are read-only. For a write-only storage texture, the underlying texels are write-only.

storage_class
  | FUNCTION
  | PRIVATE
  | WORKGROUP
  | UNIFORM
  | STORAGE
WGSL storage class SPIR-V storage class
uniform Uniform
workgroup Workgroup
handle UniformConstant
storage StorageBuffer
private Private
function Function

4.3.7. Memory Layout

Uniform buffer and storage buffer variables are used to share bulk data organized as a sequence of bytes in memory. Buffers are shared between the CPU and the GPU, or between different shader stages in a pipeline, or between different pipelines.

Because buffer data are shared without reformatting or translation, buffer producers and consumers must agree on the memory layout, which is the description of how the bytes in a buffer are organized into typed WGSL values.

The store type of a buffer variable must be host-shareable, with fully elaborated memory layout, as described below.

Each buffer variable must be declared in either the uniform or storage storage classes.

The memory layout of a type is significant only when evaluating an expression with:

An 8-bit byte is the most basic unit of host-shareable memory. The terms defined in this section express counts of 8-bit bytes.

We will use the following notation:

4.3.7.1. Alignment and Size

Each host-shareable data type T has an alignment and size value, denoted by AlignOf(T) and SizeOf(T), respectively.

Alignment guarantees that a value’s address in memory will be a multiple of the specified value. This can enable more efficient hardware instructions to be used to access the value or satisfy more restrictive hardware requirements on certain storage classes. (see storage class layout constraints).

Note: Each alignment value is always a power of two, by construction.

The size of a type or structure member is the number of contiguous bytes reserved in host-shareable memory for the purpose of storing a value of the type or structure member. The size may include non-addressable padding at the end of the type. Consequently, loads and stores of a value might access fewer memory locations than the value’s size.

Alignment and size for host-shareable types are defined recursively in the following table:

Alignment and size for host-shareable types
Host-shareable type T AlignOf(T) SizeOf(T)
i32, u32, or f32 4 4
atomic<T> 4 4
vec2<T> 8 8
vec3<T> 16 12
vec4<T> 16 16
matNxM (col-major)

(General form)

AlignOf(vecM) SizeOf(array<vecM, N>)
mat2x2<f32> 8 16
mat3x2<f32> 8 24
mat4x2<f32> 8 32
mat2x3<f32> 16 32
mat3x3<f32> 16 48
mat4x3<f32> 16 64
mat2x4<f32> 16 32
mat3x4<f32> 16 48
mat4x4<f32> 16 64
struct S max(AlignOfMember(S, M1), ... , AlignOfMember(S, MN))
roundUp(AlignOf(S), OffsetOfMember(S, L) + SizeOfMember(S, L))

Where L is the last member of the structure
array<E, N>

(Implicit stride)

AlignOf(E) N * roundUp(AlignOf(E), SizeOf(E))
array<E>

(Implicit stride)

AlignOf(E) Nruntime * roundUp(AlignOf(E), SizeOf(E))

Where Nruntime is the runtime-determined number of elements of T
[[stride(Q)]]
array<E, N>
AlignOf(E) N * Q
[[stride(Q)]]
array<E>
AlignOf(E) Nruntime * Q
4.3.7.2. Structure Layout Rules

Each structure S member MN has a size and alignment value, denoted by SizeOfMember(S, MN) and AlignOfMember(S, MN), respectively. The member sizes and alignments are used to calculate each member’s byte offset from the start of the structure.

Structures member size and alignment values default to the member type T's SizeOf(T) and AlignOf(T) values.

If a structure member is decorated with the size decoration, then the structure member will use the value of the decoration for its size instead of its type’s size.

If a structure member is decorated with the align decoration, then the structure member will use the value of the decoration for its alignment instead of its type’s alignment.

The first structure member always has a zero byte offset from the start of the structure.

Subsequent members have the following byte offset from the start of the structure:

OffsetOfMember(S, MN) = roundUp(AlignOfMember(S, MN), OffsetOfMember(S, MN-1) + SizeOfMember(S, MN-1)
Where MN is the current member and MN-1 is the previous member

Structure members must not overlap. If a structure member is decorated with the size attribute, the value must be at least as large as the size of the member’s type:

SizeOfMember(S, MN) ≥ SizeOf(T)
Where T is the type of member MN.

The alignment of a structure is equal to the largest alignment of all of its members:

AlignOf(S) = max(AlignOfMember(S, M1), ... , AlignOfMember(S, MN))

The size of a structure is equal to the offset plus the size of its last member, rounded to the next multiple of the structure’s alignment:

SizeOf(S) = roundUp(AlignOf(S), OffsetOfMember(S, L) + SizeOfMember(S, L))
Where L is the last member of the structure

EXAMPLE: Layout of structures using implicit member sizes, alignments and strides
struct A {                                     //             align(8)  size(24)
    u: f32;                                    // offset(0)   align(4)  size(4)
    v: f32;                                    // offset(4)   align(4)  size(4)
    w: vec2<f32>;                              // offset(8)   align(8)  size(8)
    x: f32;                                    // offset(16)  align(4)  size(4)
    // -- implicit struct size padding --      // offset(20)            size(4)
};

[[block]] struct B {                           //             align(16) size(160)
    a: vec2<f32>;                              // offset(0)   align(8)  size(8)
    // -- implicit member alignment padding -- // offset(8)             size(8)
    b: vec3<f32>;                              // offset(16)  align(16) size(12)
    c: f32;                                    // offset(28)  align(4)  size(4)
    d: f32;                                    // offset(32)  align(4)  size(4)
    // -- implicit member alignment padding -- // offset(36)            size(12)
    e: A;                                      // offset(40)  align(8)  size(24)
    f: vec3<f32>;                              // offset(64)  align(16) size(12)
    // -- implicit member alignment padding -- // offset(76)            size(4)
    g: array<A, 3>;                            // offset(80)  align(8)  size(72) stride(24)
    h: i32;                                    // offset(152) align(4)  size(4)
    // -- implicit struct size padding --      // offset(156)           size(4)
};

[[group(0), binding(0)]]
var<storage,read_write> storage_buffer: B;
EXAMPLE: Layout of structures with explicit member sizes, alignments and strides
struct A {                                     //             align(8)  size(32)
    u: f32;                                    // offset(0)   align(4)  size(4)
    v: f32;                                    // offset(4)   align(4)  size(4)
    w: vec2<f32>;                              // offset(8)   align(8)  size(8)
    [[size(16)]] x: f32;                       // offset(16)  align(4)  size(16)
};

[[block]] struct B {                           //             align(16) size(208)
    a: vec2<f32>;                              // offset(0)   align(8)  size(8)
    // -- implicit member alignment padding -- // offset(8)             size(8)
    b: vec3<f32>;                              // offset(16)  align(16) size(12)
    c: f32;                                    // offset(28)  align(4)  size(4)
    d: f32;                                    // offset(32)  align(4)  size(4)
    // -- implicit member alignment padding -- // offset(36)            size(12)
    [[align(16)]] e: A;                        // offset(48)  align(16) size(32)
    f: vec3<f32>;                              // offset(80)  align(16) size(12)
    // -- implicit member alignment padding -- // offset(92)            size(4)
    g: [[stride(32)]] array<A, 3>;             // offset(96)  align(8)  size(96)
    h: i32;                                    // offset(192) align(4)  size(4)
    // -- implicit struct size padding --      // offset(196)           size(12)
};

[[group(0), binding(0)]]
var<uniform> uniform_buffer: B;
4.3.7.3. Array Layout Rules

An array element stride is the number of bytes from the start of one array element to the start of the next element.

The first array element always has a zero byte offset from the start of the array.

If the array type is annotated with an explicit stride decoration then this will be used as the array stride, otherwise the array uses an implicit stride equal to the size of the array’s element type, rounded up to the alignment of the element type:

StrideOf(array<T[, N]>) = roundUp(AlignOf(T), SizeOf(T))

In all cases, the array stride must be a multiple of the element alignment.

EXAMPLE: Implicit / explicit array strides
// Array with an implicit element stride of 16 bytes
var implicit_stride: array<vec3<f32>, 8>;

// Array with an explicit element stride of 32 bytes
var explicit_stride: [[stride(32)]] array<vec3<f32>, 8>;

Arrays decorated with the stride attribute must have a stride that is at least the size of the element type, and be a multiple of the element type’s alignment value.

The array size is equal to the element stride multiplied by the number of elements:

SizeOf(array<T, N>) = StrideOf(array<T, N>) × N
SizeOf(array<T>) = StrideOf(array<T>) × Nruntime

The array alignment is equal to the element alignment:

AlignOf(array<T[, N]>) = AlignOf(T)

For example, the layout for a [[stride(S)]] array<T, 3> type is equivalent to the following structure:

EXAMPLE: Structure equivalent of a three element array
struct Array {
  [[size(S)]] element_0: T;
  [[size(S)]] element_1: T;
  [[size(S)]] element_2: T;
};
4.3.7.4. Internal Layout of Values

This section describes how the internals of a value are placed in the byte locations of a buffer, given an assumed placement of the overall value. These layouts depend on the value’s type, the stride attribute on array types, and the align and size attributes on structure type members.

The data will appear identically regardless of storage class.

When a value V of type u32 or i32 is placed at byte offset k of a host-shared buffer, then:

Note: Recall that i32 uses twos-complement representation, so the sign bit is in bit position 31.

A value V of type f32 is represented in IEEE-754 binary32 format. It has one sign bit, 8 exponent bits, and 23 fraction bits. When V is placed at byte offset k of host-shared buffer, then:

Note: The above rules imply that numeric values in host-shared buffers are stored in little-endian format.

When a value V of atomic type atomic<T> is placed in a host-shared buffer, it has the same internal layout as a value of the underlying type T.

When a value V of vector type vecN<T> is placed at byte offset k of a host-shared buffer, then:

When a matrix value M is placed at byte offset k of a host-shared memory buffer, then:

When a value of array type A is placed at byte offset k of a host-shared memory buffer, then:

When a value of structure type S is placed at byte offset k of a host-shared memory buffer, then:

4.3.7.5. Storage Class Layout Constraints

The storage and uniform storage classes have different buffer layout constraints which are described in this section.

All structure and array types directly or indirectly referenced by a variable must obey the constraints of the variable’s storage class. Violations of a storage class constraint result in a compile-time error.

In this section we define RequiredAlignOf(S, C) as the required byte offset alignment of values of host-shareable type S when used by storage class C.

Alignment requirements of a host-shareable type for storage and uniform storage classes
Host-shareable type S RequiredAlignOf(S, storage) RequiredAlignOf(S, uniform)
i32, u32, or f32 AlignOf(S) AlignOf(S)
atomic<T> AlignOf(S) AlignOf(S)
vecN<T> AlignOf(S) AlignOf(S)
matNxM<f32> AlignOf(S) AlignOf(S)
array<T, N> AlignOf(S) roundUp(16, AlignOf(S))
array<T> AlignOf(S) roundUp(16, AlignOf(S))
struct S AlignOf(S) roundUp(16, AlignOf(S))

Structure members of type T must have a byte offset from the start of the structure that is a multiple of the RequiredAlignOf(T, C) for the storage class C:

OffsetOfMember(S, M) = k × RequiredAlignOf(T, C)
Where k is a non-negative integer and M is a member of structure S with type T

Arrays of element type T must have an element stride that is a multiple of the RequiredAlignOf(T, C) for the storage class C:

StrideOf(array<T[, N]>) = k × RequiredAlignOf(T, C)
Where k is a non-negative integer

Note: RequiredAlignOf(T, C) does not impose any additional restrictions on the values permitted for an align decoration, nor does it affect the rules of AlignOf(T). Data is laid out with the rules defined in previous sections and then the resulting layout is validated against the RequiredAlignOf(T, C) rules.

The uniform storage class also requires that:

EXAMPLE: invalid structure layout for uniform storage class
struct S {
  x: f32;
};
struct Invalid {
  a: S;
  b: f32; // invalid: offset between a and b is 4 bytes, but must be at least 16
};
struct Valid {
  a: S;
  [[align(16)]] b: f32; // valid: offset between a and b is 16 bytes
};
[[group(0), binding(0)]] var<uniform> invalid: Invalid;
[[group(0), binding(1)]] var<uniform> valid: Valid;

4.4. Memory View Types

In addition to calculating with plain values, a WGSL program will also often read values from memory or write values to memory, via memory access operations. Each memory access is performed via a memory view.

A memory view comprises:

The access mode of a memory view must be supported by the storage class. See § 4.3.6 Storage Classes.

WGSL has two kinds of types for representing memory views: reference types and pointer types.

Constraint Type Description
SC is a storage class,
T is a storable type,
A is an access mode
ref<SC,T,A> The reference type identified with the set of memory views for memory locations in SC holding values of type T, supporting memory accesses described by mode A.
In this context T is known as the store type.
Reference types are not written in WGSL progam source; instead they are used to analyze a WGSL program.
SC is a storage class,
T is a storable type,
A is an access mode
ptr<SC,T,A> The pointer type identified with the set of memory views for memory locations in SC holding values of type T, supporting memory accesses described by mode A.
In this context T is known as the pointee type.
Pointer types may appear in WGSL progam source.

When analyzing a WGSL program, reference and pointer types are fully parameterized by a storage class, a storable type, and an access mode. In code examples in this specification, the comments show this fully parameterized form.

However, in WGSL source text:

EXAMPLE: Pointer type
fn my_function(
  // 'ptr<function,i32,read_write>' is the type of a pointer value that references
  // storage for keeping an 'i32' value, using memory locations in the 'function'
  // storage class.  Here 'i32' is the pointee type.
  // The implied access mode is 'read_write'. See below for access mode defaults.
  ptr_int: ptr<function,i32>,

  // 'ptr<private,array<f32,50>,read_write>' is the type of a pointer value that
  // refers to storage for keeping an array of 50 elements of type 'f32', using
  // memory locations in the 'private' storage class.
  // Here the pointee type is 'array<f32,50>'.
  // The implied access mode is 'read_write'. See below for access mode defaults.
  ptr_array: ptr<private, array<f32, 50>>
) { }

Reference types and pointer types are both sets of memory views: a particular memory view is associated with a unique reference value and also a unique pointer value:

Each pointer value p of type ptr<SC,T,A> corresponds to a unique reference value r of type ref<SC,T,A>, and vice versa, where p and r describe the same memory view.

4.4.1. Access Mode Defaults

The access mode for a memory view is often determined by context:

When writing a variable declaration or a pointer type in WGSL source:

4.4.2. Originating variable

In WGSL a reference value always corresponds to the memory view for some or all of the memory locations for some variable. This defines the originating variable for the reference value.

A pointer value always corresponds to a reference value, and so the originating variable of a pointer is the same as the originating variable of the corresponding reference.

Note: The originating variable is a dynamic concept. The originating variable for a formal parameter of a function depends on the call sites for the function. Different call sites may supply pointers into different originating variables.

If a reference or pointer access is out of bounds, an invalid memory reference is produced. Loads from an invalid reference return one of:

Stores to an invalid reference may either:

4.4.3. Use cases for references and pointers

References and pointers are distinguished by how they are used:

Defining references in this way enables simple idiomatic use of variables:

EXAMPLE: Reference types enable simple use of variables
[[stage(compute)]]
fn main() {
  // 'i' has reference type ref<function,i32,read_write>
  // The memory locations for 'i' store the i32 value 0.
  var i: i32 = 0;

  // 'i + 1' can only match a type rule where the 'i' subexpression is of type i32.
  // So the expression 'i + 1' has type i32, and at evaluation, the 'i' subexpression
  // evaluates to the i32 value stored in the memory locations for 'i' at the time
  // of evaluation.
  let one: i32 = i + 1;

  // Update the value in the locations referenced by 'i' so they hold the value 2.
  i = one + 1;

  // Update the value in the locations referenced by 'i' so they hold the value 5.
  // The evaluation of the right-hand-side occurs before the assignment takes effect.
  i = i + 3;
}
EXAMPLE: Returning a reference returns the value loaded via the reference
var<private> age: i32;
fn get_age() -> i32 {
  // The type of the expression in the return statement must be 'i32' since it
  // must match the declared return type of the function.
  // The 'age' expression is of type ref<private,i32,read_write>.
  // Apply the Load Rule, since the store type of the reference matches the
  // required type of the expression, and no other type rule applies.
  // The evaluation of 'age' in this context is the i32 value loaded from the
  // memory locations referenced by 'age' at the time the return statement is
  // executed.
  return age;
}

fn caller() {
  age = 21;
  // The copy_age constant will get the i32 value 21.
  let copy_age: i32 = get_age();
}

Defining pointers in this way enables two key use cases:

Note: The following examples use WGSL features explained later in this specification.

EXAMPLE: Using a pointer as a short name for part of a variable
struct Particle {
  position: vec3<f32>;
  velocity: vec3<f32>;
};
[[block]] struct System {
  active_index: i32;
  timestep: f32;
  particles: array<Particle,100>;
};
[[group(0), binding(0)]] var<storage,read_write> system: System;

[[stage(compute)]]
fn main() {
  // Form a pointer to a specific Particle in storage memory.
  let active_particle: ptr<storage,Particle> =
      &system.particles[system.active_index];

  let delta_position: vec3<f32> = (*active_particle).velocity * system.timestep;
  let current_position: vec3<f32>  = (*active_particle).position;
  (*active_particle).position = delta_position + current_position;
}
EXAMPLE: Using a pointer as a formal parameter
fn add_one(x: ptr<function,i32>) {
  // Update the locations for 'x' to contain the next higher integer value,
  // (or to wrap around to the largest negative i32 value).
  // On the left-hand side, unary '*' converts the pointer to a reference that
  // can then be assigned to. It has a read_write access mode, by default.
  // On the right-hand side:
  //    - Unary '*' converts the pointer to a reference, with a read_write
  //      access mode.
  //    - The only matching type rule is for addition (+) and requires '*x' to
  //      have type i32, which is the store type for '*x'.  So the Load Rule
  //      applies and '*x' evaluates to the value stored in the memory for '*x'
  //      at the time of evaluation, which is the i32 value for 0.
  //    - Add 1 to 0, to produce a final value of 1 for the right-hand side.
  // Store 1 into the memory for '*x'.
  *x = *x + 1;
}

[[stage(compute)]]
fn main() {
  var i: i32 = 0;

  // Modify the contents of 'i' so it will contain 1.
  // Use unary '&' to get a pointer value for 'i'.
  // This is a clear signal that the called function has access to the storage
  // for 'i', and may modify it.
  add_one(&i);
  let one: i32 = i;  // 'one' has value 1.
}

4.4.4. Forming reference and pointer values

A reference value is formed in one of the following ways:

In all cases, the access mode of the result is the same as the access mode of the original reference.

EXAMPLE: Component reference from a composite reference
struct S {
    age: i32;
    weight: f32;
};
var<private> person: S;
// Uses of 'person' denote the reference to the storage underlying the variable,
// and will have type ref<private,S,read_write>.

fn f() {
    var uv: vec2<f32>;
    // Uses of 'uv' denote the reference to the storage underlying the variable,
    // and will have type ref<function,vec2<f32>,read_write>.

    // Evaluate the left-hand side of the assignment:
    //   Evaluate 'uv.x' to yield a reference:
    //   1. First evaluate 'uv', yielding a reference to the storage for
    //      the 'uv' variable. The result has type ref<function,vec2<f32>,read_write>.
    //   2. Then apply the '.x' vector access phrase, yielding a reference to
    //      the storage for the first component of the vector pointed at by the
    //      reference value from the previous step.
    //      The result has type ref<function,f32,read_write>.
    // Evaluating the right-hand side of the assignment yields the f32 value 1.0.
    // Store the f32 value 1.0 into the storage memory locations referenced by uv.x.
    uv.x = 1.0;

    // Evaluate the left-hand side of the assignment:
    //   Evaluate 'uv[1]' to yield a reference:
    //   1. First evaluate 'uv', yielding a reference to the storage for
    //      the 'uv' variable. The result has type ref<function,vec2<f32>,read_write>.
    //   2. Then apply the '[1]' array index phrase, yielding a reference to
    //      the storage for second component of the vector referenced from
    //      the previous step.  The result has type ref<function,f32,read_write>.
    // Evaluating the right-hand side of the assignment yields the f32 value 2.0.
    // Store the f32 value 2.0 into the storage memory locations referenced by uv[1].
    uv[1] = 2.0;

    var m: mat3x2<f32>;
    // When evaluating 'm[2]':
    // 1. First evaluate 'm', yielding a reference to the storage for
    //    the 'm' variable. The result has type ref<function,mat3x2<f32>,read_write>.
    // 2. Then apply the '[2]' array index phrase, yielding a reference to
    //    the storage for the third column vector pointed at by the reference
    //    value from the previous step.
    //    Therefore the 'm[2]' expression has type ref<function,vec2<f32>,read_write>.
    // The 'let' declaration is for type vec2<f32>, so the declaration
    // statement requires the initializer to be of type vec2<f32>.
    // The Load Rule applies (because no other type rule can apply), and
    // the evaluation of the initializer yields the vec2<f32> value loaded
    // from the memory locations referenced by 'm[2]' at the time the declaration
    // is executed.
    let p_m_col2: vec2<f32> = m[2];

    var A: array<i32,5>;
    // When evaluating 'A[4]'
    // 1. First evaluate 'A', yielding a reference to the storage for
    //    the 'A' variable. The result has type ref<function,array<i32,5>,read_write>.
    // 2. Then apply the '[4]' array index phrase, yielding a reference to
    //    the storage for the fifth element of the array referenced by
    //    the reference value from the previous step.
    //    The result value has type ref<function,i32,read_write>.
    // The let declaration requires the right-hand-side to be of type i32.
    // The Load Rule applies (because no other type rule can apply), and
    // the evaluation of the initializer yields the i32 value loaded from
    // the memory locations referenced by 'A[5]' at the time the declaration
    // is executed.
    let A_4_value: i32 = A[4];

    // When evaluating 'person.weight'
    // 1. First evaluate 'person', yielding a reference to the storage for
    //    the 'person' variable declared at module scope.
    //    The result has type ref<private,S,read_write>.
    // 2. Then apply the '.weight' member access phrase, yielding a reference to
    //    the storage for the second member of the memory referenced by
    //    the reference value from the previous step.
    //    The result has type ref<private,f32,read_write>.
    // The let declaration requires the right-hand-side to be of type f32.
    // The Load Rule applies (because no other type rule can apply), and
    // the evaluation of the initializer yields the f32 value loaded from
    // the memory locations referenced by 'person.weight' at the time the
    // declaration is executed.
    let person_weight: f32 = person.weight;
}

A pointer value is formed in one of the following ways:

In all cases, the access mode of the result is the same as the access mode of the original pointer.

EXAMPLE: Pointer from a variable
// Declare a variable in the private storage class, for storing an f32 value.
var<private> x: f32;

fn f() {
    // Declare a variable in the function storage class, for storing an i32 value.
    var y: i32;

    // The name 'x' resolves to the module-scope variable 'x',
    // and has reference type ref<private,f32,read_write>.
    // Applying the unary '&' operator converts the reference to a pointer.
    // The access mode is the same as the access mode of the original variable, so
    // the fully specified type is ptr<private,f32,read_write>.  But read_write
    // is the default access mode for function storage class, so read_write does not
    // have to be spelled in this case
    let x_ptr: ptr<private,f32> = &x;

    // The name 'y' resolves to the function-scope variable 'y',
    // and has reference type ref<private,i32,read_write>.
    // Applying the unary '&' operator converts the reference to a pointer.
    // The access mode defaults to 'read_write'.
    let y_ptr: ptr<function,i32> = &y;

    // A new variable, distinct from the variable declared at module scope.
    var x: u32;

    // Here, the name 'x' resolves to the function-scope variable 'x' declared in
    // the previous statement, and has type ref<function,u32,read_write>.
    // Applying the unary '&' operator converts the reference to a pointer.
    // The acces mode defaults to 'read_write'.
    let inner_x_ptr: ptr<function,u32> = &x;
}

4.4.5. Comparison with references and pointers in other languages

This section is informative, not normative.

References and pointers in WGSL are more restricted than in other languages. In particular:

Note: From the above rules, it is not possible to form a "dangling" pointer, i.e. a pointer that does not reference the storage for a valid (or "live") originating variable.

4.5. Texture and Sampler Types

A texel is a scalar or vector used as the smallest independently accessible element of a texture. The word texel is short for texture element.

A texture is a collection of texels supporting special operations useful for rendering. In WGSL, those operations are invoked via texture builtin functions. See § 16.8 Texture built-in functions for a complete list.

A WGSL texture corresponds to a WebGPU GPUTexture.

A texture is either arrayed, or non-arrayed:

A texture has the following features:

texel format

The data in each texel. See § 4.5.1 Texel formats

dimensionality

The number of dimensions in the grid coordinates, and how the coordinates are interpreted. The number of dimensions is 1, 2, or 3. Most textures use cartesian coordinates. Cube textures have six square faces, and are sampled with a three dimensional coordinate interpreted as a direction vector from the origin toward the cube centered on the orgin.

size

The extent of grid coordinates along each dimension

mip level count

The mip level count is at least 1 for sampled textures, and equal to 1 for storage textures.
Mip level 0 contains a full size version of the texture. Each successive mip level contains a filtered version of the previous mip level at half the size (within rounding) of the previous mip level.
When sampling a texture, an explicit or implicitly-computed level-of-detail is used to select the mip levels from which to read texel data. These are then combined via filtering to produce the sampled value.

arrayed

whether the texture is arrayed

array size

the number of homogeneous grids, if the texture is arrayed

A texture’s representation is typically optimized for rendering operations. To achieve this, many details are hidden from the programmer, including data layouts, data types, and internal operations that cannot be expressed directly in the shader language.

As a consequence, a shader does not have direct access to the texel storage within a texture variable. Instead, access is mediated through an opaque handle:

TODO: update this wording to handle function parameters that are textures or samplers.

In this way, the set of supported operations for a texture type is determined by the availability of texture builtin functions accepting that texture type as the first parameter.

Note: The handle stored by a texture variable cannot be changed by the shader. That is, the variable is read-only, even if the underlying texture to which it provides access may be mutable (e.g. a write-only storage texture).

TODO: Describe the use of samplers, in the same broad framework.

4.5.1. Texel formats

In WGSL, certain texture types are parameterized by texel format.

A texel format is characterized by:

channels

Each channel contains a scalar. A texel format has up to four channels: r, g, b, and a, normally corresponding to the concepts of red, green, blue, and alpha channels.

channel format

The number of bits in the channel, and how those bits are interpreted.

Each texel format in WGSL corresponds to a WebGPU GPUTextureFormat with the same name.

Only certain texel formats are used in WGSL source code. The channel formats used to define those texel formats are listed in the Channel Formats table. The last column specfies the conversion from the stored channel bits to the value used in the shader. This is also known as the channel transfer function, or CTF.

Channel Formats
Channel format Number of stored bits Interpetation of stored bits Shader type Shader value (Channel Transfer Function)
8unorm 8 unsigned integer v ∈ {0,...,255} f32 v ÷ 255
8snorm 8 signed integer v ∈ {-128,...,127} f32 max(-1, v ÷ 127)
8uint 8 unsigned integer v ∈ {0,...,255} u32 v
8sint 8 signed integer v ∈ {-128,...,127} i32 v
16uint 16 unsigned integer v ∈ {0,...,65535} u32 v
16sint 16 signed integer v ∈ {-32768,...,32767} i32 v
16float 16 IEEE-754 binary16 16-bit floating point value v, with 1 sign bit, 5 exponent bits, 10 mantissa bits f32 v
32uint 32 32-bit unsigned integer value v u32 v
32sint 32 32-bit signed integer value v i32 v
32float 32 IEEE-754 binary32 32-bit floating point value v f32 v

The texel formats listed in the Texel Formats for Storage Textures table correspond to the WebGPU plain color formats which support the WebGPU STORAGE usage. These texel formats are used to parameterize the storage texture types defined in § 4.5.5 Storage Texture Types.

When the texel format does not have all four channels, then:

The last column in the table below uses the format-specific channel transfer function from the channel formats table.

Texel Formats for Storage Textures
Texel format Channel format Channels in memory order Corresponding shader value
rgba8unorm 8unorm r, g, b, a vec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba8snorm 8snorm r, g, b, a vec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba8uint 8uint r, g, b, a vec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba8sint 8sint r, g, b, a vec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba16uint 16uint r, g, b, a vec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba16sint 16sint r, g, b, a vec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba16float 16float r, g, b, a vec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a))
r32uint 32uint r vec4<u32>(CTF(r), 0u, 0u, 1u)
r32sint 32sint r vec4<i32>(CTF(r), 0, 0, 1)
r32float 32float r vec4<f32>(CTF(r), 0.0, 0.0, 1.0)
rg32uint 32uint r, g vec4<u32>(CTF(r), CTF(g), 0.0, 1.0)
rg32sint 32sint r, g vec4<i32>(CTF(r), CTF(g), 0.0, 1.0)
rg32float 32float r, g vec4<f32>(CTF(r), CTF(g), 0.0, 1.0)
rgba32uint 32uint r, g, b, a vec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba32sint 32sint r, g, b, a vec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba32float 32float r, g, b, a vec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a))

The following table lists the correspondence between WGSL texel formats and SPIR-V image formats.

Mapping texel formats to SPIR-V
Texel format SPIR-V Image Format SPIR-V Enabling Capability
rgba8unorm Rgba8 Shader
rgba8snorm Rgba8Snorm Shader
rgba8uint Rgba8ui Shader
rgba8sint Rgba8i Shader
rgba16uint Rgba16ui Shader
rgba16sint Rgba16i Shader
rgba16float Rgba16f Shader
r32uint R32ui Shader
r32sint R32i Shader
r32float R32f Shader
rg32uint Rg32ui StorageImageExtendedFormats
rg32sint Rg32i StorageImageExtendedFormats
rg32float Rg32f StorageImageExtendedFormats
rgba32uint Rgba32ui Shader
rgba32sint Rgba32i Shader
rgba32float Rgba32f Shader

4.5.2. Sampled Texture Types

texture_1d<type>
  %1 = OpTypeImage %type 1D 0 0 0 1 Unknown

texture_2d<type>
  %1 = OpTypeImage %type 2D 0 0 0 1 Unknown

texture_2d_array<type>
  %1 = OpTypeImage %type 2D 0 1 0 1 Unknown

texture_3d<type>
  %1 = OpTypeImage %type 3D 0 0 0 1 Unknown

texture_cube<type>
  %1 = OpTypeImage %type Cube 0 0 0 1 Unknown

texture_cube_array<type>
  %1 = OpTypeImage %type Cube 0 1 0 1 Unknown

4.5.3. Multisampled Texture Types

texture_multisampled_2d<type>
  %1 = OpTypeImage %type 2D 0 0 1 1 Unknown

4.5.4. External Sampled Texture Types

texture_external

texture_external is an opaque 2d float-sampled texture type similar to texture_2d<f32> but potentially with a different representation. It can be read using textureLoad or textureSampleLevel, which handle these different representations opaquely.

See [GPUExternalTexture](https://gpuweb.github.io/gpuweb/#gpu-external-texture) for details.

4.5.5. Storage Texture Types

A storage texture supports accessing a single texel without the use of a sampler.

A storage texture type must be parameterized by one of the texel formats for storage textures. The texel format determines the conversion function as specified in § 4.5.1 Texel formats.

For a write-only storage texture the inverse of the conversion function is used to convert the shader value to the stored texel.

See § 16.8 Texture built-in functions.

TODO(dneto): Move description of the conversion to the builtin function that actually does the reading.

texture_storage_1d<texel_format,access>
  // %1 = OpTypeImage sampled_type 1D 0 0 0 2 image_format

texture_storage_2d<texel_format,access>
  // %1 = OpTypeImage sampled_type 2D 0 0 0 2 image_format

texture_storage_2d_array<texel_format,access>
  // %1 = OpTypeImage sampled_type 2D 0 1 0 2 image_format

texture_storage_3d<texel_format,access>
  // %1 = OpTypeImage sampled_type 3D 0 0 0 2 image_format

In the SPIR-V mapping:

When mapping to SPIR-V, a write-only storage texture variable must have a NonReadable decoration.

For example:

EXAMPLE: Mapping a readable texture_storage_1d variable to SPIR-V
var tbuf: texture_storage_1d<rgba8unorm,read>;

// Maps to the following SPIR-V:
//  OpDecorate %tbuf NonWritable
//  ...
//  %float = OpTypeFloat 32
//  %image_type = OpTypeImage %float 1D 0 0 0 2 Rgba8
//  %image_ptr_type = OpTypePointer UniformConstant %image_type
//  %tbuf = OpVariable %image_ptr_type UniformConstant
EXAMPLE: Mapping a writable texture_storage_1d variable to SPIR-V
var tbuf: texture_storage_1d<rgba8unorm,write>;

// Maps to the following SPIR-V:
//  OpDecorate %tbuf NonReadable
//  ...
//  %float = OpTypeFloat 32
//  %image_type = OpTypeImage %float 1D 0 0 0 2 Rgba8
//  %image_ptr_type = OpTypePointer UniformConstant %image_type
//  %tbuf = OpVariable %image_ptr_type UniformConstant

4.5.6. Depth Texture Types

texture_depth_2d
  %1 = OpTypeImage %f32 2D 1 0 0 1 Unknown

texture_depth_2d_array
  %1 = OpTypeImage %f32 2D 1 1 0 1 Unknown

texture_depth_cube
  %1 = OpTypeImage %f32 Cube 1 0 0 1 Unknown

texture_depth_cube_array
  %1 = OpTypeImage %f32 Cube 1 1 0 1 Unknown

4.5.7. Sampler Type

A sampler mediates access to a sampled texture or a depth texture, by performing a combination of:

Type Description
sampler Sampler. Mediates access to a sampled texture.
sampler_comparison Comparison sampler. Mediates access to a depth texture.

Samplers are parameterized when created in the WebGPU API. They cannot be modified by a WGSL program.

Samplers can only be used by the texture builtin functions.

sampler
  OpTypeSampler

sampler_comparison
  OpTypeSampler

4.5.8. Texture Types Grammar

TODO: Add texture usage validation rules.
texture_sampler_types
  : sampler_type
  | depth_texture_type
  | sampled_texture_type LESS_THAN type_decl GREATER_THAN
  | multisampled_texture_type LESS_THAN type_decl GREATER_THAN
  | storage_texture_type LESS_THAN texel_format COMMA access_mode GREATER_THAN

sampler_type
  : SAMPLER
  | SAMPLER_COMPARISON

sampled_texture_type
  : TEXTURE_1D
  | TEXTURE_2D
  | TEXTURE_2D_ARRAY
  | TEXTURE_3D
  | TEXTURE_CUBE
  | TEXTURE_CUBE_ARRAY

multisampled_texture_type
  : TEXTURE_MULTISAMPLED_2D

storage_texture_type
  : TEXTURE_STORAGE_1D
  | TEXTURE_STORAGE_2D
  | TEXTURE_STORAGE_2D_ARRAY
  | TEXTURE_STORAGE_3D

depth_texture_type
  : TEXTURE_DEPTH_2D
  | TEXTURE_DEPTH_2D_ARRAY
  | TEXTURE_DEPTH_CUBE
  | TEXTURE_DEPTH_CUBE_ARRAY

texel_format
  : R8UNORM
     R8  -- Capability: StorageImageExtendedFormats
  | R8SNORM
     R8Snorm  -- Capability: StorageImageExtendedFormats
  | R8UINT
     R8ui  -- Capability: StorageImageExtendedFormats
  | R8SINT
     R8i  -- Capability: StorageImageExtendedFormats
  | R16UINT
     R16ui  -- Capability: StorageImageExtendedFormats
  | R16SINT
     R16i  -- Capability: StorageImageExtendedFormats
  | R16FLOAT
     R16f  -- Capability: StorageImageExtendedFormats
  | RG8UNORM
     Rg8  -- Capability: StorageImageExtendedFormats
  | RG8SNORM
     Rg8Snorm  -- Capability: StorageImageExtendedFormats
  | RG8UINT
     Rg8ui  -- Capability: StorageImageExtendedFormats
  | RG8SINT
     Rg8i  -- Capability: StorageImageExtendedFormats
  | R32UINT
     R32ui
  | R32SINT
     R32i
  | R32FLOAT
     R32f
  | RG16UINT
     Rg16ui  -- Capability: StorageImageExtendedFormats
  | RG16SINT
     Rg16i  -- Capability: StorageImageExtendedFormats
  | RG16FLOAT
     Rg16f  -- Capability: StorageImageExtendedFormats
  | RGBA8UNORM
     Rgba8
  | RGBA8UNORM-SRGB
     ???
  | RGBA8SNORM
     Rgba8Snorm
  | RGBA8UINT
     Rgba8ui
  | RGBA8SINT
     Rgba8i
  | BGRA8UNORM
     Rgba8  ???
  | BGRA8UNORM-SRGB
     ???
  | RGB10A2UNORM
     Rgb10A2  -- Capability: StorageImageExtendedFormats
  | RG11B10FLOAT
     R11fG11fB10f  -- Capability: StorageImageExtendedFormats
  | RG32UINT
     Rg32ui  -- Capability: StorageImageExtendedFormats
  | RG32SINT
     Rg32i  -- Capability: StorageImageExtendedFormats
  | RG32FLOAT
     Rg32f  -- Capability: StorageImageExtendedFormats
  | RGBA16UINT
     Rgba16ui
  | RGBA16SINT
     Rgba16i
  | RGBA16FLOAT
     Rgba16f
  | RGBA32UINT
     Rgba32ui
  | RGBA32SINT
     Rgba32i
  | RGBA32FLOAT
     Rgba32f

4.6. Type Aliases TODO

type_alias
  : TYPE IDENT EQUAL type_decl
EXAMPLE: Type Alias
type Arr = array<i32, 5>;

type RTArr = [[stride(16)]] array<vec4<f32>>;

4.7. Type Declaration Grammar

type_decl
  : IDENT
  | BOOL
  | FLOAT32
  | INT32
  | UINT32
  | VEC2 LESS_THAN type_decl GREATER_THAN
  | VEC3 LESS_THAN type_decl GREATER_THAN
  | VEC4 LESS_THAN type_decl GREATER_THAN
  | POINTER LESS_THAN storage_class COMMA type_decl (COMMA access_mode)? GREATER_THAN
  | attribute_list* ARRAY LESS_THAN type_decl (COMMA INT_LITERAL)? GREATER_THAN
  | MAT2x2 LESS_THAN type_decl GREATER_THAN
  | MAT2x3 LESS_THAN type_decl GREATER_THAN
  | MAT2x4 LESS_THAN type_decl GREATER_THAN
  | MAT3x2 LESS_THAN type_decl GREATER_THAN
  | MAT3x3 LESS_THAN type_decl GREATER_THAN
  | MAT3x4 LESS_THAN type_decl GREATER_THAN
  | MAT4x2 LESS_THAN type_decl GREATER_THAN
  | MAT4x3 LESS_THAN type_decl GREATER_THAN
  | MAT4x4 LESS_THAN type_decl GREATER_THAN
  | ATOMIC LESS_THAN type_decl GREATER_THAN
  | texture_sampler_types

When the type declaration is an identifer, then the expression must be in scope of a declaration of the identifier as a type alias or structure type.

EXAMPLE: Type Declarations
identifier
  Allows to specify types created by the type command

bool
   %1 = OpTypeBool

f32
   %2 = OpTypeFloat 32

i32
   %3 = OpTypeInt 32 1

u32
   %4 = OpTypeInt 32 0

vec2<f32>
    %7 = OpTypeVector %float 2

array<f32, 4>
   %uint_4 = OpConstant %uint 4
        %9 = OpTypeArray %float %uint_4

[[stride(32)]] array<f32, 4>
             OpDecorate %9 ArrayStride 32
   %uint_4 = OpConstant %uint 4
        %9 = OpTypeArray %float %uint_4

array<f32>
   %rtarr = OpTypeRuntimeArray %float

mat2x3<f32>
   %vec = OpTypeVector %float 3
     %6 = OpTypeMatrix %vec 2
EXAMPLE: Access modes for buffers
// Storage buffers
[[group(0), binding(0)]]
var<storage,read> buf1: Buffer;       // Can read, cannot write.
[[group(0), binding(0)]]
var<storage> buf2: Buffer;            // Can read, cannot write.
[[group(0), binding(1)]
var<storage,read_write> buf3: Buffer; // Can both read and write.

// Uniform buffer. Always read-only, and has more restrictive layout rules.
struct ParamsTable {};
[[group(0), binding(2)]]
var<uniform> params: ParamsTable;     // Can read, cannot write.

5. var and let

A let declaration specifies a name for a value. Once the value for a let-declaration is computed, it is immutable. When an identifier use resolves to a let-declaration, the identifier denotes that value.

When a let identifier is declared without an explicitly specified type, e.g. let foo = 4, the type is automatically inferred from the expression to the right of the equals token (=). When the type is specified, e.g let foo: i32 = 4, the initializer expression must evaluate to that type.

Some rules about let-declarations depend on where the declaration appears. See § 5.2 Module Constants and § 5.3 Function Scope Variables and Constants.

EXAMPLE: let-declared constants at module scope
// 'blockSize' denotes the i32 value 1024.
let blockSize: i32 = 1024;

// 'row_size' denotes the u32 value 16u.  The type is inferred.
let row_size = 16u;

A variable is a named reference to storage that can contain a value of a particular storable type.

Two types are associated with a variable: its store type (the type of value that may be placed in the referenced storage) and its reference type (the type of the variable itself). If a variable has store type T and storage class S, then its reference type is ref<S,T>.

A variable declaration:

When an identifier use resolves to a variable declaration, the identifer is an expression denoting the reference memory view for the variable’s storage, and its type is the variable’s reference type. See § 6.13 Variable Identifier Expression.

See § 5.1 Module Scope Variables and § 5.3 Function Scope Variables and Constants for rules about where a variable in a particular storage class can be declared, and when the storage class decoration is required, optional, or forbidden.

The access mode always has a default, and except for variables in the storage storage class, must not be written. See § 4.4.1 Access Mode Defaults.

variable_statement
  : variable_decl
  | variable_decl EQUAL short_circuit_or_expression
  | LET (IDENT | variable_ident_decl) EQUAL short_circuit_or_expression

variable_decl
  : VAR variable_qualifier? variable_ident_decl

variable_ident_decl
  : IDENT COLON attribute_list* type_decl

variable_qualifier
  : LESS_THAN storage_class ( COMMA access_mode )? GREATER_THAN

The lifetime of a variable is the period during shader execution for which the variable exists. The lifetime of a module scope variable is the entire execution of the shader stage.

For a function scope variable, each invocation has its own indepdendent version of the variable. The lifetime of the variable is determined by its scope:

Two variables with overlapping lifetimes will not have overlapping storage. When a variable’s lifetime ends, its storage may be used for another variable.

When a variable is created, its storage contains an initial value as follows:

Consider the following snippet of WGSL:

EXAMPLE: Variable initial values
var i: i32;         // Initial value is 0.  Not recommended style.
loop {
  var twice: i32 = 2 * i;   // Re-evaluated each iteration.
  i = i + 1;
  if (i == 5) { break; }
}
The loop body will execute five times. Variable i will take on values 0, 1, 2, 3, 4, 5, and variable twice will take on values 0, 2, 4, 6, 8.

Consider the following snippet of WGSL:

EXAMPLE: Reading a variable multiple times
var x: f32 = 1.0;
let y = x * x + x + 1;
Because x is a variable, all accesses to it turn into load and store operations. If this snippet was compiled to SPIR-V, it would be represented as
EXAMPLE: Sample translation for reading a variable multiple times
%temp_1 = OpLoad %float %x
%temp_2 = OpLoad %float %x
%temp_3 = OpFMul %float %temp_1 %temp_2
%temp_4 = OpLoad %float %x
%temp_5 = OpFAdd %float %temp_3 %temp_4
%y      = OpFAdd %float %temp_5 %one
However, it is expected that either the browser or the driver optimizes this intermediate representation such that the redundant loads are eliminated.

5.1. Module Scope Variables

A variable declared outside all functions is at module scope. The variable name is available for use immediately after its declaration statement, until the end of the program.

Variables at module scope are restricted as follows:

A variable in the uniform storage class is a uniform buffer variable. Its store type must be a host-shareable structure type with block attribute, satisfying the storage class layout constraints.

A variable in the storage storage class is a storage buffer variable. Its store type must be a host-shareable structure type with block attribute, satisfying the storage class layout constraints. It may be declared with a read or write access mode; the default is read.

As described in § 9.3.2 Resource interface, uniform buffers, storage buffers, textures, and samplers form the resource interface of a shader. Such variables are declared with group and binding decorations.

EXAMPLE: Module scope variable declarations
var<private> decibels: f32;
var<workgroup> worklist: array<i32,10>;

[[block]] struct Params {
  specular: f32;
  count: i32;
};
[[group(0), binding(2)]]
var<uniform> param: Params;    // A uniform buffer

[[block]] struct PositionsBuffer {
  pos: array<vec2<f32>>;
};
// A storage buffer, for reading and writing
[[group(0), binding(0)]]
var<storage,read_write> pbuf: PositionsBuffer;

// Textures and samplers are always in "handle" storage.
[[group(0), binding(1)]]
var filter_params: sampler;
global_variable_decl
  : attribute_list* variable_decl (EQUAL const_expr)?
EXAMPLE: Variable Decorations
[[group(4), binding(3)]]
   OpDecorate %variable DescriptorSet 4
   OpDecorate %variable Binding 3

WGSL defines the following attributes that can be applied to global variables:

5.2. Module Constants

A let-declaration appearing outside all functions declares a module-scope constant. The name is available for use after the end of the declaration, until the end of the WGSL program.

A module-scope let-declared constant must be of constructible type.

When the declaration has no attributes, an initializer expression must be present, and the name denotes the value of that expression.

EXAMPLE: Module constants
// The golden ratio.
let golden: f32 = 1.61803398875;

// The second unit vector for three dimensions, with inferred type.
let e2 = vec3<i32>(0,1,0);

When the declaration uses the override attribute, the constant is pipeline-overridable. In this case:

EXAMPLE: Module constants, pipeline-overrideable
[[override(0)]]    let has_point_light: bool = true;  // Algorithmic control
[[override(1200)]] let specular_param: f32 = 2.3;     // Numeric control
[[override(1300)]] let gain: f32;                     // Must be overridden
[[override]]       let width: f32 = 0.0;              // Specifed at the API level using
                                                      // the name "width".
[[override]]       let depth: f32;                    // Specifed at the API level using
                                                      // the name "depth".
                                                      // Must be overridden.

When a variable or feature is used within control flow that depends on the value of a constant, then that variable or feature is considered to be used by the program. This is true regardless of the value of the constant, whether that value is the one from the constant’s declaration or from a pipeline override.

global_constant_decl
  : attribute_list* LET variable_ident_decl global_const_initializer?

global_const_initializer
  : EQUAL const_expr

const_expr
  : type_decl PAREN_LEFT ((const_expr COMMA)* const_expr COMMA?)? PAREN_RIGHT
  | const_literal
EXAMPLE: Constants
-1
   %a = OpConstant %int -1

2
   %b = OpConstant %uint 2

3.2
   %c = OpConstant %float 3.2

true
    %d = OpConstantTrue

false
    %e = OpConstant False

vec4<f32>(1.2, 2.3, 3.4, 2.3)
    %f0 = OpConstant %float 1.2
    %f1 = OpConstant %float 2.3
    %f2 = OpConstant %float 3.4
     %f = OpConstantComposite %v4float %f0 %f1 %f2 %f1

The WebGPU pipeline creation API must specify how API-supplied values are mapped to shader scalar values. For booleans, I suggest using a 32-bit integer, where only 0 maps to false. If WGSL gains non-32-bit numeric scalars, I recommend overridable constants continue being 32-bit numeric types.

5.3. Function Scope Variables and Constants

A variable or constant declared in a declaration statement in a function body is in function scope. The name is available for use immediately after its declaration statement, and until the end of the brace-delimited list of statements immediately enclosing the declaration.

A function-scope let-declared constant must be of constructible type, or of pointer type.

For a variable declared in function scope:

EXAMPLE: Function scope variables and constants
fn f() {
   var<function> count: u32;  // A variable in function storage class.
   var delta: i32;            // Another variable in the function storage class.
   var sum: f32 = 0.0;        // A function storage class variable with initializer.
   var pi = 3.14159;          // Infer the f32 store type from the initializer.
   let unit: i32 = 1;         // Let-declared constants don’t use a storage class.
}

A variable or constant declared in the first clause of a for statement is available for use in the second and third clauses and in the body of the for statement.

An instance of a function scope variable is a dynamic context. Each variable that is in scope for some invocation has an overlapping lifetime and, therefore, has non-overlapping storage. Variables with non-overlapping lifetimes may reuse the storage of previous variables; however, new instances of the same variable are not guaranteed to use the same storage.

5.4. Never-alias assumption TODO

6. Expressions TODO

6.1. Literal Expressions TODO

Scalar literal type rules
Precondition Conclusion Notes
true: bool OpConstantTrue %bool
false: bool OpConstantFalse %bool
INT_LITERAL: i32 OpConstant %int literal
UINT_LITERAL: u32 OpConstant %uint literal
FLOAT_LITERAL: f32 OpConstant %float literal

6.2. Parenthesized Expressions

Parenthesized expression type rules
Precondition Conclusion Description
e : T ( e ) : T Evaluates to e.
Use parentheses to isolate an expression from the surrounding text.

6.3. Type Constructor Expressions

Type constructor expressions explicitly create a value of a given type.

The scalar forms are redundant, but provide symmetry with scalar conversion expressions, and can be used to enhance readability.

The vector forms construct vector values from various combinations of components and subvectors with matching component types.

See also § 6.4 Zero Value Expressions and § 6.5 Conversion Expressions.

Scalar constructor type rules
Precondition Conclusion Notes
e: bool bool(e): bool Identity.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand.
e: i32 i32(e): i32 Identity.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand.
e: u32 u32(e): u32 Identity.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand.
e: f32 f32(e): f32 Identity.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand.
Vector constructor type rules, where T is a scalar type
Precondition Conclusion Notes
e: T vecN<T>(e): vecN<T> Evaluates e once. Results in the N-element vector where each component has the value of e.
e1: T
e2: T
vec2<T>(e1,e2): vec2<T> OpCompositeConstruct
e: vec2<T> vec2<T>(e): vec2<T> Identity. The result is e.
e1: T
e2: T
e3: T
vec3<T>(e1,e2,e3): vec3<T> OpCompositeConstruct
e1: T
e2: vec2<T>
vec3<T>(e1,e2): vec3<T>
vec3<T>(e2,e1): vec3<T>
OpCompositeConstruct
e: vec3<T> vec3<T>(e): vec3<T> Identity. The result is e.
e1: T
e2: T
e3: T
e4: T
vec4<T>(e1,e2,e3,e4): vec4<T> OpCompositeConstruct
e1: T
e2: T
e3: vec2<T>
vec4<T>(e1,e2,e3): vec4<T>
vec4<T>(e1,e3,e2): vec4<T>
vec4<T>(e3,e1,e2): vec4<T>
OpCompositeConstruct
e1: vec2<T>
e2: vec2<T>
vec4<T>(e1,e2): vec4<T> OpCompositeConstruct
e1: T
e2: vec3<T>
vec4<T>(e1,e2): vec4<T>
vec4<T>(e2,e1): vec4<T>
OpCompositeConstruct
e: vec4<T> vec4<T>(e): vec4<T> Identity. The result is e.
Matrix constructor type rules
Precondition Conclusion Notes
e1: vec2<f32>
e2: vec2<f32>
e3: vec2<f32>
e4: vec2<f32>
mat2x2<f32>(e1,e2): mat2x2<f32>
mat3x2<f32>(e1,e2,e3): mat3x2<f32>
mat4x2<f32>(e1,e2,e3,e4): mat4x2<f32>
Column by column construction.
OpCompositeConstruct
e1: vec3<f32>
e2: vec3<f32>
e3: vec3<f32>
e4: vec3<f32>
mat2x3<f32>(e1,e2): mat2x3<f32>
mat3x3<f32>(e1,e2,e3): mat3x3<f32>
mat4x3<f32>(e1,e2,e3,e4): mat4x3<f32>
Column by column construction.
OpCompositeConstruct
e1: vec4<f32>
e2: vec4<f32>
e3: vec4<f32>
e4: vec4<f32>
mat2x4<f32>(e1,e2): mat2x4<f32>
mat3x4<f32>(e1,e2,e3): mat3x4<f32>
mat4x4<f32>(e1,e2,e3,e4): mat4x4<f32>
Column by column construction.
OpCompositeConstruct
Array constructor type rules
Precondition Conclusion Notes
e1: T
...
eN: T,
T is a constructible type.
array<T,N>(e1,...,eN) : array<T,N> Construction of an array from elements.
Structure constructor type rules
Precondition Conclusion Notes
e1: T1
...
eN: TN,
S is a constructible structure type with members having types T1 ... TN.
The expression is in the scope of declaration of S.
S(e1,...,eN): S Construction of a structure from members.

6.4. Zero Value Expressions

Each constructible T has a unique zero value written in WGSL as the type followed by an empty pair of parentheses: T ().

The zero values are as follows:

Note: WGSL does not have zero expression for atomic types, runtime-sized arrays, or other types that are not constructible.

Scalar zero value type rules
Precondition Conclusion Notes
bool(): bool false
Zero value (OpConstantNull for bool)
i32(): i32 0
Zero value (OpConstantNull for i32)
u32(): u32 0u
Zero value (OpConstantNull for u32)
f32(): f32 0.0
Zero value (OpConstantNull for f32)
Vector zero type rules, where T is a scalar type
Precondition Conclusion Notes
vec2<T>(): vec2<T> Zero value (OpConstantNull)
vec3<T>(): vec3<T> Zero value (OpConstantNull)
vec4<T>(): vec4<T> Zero value (OpConstantNull)
EXAMPLE: Zero-valued vectors
vec2<f32>()                 // The zero-valued vector of two f32 elements.
vec2<f32>(0.0, 0.0)         // The same value, written explicitly.

vec3<i32>()                 // The zero-valued vector of three i32 elements.
vec3<i32>(0, 0, 0)          // The same value, written explicitly.
Matrix zero type rules
Precondition Conclusion Notes
mat2x2<f32>(): mat2x2<f32>
mat3x2<f32>(): mat3x2<f32>
mat4x2<f32>(): mat4x2<f32>
Zero value (OpConstantNull)
mat2x3<f32>(): mat2x3<f32>
mat3x3<f32>(): mat3x3<f32>
mat4x3<f32>(): mat4x3<f32>
Zero value (OpConstantNull)
mat2x4<f32>(): mat2x4<f32>
mat3x4<f32>(): mat3x4<f32>
mat4x4<f32>(): mat4x4<f32>
Zero value (OpConstantNull)
Array zero type rules
Precondition Conclusion Notes
T is a constructible array<T,N>(): array<T,N> Zero-valued array (OpConstantNull)
EXAMPLE: Zero-valued arrays
array<bool, 2>()               // The zero-valued array of two booleans.
array<bool, 2>(false, false)   // The same value, written explicitly.
Structure zero type rules
Precondition Conclusion Notes
S is a constructible structure type.
The expression is in the scope of declaration of S.
S(): S Zero-valued structure: a structure of type S where each member is the zero value for its member type.
(OpConstantNull)
EXAMPLE: Zero-valued structures
struct Student {
  grade: i32;
  GPA: f32;
  attendance: array<bool,4>;
};

fn func() {
  var s: Student;

  // The zero value for Student
  s = Student();

  // The same value, written explicitly.
  s = Student(0, 0.0, array<bool,4>(false, false, false, false));

  // The same value, written with zero-valued members.
  s = Student(i32(), f32(), array<bool,4>());
}

6.5. Conversion Expressions

WGSL does not implicitly convert or promote a numeric or boolean value to another type. Instead use conversion expressions as defined in the tables below.

See also § 6.3 Type Constructor Expressions.

Scalar conversion type rules
Precondition Conclusion Notes
e: u32 bool(e): bool Coercion to boolean.
The result is false if e is 0, and true otherwise.
(Use OpINotEqual to compare e against 0.)
e: i32 bool(e): bool Coercion to boolean.
The result is false if e is 0, and true otherwise.
(Use OpINotEqual to compare e against 0.)
e: f32 bool(e): bool Coercion to boolean.
The result is false if e is 0.0 or -0.0, and true otherwise. In particular NaN and infinity values map to true.
(Use OpFUnordNotEqual to compare e against 0.0.)
e: bool i32(e): i32 Conversion of a boolean value to a signed integer
The result is 1 if e is true and 0 otherwise.
(Use OpSelect with e as the condition, selecting between 1 and 0.)
e: u32 i32(e): i32 Reinterpretation of bits.
The result is the unique value in i32 that is equal to (e mod 232).
(OpBitcast)
e: f32 i32(e): i32 Value conversion, including invalid cases. (OpConvertFToS)
e: bool u32(e): u32 Conversion of a boolean value to an unsigned integer
The result is 1u if e is true and 0u otherwise.
(Use OpSelect with e as the condition, selecting between 1u and 0u.)
e: i32 u32(e): u32 Reinterpretation of bits.
The result is the unique value in u32 that is equal to (e mod 232).
(OpBitcast)
e: f32 u32(e): u32 Value conversion, including invalid cases. (OpConvertFToU)
e: bool f32(e): f32 Conversion of a boolean value to floating point
The result is 1.0 if e is true and 0.0 otherwise.
(Use OpSelect with e as the condition, selecting between 1.0 and 0.0.)
e: i32 f32(e): f32 Value conversion, including invalid cases. (OpConvertSToF)
e: u32 f32(e): f32 Value conversion, including invalid cases. (OpConvertUToF)

Details of conversion to and from floating point are explained in § 12.5.2 Floating point conversion.

Vector conversion type rules
Precondition Conclusion Notes
e: vecN<u32> vecN<bool>(e): vecN<bool> Component-wise coercion of a unsigned integer vector to a boolean vector.
(OpINotEqual to compare e against a zero vector.)
e: vecN<i32> vecN<bool>(e): vecN<bool> Component-wise coercion of a signed integer vector to a boolean vector.
(OpINotEqual to compare e against a zero vector.)
e: vecN<f32> vecN<bool>(e): vecN<bool> Component-wise coercion of a floating point vector to a boolean vector.
(OpFUnordNotEqual to compare e against a zero vector.)
e: vecN<bool> vecN<i32>(e): vecN<i32> Component-wise conversion of a boolean vector to signed.
Component i of the result is i32(e[i])
(OpSelect, with e as the condition, selecting between vecN(1) and vecN(0).)
e: vecN<u32> vecN<i32>(e): vecN<i32> Component-wise reinterpretation of bits.
Component i of the result is i32(e[i])
(OpBitcast)
e: vecN<f32> vecN<i32>(e): vecN<i32> Component-wise value conversion to signed integer, including invalid cases.
(OpConvertFToS)
e: vecN<bool> vecN<u32>(e): vecN<u32> Component-wise conversion of a boolean vector to unsigned.
Component i of the result is u32(e[i])
(OpSelect, with e as the condition, selecting between vecN(1u) and vecN(0u).)
e: vecN<i32> vecN<u32>(e): vecN<u32> Component-wise reinterpretation of bits.
(OpBitcast)
e: vecN<f32> vecN<u32>(e): vecN<u32> Component-wise value conversion to unsigned integer, including invalid cases.
(OpConvertFToU)
e: vecN<bool> vecN<u32>(e): vecN<u32> Component-wise conversion of a boolean vector to floating point.
Component i of the result is f32(e[i])
(OpSelect, with e as the condition, selecting between vecN(1.0) and vecN(0.0).)
e: vecN<i32> vecN<f32>(e): vecN<f32> Component-wise value conversion to floating point, including invalid cases.
(OpConvertSToF)
e: vecN<u32> vecN<f32>(e): vecN<f32> Component-wise value conversion to floating point, including invalid cases.
(ConvertUToF)

6.6. Reinterpretation of Representation Expressions

A bitcast expression is used to reinterpet the bit representation of a value in one type as a value in another type.

Bitcast type rules
Precondition Conclusion Notes
e: T
T is a numeric scalar or numeric vector type
bitcast<T>(e): T Identity transform. Component-wise when T is a vector.
The result is e.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand.
e: T1
T1 is a numeric scalar or numeric vector type
T2 is not T1 and is a numeric scalar type if T1 is a scalar, or
a numeric vector type if T1 is a vector
bitcast<T2>(e): T2 Reinterpretation of bits as T2. Component-wise when T1 is a vector.
The result is the reinterpretation of the bits in e as a T2 value.
(OpBitcast)

6.7. Composite Value Decomposition Expressions

6.7.1. Vector Access Expression

Accessing members of a vector can be done either using array subscripting (e.g. a[2]) or using a sequence of convenience names, each mapping to an element of the source vector.

The convenience names are accessed using the . notation. (e.g. color.bgra).

NOTE: the convenience letterings can not be mixed. (i.e. you can not use rybw).

Using a convenience letter, or array subscript, which accesses an element past the end of the vector is an error.

The convenience letterings can be applied in any order, including duplicating letters as needed. You can provide 1 to 4 letters when extracting components from a vector. Providing more then 4 letters is an error.

The result type depends on the number of letters provided. Assuming a vec4<f32>

Accessor Result type
r f32
rg vec2<f32>
rgb vec3<f32>
rgba vec4<f32>
var a: vec3<f32> = vec3<f32>(1., 2., 3.);
var b: f32 = a.y;          // b = 2.0
var c: vec2<f32> = a.bb;   // c = (3.0, 3.0)
var d: vec3<f32> = a.zyx;  // d = (3.0, 2.0, 1.0)
var e: f32 = a[1];         // e = 2.0
6.7.1.1. Vector single component selection
Vector decomposition: single component selection
Precondition Conclusion Description
e: vecN<T>
e.x: T
e.r: T
Select the first component of e
(OpCompositeExtract with selection index 0)
e: vecN<T>
e.y: T
e.g: T
Select the second component of e
(OpCompositeExtract with selection index 1)
e: vecN<T>
N is 3 or 4
e.z: T
e.b: T
Select the third component of e
(OpCompositeExtract with selection index 2)
e: vec4<T> e.w: T
e.a: T
Select the fourth component of e
(OpCompositeExtract with selection index 3)
e: vecN<T>
i: i32 or u32
e[i]: T Select the ith component of vector
The first component is at index i=0.
If i is outside the range [0,N-1], then any valid value for T may be returned. (OpVectorExtractDynamic)
6.7.1.2. Vector multiple component selection
Vector decomposition: multiple component selection
Precondition Conclusion Description
e: vecN<T>
I is the letter x, y, z, or w
J is the letter x, y, z, or w
e.IJ: vec2<T>
Computes the two-element vector with first component e.I, and second component e.J.
Letter z is valid only when N is 3 or 4.
Letter w is valid only when N is 4.
(OpVectorShuffle)
e: vecN<T>
I is the letter r, g, b, or a
J is the letter r, g, b, or a
e.IJ: vec2<T>
Computes the two-element vector with first component e.I, and second component e.J.
Letter b is valid only when N is 3 or 4.
Letter a is valid only when N is 4.
(OpVectorShuffle)
e: vecN<T>
I is the letter x, y, z, or w
J is the letter x, y, z, or w
K is the letter x, y, z, or w
e.IJK: vec3<T>
Computes the three-element vector with first component e.I, second component e.J, and third component e.K.
Letter z is valid only when N is 3 or 4.
Letter w is valid only when N is 4.
(OpVectorShuffle)
e: vecN<T>
I is the letter r, g, b, or a
J is the letter r, g, b, or a
K is the letter r, g, b, or a
e.IJK: vec3<T>
Computes the three-element vector with first component e.I, second component e.J, and third component e.K.
Letter b is only valid when N is 3 or 4.
Letter a is only valid when N is 4.
(OpVectorShuffle)
e: vecN<T>
I is the letter x, y, z, or w
J is the letter x, y, z, or w
K is the letter x, y, z, or w
L is the letter x, y, z, or w
e.IJKL: vec4<T>
Computes the four-element vector with first component e.I, second component e.J, third component e.K, and fourth component e.L.
Letter z is valid only when N is 3 or 4.
Letter w is valid only when N is 4.
(OpVectorShuffle)
e: vecN<T>
I is the letter r, g, b, or a
J is the letter r, g, b, or a
K is the letter r, g, b, or a
L is the letter r, g, b, or a
e.IJKL: vec4<T>
Computes the four-element vector with first component e.I, second component e.J, third component e.K, and fourth component e.L.
Letter b is only valid when N is 3 or 4.
Letter a is only valid when N is 4.
(OpVectorShuffle)
6.7.1.3. Component reference from vector reference
Getting a reference to a component from a reference to a vector
Precondition Conclusion Description
r: ref<SC,vecN<T>>
r.x: ref<SC,T>
r.r: ref<SC,T>
Compute a reference to the first component of the vector referenced by the reference r.
The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain with index value 0)
r: ref<SC,vecN<T>>
r.y: ref<SC,T>
r.g: ref<SC,T>
Compute a reference to the second component of the vector referenced by the reference r.
The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain with index value 1)
r: ref<SC,vecN<T>>
N is 3 or 4
r.z: ref<SC,T>
r.b: ref<SC,T>
Compute a reference to the third component of the vector referenced by the reference r.
The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain with index value 2)
r: ref<SC,vec4<T>>
r.w: ref<SC,T>
r.a: ref<SC,T>
Compute a reference to the fourth component of the vector referenced by the reference r.
The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain with index value 3)
r: ref<SC,vecN<T>>
i: i32 or u32
r[i] : ref<SC,T> Compute a reference to the ith component of the vector referenced by the reference r.

If i is outside the range [0,N-1], then the expression evaluates to invalid memory reference.

The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain)

6.7.2. Matrix Access Expression

Column vector extraction
Precondition Conclusion Description
e: matNxM<T>
i: i32 or u32
i is a const_expr expression
e[i]: vecM<T> The result is the ith column vector of e.
If i is outside the range [0,N-1], then any valid value for vecM<T> may be returned. (OpCompositeExtract)
Getting a reference to a column vector from a reference to a matrix
Precondition Conclusion Description
r: ref<SC,matNxM<T>>
i: i32 or u32
r[i] : ref<vecM<SC,T>> Compute a reference to the ith column vector of the matrix referenced by the reference r.

If i is outside the range [0,N-1], then the expression evaluates to invalid memory reference.

The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain)

Note: Reflecting the limitations of the languages WGSL is meant to be translated into, it is only possible to use dynamically computed indices to subscript references to matrices. A matrix not behind a reference may only be indexed by a const_expr. To work around this restriction, consider storing the matrix in a temporary variable, and then subscripting the variable: a variable identifier expression produces a reference to the variable’s value, as required.

6.7.3. Array Access Expression

Array element extraction
Precondition Conclusion Description
e: array<T,N>
i: i32 or u32
i is a const_expr expression
e[i] : T The result is the value of the ith element of the array value e.

If i is outside the range [0,N-1], then any valid value for T may be returned. (OpCompositeExtract)

Getting a reference to an array element from a reference to an array
Precondition Conclusion Description
r: ref<SC,array<T,N>>
i: i32 or u32
r[i] : ref<SC,T> Compute a reference to the ith element of the array referenced by the reference r.

If i is outside the range [0,N-1], then the expression evaluates to an invalid memory reference.

The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain)

r: ref<SC,array<T>>
i: i32 or u32
r[i] : ref<SC,T> Compute a reference to the ith element of the runtime-sized array referenced by the reference r.

If at runtime the array has N elements, and i is outside the range [0,N-1], then the expression evaluates to an invalid memory reference.

The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain)

Note: Reflecting the limitations of the languages WGSL is meant to be translated into, it is only possible to use dynamically computed indices to subscript references to arrays. An array not behind a reference may only be indexed by a const_expr.

6.7.4. Structure Access Expression

Structure member extraction
Precondition Conclusion Description
S is a structure type
M is the identifier name of a member of S, having type T
e: S
e.M: T The result is the value of the member with name M from the structure value e.
(OpCompositeExtract, using the member index)
Getting a reference to a structure member from a reference to a structure
Precondition Conclusion Description
S is a structure type
M is the name of a member of S, having type T
r: ref<SC,S>
r.M: ref<SC,T> Given a reference to a structure, the result is a reference to the structure member with identifier name M.
The originating variable of the resulting reference is the same as the originating variable of r.
(OpAccessChain, using the index of the structure member)

6.8. Logical Expressions

Unary logical operations
Precondition Conclusion Notes
e: bool
T is bool or vecN<bool>
!e: T Logical negation. The result is true when e is false and false when e is true. Component-wise when T is a vector. (OpLogicalNot)
Binary logical expressions
Precondition Conclusion Notes
e1: bool
e2: bool
e1 || e2: bool Short-circuiting "or". Yields true if either e1 or e2 are true; evaluates e2 only if e1 is false.
e1: bool
e2: bool
e1 && e2: bool Short-circuiting "and". Yields true if both e1 and e2 are true; evaluates e2 only if e1 is true.
e1: T
e2: T
T is bool or vecN<bool>
e1 | e2: T Logical "or". Component-wise when T is a vector. Evaluates both e1 and e2.
e1: T
e2: T
T is bool or vecN<bool>
e1 & e2: T Logical "and". Component-wise when T is a vector. Evaluates both e1 and e2.

6.9. Arithmetic Expressions

Unary arithmetic expressions
Precondition Conclusion Notes
e: T
T is i32 or vecN<i32>
-e: T Signed integer negation. Component-wise when T is a vector. If e evaluates to the largest negative value, then the result is e. (OpSNegate)
e: T
T is f32 or vecN<f32>
-e: T Floating point negation. Component-wise when T is a vector. (OpFNegate)
Binary arithmetic expressions
Precondition Conclusion Notes
e1 : T
e2 : T
T is i32, u32, vecN<i32>, or vecN<u32>
e1 + e2 : T Integer addition, modulo 232. Component-wise when T is a vector. (OpIAdd)
e1 : T
e2 : T
T is f32 or vecN<f32>
e1 + e2 : T Floating point addition. Component-wise when T is a vector. (OpFAdd)
e1 : T
e2 : T
T is i32, u32, vecN<i32>, or vecN<u32>
e1 - e2 : T Integer subtraction, modulo 232. Component-wise when T is a vector. (OpISub)
e1 : T
e2 : T
T is f32 or vecN<f32>
e1 - e2 : T Floating point subtraction. Component-wise when T is a vector. (OpFSub)
e1 : T
e2 : T
T is i32, u32, vecN<i32>, or vecN<u32>
e1 * e2 : T Integer multiplication, modulo 232. Component-wise when T is a vector. (OpIMul)
e1 : T
e2 : T
T is f32 or vecN<f32>
e1 * e2 : T Floating point multiplication. Component-wise when T is a vector. (OpFMul)
e1 : T
e2 : T
T is i32 or vecN<i32>
e1 / e2 : T Signed integer division. Component-wise when T is a vector. (OpSDiv)
e1 : T
e2 : T
T is u32 or vecN<u32>
e1 / e2 : T Unsigned integer division. Component-wise when T is a vector. (OpUDiv)
e1 : T
e2 : T
T is f32 or vecN<f32>
e1 / e2 : T Floating point division. Component-wise when T is a vector. (OpFDiv)
e1 : T
e2 : T
T is i32 or vecN<i32>
e1 % e2 : T Signed integer remainder. Component-wise when T is a vector. (OpSMod)
e1 : T
e2 : T
T is u32 or vecN<u32>
e1 % e2 : T Unsigned integer remainder. Component-wise when T is a vector. (OpUMod)
e1 : T
e2 : T
T is f32 or vecN<f32>
e1 % e2 : T Floating point remainder, where sign of non-zero result matches sign of e2. Component-wise when T is a vector. (OpFMod)
Binary arithmetic expressions with mixed scalar and vector operands
Preconditions Conclusions Semantics
S is one of f32, i32, u32
V is vecN<S>
es: S
ev: V
ev + es: V ev + V(es)
es + ev: V V(es) + ev
ev - es: V ev - V(es)
es - ev: V V(es) - ev
ev * es: V ev * V(es)
es * ev: V V(es) * ev
ev / es: V ev / V(es)
es / ev: V V(es) / ev
S is one of i32, u32
V is vecN<S>
es: S
ev: V
ev % es: V ev % V(es)
es % ev: V V(es) % ev
Matrix arithmetic
Preconditions Conclusions Semantics
e1, e2: matMxN<f32> e1 + e2: matMxN<f32>
Matrix addition: column i of the result is e1[i] + e2[i]
e1 - e2: matMxN<f32> Matrix subtraction: column i of the result is e1[i] - e2[i]
m: matMxN<f32>
s: f32
m * s: matMxN<f32>
Component-wise scaling: (m * s)[i][j] is m[i][j] * s
s * m: matMxN<f32>
Component-wise scaling: (s * m)[i][j] is m[i][j] * s
m: matMxN<f32>
v: vecM<f32>
m * v: vecN<f32>
Linear algebra matrix-column-vector product: Component i of the result is dot(m[i],v)
OpMatrixTimesVector
m: matMxN<f32>
v: vecN<f32>
v * m: vecM<f32>
Linear algebra row-vector-matrix product:
transpose(transpose(m) * transpose(v))
OpVectorTimesMatrix
e1: matKxN<f32>
e2: matMxK<f32>
e1 * e2: matMxN<f32>
Linear algebra matrix product.
OpMatrixTimesMatrix

6.10. Comparison Expressions TODO

Comparisons
Precondtion Conclusion Notes
e1: T
e2: T
T is bool or vecN<bool>
e1 == e2: T Equality. Component-wise when T is a vector. (OpLogicalEqual)
e1: T
e2: T
T is bool or vecN<bool>
e1 != e2: T Inequality. Component-wise when T is a vector. (OpLogicalNotEqual)
e1: TI
e2: TI
TI is i32, u32, vecN<i32>, or vecN<u32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 == e2: TB Equality. Component-wise when TI is a vector. (OpIEqual)
e1: TI
e2: TI
TI is i32, u32, vecN<i32>, or vecN<u32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 != e2: TB Inequality. Component-wise when TI is a vector. (OpINotEqual)
e1: TI
e2: TI
TI is i32 or vecN<i32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 < e2: TB Less than. Component-wise when TI is a vector. (OpSLessThan)
e1: TI
e2: TI
TI is i32 or vecN<i32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 <= e2: TB Less than or equal. Component-wise when TI is a vector. (OpSLessThanEqual)
e1: TI
e2: TI
TI is i32 or vecN<i32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 > e2: TB Greater than. Component-wise when TI is a vector. (OpSGreaterThan)
e1: TI
e2: TI
TI is i32 or vecN<i32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 >= e2: TB Greater than or equal. Component-wise when TI is a vector. (OpSGreaterThanEqual)
e1: TI
e2: TI
TI is u32 or vecN<u32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 < e2: TB Less than. Component-wise when TI is a vector. (OpULessThan)
e1: TI
e2: TI
TI is u32 or vecN<u32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 <= e2: TB Less than or equal. Component-wise when TI is a vector. (OpULessThanEqual)
e1: TI
e2: TI
TI is u32 or vecN<u32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 > e2: TB Greater than. Component-wise when TI is a vector. (OpUGreaterThan)
e1: TI
e2: TI
TI is u32 or vecN<u32>
TB is bool if TI is scalar, or
vecN<bool> if TI is a vector
e1 >= e2: TB Greater than or equal. Component-wise when TI is vector. (OpUGreaterThanEqual)
e1: TF
e2: TF
TF is f32 or vecN<f32>
TB is bool if TF is scalar, or
vecN<bool> if TF is a vector.
e1 == e2: TB Equality. Component-wise when TF is a vector. (OpFOrdEqual)
e1: TF
e2: TF
TF is f32 or vecN<f32>
TB is bool if TF is scalar, or
vecN<bool> if TF is a vector.
e1 != e2: TB Inequality. Component-wise when TF is a vector. (OpFOrdEqual)
e1: TF
e2: TF
TF is f32 or vecN<f32>
TB is bool if TF is scalar, or
vecN<bool> if TF is a vector.
e1 < e2: TB Less than. Component-wise when TF is a vector. (OpFOrdLessThan)
e1: TF
e2: TF
TF is f32 or vecN<f32>
TB is bool if TF is scalar, or
vecN<bool> if TF is a vector.
e1 <= e2: TB Less than or equal. Component-wise when TF is a vector. (OpFOrdLessThanEqual)
e1: TF
e2: TF
TF is f32 or vecN<f32>
TB is bool if TF is scalar, or
vecN<bool> if TF is a vector.
e1 > e2: TB Greater than. Component-wise when TF is a vector. (OpFOrdGreaterThan)
e1: TF
e2: TF
TF is f32 or vecN<f32>
TB is bool if TF is scalar, or
vecN<bool> if TF is a vector.
e1 >= e2: TB Greater than or equal. Component-wise when TF is a vector. (OpFOrdGreaterThanEqual)

6.11. Bit Expressions TODO

Unary bitwise operations
Precondition Conclusion Notes
e: T
T is i32, u32, vecN<i32>, or vecN<u32>
~e : T Bitwise complement on e. Each bit in the result is the opposite of the corresponding bit in e. Component-wise when T is a vector. (OpNot)
Binary bitwise operations
Precondition Conclusion Notes
e1: T
e2: T
T is i32, u32, vecN<i32>, or vecN<u32>
e1 | e2: T Bitwise-or. Component-wise when T is a vector.
e1: T
e2: T
T is i32, u32, vecN<i32>, or vecN<u32>
e1 & e2: T Bitwise-and. Component-wise when T is a vector.
e1: T
e2: T
T is i32, u32, vecN<i32>, or vecN<u32>
e1 ^ e2: T Bitwise-exclusive-or. Component-wise when T is a vector.
Bit shift expressions
Precondition Conclusion Notes
e1: T
e2: TS
T is i32, u32, vecN<i32>, or vecN<u32>
TS is u32 if e1 is a scalar, or
vecN<u32>.
e1 << e2: T Shift left:
Shift e1 left, inserting zero bits at the least significant positions, and discarding the most significant bits. The number of bits to shift is the value of e2 modulo the bit width of e1.
Component-wise when T is a vector. (OpShiftLeftLogical)
e1: T
e2: T
T is u32 or vecN<u32>
e1 >> e2: T Logical shift right:
Shift e1 right, inserting zero bits at the most significant positions, and discarding the least significant bits. The number of bits to shift is the value of e2 modulo the bit width of e1. Component-wise when T is a vector. (OpShiftRightLogical)
e1: T
e2: TS
T is i32 or vecN<i32>
TS is u32 if e1 is a scalar, or
vecN<u32>.
e1 >> e2: T Arithmetic shift right:
Shift e1 right, copying the sign bit of e1 into the most significant positions, and discarding the least significant bits. The number of bits to shift is the value of e2 modulo the bit width of e1. Component-wise when T is a vector. (OpShiftRightArithmetic)

6.12. Function Call Expression

A function call expression executes a function call where the called function has a return type. If the called function does not return a value, a function call statement should be used instead. See § 7.4 Function Call Statement.

6.13. Variable Identifier Expression

Getting a reference from a variable name
Precondition Conclusion Description
v is an identifier resolving to an in-scope variable declared in storage class SC with store type T v: ref<SC,T> Result is a reference to the storage for the named variable v.

6.14. Formal Parameter Expression

Getting the value of an identifier declared as a formal parameter to a function
Precondition Conclusion Description
a is an identifier resolving to an in-scope formal paramter declaration with type T a: T Result is the value supplied for the corresponding function call operand at the call site invoking this instance of the function.

6.15. Address-Of Expression

The address-of operator converts a reference to its corresponding pointer.

Getting a pointer from a reference
Precondition Conclusion Description
r: ref<SC,T,A> &r: ptr<SC,T,A> Result is the pointer value corresponding to the same memory view as the reference value r.

If r is an invalid memory reference, then the resulting pointer is also an invalid memory reference.

6.16. Indirection Expression

The indirection operator converts a pointer to its corresponding reference.

Getting a reference from a pointer
Precondition Conclusion Description
p: ptr<SC,T,A> *p: ref<SC,T,A> Result is the reference value corresponding to the same memory view as the pointer value p.

If p is an invalid memory reference, then the resulting reference is also an invalid memory reference.

6.17. Constant Identifier Expression

Getting the value of a let-declared identifier
Precondition Conclusion Description
c is an identifier resolving to an in-scope pipeline-overridable let declaration with type T c: T If pipeline creation specified a value for the constant ID, then the result is that value. This value may be different for different pipeline instances.
Otherwise, the result is the value computed for the initializer expression. Pipeline-overridable constants appear at module-scope, so evaluation occurs before the shader begins execution.
Note: Pipeline creation fails if no initial value was specified in the API call and the let-declaration has no intializer expression.
c is an identifier resolving to an in-scope let declaration with type T, and is not pipeline-overridable c: T Result is the value computed for the initializer expression.
For a let declaration at module scope, evaluation occurs before the shader begins execution.
For a let declaration inside a function, evaluation occurs each time control reaches the declaration.

6.18. Expression Grammar Summary

primary_expression
  : IDENT argument_expression_list?
  | type_decl argument_expression_list
  | const_literal
  | paren_expression
  | BITCAST LESS_THAN type_decl GREATER_THAN paren_expression
      OpBitcast

paren_expression
  : PAREN_LEFT short_circuit_or_expression PAREN_RIGHT

argument_expression_list
  : PAREN_LEFT ((short_circuit_or_expression COMMA)* short_circuit_or_expression COMMA?)? PAREN_RIGHT

postfix_expression
  :
  | BRACKET_LEFT short_circuit_or_expression BRACKET_RIGHT postfix_expression
  | PERIOD IDENT postfix_expression

unary_expression
  : singular_expression
  | MINUS unary_expression
      OpSNegate
      OpFNegate
  | BANG unary_expression
      OpLogicalNot
  | TILDE unary_expression
      OpNot
  | STAR unary_expression
  | AND unary_expression

singular_expression
  : primary_expression postfix_expression

multiplicative_expression
  : unary_expression
  | multiplicative_expression STAR unary_expression
      OpVectorTimesScalar
      OpMatrixTimesScalar
      OpVectorTimesMatrix
      OpMatrixTimesVector
      OpMatrixTimesMatrix
      OpIMul
      OpFMul
  | multiplicative_expression FORWARD_SLASH unary_expression
      OpUDiv
      OpSDiv
      OpFDiv
  | multiplicative_expression MODULO unary_expression
      OpUMOd
      OpSMod
      OpFMod

additive_expression
  : multiplicative_expression
  | additive_expression PLUS multiplicative_expression
      OpIAdd
      OpFAdd
  | additive_expression MINUS multiplicative_expression
      OpFSub
      OpISub

shift_expression
  : additive_expression
  | shift_expression SHIFT_LEFT additive_expression
        OpShiftLeftLogical
  | shift_expression SHIFT_RIGHT additive_expression
        OpShiftRightLogical or OpShiftRightArithmetic

relational_expression
  : shift_expression
  | relational_expression LESS_THAN shift_expression
        OpULessThan
        OpFOrdLessThan
  | relational_expression GREATER_THAN shift_expression
        OpUGreaterThan
        OpFOrdGreaterThan
  | relational_expression LESS_THAN_EQUAL shift_expression
        OpULessThanEqual
        OpFOrdLessThanEqual
  | relational_expression GREATER_THAN_EQUAL shift_expression
        OpUGreaterThanEqual
        OpFOrdGreaterThanEqual

equality_expression
  : relational_expression
  | relational_expression EQUAL_EQUAL relational_expression
        OpIEqual
        OpFOrdEqual
  | relational_expression NOT_EQUAL relational_expression
        OpINotEqual
        OpFOrdNotEqual

and_expression
  : equality_expression
  | and_expression AND equality_expression

exclusive_or_expression
  : and_expression
  | exclusive_or_expression XOR and_expression

inclusive_or_expression
  : exclusive_or_expression
  | inclusive_or_expression OR exclusive_or_expression

short_circuit_and_expression
  : inclusive_or_expression
  | short_circuit_and_expression AND_AND inclusive_or_expression

short_circuit_or_expression
  : short_circuit_and_expression
  | short_circuit_or_expression OR_OR short_circuit_and_expression

7. Statements TODO

7.1. Compound Statement

A compound statement is a brace-enclosed group of zero or more statements. When a declaration is one of those statements, its identifier is in scope from the start of the next statement until the end of the compound statement.

compound_statement
  : BRACE_LEFT statements BRACE_RIGHT

7.2. Assignment Statement

An assignment statement replaces the contents of a variable, or a portion of a variable, with a new value.

The expression to the left of the equals token is the left-hand side, and the expression to the right of the equals token is the right-hand side.

Precondition Statement Description
r: ref<SC,T,A>,
A is write or read_write
e: T,
T is a constructible type,
SC is a writable storage class
r = e; Evaluates e, evaluates r, then writes the value computed for e into the memory locations referenced by r.

Note: if the reference is an invalid memory reference, the write may not execute, or may write to a different memory location than expected.

(OpStore)

In the simplest case, the left hand side of the assignment statement is the name of a variable. See § 4.4.4 Forming reference and pointer values for other cases.

EXAMPLE: Assignments
struct S {
    age: i32;
    weight: f32;
};
var<private> person: S;

fn f() {
    var a: i32 = 20;
    a = 30;           // Replace the contents of 'a' with 30.

    person.age = 31;  // Write 31 into the age field of the person variable.

    var uv: vec2<f32>;
    uv.y = 1.25;      // Place 1.25 into the second component of uv.

    let uv_x_ptr: ptr<function,f32> = &uv.x;
    *uv_x_ptr = 2.5;   // Place 2.5 into the first component of uv.

    var friend: S;
    // Copy the contents of the 'person' variable into the 'friend' variable.
    friend = person;
}
assignment_statement
  : singular_expression EQUAL short_circuit_or_expression
      If singular_expression is a variable, this maps to OpStore to the variable.
      Otherwise, singular expression is a pointer expression in an Assigning (L-value) context
      which maps to OpAccessChain followed by OpStore

7.3. Control flow TODO

7.3.1. Sequence TODO

7.3.2. If Statement

if_statement
  : IF paren_expression compound_statement elseif_statement? else_statement?

elseif_statement
  : ELSE_IF paren_expression compound_statement elseif_statement?

else_statement
  : ELSE compound_statement

An if statement provides provides predicated execution of a compound statement based on the evaluation of an expression.

If statements in WGSL use an if/elseif/else structure, that contains a single required if clause, zero or more elseif clauses and a single optional else clause. Each of the expressions for the if and elseif clause conditions must be a scalar boolean expression.

An if statement is executed as follows:

7.3.3. Switch Statement

switch_statement
  : SWITCH paren_expression BRACE_LEFT switch_body+ BRACE_RIGHT

switch_body
  : CASE case_selectors COLON BRACE_LEFT case_body BRACE_RIGHT
  | DEFAULT COLON BRACE_LEFT case_body BRACE_RIGHT

case_selectors
  : const_literal (COMMA const_literal)* COMMA?

case_body
  :
  | statement case_body
  | FALLTHROUGH SEMICOLON

A switch statement transfers control to one of a set of case clauses, or to the default clause, depending on the evaluation of a selector expression.

The selector expression must be of a scalar integer type. If the selector value equals a value in a case selector list, then control is transferred to the body of that case clause. If the selector value does not equal any of the case selector values, then control is transferred to the default clause.

Each switch statement must have exactly one default clause.

The case selector values must have the same type as the result of evaluating the selector expression.

A literal value must not appear more than once in the case selectors for a switch statement.

Note: The value of the literal is what matters, not the spelling. For example 0, 00, and 0x0000 all denote the zero value.

When control reaches the end of a case body, control normally transfers to the first statement after the switch statement. Alternately, executing a fallthrough statement transfers control to the body of the next case clause or default clause, whichever appears next in the switch body. A fallthrough statement must not appear as the last statement in the last clause of a switch. When a declaration appears in a case body, its identifier is in scope from the start of the next statement until the end of the case body.

Note: Identifiers declared in a case body are not in scope of case bodies which are reachable via a fallthrough statement.

7.3.4. Loop Statement

loop_statement
  : LOOP BRACE_LEFT statements continuing_statement? BRACE_RIGHT

The loop body is special form compound statement that executes repeatedly. Each execution of the loop body is called an iteration.

The identifier of a declaration in a loop is in scope from the start of the next statement until the end of the loop body. The declaration is executed each time it is reached, so each new iteration creates a new instance of the variable or constant, and re-initializes it.

This repetition can be interrupted by a § 7.3.6 Break, return, or discard.

Optionally, the last statement in the loop body may be a § 7.3.8 Continuing Statement.

Note: The loop statement is one of the biggest differences from other shader languages.

This design directly expresses loop idioms commonly found in compiled code. In particular, placing the loop update statements at the end of the loop body allows them to naturally use values defined in the loop body.

EXAMPLE: GLSL Loop
int a = 2;
for (int i = 0; i < 4; i++) {
  a *= 2;
}
EXAMPLE: WGSL Loop
let a: i32 = 2;
var i: i32 = 0;      // <1>
loop {
  if (i >= 4) { break; }

  a = a * 2;

  i = i + 1;
}
EXAMPLE: GLSL Loop with continue
int a = 2;
let int step = 1;
for (int i = 0; i < 4; i += step) {
  if (i % 2 == 0) continue;
  a *= 2;
}
EXAMPLE: WGSL Loop with continue
var a: i32 = 2;
var i: i32 = 0;
loop {
  if (i >= 4) { break; }

  let step: i32 = 1;

  i = i + 1;
  if (i % 2 == 0) { continue; }

  a = a * 2;
}
EXAMPLE: WGSL Loop with continue and continuing
var a: i32 = 2;
var i: i32 = 0;
loop {
  if (i >= 4) { break; }

  let step: i32 = 1;

  if (i % 2 == 0) { continue; }

  a = a * 2;

  continuing {   // <2>
    i = i + step;
  }
}

7.3.5. For Statement

for_statement
  : FOR PAREN_LEFT for_header PAREN_RIGHT compound_statement

for_header
  : (variable_statement | assignment_statement | func_call_statement)? SEMICOLON
     short_circuit_or_expression? SEMICOLON
     (assignment_statement | func_call_statement)?

The for(initializer; condition; continuing) { body } statement is syntactic sugar on top of a § 7.3.4 Loop Statement with the same body. Additionally:

The initializer of a for loop is executed once prior to executing the loop. When a declaration appears in the initializer, its identifier is in scope until the end of the body. Unlike declarations in the body, the declaration is not re-initialized each iteration.

The condition, body and continuing execute in that order to form a loop iteration. The body is a special form of compound statement. The identifier of a declaration in the body is in scope from the start of the next statement until the end of the body. The declaration is executed each time it is reached, so each new iteration creates a new instance of the variable or constant, and re-intializes it.

EXAMPLE: For to Loop transformation
for(var i: i32 = 0; i < 4; i = i + 1) {
  if (a == 0) {
    continue;
  }
  a = a + 2;
}

Converts to:

EXAMPLE: For to Loop transformation
{ // Introduce new scope for loop variable i
  var i: i32 = 0;
  var a: i32 = 0;
  loop {
    if (!(i < 4)) {
      break;
    }

    if (a == 0) {
      continue;
    }
    a = a + 2;

    continuing {
      i = i + 1;
    }
  }
}

7.3.6. Break

break_statement
  : BREAK

Use a break statement to transfer control to the first statement after the body of the nearest-enclosing § 7.3.4 Loop Statement or § 7.3.3 Switch Statement. A break statement must only be used in loop, for, and switch statements.

When a break statement is placed such that it would exit from a loop’s § 7.3.8 Continuing Statement, then:

EXAMPLE: WGSL Valid loop if-break from a continuing clause
var a: i32 = 2;
var i: i32 = 0;
loop {
  let step: i32 = 1;

  if (i % 2 == 0) { continue; }

  a = a * 2;

  continuing {
    i = i + step;
    if (i >= 4) { break; }
  }
}
EXAMPLE: WGSL Valid loop if-else-break from a continuing clause
var a: i32 = 2;
var i: i32 = 0;
loop {
  let step: i32 = 1;

  if (i % 2 == 0) { continue; }

  a = a * 2;

  continuing {
    i = i + step;
    if (i < 4) {} else { break; }
  }
}
EXAMPLE: WGSL Invalid breaks from a continuing clause
var a: i32 = 2;
var i: i32 = 0;

loop {
  let step: i32 = 1;

  if (i % 2 == 0) { continue; }

  a = a * 2;

  continuing {
    i = i + step;
    break;                                     // Invalid: too early
    if (i < 4) { i = i + 1; } else { break; }  // Invalid: if is too complex, and too early
    if (i >= 4) { break; } else { i = i + 1; } // Invalid: if is too complex
  }
}

7.3.7. Continue

continue_statement
  : CONTINUE

Use a continue statement to transfer control in the nearest-enclosing § 7.3.4 Loop Statement:

A continue statement must only be used in a loop or for statement. A continue statement must not be placed such that it would transfer control to an enclosing § 7.3.8 Continuing Statement. (It is a forward branch when branching to a continuing statement.)

A continue statement must not be placed such that it would transfer control past a declaration used in the targeted continuing construct.

EXAMPLE: Invalid continue bypasses declaration
var i: i32 = 0;
loop {
  if (i >= 4) { break; }
  if (i % 2 == 0) { continue; } // <3>

  let step: i32 = 2;

  continuing {
    i = i + step;
  }
}

7.3.8. Continuing Statement

continuing_statement
  : CONTINUING compound_statement

A continuing construct is a block of statements to be executed at the end of a loop iteration. The construct is optional.

The block of statements must not contain a return or discard statement.

7.3.9. Return Statement

return_statement
  : RETURN short_circuit_or_expression?

A return statement ends execution of the current function. If the function is an entry point, then the current shader invocation is terminated. Otherwise, evaluation continues with the next expression or statement after the evaluation of the call site of the current function invocation.

If the function doesn’t have a return type, then the return statement is optional. If the return statement is provided for such a function, it must not supply a value. Otherwise the expression must be present, and is called the return value. In this case the call site of this function invocation evaluates to the return value. The type of the return value must match the return type of the function. If a return statement is present, it must be the last statement in the enclosing compound statement.

7.3.10. Discard Statement

The discard statement must only be used in a fragment shader stage. Executing a discard statement will:

Only statements executed prior to the discard statement will have observable effects.

Note: A discard statement may be executed by any function in a fragment stage and the effect is the same: immediate termination of the invocation.

After a discard statement is executed, control flow is non-uniform for the duration of the entry point.

§ 12.2.1 Uniform control flow TODO needs to state whether all invocations being discarded maintains uniform control flow.

EXAMPLE: Using the discard statement to throw away a fragment
var<private> will_emit_color: bool = false;

fn discard_if_shallow(pos: vec4<f32>) {
  if (pos.z < 0.001) {
    // If this is executed, then the will_emit_color flag will
    // never be set to true.
    discard;
  }
  will_emit_color = true;
}

[[stage(fragment)]]
fn main([[builtin(position)]] coord_in: vec4<f32>)
  -> [[location(0)]] vec4<f32>
{
  discard_if_shallow(coord_in);

  // Set the flag and emit red, but only if the helper function
  // did not execute the discard statement.
  will_emit_color = true;
  return vec4<f32>(1.0, 0.0, 0.0, 1.0);
}

7.4. Function Call Statement

func_call_statement
  : IDENT argument_expression_list

A function call statement executes a function call where the called function does not return a value. If the called function returns a value, that value must be consumed either through assignment, evaluation in another expression or through use of the ignore built-in function (see § 16.13 Value-steering functions).

7.5. Statements Grammar Summary

compound_statement
  : BRACE_LEFT statements BRACE_RIGHT

statements
  : statement*

statement
  : SEMICOLON
  | return_statement SEMICOLON
  | if_statement
  | switch_statement
  | loop_statement
  | for_statement
  | func_call_statement SEMICOLON
  | variable_statement SEMICOLON
  | break_statement SEMICOLON
  | continue_statement SEMICOLON
  | DISCARD SEMICOLON
  | assignment_statement SEMICOLON
  | compound_statement

8. Functions

A function performs computational work when invoked.

A function is invoked in one of the following ways:

There are two kinds of functions:

8.1. Declaring a user-defined function

A function declaration creates a user-defined function, by specifying:

A function declaration must only occur at module scope. The function name is in scope from the start of the formal parameter list until the end of the program.

A formal parameter declaration specifies an identifier name and a type for a value that must be provided when invoking the function. A formal parameter may have attributes. See § 8.2 Function calls. The identifier is in scope until the end of the function. Two formal parameters for a given function must not have the same name.

If the return type is specified, then:

function_decl
  : attribute_list* function_header compound_statement

function_header
  : FN IDENT PAREN_LEFT param_list PAREN_RIGHT function_return_type_decl_optional

function_return_type_decl_optional
  :
  | ARROW attribute_list* type_decl

param_list
  :
  | (param COMMA)* param COMMA?

param
  : attribute_list* variable_ident_decl

WGSL defines the following attributes that can be applied to function declarations:

WGSL defines the following attributes that can be applied to function parameters and return types:

EXAMPLE: Simple functions
// Declare the add_two function.
// It has two formal paramters, i and b.
// It has a return type of i32.
// It has a body with a return statement.
fn add_two(i: i32, b: f32) -> i32 {
  return i + 2;  // A formal parameter is available for use in the body.
}

// A compute shader entry point function, 'main'.
// It has no specified return type.
// It invokes the ordinary_two function, and captures
// the resulting value in the named value 'two'.
[[stage(compute)]] fn main() {
   let six: i32 = add_two(4, 5.0);
}

8.2. Function calls

A function call is a statement or expression which invokes a function.

The function call must supply the same number of argument values as there are formal parameters in the called function. Each argument value must evaluate to the same type as the corresponding formal parameter, by position.

When a function call is executed the following steps occur:

The location of a function call is referred to as a call site. Call sites are a dynamic context. As such, the same textual location may represent multiple call sites.

Note: There are no default parameter values in WGSL. Built-in functions described this way are really overloaded functions.

Note: The current function will not resume execution if the called function or any descendent called function executes a discard statement. See § 7.3.10 Discard Statement.

8.3. Restrictions on functions

Note: The aliasing restriction applies to memory location written by function calls in the function.

Revisit aliasing rules for clarity.

9. Entry Points TODO

9.1. Shader Stages

WebGPU issues work to the GPU in the form of draw or dispatch commands. These commands execute a pipeline in the context of a set of inputs, outputs, and attached resources.

A pipeline describes the behaviour to be performed on the GPU, as a sequence of stages, some of which are programmable. In WebGPU, a pipeline is created before scheduling a draw or dispatch command for execution. There are two kinds of pipelines: GPUComputePipeline, and GPURenderPipeline.

A dispatch command uses a GPUComputePipeline to run a compute shader stage over a logical grid of points with a controllable amount of parallelism, while reading and possibly updating buffer and image resources.

A draw command uses a GPURenderPipeline to run a multi-stage process with two programmable stages among other fixed-function stages:

The WebGPU specification describes pipelines in greater detail.

WGSL defines three shader stages, corresponding to the programmable parts of pipelines:

Each shader stage has its own set of features and constraints, described elsewhere.

9.2. Entry point declaration

An entry point is a user-defined function that is invoked to perform the work for a particular shader stage.

Specify a stage attribute on a function declaration to declare that function as an entry point.

When configuring the stage in the pipeline, the entry point is specified by providing the WGSL module and the entry point’s function name.

The parameters of an entry point have to be within Entry point IO types. The return type of an entry point has to be of an Entry point IO type, if specified.

Note: compute entry points never have a return type.

EXAMPLE: Entry Point
[[stage(vertex)]]
fn vert_main() -> [[builtin(position)]] vec4<f32> {
  return vec4<f32>(0.0, 0.0, 0.0, 1.0);
}
   // OpEntryPoint Vertex %vert_main "vert_main" %return_value
   // OpDecorate %return_value BuiltIn Position
   // %float = OpTypeFloat 32
   // %v4float = OpTypeVector %float 4
   // %ptr = OpTypePointer Output %v4float
   // %return_value = OpVariable %ptr Output

[[stage(fragment)]]
fn frag_main([[builtin(position)]] coord_in: vec4<f32>) -> [[location(0)]] vec4<f32> {
  return vec4<f32>(coord_in.x, coord_in.y, 0.0, 1.0);
}
   // OpEntryPoint Fragment %frag_main "frag_main" %return_value %coord_in
   // OpDecorate %return_value Location 0
   // %float = OpTypeFloat 32
   // %v4float = OpTypeVector %float 4
   // %ptr = OpTypePointer Output %v4float
   // %return_value = OpVariable %ptr Output

[[stage(compute)]]
fn comp_main() { }
   // OpEntryPoint GLCompute %comp_main "comp_main"

The set of functions in a shader stage is the union of:

The union is applied repeatedly until it stabilizes. It will stabilize in a finite number of steps.

9.2.1. Function attributes for entry points

WGSL defines the following attributes that can be applied to entry point declarations:

Can we query upper bounds on workgroup size dimensions? Is it independent of the shader, or a property to be queried after creating the shader module?

EXAMPLE: workgroup_size Attribute
[[ stage(compute), workgroup_size(8,4,1) ]]
fn sorter() { }
   // OpEntryPoint GLCompute %sorter "sorter"
   // OpExecutionMode %sorter LocalSize 8 4 1

[[ stage(compute), workgroup_size(8u) ]]
fn reverser() { }
   // OpEntryPoint GLCompute %reverser "reverser"
   // OpExecutionMode %reverser LocalSize 8 1 1

// Using an pipeline-overridable constant.
[[override(42)]] let block_width = 12u;
[[ stage(compute), workgroup_size(block_width) ]]
fn shuffler() { }
   // The SPIR-V translation uses a WorkgroupSize-decorated constant,
   // where the first component is an OpSpecConstant decorated with
   // SpecID 42, and with default value 12, and second and third components
   // use defaulted values of 1.

// Error: workgroup_size must be specified on compute shader
[[ stage(compute) ]]
fn bad_shader() { }

9.3. Shader Interface

The shader interface is the set of objects through which the shader accesses data external to the shader stage, either for reading or writing. The interface includes:

These objects are represented by module-scope variables in certain storage classes.

We say a variable is statically accessed by a function if any subexpression in the body of the function uses the variable’s identifier, and that subexpression is in scope of the variable’s declaration. Static access of a let-declared constant is defined similarly. Note that being statically accessed is independent of whether an execution of the shader will actually evaluate the subexpression, or even execute the enclosing statement.

More precisely, the interface of a shader stage consists of:

9.3.1. Pipeline Input and Output Interface

The Entry point IO types include the following:

A pipeline input is data provided to the shader stage from upstream in the pipeline. A pipeline input is denoted by the arguments of the entry point.

A pipeline output is data the shader provides for further processing downstream in the pipeline. A pipeline output is denoted by the return type of the entry point.

Each pipeline input or output is one of:

9.3.1.1. Built-in inputs and outputs

A built-in input variable provides access to system-generated control information. The set of built-in inputs are listed in § 15 Built-in variables.

To declare a variable for accessing a particular input built-in X from an entry point:

A built-in output variable is used by the shader to convey control information to later processing steps in the pipeline. The set of built-in outputs are listed in § 15 Built-in variables.

To declare a variable for accessing a particular output built-in Y from an entry point:

Both input and output built-in variables may also be declared as members of structures that are either entry point function parameters (for inputs) or the return type of an entry point (for outputs). The type of the structure member must match the type specified for the built-in variable.

The builtin attribute must not be applied to a variables in module scope, or the local variables in the function scope.

A variable must not have more than one builtin attribute.

Each built-in variable has an associated shader stage, as described in § 15 Built-in variables. If a built-in variable has stage S and is used by a function F, as either an argument or the result type, then F must be a function in a shader for stage S.

in Vulkan, builtin variables occoupy I/O location slots counting toward limits.

9.3.1.2. User Data Attribute TODO

User-defined data can be passed as input to the start of a pipeline, passed between stages of a pipeline or output from the end of a pipeline. User-defined IO must not be passed to compute shader entry points. User-defined IO must be numeric scalar or numeric vector types . All user defined IO must be assigned locations (See § 9.3.1.4 Input-output Locations).

9.3.1.3. Interpolation

Authors can control how user-defined IO data is interpolated through the use of the interpolate attribute. WGSL offers two aspects of interpolation to control: the type of interpolation, and the sampling of the interpolation.

The interpolation type must be one of:

The interpolation sampling must be one of:

For user-defined IO of scalar or vector floating-point type:

User-defined IO of scalar or vector integer type is always [[interpolate(flat)]] and, therefore, must not be specified in a WGSL program.

Interpolation attributes must match between vertex outputs and fragment inputs with the same location assignment within the same pipeline.

9.3.1.4. Input-output Locations

Each location can store a value up to 16 bytes in size. The byte size of a type is defined using the SizeOf column in § 4.3.7.1 Alignment and Size. For example, a four-element vector of floating-point values occupies a single location.

Locations are specified via the location attribute.

Every user-defined input and output must have a fully specified set of locations. Each structure member in the entry point IO must be one of either a builtin variable (see § 9.3.1.1 Built-in inputs and outputs), or assigned a location.

For a given entry point, the locations of the return type are distinct from the locations of the function parameters. Within each set of locations, there must be no overlap.

Note: the number of available locations for an entry point is defined by the WebGPU API.

EXAMPLE: Applying location attributes
struct A {
  [[location(0)]] x: f32;
  // Despite locations being 16-bytes, x and y cannot share a location
  [[location(1)]] y: f32;
};

// in1 occupies locations 0 and 1.
// in2 occupies location 2.
// The return value occupies location 0.
[[stage(fragment)]]
fn fragShader(in1: A, [[location(2)]] in2: f32) -> [[location(0)]] vec4<f32> {
 // ...
}

User-defined IO can be mixed with builtin variables in the same structure. For example,

EXAMPLE: Mixing builtins and user-defined IO
// Mixed builtins and user-defined inputs.
struct MyInputs {
  [[location(0)]] x: vec4<f32>;
  [[builtin(front_facing)]] y: bool;
  [[location(1)]] z: u32;
};

struct MyOutputs {
  [[builtin(frag_depth)]] x: f32;
  [[location(0)]] y: vec4<f32>;
};

[[stage(fragment)]]
fn fragShader(in1: MyInputs) -> MyOutputs {
  // ...
}
EXAMPLE: Invalid location assignments
struct A {
  [[location(0)]] x: u32;
  // Invalid, x and y cannot share a location.
  [[location(0)]] y: u32;
};

struct B {
  [location(0)]] x: f32;
};

struct C {
  // Invalid, structures with user-defined IO cannot be nested.
  b: B;
};

struct D {
  x: vec4<f32>;
};

[[stage(fragment)]]
// Invalid, location cannot be applied to a structure type.
fn fragShader1([[location(0)]] in1: D) {
  // ...
}

[[stage(fragment)]]
// Invalid, in1 and in2 cannot share a location.
fn fragShader2([location(0)]] in1: f32, [[location(0)]] in2: f32) {
  // ...
}

[[stage(fragment)]]
// Invalid, location cannot be applied to a structure.
fn fragShader3([[location(0)]] in1: vec4<f32>) -> [[location(0)]] D {
  // ...
}

9.3.2. Resource interface

A resource is an object, other than a pipeline input or output, which provides access to data external to a shader stage. Resources are shared by all invocations of the shader.

There are four kinds of resources:

The resource interface of a shader is the set of module-scope resource variables statically accessed by functions in the shader stage.

Each resource variable must be declared with both group and binding attributes. Together with the shader’s stage, these identify the binding address of the resource on the shader’s pipeline. See WebGPU § GPUPipelineLayout.

Bindings must not alias within a shader stage: two different variables in the resource interface of a given shader must not have the same group and binding values, when considered as a pair of values.

9.3.3. Resource layout compatibility

WebGPU requires that a shader’s resource interface match the layout of the pipeline using the shader.

Each WGSL variable in a resource interface must be bound to a WebGPU resource with a compatible resource type and binding type, where compatibility is defined by the following table.

WebGPU binding type compatibility
WGSL resource WebGPU
Resource type
WebGPU Binding type
uniform buffer GPUBufferBinding GPUBufferBindingType uniform
storage buffer with read_write access storage
storage buffer with read access read-only-storage
sampler GPUSampler GPUSamplerBindingType filtering
non-filtering
sampler_comparison comparison
sampled texture GPUTextureView GPUTextureSampleType float
unfilterable-float
sint
uint
depth
write-only storage texture GPUTextureView GPUStorageTextureAccess write-only

TODO: Describe when filtering or non-filtering samplers are valid.

TODO: Describe when float vs. unfilterable float sampled textures are valid.

If B is a uniform buffer variable in a resource interface, and WB is the WebGPU GPUBuffer bound to B, then:

If B is a storage buffer variable in a resource interface, and WB is the WebGPU GPUBuffer bound to B, then:

Note: Recall that a runtime-sized array may only appear as the last element in the structure type that is the store type of a storage buffer variable.

TODO: Describe other interface matching requirements, e.g. for images?

9.4. Pipeline compatibility TODO

TODO: match flat attribute

TODO: user data inputs of fragment stage must be subset of user data outputs of vertex stage

9.4.1. Input-output matching rules TODO

10. Language extensions

The WGSL language is expected to evolve over time.

An extension is a named grouping for a coherent set of modifications to a particular version of the WGSL specification, consisting of any combination of:

Hypothetically, extensions could be used to:

10.1. Enable Directive

An enable directive indicates that the functionality described by a particular named extension may be used in the source text after the directive itself. That is, language functionality described by the extension may be used in any source text after the enable directive.

The directive must not appear inside the text of any declaration. (If it were a declaration, it would be at module scope.)

The directive uses an identifier to name the extension, but does not create a scope for the identifier. Use of the identifier by the directive does not conflict with the use of that identifier as the name in any declaration.

enable_directive
  : ENABLE IDENT SEMICOLON

Note: The grammar rule includes the terminating semicolon token, ensuring the additional functionality is usable only after that semicolon. Therefore any WGSL implementation can parse the entire enable directive. When an implementation encounters an enable directive for an unsupported extension, the implementation can issue a clear diagnostic.

EXAMPLE: Using hypothetical extensions
// Enable a hypothetical IEEE-754 binary16 floating point extension.
enable f16;

// Assuming the f16 extension enables use of the f16 type:
//    - as function return value
//    - as the type for let declaration
//    - as a type constructor, with an i32 argument
//    - as operands to the division operator: /
fn halve_it(x: f16) -> f16 {
   let two: f16 = f16(2);
   return x / two;
};

enable f16; // A redundant enable directive is ok.
// Enable a hypothetical extension adding syntax for controlling
// the rounding mode on f16 arithmetic.
enable rounding_mode_f16;

[[round_to_even_f16]] // Attribute enabled by the rounding_mode_f16 extension
fn triple_it(x: f16) -> f16 {
   return x * f16(3); // Uses round-to-even.
};

11. WGSL program TODO

TODO: Stub A WGSL program is a sequence of directives and module scope declarations.

translation_unit
  : global_decl_or_directive* EOF
global_decl_or_directive
  : SEMICOLON
  | global_variable_decl SEMICOLON
  | global_constant_decl SEMICOLON
  | type_alias SEMICOLON
  | struct_decl SEMICOLON
  | function_decl
  | enable_directive

12. Execution TODO

12.1. Invocation of an entry point TODO

12.1.1. Before an entry point begins TODO

TODO: Stub

12.1.2. Program order (within an invocation) TODO

12.1.2.1. Function-scope variable lifetime and initialization TODO
12.1.2.2. Statement order TODO
12.1.2.3. Intra-statement order (or lack) TODO

TODO: Stub: Expression evaluation

12.2. Uniformity TODO

12.2.1. Uniform control flow TODO

12.2.2. Divergence and reconvergence TODO

12.2.3. Uniformity restrictions TODO

12.3. Compute Shaders and Workgroups

A workgroup is a set of invocations which concurrently execute a compute shader stage entry point, and share access to shader variables in the workgroup storage class.

The workgroup grid for a compute shader is the set of points with integer coordinates (i,j,k) with:

where (workgroup_size_x, workgroup_size_y, workgroup_size_z) is the value specified for the workgroup_size attribute of the entry point.

There is exactly one invocation in a workgroup for each point in the workgroup grid.

An invocation’s local invocation ID is the coordinate triple for the invocation’s corresponding workgroup grid point.

When an invocation has local invocation ID (i,j,k), then its local invocation index is

i + (j * workgroup_size_x) + (k * workgroup_size_x * workgroup_size_y)

Note that if a workgroup has W invocations, then each invocation I the workgroup has a unique local invocation index L(I) such that 0 ≤ L(I) < W, and that entire range is covered.

A compute shader begins execution when a WebGPU implementation removes a dispatch command from a queue and begins the specified work on the GPU. The dispatch command specifies a dispatch size, which is an integer triple (group_count_x, group_count_y, group_count_z) indicating the number of workgroups to be executed, as described in the following.

The compute shader grid for a particular dispatch is the set of points with integer coordinates (CSi,CSj,CSk) with:

where workgroup_size_x, workgroup_size_y, and workgroup_size_z are as above for the compute shader entry point.

The work to be performed by a compute shader dispatch is to execute exactly one invocation of the entry point for each point in the compute shader grid.

An invocation’s global invocation ID is the coordinate triple for the invocation’s corresponding compute shader grid point.

The invocations are organized into workgroups, so that each invocation (CSi, CSj, CSk) is identified with the workgroup grid point

( CSi mod workgroup_size_x , CSj mod workgroup_size_y , CSk mod workgroup_size_z )

in workgroup ID

( ⌊ CSi ÷ workgroup_size_x ⌋, ⌊ CSj ÷ workgroup_size_y ⌋, ⌊ CSk ÷ workgroup_size_z ⌋).

WebGPU provides no guarantees about:

WebGPU issue 1045: Dispatch group counts must be positive. However, how do we handle an indirect dispatch that specifies a group count of zero.

12.4. Collective operations TODO

12.4.1. Barrier TODO

12.4.2. Derivatives

A partial derivative is the rate of change of a value along an axis.

Fragment shader invocations operating on neighbouring fragments (in screen-space coordinates) collaborate to compute approximate partial derivatives.

Partial derivatives of the fragment coordinate are computed implicitly as part of operation of the following built-in functions:

For these, the derivatives help determine the mip levels of texels to be sampled, or in the case of textureSampleCompare, sampled and compared against a reference value.

Partial derivatives of invocation-specified values are computed by the built-in functions described in § 16.7 Derivative built-in functions:

Because neighbouring invocations must collaborate to compute derivatives, these functions must only be invoked in uniform control flow in a fragment shader.

12.4.3. Arrayed resource access TODO

12.5. Floating Point Evaluation

WGSL follows the IEEE-754 standard for floating point computation with the following exceptions:

12.5.1. Floating Point Accuracy

Let x be the exact real-valued or infinite result of an operation when computed with unbounded precision. The correctly rounded result of the operation for floating point type T is:

That is, the result may be rounded up or down: WGSL does not specify a rounding mode.

Note: Floating point types include positive and negative infinity, so the correctly rounded result may be finite or infinite.

The units in the last place, ULP, for a floating point number x is the minimum distance between two non-equal floating point numbers a and b such that axb (i.e. ulp(x) = mina,b|b - a|).

In the following tables, the accuracy of an operation is provided among five possibilities:

For any accuracy values specified over a range, the accuracy is undefined for results outside that range.

If an allowable return value for any operation is greater in magnitude than the largest representable finite floating-point value, then that operation may additionally return either the infinity with the same sign or the largest finite value with the same sign.

Accuracy of expressions
Expression Accuracy
x + y Correctly rounded
x - y Correctly rounded
x * y Correctly rounded
x / y 2.5 ULP for |y| in the range [2-126, 2126]
x % y Derived from x - y * trunc(x/y)
-x Correctly rounded
x == y Correct result
x != y Correct result
x < y Correct result
x <= y Correct result
x > y Correct result
x >= y Correct result
Accuracy of built-in functions
Built-in Function Accuracy
abs(x) Correctly rounded
acos(x) Inherited from atan2(sqrt(1.0 - x * x), x)
asin(x) Inherited from atan2(x, sqrt(1.0 - x * x))
atan(x) 4096 ULP
atan2(x) 4096 ULP
ceil(x) Correctly rounded
clamp(x) Correctly rounded
cos(x) Absolute error ≤ 2-11 inside the range of [-π, π]
cosh(x) Inherited from (exp(x) - exp(-x)) * 0.5
cross(x, y) Inherited from (x[i] * y[j] - x[j] * y[j])
distance(x, y) Inherited from length(x - y)
exp(x) 3 + 2 * x ULP
exp2(x) 3 + 2 * x ULP
faceForward(x, y, z) Inherited from select(x, -x, dot(z, y) < 0.0)
floor(x) Correctly rounded
fma(x, y, z) Inherited from x * y + z
fract(x) Correctly rounded
frexp(x, y) Correctly rounded
inverseSqrt(x) 2 ULP
ldexp(x, y) Correctly rounded
length(x) Inherited from sqrt(dot(x, x))
log(x) 3 ULP outside the range [0.5, 2.0].
Absolute error < 2-21 inside the range [0.5, 2.0]
log2(x) 3 ULP outside the range [0.5, 2.0].
Absolute error < 2-21 inside the range [0.5, 2.0]
max(x, y) Correctly rounded
min(x, y) Correctly rounded
mix(x, y, z) Inherited from x - (1.0 - z) + y * z
modf(x, y) Correctly rounded
normalize(x) Inherited from x - length(x)
pow(x, y) Inherited from exp2(y * log2(x))
reflect(x, y) Inherited from x - 2.0 * dot(x, y) * y
refract(x, y, z) Inherited from z * x - (z * dot(y, x) + sqrt(k)) * y,
where k = 1.0 - z * z * (1.0 - dot(y, x) * dot(y, x))
If k < 0.0 the result is precisely 0.0
round(x) Correctly rounded
sign(x) Correctly rounded
sin(x) Absolute error ≤ 2-11 inside the range [-π, π]
sinh(x) Inherited from (exp(x) - exp(-x)) * 0.5
smoothStep(x, y, z) Inherited from t * t * (3.0 - 2.0 * t),
where t = clamp((z - x) / (y - x), 0.0, 1.0)
sqrt(x) Inherited from 1.0 / inverseSqrt(x)
step(x, y) Correctly rounded
tan(x) Inherited from sin(x) / cos(x)
tanh(x) Inherited from sinh(x) / cosh(x)
trunc(x) Correctly rounded

Reassociation is the reordering of operations in an expression such that the answer is the same if computed exactly. For example:

However, the result may not be the same when computed in floating point. The reassociated result may be inaccurate due to approximation, or may trigger an overflow or NaN when computing intermediate results.

An implementation may reassociate and/or fuse operations if the optimization is at least as accurate as the original formulation.

12.5.2. Floating point conversion

In this section, a floating point type may be any of:

Note: Recall that the f32 WGSL type corresponds to the IEEE-754 binary32 format.

When converting a floating point scalar value to an integral type:

When converting a value to a floating point type:

NOTE: An integer value may lie between two adjacent representable floating point values. In particular, the f32 type uses 23 explicit fractional bits. Additionally, when the floating point value is in the normal range (the exponent is neither extreme value), then the mantissa is the set of fractional bits together with an extra 1-bit at the most significant position at bit position 23. Then, for example, integers 228 and 1+228 both map to the same floating point value: the difference in the least significant 1 bit is not representable by the floating point format. This kind of collision occurs for pairs of adjacent integers with a magnitude of at least 225.

(dneto) Default rounding mode is an implementation choice. Is that what we want?

Check behaviour of the f32 to f16 conversion for numbers just beyond the max normal f16 values. I’ve written what an NVIDIA GPU does. See https://github.com/google/amber/pull/918 for an executable test case.

13. Memory Model TODO

14. Keyword and Token Summary

14.1. Keyword Summary

Type-defining keywords
Token Definition
ARRAY array
ATOMIC atomic
BOOL bool
FLOAT32 f32
INT32 i32
MAT2x2 mat2x2 // 2 column x 2 row
MAT2x3 mat2x3 // 2 column x 3 row
MAT2x4 mat2x4 // 2 column x 4 row
MAT3x2 mat3x2 // 3 column x 2 row
MAT3x3 mat3x3 // 3 column x 3 row
MAT3x4 mat3x4 // 3 column x 4 row
MAT4x2 mat4x2 // 4 column x 2 row
MAT4x3 mat4x3 // 4 column x 3 row
MAT4x4 mat4x4 // 4 column x 4 row
POINTER ptr
SAMPLER sampler
SAMPLER_COMPARISON sampler_comparison
STRUCT struct
TEXTURE_1D texture_1d
TEXTURE_2D texture_2d
TEXTURE_2D_ARRAY texture_2d_array
TEXTURE_3D texture_3d
TEXTURE_CUBE texture_cube
TEXTURE_CUBE_ARRAY texture_cube_array
TEXTURE_MULTISAMPLED_2D texture_multisampled_2d
TEXTURE_STORAGE_1D texture_storage_1d
TEXTURE_STORAGE_2D texture_storage_2d
TEXTURE_STORAGE_2D_ARRAY texture_storage_2d_array
TEXTURE_STORAGE_3D texture_storage_3d
TEXTURE_DEPTH_2D texture_depth_2d
TEXTURE_DEPTH_2D_ARRAY texture_depth_2d_array
TEXTURE_DEPTH_CUBE texture_depth_cube
TEXTURE_DEPTH_CUBE_ARRAY texture_depth_cube_array
UINT32 u32
VEC2 vec2
VEC3 vec3
VEC4 vec4
Other keywords
Token Definition
BITCAST bitcast
BLOCK block
BREAK break
CASE case
CONTINUE continue
CONTINUING continuing
DEFAULT default
DISCARD discard
ELSE else
ELSE_IF elseif
ENABLE enable
FALLTHROUGH fallthrough
FALSE false
FN fn
FOR for
FUNCTION function
IF if
LET let
LOOP loop
PRIVATE private
READ read
READ_WRITE read_write
RETURN return
STORAGE storage
SWITCH switch
TRUE true
TYPE type
UNIFORM uniform
VAR var
WORKGROUP workgroup
WRITE write
Issue: Should read, write, and read_write be not completely reserved? They are only used in specific contexts.
Image format keywords
Token Definition
R8UNORM r8unorm
R8SNORM r8snorm
R8UINT r8uint
R8SINT r8sint
R16UINT r16uint
R16SINT r16sint
R16FLOAT r16float
RG8UNORM rg8unorm
RG8SNORM rg8snorm
RG8UINT rg8uint
RG8SINT rg8sint
R32UINT r32uint
R32SINT r32sint
R32FLOAT r32float
RG16UINT rg16uint
RG16SINT rg16sint
RG16FLOAT rg16float
RGBA8UNORM rgba8unorm
RGBA8UNORM-SRGB rgba8unorm_srgb
RGBA8SNORM rgba8snorm
RGBA8UINT rgba8uint
RGBA8SINT rgba8sint
BGRA8UNORM bgra8unorm
BGRA8UNORM-SRGB bgra8unorm_srgb
RGB10A2UNORM rgb10a2unorm
RG11B10FLOAT rg11b10float
RG32UINT rg32uint
RG32SINT rg32sint
RG32FLOAT rg32float
RGBA16UINT rgba16uint
RGBA16SINT rgba16sint
RGBA16FLOAT rgba16float
RGBA32UINT rgba32uint
RGBA32SINT rgba32sint
RGBA32FLOAT rgba32float

TODO(dneto): Eliminate the image formats that are not used in storage images. For example SRGB formats (bgra8unorm_srgb), mixed channel widths (rg11b10float), out-of-order channels (bgra8unorm)

14.2. Reserved Keywords

The following is a list of keywords which are reserved for future expansion.
asm bf16 const do enum
f16 f64 handle i8 i16
i64 mat premerge regardless typedef
u8 u16 u64 unless using
vec void while

14.3. Syntactic Tokens

AND &
AND_AND &&
ARROW ->
ATTR_LEFT [[
ATTR_RIGHT ]]
FORWARD_SLASH /
BANG !
BRACKET_LEFT [
BRACKET_RIGHT ]
BRACE_LEFT {
BRACE_RIGHT }
COLON :
COMMA ,
EQUAL =
EQUAL_EQUAL ==
NOT_EQUAL !=
GREATER_THAN >
GREATER_THAN_EQUAL >=
SHIFT_RIGHT >>
LESS_THAN <
LESS_THAN_EQUAL <=
SHIFT_LEFT <<
MODULO %
MINUS -
MINUS_MINUS --
PERIOD .
PLUS +
PLUS_PLUS ++
OR |
OR_OR ||
PAREN_LEFT (
PAREN_RIGHT )
SEMICOLON ;
STAR *
TILDE ~
XOR ^

Note: The MINUS_MINUS and PLUS_PLUS tokens are reserved, i.e. they are not used in any grammar productions. For example x-- and ++i are not syntactically valid expressions in WGSL.

15. Built-in variables

See § 9.3.1.1 Built-in inputs and outputs for how to declare a built-in variable.

Built-in Stage Input or Output Store type Description
vertex_index vertex in u32 Index of the current vertex within the current API-level draw command, independent of draw instancing.

For a non-indexed draw, the first vertex has an index equal to the firstVertex argument of the draw, whether provided directly or indirectly. The index is incremented by one for each additional vertex in the draw instance.

For an indexed draw, the index is equal to the index buffer entry for vertex, plus the baseVertex argument of the draw, whether provided directly or indirectly.

instance_index vertex in u32 Instance index of the current vertex within the current API-level draw command.

The first instance has an index equal to the firstInstance argument of the draw, whether provided directly or indirectly. The index is incremented by one for each additional instance in the draw.

position vertex out vec4<f32> Output position of the current vertex, using homogeneous coordinates. After homogeneous normalization (where each of the x, y, and z components are divided by the w component), the position is in the WebGPU normalized device coordinate space. See WebGPU § Coordinate Systems.
position fragment in vec4<f32> Framebuffer position of the current fragment, using normalized homogeneous coordinates. (The x, y, and z components have already been scaled such that w is now 1.) See WebGPU § Coordinate Systems.
front_facing fragment in bool True when the current fragment is on a front-facing primitive. False otherwise. See WebGPU § Front-facing.
frag_depth fragment out f32 Updated depth of the fragment, in the viewport depth range. See WebGPU § Coordinate Systems.
local_invocation_id compute in vec3<u32> The current invocation’s local invocation ID, i.e. its position in the workgroup grid.
local_invocation_index compute in u32 The current invocation’s local invocation index, a linearized index of the invocation’s position within the workgroup grid.
global_invocation_id compute in vec3<u32> The current invocation’s global invocation ID, i.e. its position in the compute shader grid.
workgroup_id compute in vec3<u32> The current invocation’s workgroup ID, i.e. the position of the workgroup in the workgroup grid.
num_workgroups compute in vec3<u32> The dispatch size, vec<u32>(group_count_x, group_count_y, group_count_z), of the compute shader dispatched by the API.
sample_index fragment in u32 Sample index for the current fragment. The value is least 0 and at most sampleCount-1, where sampleCount is the number of MSAA samples specified for the GPU render pipeline.
See WebGPU § GPURenderPipeline.
sample_mask fragment in u32 Sample coverage mask for the current fragment. It contains a bitmask indicating which samples in this fragment are covered by the primitive being rendered.
See WebGPU § Sample Masking.
sample_mask fragment out u32 Sample coverage mask control for the current fragment. The last value written to this variable becomes the shader-output mask. Zero bits in the written value will cause corresponding samples in the color attachments to be discarded.
See WebGPU § Sample Masking.
EXAMPLE: Declaring built-in variables
 struct VertexOutput {
   [[builtin(position)]] my_pos: vec4<f32>;
   //   OpDecorate %my_pos BuiltIn Position
   //   %float = OpTypeFloat 32
   //   %v4float = OpTypeVector %float 4
   //   %ptr = OpTypePointer Output %v4float
   //   %my_pos = OpVariable %ptr Output
 };

 [[stage(vertex)]]
 fn vs_main(
   [[builtin(vertex_index)]] my_index: u32,
   //   OpDecorate %my_index BuiltIn VertexIndex
   //   %uint = OpTypeInt 32 0
   //   %ptr = OpTypePointer Input %uint
   //   %my_index = OpVariable %ptr Input
   [[builtin(instance_index)]] my_inst_index: u32,
   //    OpDecorate %my_inst_index BuiltIn InstanceIndex
 ) -> VertexOutput;

 struct FragmentOutput {
   [[builtin(frag_depth)]] depth: f32;
   //     OpDecorate %depth BuiltIn FragDepth
   [[builtin(sample_mask)]] mask_out: u32;
   //      OpDecorate %mask_out BuiltIn SampleMask ; an output variable
 };

 [[stage(fragment)]]
 fn fs_main(
   [[builtin(front_facing)]] is_front: bool,
   //     OpDecorate %is_front BuiltIn FrontFacing
   [[builtin(position)]] coord: vec4<f32>,
   //     OpDecorate %coord BuiltIn FragCoord
   [[builtin(sample_index)]] my_sample_index: u32,
   //      OpDecorate %my_sample_index BuiltIn SampleId
   [[builtin(sample_mask_in)]] mask_in: u32,
   //      OpDecorate %mask_in BuiltIn SampleMask ; an input variable
   //      OpDecorate %mask_in Flat
 ) -> FragmentOutput;

 [[stage(compute)]]
 fn cs_main(
   [[builtin(local_invocation_id)]] local_id: vec3<u32>,
   //     OpDecorate %local_id BuiltIn LocalInvocationId
   [[builtin(local_invocation_index)]] local_index: u32,
   //     OpDecorate %local_index BuiltIn LocalInvocationIndex
   [[builtin(global_invocation_id)]] global_id: vec3<u32>,
   //      OpDecorate %global_id BuiltIn GlobalInvocationId
);

16. Built-in functions

Certain functions are always available in a WGSL program, and are provided by the implementation. These are called built-in functions.

Since a built-in function is always in scope, it is an error to attempt to redefine one or to use the name of a built-in function as an identifier for any other kind of declaration.

Unlike ordinary functions defined in a WGSL program, a built-in function may use the same function name with different sets of parameters. In other words, a built-in function may have more than one overload, but ordinary function definitions in WGSL may not.

When calling a built-in function, all arguments to the function are evaluated before function evaulation begins.

TODO(dneto): Elaborate the descriptions of the built-in functions. So far I’ve only reorganized the contents of the existing table.

TODO: Explain the use of a function prototype in the table: provides name, formal parameter list, and return type. That’s not a full user-defined function declaration.

16.1. Logical built-in functions

Precondition Conclusion Notes
e: vecN<bool> all(e): bool Returns true if each component of e is true. (OpAll)
e: vecN<bool> any(e): bool Returns true if any component of e is true. (OpAny)
T is a scalar or vector select(T,T,bool): T select(a,b,c) evaluates to a when c is true, and b otherwise. (OpSelect)
T is a scalar select(vecN<T>,vecN<T>,vecN<bool>: vecN<T> Component-wise selection. Result component i is evaluated as select(a[i],b[i],c[i]). (OpSelect)

16.2. Value-testing built-in functions

Unary operators
Precondition Conclusion Notes
e: T
T is f32 or vecN<f32>
TR is bool if T is a scalar, or
vecN<bool> if T is a vector
isNan(e) ->TR Test for NaN according to IEEE-754.
Component-wise when T is a vector. (OpIsNan)
isInf(e) ->TR Test for infinity according to IEEE-754.
Component-wise when T is a vector. (OpIsInf)
isFinite(e) ->TR Test a finite value according to IEEE-754.
Component-wise when T is a vector.
isNormal(e) ->TR Test a normal value according to IEEE-754.
Component-wise when T is a vector.
e: ptr<storage,array<T>> arrayLength(e): u32 Returns the number of elements in the runtime-sized array.
(OpArrayLength, but the implementation has to trace back to get the pointer to the enclosing struct.)

16.3. Float built-in functions

Precondition Conclusion Description
T is f32 or vecN<f32> abs(e: T ) -> T Returns the absolute value of e (e.g. e with a positive sign bit). Component-wise when T is a vector. (GLSLstd450Fabs)
T is f32 or vecN<f32> acos(e: T ) -> T Returns the arc cosine of e. Component-wise when T is a vector. (GLSLstd450Acos)
T is f32 or vecN<f32> asin(e: T ) -> T Returns the arc sine of e. Component-wise when T is a vector. (GLSLstd450Asin)
T is f32 or vecN<f32> atan(e: T ) -> T Returns the arc tangent of e. Component-wise when T is a vector. (GLSLstd450Atan)
T is f32 or vecN<f32> atan2(e1: T ,e2: T ) -> T Returns the arc tangent of e1 over e2. Component-wise when T is a vector. (GLSLstd450Atan2)
T is f32 or vecN<f32> ceil(e: T ) -> T Returns the ceiling of e. Component-wise when T is a vector. (GLSLstd450Ceil)
T is f32 or vecN<f32> clamp(e1: T ,e2: T ,e3: T) -> T Returns min(max(e1,e2),e3). Component-wise when T is a vector. (GLSLstd450NClamp)
T is f32 or vecN<f32> cos(e: T ) -> T Returns the cosine of e. Component-wise when T is a vector. (GLSLstd450Cos)
T is f32 or vecN<f32> cosh(e: T ) -> T Returns the hyperbolic cosine of e. Component-wise when T is a vector (GLSLstd450Cosh)
T is f32 cross(e1: vec3<T> ,e2: vec3<T>) -> vec3<T> Returns the cross product of e1 and e2. (GLSLstd450Cross)
T is f32 or vecN<f32> distance(e1: T ,e2: T ) -> T Returns the distance between e1 and e2 (e.g. length(e1-e2)). Component-wise when T is a vector. (GLSLstd450Distance)
T is f32 or vecN<f32> exp(e1: T ) -> T Returns the natural exponentiation of e1 (e.g. ee1). Component-wise when T is a vector. (GLSLstd450Exp)
T is f32 or vecN<f32> exp2(e: T ) -> T Returns 2 raised to the power e (e.g. 2e). Component-wise when T is a vector. (GLSLstd450Exp2)
T is vecN<f32> faceForward(e1: T ,e2: T ,e3: T ) -> T Returns e1 if dot(e2,e3) is negative, and -e1 otherwise. (GLSLstd450FaceForward)
T is f32 or vecN<f32> floor(e: T ) -> T Returns the floor of e. Component-wise when T is a vector. (GLSLstd450Floor)
T is f32 fma(e1: T ,e2: T ,e3: T ) -> T Returns e1 * e2 + e3. Component-wise when T is a vector. (GLSLstd450Fma)
T is f32 or vecN<f32> fract(e: T ) -> T Returns the fractional bits of e (e.g. e - floor(e)). Component-wise when T is a vector. (GLSLstd450Fract)
T is f32 or vecN<f32>
I is i32 or vecN<i32>, where
I is a scalar if T is a scalar, or
a vector when T is a vector
SC is a storage class
A is write or read_write
frexp(e1: T ,e2: ptr<SC,I,A> ) -> T Splits e1 into a signficand and exponent of the form significand * 2exponent and returns the significand. The magnitude of the signficand is in the range of [0.5, 1.0) or 0. The exponent is stored through the pointer e2. Component-wise when T is a vector. (GLSLstd450Frexp)
T is f32 or vecN<f32> inverseSqrt(e: T ) -> T Returns the reciprocal of sqrt(e). Component-wise when T is a vector. (GLSLstd450InverseSqrt)
T is f32 or vecN<f32>
I is i32 or vecN<i32>, where
I is a scalar if T is a scalar, or
a vector when T is a vector
ldexp(e1: T ,e2: I ) -> T Returns e1 * 2e2. Component-wise when T is a vector. (GLSLstd450Ldexp)
T is f32 or vecN<f32> length(e: T ) -> T Returns the length of e (e.g. abs(e) if T is a scalar, or sqrt(e[0]2 + e[1]2 + ...) if T is a vector). Component-wise when T is a vector. (GLSLstd450Length)
T is f32 or vecN<f32> log(e: T ) -> T Returns the natural logaritm of e. Component-wise when T is a vector. (GLSLstd450Log)
T is f32 or vecN<f32> log2(e: T ) -> T Returns the base-2 logarithm of e. Component-wise when T is a vector. (GLSLstd450Log2)
T is f32 or vecN<f32> max(e1: T ,e2: T ) -> T Returns e2 if e1 is less than e2, and e1 otherwise. If one operand is a NaN, the other is returned. If both operands are NaNs, a NaN is returned. Component-wise when T is a vector. (GLSLstd450NMax)
T is f32 or vecN<f32> min(e1: T ,e2: T ) -> T Returns e2 if e2 is less than e1, and e1 otherwise. If one operand is a NaN, the other is returned. If both operands are NaNs, a NaN is returned. Component-wise when T is a vector. (GLSLstd450NMin)
T is f32 or vecN<f32> mix(e1: T ,e2: T ,e3: T) -> T Returns the linear blend of e1 and e2 (e.g. e1*(1-e3)+e2*e3). Component-wise with T is a vector. (GLSLstd450FMix)
T is f32 or vecN<f32> SC is a storage class
A is write or read_write
modf(e1: T ,e2: ptr<SC,T,A> ) -> T Splits e1 into fractional and whole number parts. Returns the fractional part and stores the whole number through the pointer e2. Component-wise when T is a vector. (GLSLstd450Modf)
T is f32 normalize(e: vecN<T> ) -> vecN<T> Returns a unit vector in the same direction as e. (GLSLstd450Normalize)
T is f32 or vecN<f32> pow(e1: T ,e2: T ) -> T Returns e1 raised to the power e2. Component-wise when T is a vector. (GLSLstd450Pow)
T is f32 or vecN<f32> quantizeToF16(e: T ) -> T Quantizes a 32-bit floating point value e as if e were converted to a IEEE 754 binary16 value, and then converted back to a IEEE 754 binary32 value.
See § 12.5.2 Floating point conversion.
Component-wise when T is a vector.

Note: The vec2<f32> case is the same as unpack2x16float(pack2x16float(e)).

(OpQuantizeToF16)

T is vecN<f32> reflect(e1: T ,e2: T ) -> T For the incident vector e1 and surface orientation e2, returns the reflection direction e1-2*dot(e2,e1)*e2. (GLSLstd450Reflect)
T is vecN<f32>
I is f32
refract(e1: T ,e2: T ,e3: I ) -> T For the incident vector e1 and surface normal e2, and the ratio of indices of refraction e3, let k = 1.0 -e3*e3* (1.0 - dot(e2,e1) * dot(e2,e1)). If k < 0.0, returns the refraction vector 0.0, otherwise return the refraction vector e3*e1- (e3* dot(e2,e1) + sqrt(k)) *e2. (GLSLstd450Refract)
T is f32 or vecN<f32> round(e: T ) -> T Result is the integer k nearest to e, as a floating point value.
When e lies halfway between integers k and k+1, the result is k when k is even, and k+1 when k is odd.
Component-wise when T is a vector. (GLSLstd450RoundEven)
T is f32 or vecN<f32> sign(e: T ) -> T Returns the sign of e. Component-wise when T is a vector. (GLSLstd450FSign)
T is f32 or vecN<f32> sin(e: T ) -> T Returns the sine of e. Component-wise when T is a vector. (GLSLstd450Sin)
T is f32 or vecN<f32> sinh(e: T ) -> T Returns the hyperbolic sine of e. Component-wise when T is a vector. (GLSLstd450Sinh)
T is f32 or vecN<f32> smoothStep(e1: T ,e2: T ,e3: T ) -> T Returns the smooth Hermite interpolation between 0 and 1. Component-wise when T is a vector. (GLSLstd450SmoothStep)
T is f32 or vecN<f32> sqrt(e: T ) -> T Returns the square root of e. Component-wise when T is a vector. (GLSLstd450Sqrt)
T is f32 or vecN<f32> step(e1: T ,e2: T ) -> T Returns 0.0 if e1 is less than e2, and 1.0 otherwise. Component-wise when T is a vector. (GLSLstd450Step)
T is f32 or vecN<f32> tan(e: T ) -> T Returns the tangent of e. Component-wise when T is a vector. (GLSLstd450Tan)
T is f32 or vecN<f32> tanh(e: T ) -> T Returns the hyperbolic tangent of e. Component-wise when T is a vector. (GLSLstd450Tanh)
T is f32 or vecN<f32> trunc(e: T ) -> T Returns the nearest whole number whose absolute value is less than or equal to e. Component-wise when T is a vector. (GLSLstd450Trunc)

16.4. Integer built-in functions

Precondition Conclusion Description
T is i32 or vecN<i32> abs(e: T ) -> T The absolute value of e. Component-wise when T is a vector. If e evaluates to the largest negative value, then the result is e. (GLSLstd450SAbs)
T is u32 or vecN<u32> abs(e: T ) -> T Result is e. This is provided for symmetry with abs for signed integers. Component-wise when T is a vector.
T is u32 or vecN<u32> clamp(e1: T ,e2: T,e3: T) -> T Returns min(max(e1,e2),e3). Component-wise when T is a vector. (GLSLstd450UClamp)
T is i32 or vecN<i32> clamp(e1: T ,e2: T,e3: T) -> T Returns min(max(e1,e2),e3). Component-wise when T is a vector. (GLSLstd450SClamp)
T is i32, u32, vecN<i32>, or vecN<u32> countOneBits(e: T ) -> T The number of 1 bits in the representation of e.
Also known as "population count".
Component-wise when T is a vector. (SPIR-V OpBitCount)
T is u32 or vecN<u32> max(e1: T ,e2: T) -> T Returns e2 if e1 is less than e2, and e1 otherwise. Component-wise when T is a vector. (GLSLstd450UMax)
T is i32 or vecN<i32> max(e1: T ,e2: T) -> T Returns e2 if e1 is less than e2, and e1 otherwise. Component-wise when T is a vector. (GLSLstd450SMax)
T is u32 or vecN<u32> min(e1: T ,e2: T) -> T Returns e1 if e1 is less than e2, and e2 otherwise. Component-wise when T is a vector. (GLSLstd450UMin)
T is i32 or vecN<i32> min(e1: T ,e2: T) -> T Returns e1 if e1 is less than e2, and e2 otherwise. Component-wise when T is a vector. (GLSLstd45SUMin)
T is i32, u32, vecN<i32>, or vecN<u32> reverseBits(e: T ) -> T Reverses the bits in e: The bit at position k of the result equals the bit at position 31-k of e.
Component-wise when T is a vector. (SPIR-V OpBitReverse)

16.5. Matrix built-in functions

Precondition Conclusion Description
T is f32 determinant(e: matNxN<T> ) -> T Returns the determinant of e. (GLSLstd450Determinant)
T is f32 transpose(e: matMxN<T> ) -> matNxM<T> Returns the transpose of e. (OpTranspose)

16.6. Vector built-in functions

Precondition Conclusion Notes
T is f32 dot(e1: vecN<T>,e2: vecN<T>) -> T Returns the dot product of e1 and e2. (OpDot)

16.7. Derivative built-in functions

See § 12.4.2 Derivatives.

These functions:

Precondition Conclusion Notes
T is f32 or vecN<f32> dpdx(e:T) -> T Partial derivative of e with respect to window x coordinates. The result is the same as either dpdxFine(e) or dpdxCoarse(e). (OpDPdx)
dpdxCoarse(e:T) ->T Returns the partial derivative of e with respect to window x coordinates using local differences. This may result in fewer unique positions that dpdxFine(e). (OpDPdxCoarse)
dpdxFine(e:T) ->T Returns the partial derivative of e with respect to window x coordinates. (OpDPdxFine)
dpdy(e:T) ->T Partial derivative of e with respect to window y coordinates. The result is the same as either dpdyFine(e) or dpdyCoarse(e). (OpDPdy)
dpdyCoarse(e:T) ->T Returns the partial derivative of e with respect to window y coordinates using local differences. This may result in fewer unique positions that dpdyFine(e). (OpDPdyCoarse)
dpdyFine(e:T) ->T Returns the partial derivative of e with respect to window y coordinates. (OpDPdyFine)
fwidth(e:T) ->T Returns abs(dpdx(e)) + abs(dpdy(e)). (OpFwidth)
fwidthCoarse(e:T) ->T Returns abs(dpdxCoarse(e)) + abs(dpdyCoarse(e)). (OpFwidthCoarse)
fwidthFine(e:T) ->T Returns abs(dpdxFine(e)) + abs(dpdyFine(e)). (OpFwidthFine)

16.8. Texture built-in functions

In this section, texture types are shown with the following parameters:

Parameter values must be valid for the respective texture types.

16.8.1. textureDimensions

Returns the dimensions of a texture, or texture’s mip level in texels.

textureDimensions(t: texture_1d<T>) -> i32
textureDimensions(t: texture_1d<T>, level: i32) -> i32
textureDimensions(t: texture_2d<T>) -> vec2<i32>
textureDimensions(t: texture_2d<T>, level: i32) -> vec2<i32>
textureDimensions(t: texture_2d_array<T>) -> vec2<i32>
textureDimensions(t: texture_2d_array<T>, level: i32) -> vec2<i32>
textureDimensions(t: texture_3d<T>) -> vec3<i32>
textureDimensions(t: texture_3d<T>, level: i32) -> vec3<i32>
textureDimensions(t: texture_cube<T>) -> vec2<i32>
textureDimensions(t: texture_cube<T>, level: i32) -> vec2<i32>
textureDimensions(t: texture_cube_array<T>) -> vec2<i32>
textureDimensions(t: texture_cube_array<T>, level: i32) -> vec2<i32>
textureDimensions(t: texture_multisampled_2d<T>)-> vec2<i32>
textureDimensions(t: texture_depth_2d) -> vec2<i32>
textureDimensions(t: texture_depth_2d, level: i32) -> vec2<i32>
textureDimensions(t: texture_depth_2d_array) -> vec2<i32>
textureDimensions(t: texture_depth_2d_array, level: i32) -> vec2<i32>
textureDimensions(t: texture_depth_cube) -> vec2<i32>
textureDimensions(t: texture_depth_cube, level: i32) -> vec2<i32>
textureDimensions(t: texture_depth_cube_array) -> vec2<i32>
textureDimensions(t: texture_depth_cube_array, level: i32) -> vec2<i32>
textureDimensions(t: texture_storage_1d<F,A>) -> i32
textureDimensions(t: texture_storage_2d<F,A>) -> vec2<i32>
textureDimensions(t: texture_storage_2d_array<F,A>) -> vec2<i32>
textureDimensions(t: texture_storage_3d<F,A>) -> vec3<i32>
textureDimensions(t: texture_external) -> vec2<i32>

Parameters:

t The sampled, multisampled, depth, storage, or external texture.
level The mip level, with level 0 containing a full size version of the texture.
If omitted, the dimensions of level 0 are returned.

Returns:

The dimensions of the texture in texels.

For textures based on cubes, the results are the dimensions of each face of the cube. Cube faces are square, so the x and y components of the result are equal.

If level is outside the range [0, textureNumLevels(t)) then any valid value for the return type may be returned.

16.8.2. textureLoad

Reads a single texel from a texture without sampling or filtering.

textureLoad(t: texture_1d<T>, coords: i32, level: i32) -> vec4<T>
textureLoad(t: texture_2d<T>, coords: vec2<i32>, level: i32) -> vec4<T>
textureLoad(t: texture_2d_array<T>, coords: vec2<i32>, array_index: i32, level: i32) -> vec4<T>
textureLoad(t: texture_3d<T>, coords: vec3<i32>, level: i32) -> vec4<T>
textureLoad(t: texture_multisampled_2d<T>, coords: vec2<i32>, sample_index: i32)-> vec4<T>
textureLoad(t: texture_depth_2d, coords: vec2<i32>, level: i32) -> f32
textureLoad(t: texture_depth_2d_array, coords: vec2<i32>, array_index: i32, level: i32) -> f32
textureLoad(t: texture_external, coords: vec2<i32>) -> vec4<f32>

Parameters:

t The sampled, multisampled, depth, or external texture.
coords The 0-based texel coordinate.
array_index The 0-based texture array index.
level The mip level, with level 0 containing a full size version of the texture.
sample_index The 0-based sample index of the multisampled texture.

Returns:

The unfiltered texel data.

An out of bounds access occurs if:

If an out of bounds access occurs, the built-in function returns one of:

16.8.3. textureNumLayers

Returns the number of layers (elements) of an array texture.

textureNumLayers(t: texture_2d_array<T>) -> i32
textureNumLayers(t: texture_cube_array<T>) -> i32
textureNumLayers(t: texture_depth_2d_array) -> i32
textureNumLayers(t: texture_depth_cube_array) -> i32
textureNumLayers(t: texture_storage_2d_array<F,A>) -> i32

Parameters:

t The sampled, multisampled, depth or storage array texture.

Returns:

If the number of layers (elements) of the array texture.

16.8.4. textureNumLevels

Returns the number of mip levels of a texture.

textureNumLevels(t: texture_1d<T>) -> i32
textureNumLevels(t: texture_2d<T>) -> i32
textureNumLevels(t: texture_2d_array<T>) -> i32
textureNumLevels(t: texture_3d<T>) -> i32
textureNumLevels(t: texture_cube<T>) -> i32
textureNumLevels(t: texture_cube_array<T>) -> i32
textureNumLevels(t: texture_depth_2d) -> i32
textureNumLevels(t: texture_depth_2d_array) -> i32
textureNumLevels(t: texture_depth_cube) -> i32
textureNumLevels(t: texture_depth_cube_array) -> i32

Parameters:

t The sampled or depth texture.

Returns:

If the number of mip levels for the texture.

16.8.5. textureNumSamples

Returns the number samples per texel in a multisampled texture.

textureNumSamples(t: texture_multisampled_2d<T>) -> i32

Parameters:

t The multisampled texture.

Returns:

If the number of samples per texel in the multisampled texture.

16.8.6. textureSample

Samples a texture.

Must only be used in a fragment shader stage. Must only be invoked in uniform control flow.

textureSample(t: texture_1d<f32>, s: sampler, coords: f32) -> vec4<f32>
textureSample(t: texture_2d<f32>, s: sampler, coords: vec2<f32>) -> vec4<f32>
textureSample(t: texture_2d<f32>, s: sampler, coords: vec2<f32>, offset: vec2<i32>) -> vec4<f32>
textureSample(t: texture_2d_array<f32>, s: sampler, coords: vec2<f32>, array_index: i32) -> vec4<f32>
textureSample(t: texture_2d_array<f32>, s: sampler, coords: vec2<f32>, array_index: i32, offset: vec2<i32>) -> vec4<f32>
textureSample(t: texture_3d<f32>, s: sampler, coords: vec3<f32>) -> vec4<f32>
textureSample(t: texture_3d<f32>, s: sampler, coords: vec3<f32>, offset: vec3<i32>) -> vec4<f32>
textureSample(t: texture_cube<f32>, s: sampler, coords: vec3<f32>) -> vec4<f32>
textureSample(t: texture_cube_array<f32>, s: sampler, coords: vec3<f32>, array_index: i32) -> vec4<f32>
textureSample(t: texture_depth_2d, s: sampler, coords: vec2<f32>) -> f32
textureSample(t: texture_depth_2d, s: sampler, coords: vec2<f32>, offset: vec2<i32>) -> f32
textureSample(t: texture_depth_2d_array, s: sampler, coords: vec2<f32>, array_index: i32) -> f32
textureSample(t: texture_depth_2d_array, s: sampler, coords: vec2<f32>, array_index: i32, offset: vec2<i32>) -> f32
textureSample(t: texture_depth_cube, s: sampler, coords: vec3<f32>) -> f32
textureSample(t: texture_depth_cube_array, s: sampler, coords: vec3<f32>, array_index: i32) -> f32

Parameters:

t The sampled, depth, or external texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
offset must be compile time constant, and may only be provided as a literal or const_expr expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will be treated as a compile time error.

Returns:

The sampled value.

16.8.7. textureSampleBias

Samples a texture with a bias to the mip level.

Must only be used in a fragment shader stage. Must only be invoked in uniform control flow.

textureSampleBias(t: texture_2d<f32>, s: sampler, coords: vec2<f32>, bias: f32) -> vec4<f32>
textureSampleBias(t: texture_2d<f32>, s: sampler, coords: vec2<f32>, bias: f32, offset: vec2<i32>) -> vec4<f32>
textureSampleBias(t: texture_2d_array<f32>, s: sampler, coords: vec2<f32>, array_index: i32, bias: f32) -> vec4<f32>
textureSampleBias(t: texture_2d_array<f32>, s: sampler, coords: vec2<f32>, array_index: i32, bias: f32, offset: vec2<i32>) -> vec4<f32>
textureSampleBias(t: texture_3d<f32>, s: sampler, coords: vec3<f32>, bias: f32) -> vec4<f32>
textureSampleBias(t: texture_3d<f32>, s: sampler, coords: vec3<f32>, bias: f32, offset: vec3<i32>) -> vec4<f32>
textureSampleBias(t: texture_cube<f32>, s: sampler, coords: vec3<f32>, bias: f32) -> vec4<f32>
textureSampleBias(t: texture_cube_array<f32>, s: sampler, coords: vec3<f32>, array_index: i32, bias: f32) -> vec4<f32>

Parameters:

t The texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
bias The bias to apply to the mip level before sampling. bias must be between -16.0 and 15.99.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
offset must be compile time constant, and may only be provided as a literal or const_expr expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will be treated as a compile time error.

Returns:

The sampled value.

16.8.8. textureSampleCompare

Samples a depth texture and compares the sampled depth values against a reference value.

Must only be used in a fragment shader stage. Must only be invoked in uniform control flow.

textureSampleCompare(t: texture_depth_2d, s: sampler_comparison, coords: vec2<f32>, depth_ref: f32) -> f32
textureSampleCompare(t: texture_depth_2d, s: sampler_comparison, coords: vec2<f32>, depth_ref: f32, offset: vec2<i32>) -> f32
textureSampleCompare(t: texture_depth_2d_array, s: sampler_comparison, coords: vec2<f32>, array_index: i32, depth_ref: f32) -> f32
textureSampleCompare(t: texture_depth_2d_array, s: sampler_comparison, coords: vec2<f32>, array_index: i32, depth_ref: f32, offset: vec2<i32>) -> f32
textureSampleCompare(t: texture_depth_cube, s: sampler_comparison, coords: vec3<f32>, depth_ref: f32) -> f32
textureSampleCompare(t: texture_depth_cube_array, s: sampler_comparison, coords: vec3<f32>, array_index: i32, depth_ref: f32) -> f32

Parameters:

t The depth texture to sample.
s The sampler comparision type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
depth_ref The reference value to compare the sampled depth value against.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
offset must be compile time constant, and may only be provided as a literal or const_expr expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will be treated as a compile time error.

Returns:

A value in the range [0.0..1.0].

Each sampled texel is compared against the reference value using the comparision operator defined by the sampler_comparison, resulting in either a 0 or 1 value for each texel.

If the sampler uses bilinear filtering then the returned value is the filtered average of these values, otherwise the comparision result of a single texel is returned.

16.8.9. textureSampleCompareLevel

Samples a depth texture and compares the sampled depth values against a reference value.

textureSampleCompareLevel(t: texture_depth_2d, s: sampler_comparison, coords: vec2<f32>, depth_ref: f32) -> f32
textureSampleCompareLevel(t: texture_depth_2d, s: sampler_comparison, coords: vec2<f32>, depth_ref: f32, offset: vec2<i32>) -> f32
textureSampleCompareLevel(t: texture_depth_2d_array, s: sampler_comparison, coords: vec2<f32>, array_index: i32, depth_ref: f32) -> f32
textureSampleCompareLevel(t: texture_depth_2d_array, s: sampler_comparison, coords: vec2<f32>, array_index: i32, depth_ref: f32, offset: vec2<i32>) -> f32
textureSampleCompareLevel(t: texture_depth_cube, s: sampler_comparison, coords: vec3<f32>, depth_ref: f32) -> f32
textureSampleCompareLevel(t: texture_depth_cube_array, s: sampler_comparison, coords: vec3<f32>, array_index: i32, depth_ref: f32) -> f32

The textureSampleCompareLevel function is the same as textureSampleCompare, except that:

16.8.10. textureSampleGrad

Samples a texture using explicit gradients.

textureSampleGrad(t: texture_2d<f32>, s: sampler, coords: vec2<f32>, ddx: vec2<f32>, ddy: vec2<f32>) -> vec4<f32>
textureSampleGrad(t: texture_2d<f32>, s: sampler, coords: vec2<f32>, ddx: vec2<f32>, ddy: vec2<f32>, offset: vec2<i32>) -> vec4<f32>
textureSampleGrad(t: texture_2d_array<f32>, s: sampler, coords: vec2<f32>, array_index: i32, ddx: vec2<f32>, ddy: vec2<f32>) -> vec4<f32>
textureSampleGrad(t: texture_2d_array<f32>, s: sampler, coords: vec2<f32>, array_index: i32, ddx: vec2<f32>, ddy: vec2<f32>, offset: vec2<i32>) -> vec4<f32>
textureSampleGrad(t: texture_3d<f32>, s: sampler, coords: vec3<f32>, ddx: vec3<f32>, ddy: vec3<f32>) -> vec4<f32>
textureSampleGrad(t: texture_3d<f32>, s: sampler, coords: vec3<f32>, ddx: vec3<f32>, ddy: vec3<f32>, offset: vec3<i32>) -> vec4<f32>
textureSampleGrad(t: texture_cube<f32>, s: sampler, coords: vec3<f32>, ddx: vec3<f32>, ddy: vec3<f32>) -> vec4<f32>
textureSampleGrad(t: texture_cube_array<f32>, s: sampler, coords: vec3<f32>, array_index: i32, ddx: vec3<f32>, ddy: vec3<f32>) -> vec4<f32>

Parameters:

t The texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
ddx The x direction derivative vector used to compute the sampling locations.
ddy The y direction derivative vector used to compute the sampling locations.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
offset must be compile time constant, and may only be provided as a literal or const_expr expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will be treated as a compile time error.

Returns:

The sampled value.

16.8.11. textureSampleLevel

Samples a texture using an explicit mip level, or at mip level 0.

textureSampleLevel(t: texture_2d<f32>, s: sampler, coords: vec2<f32>, level: f32) -> vec4<f32>
textureSampleLevel(t: texture_2d<f32>, s: sampler, coords: vec2<f32>, level: f32, offset: vec2<i32>) -> vec4<f32>
textureSampleLevel(t: texture_2d_array<f32>, s: sampler, coords: vec2<f32>, array_index: i32, level: f32) -> vec4<f32>
textureSampleLevel(t: texture_2d_array<f32>, s: sampler, coords: vec2<f32>, array_index: i32, level: f32, offset: vec2<i32>) -> vec4<f32>
textureSampleLevel(t: texture_3d<f32>, s: sampler, coords: vec3<f32>, level: f32) -> vec4<f32>
textureSampleLevel(t: texture_3d<f32>, s: sampler, coords: vec3<f32>, level: f32, offset: vec3<i32>) -> vec4<f32>
textureSampleLevel(t: texture_cube<f32>, s: sampler, coords: vec3<f32>, level: f32) -> vec4<f32>
textureSampleLevel(t: texture_cube_array<f32>, s: sampler, coords: vec3<f32>, array_index: i32, level: f32) -> vec4<f32>
textureSampleLevel(t: texture_depth_2d, s: sampler, coords: vec2<f32>, level: i32) -> f32
textureSampleLevel(t: texture_depth_2d, s: sampler, coords: vec2<f32>, level: i32, offset: vec2<i32>) -> f32
textureSampleLevel(t: texture_depth_2d_array, s: sampler, coords: vec2<f32>, array_index: i32, level: i32) -> f32
textureSampleLevel(t: texture_depth_2d_array, s: sampler, coords: vec2<f32>, array_index: i32, level: i32, offset: vec2<i32>) -> f32
textureSampleLevel(t: texture_depth_cube, s: sampler, coords: vec3<f32>, level: i32) -> f32
textureSampleLevel(t: texture_depth_cube_array, s: sampler, coords: vec3<f32>, array_index: i32, level: i32) -> f32
textureSampleLevel(t: texture_external, s: sampler, coords: vec2<f32>) -> vec4<f32>

Parameters:

t The sampled or depth texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
level The mip level, with level 0 containing a full size version of the texture. For the functions where level is a f32, fractional values may interpolate between two levels if the format is filterable according to the Texture Format Capabilities.
When not specified, mip level 0 is sampled.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
offset must be compile time constant, and may only be provided as a literal or const_expr expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will be treated as a compile time error.

Returns:

The sampled value.

16.8.12. textureStore

Writes a single texel to a texture.

textureStore(t: texture_storage_1d<F,write>, coords: i32, value: vec4<T>)
textureStore(t: texture_storage_2d<F,write>, coords: vec2<i32>, value: vec4<T>)
textureStore(t: texture_storage_2d_array<F,write>, coords: vec2<i32>, array_index: i32, value: vec4<T>)
textureStore(t: texture_storage_3d<F,write>, coords: vec3<i32>, value: vec4<T>)

The channel format T depends on the storage texel format F. See the texel format table for the mapping of texel format to channel format.

Parameters:

t The write-only storage texture.
coords The 0-based texel coordinate.
array_index The 0-based texture array index.
value The new texel value.

Note:

An out-of-bounds access occurs if:

If an out-of-bounds access occurs, the built-in function may do any of the following:

TODO:

TODO(dsinclair): Need gather operations

16.9. Atomic built-in functions

Atomic built-in functions can be used to read/write/read-modify-write atomic objects. They are the only operations allowed on § 4.2.7 Atomic Types.

All atomic built-in functions use a relaxed memory ordering (0-value integral constant in SPIR-V for all Memory Semantics operands). This means synchronization and ordering guarantees only apply among atomic operations acting on the same memory locations. No synchronization or ordering guarantees apply between atomic and non-atomic memory accesses, or between atomic accesses acting on different memory locations.

Atomic built-in functions must not be used in a vertex shader stage.

The storage class SC of the atomic_ptr parameter in all atomic built-in functions must be either storage or workgroup. workgroup atomics have a Workgroup memory scope in SPIR-V, while storage atomics have a Device memory scope in SPIR-V.

The access mode A in all atomic built-in functions must be read_write.

TODO: Add links to the eventual memory model descriptions.

16.9.1. Atomic Load

atomicLoad(atomic_ptr: ptr<SC, atomic<T>, A>) -> T

// Maps to the SPIR-V instruction OpAtomicLoad.

Returns the atomically loaded the value pointed to by atomic_ptr. It does not modify the object.

16.9.2. Atomic Store

atomicStore(atomic_ptr: ptr<SC, atomic<T>, A>, v: T)

// Maps to the SPIR-V instruction OpAtomicStore.

Atomically stores the value v in the atomic object pointed to by atomic_ptr.

16.9.3. Atomic Read-Modify-Write

atomicAdd(atomic_ptr: ptr<SC, atomic<T>, A>, v: T) -> T
atomicMax(atomic_ptr: ptr<SC, atomic<T>, A>, v: T) -> T
atomicMin(atomic_ptr: ptr<SC, atomic<T>, A>, v: T) -> T
atomicAnd(atomic_ptr: ptr<SC, atomic<T>, A>, v: T) -> T
atomicOr(atomic_ptr: ptr<SC, atomic<T>, A>, v: T) -> T
atomicXor(atomic_ptr: ptr<SC, atomic<T>, A>, v: T) -> T

// Mappings to SPIR-V instructions:
// atomicAdd -> OpAtomicIAdd
// atomicMax -> OpAtomicSMax or OpAtomicUMax (depending on the signedness of T)
// atomicMin -> OpAtomicSMin or OpAtomicUMin (depending on the signedness of T)
// atomicAnd -> OpAtomicAnd
// atomicOr  -> OpAtomicOr
// atomicXor -> OpAtomicXor
Each function performs the following steps atomically:
  1. Load the original value pointed to by atomic_ptr.

  2. Obtains a new value by performing the operation (e.g. max) from the function name with the value v.

  3. Store the new value using atomic_ptr.

Each function returns the original value stored in the atomic object.

atomicExchange(atomic_ptr: ptr<SC, atomic<T>, A>, v: T) -> T

// Maps to the SPIR-V instruction OpAtomicExchange.

Atomically stores the value v in the atomic object pointed to atomic_ptr and returns the original value stored in the atomic object.

atomicCompareExchangeWeak(atomic_ptr: ptr<SC, atomic<T>, A>, cmp: T, v: T) -> vec2<T>

// Maps to the SPIR-V instruction OpAtomicCompareExchange.

Performs the following steps atomically:

  1. Load the original value pointed to by atomic_ptr.

  2. Compare the original value to the value v using an equality operation.

  3. Store the value v only if the result of the equality comparison was true.

Returns a two-element vector, where the first element is the original value of the atomic object and the second element is whether or not the comparison succeeded (1 if successful, 0 otherwise).

Note: the equality comparison may spuriously fail on some implementations. That is, the second element of the result vector may be 0 even if the first element of the result vector equals cmp.

16.10. Data packing built-in functions

Data packing builtin functions can be used to encode values using data formats that do not correspond directly to types in WGSL. This enables a program to write many densely packed values to memory, which can reduce a shader’s memory bandwidth demand.

Conclusion Notes
pack4x8snorm(e: vec4<f32>) -> u32 Converts four normalized floating point values to 8-bit signed integers, and then combines them into one u32 value.
Component e[i] of the input is converted to an 8-bit twos complement integer value ⌊ 0.5 + 127 × min(1, max(-1, e[i])) ⌋ which is then placed in bits 8 × i through 8 × i + 7 of the result.
pack4x8unorm(e: vec4<f32>) -> u32 Converts four normalized floating point values to 8-bit unsigned integers, and then combines them into one u32 value.
Component e[i] of the input is converted to an 8-bit unsigned integer value ⌊ 0.5 + 255 × min(1, max(0, e[i])) ⌋ which is then placed in bits 8 × i through 8 × i + 7 of the result.
pack2x16snorm(e: vec2<f32>) -> u32 Converts two normalized floating point values to 16-bit signed integers, and then combines them into one u32 value.
Component e[i] of the input is converted to a 16-bit twos complement integer value ⌊ 0.5 + 32767 × min(1, max(-1, e[i])) ⌋ which is then placed in bits 16 × i through 16 × i + 15 of the result.
pack2x16unorm(e: vec2<f32>) -> u32 Converts two normalized floating point values to 16-bit unsigned integers, and then combines them into one u32 value.
Component e[i] of the input is converted to a 16-bit unsigned integer value ⌊ 0.5 + 65535 × min(1, max(0, e[i])) ⌋ which is then placed in bits 16 × i through 16 × i + 15 of the result.
pack2x16float(e: vec2<f32>) -> u32 Converts two floating point values to half-precision floating point numbers, and then combines them into one one u32 value.
Component e[i] of the input is converted to a IEEE-754 binary16 value, which is then placed in bits 16 × i through 16 × i + 15 of the result. See § 12.5.2 Floating point conversion.

16.11. Data unpacking built-in functions

Data unpacking builtin functions can be used to decode values in data formats that do not correspond directly to types in WGSL. This enables a program to read many densely packed values from memory, which can reduce a shader’s memory bandwidth demand.

Conclusion Notes
unpack4x8snorm(e: u32) -> vec4<f32> Decomposes a 32-bit value into four 8-bit chunks, then reinterprets each chunk as a signed normalized floating point value.
Component i of the result is max(v ÷ 127, -1), where v is the interpretation of bits 8×i through 8×i+7 of e as a twos-complement signed integer.
unpack4x8unorm(e: u32) -> vec4<f32> Decomposes a 32-bit value into four 8-bit chunks, then reinterprets each chunk as an unsigned normalized floating point value.
Component i of the result is v ÷ 255, where v is the interpretation of bits 8×i through 8×i+7 of e as an unsigned integer.
unpack2x16snorm(e: u32) -> vec2<f32> Decomposes a 32-bit value into two 16-bit chunks, then reinterprets each chunk as a signed normalized floating point value.
Component i of the result is max(v ÷ 32767, -1), where v is the interpretation of bits 16×i through 16×i+15 of e as a twos-complement signed integer.
unpack2x16unorm(e: u32) -> vec2<f32> Decomposes a 32-bit value into two 16-bit chunks, then reinterprets each chunk as an unsigned normalized floating point value.
Component i of the result is v ÷ 65535, where v is the interpretation of bits 16×i through 16×i+15 of e as an unsigned integer.
unpack2x16float(e: u32) -> vec2<f32> Decomposes a 32-bit value into two 16-bit chunks, and reinterpets each chunk as a floating point value.
Component i of the result is the f32 representation of v, where v is the interpretation of bits 16×i through 16×i+15 of e as an IEEE-754 binary16 value. See § 12.5.2 Floating point conversion.

16.12. Synchronization built-in functions

WGSL provides the following synchronization functions:

fn storageBarrier()
fn workgroupBarrier()

All synchronization functions execute a control barrier with Acquire/Release memory ordering. That is, all synchronization functions, and affected memory and atomic operations are ordered in § 12.1.2 Program order (within an invocation) TODO relative to the synchronization function. Additionally, the affected memory and atomic operations program-ordered before the synchronization function must be visible to all other threads in the workgroup before any affected memory or atomic operation program-ordered after the synchronization function is executed by a member of the workgroup. All synchronization functions must only be used in the compute shader stage.

storageBarrier affects memory and atomic operations in the storage storage class.

workgroupBarrier affects memory and atomic operations in the workgroup storage class.

TODO: Add links to the eventual memory model.

EXAMPLE: Mapping workgroupBarrier to SPIR-V
storageBarrier();
// Maps to:
// Execution Scope is Workgroup = %uint_2
// Memory Scope is Device = %uint_1
// Memory Semantics are AcquireRelease | UniformMemory (0x8 | 0x40) = %uint_72
// OpControlBarrier %uint_2 %uint_1 %uint_72

workgroupBarrier();
// Maps to:
// Execution and Memory Scope are Workgroup = %uint_2
// Memory semantics are AcquireRelease | WorkgroupMemory (0x8 | 0x100) = %uint_264
// OpControlBarrier %uint_2 %uint_2 %uint_264

workgroupBarrier();
storageBarrier();
// Or, equivalently:
storageBarrier();
workgroupBarrier();
// Could be mapped to a single OpControlBarrier:
// Execution scope is Workgroup = %uint_2
// Memory Scope is Device = %uint_1
// Memory semantics are AcquireRelease | UniformMemory | WorkgroupMemory
//   (0x8 | 0x40 | 0x100) = %uint_328
// OpControlBarrier %uint_2 %uint_1 %uint_328

16.13. Value-steering functions

Conclusion Notes
ignore(e: T) Evaluates e, and then ignores the result.
Type T is any non-reference type.

17. MATERIAL TO BE MOVED TO A NEW HOME OR DELETED

WGSL has operations for:

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[IEEE-754]
IEEE Standard for Floating-Point Arithmetic. 29 August 2008. URL: http://ieeexplore.ieee.org/servlet/opac?punumber=4610933
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[SPIR-V]
John Kessenich; Boaz Ouriel; Raun Krisch. SPIR-V Specification. URL: https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html
[VulkanMemoryModel]
Jeff Bolz; et al. Vulkan Memory Model. URL: https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#memory-model
[WebGPU]
Dzmitry Malyshau; Justin Fan; Kai Ninomiya. WebGPU. Editor's Draft. URL: https://gpuweb.github.io/gpuweb/

Issues Index

The WebGPU pipeline creation API must specify how API-supplied values are mapped to shader scalar values. For booleans, I suggest using a 32-bit integer, where only 0 maps to false. If WGSL gains non-32-bit numeric scalars, I recommend overridable constants continue being 32-bit numeric types.
§ 12.2.1 Uniform control flow TODO needs to state whether all invocations being discarded maintains uniform control flow.
Revisit aliasing rules for clarity.
Can we query upper bounds on workgroup size dimensions? Is it independent of the shader, or a property to be queried after creating the shader module?
in Vulkan, builtin variables occoupy I/O location slots counting toward limits.
WebGPU issue 1045: Dispatch group counts must be positive. However, how do we handle an indirect dispatch that specifies a group count of zero.
(dneto) Default rounding mode is an implementation choice. Is that what we want?
Check behaviour of the f32 to f16 conversion for numbers just beyond the max normal f16 values. I’ve written what an NVIDIA GPU does. See https://github.com/google/amber/pull/918 for an executable test case.