WebML WG Teleconference – 23 April 2026

Meeting minutes

Repository: webmachinelearning/webnn

Anssi: please join me in welcoming the latest new participants to the WG:
… Yoav Weiss from Shopify
… Fidel Tian from Zoom
… Ali Spivak from Google

Web Neural Network API

Lower limit for conv2d/pool2d kernel sizes, dilations, strides

Anssi: issue #928

<gb> Issue 928 Consider specify lower limit for conv2d/pool2d kernel sizes, dilations, strides (by philloooo) [security-tracker] [Agenda+]

Anssi: I'd like to have a group discussion on the edge cases reported in the issue that may cause integer overflows
… and also discuss the proposed mitigations to be codified in the specification
… Phillis reports she found edge cases in pool2d and conv2d when "unreasonably large" sizes for the parameters are used
… this may cause integer overflows in some backends
… Phillis notes in real-life use cases these parameters never have such large values
… and thus proposes as mitigations parameter upper bounds that are "reasonable for all use cases"
… this helps with fuzz testing by limiting scope
… initial proposal suggests the following size limits:
… kernel size 1024
… strides and dilations 256

Phillis: Dillon works for TFLite and proposed checks to fix "fishy" parameter sizes

webmachinelearning/webnn#928 (comment)

<gb> Issue 928 Consider specify lower limit for conv2d/pool2d kernel sizes, dilations, strides (by philloooo) [security-tracker] [Agenda+]

<tarek> Sorry I am late.

Dwayne: I agree with limits on the high level, I want to be cautious picking arbitrary ranges that we feel should be safe
… express in terms of inputs instead, some of our samples e.g. SD use more than the initially proposed limits
… want to be careful to not break things with these limits
… Dillon's statements are in terms of input sizes and as such relative metrics

Anssi: "these rules would catch some valid graphs, but [...] unlikely that "real" graphs would violate these checks"

Dwayne: would help to have collection of cases where we have int overflows in backends

Anssi: Phillis, can that information we shared in public?

Phillis: will check with the team

Ningxin: I haven't yet received an update from investigation from my side

MarkusT: should WebNN do this, or should different backends do this? Can we expose underlying limits of the backend, would this cause fragmentation

Dwayne: I'll review Dillon's checks with additional information provided

conv2d output channels validation

Anssi: issue #925

<gb> Issue 925 Mandate `output_channels % groups == 0` validation for conv2d (by lynne0326) [security-tracker] [Agenda+]

Phillis: Lynne from Google works on WebNN

Anssi: let's discuss the issue and review the proposed spec change

Phillis: this issue also triggers some errors in the backends and we'd like to the proposed validation to the spec for conv2d
… proposed spec change: "The specification must mandate that the total number of output channels is a multiple of the groups attribute."

Anssi: motivation: "This eliminates an entire class of potential buffer overflows before the graph is ever compiled."

Anssi: as future work, add the same validation to convTranspose2d if it adds support for groups > 1 that'd cause OOB write

<DwayneR> I think this restriction should be okay.

Ningxin: I think this is a good addition, looking at conv2d groups attribute, it also mentions channels are divided in two, just need to add this to the validation step, +1 from me

MarkusT: it is a good change, +1 from me

Anssi: Dwayne +1'd

RESOLUTION: Add the proposed mitigation to conv2d spec to eliminate potential buffer overflows. (issue #925)

MLComputePolicy naming

Anssi: PR #923

<gb> Pull Request 923 Refactor device selection: Rename to computePolicy, remove accelerated, and add fallback (by mingmingtasd) [device selection] [Agenda+]

Anssi: we are close to landing this PR
… the remaining thing we need to agree on is naming
… MikeW approved with "fallback" as the name
… I see Dwayne's "compatible" alternative name has merit and warrants discussion
… the staged PR has the following names for the MLComputePolicy:

enum MLComputePolicy {
  "default",
  "high-performance",
  "low-power",
  "fallback"
};

Anssi: the proposed alternative is "fallback" -> "compatible":

enum MLComputePolicy {
  "default",
  "high-performance",
  "low-power",
  "compatible"
};

Anssi: Web Platform Design Principles that has a section called "Naming principles" to help spec authors choose names:

https://www.w3.org/TR/design-principles/#naming-is-hard

Anssi: I put these two names, "fallback" and "compatible", through this test to inform the naming decision:
… - 12.1 Use common words - both "fallback" and "compatible" are readable US English
… - 12.2 Use ASCII names - both "fallback" and "compatible" pass this test with flying colors
… - 12.3 Consult others on naming - "fallback" in more common name on the web platform

"fallback" name in WebGPU and CSS

"compatMode" name in DOM
… - 12.4 Use names that describe a purpose - "compatible" seems better aligned with TAG principles
… TAG says "Name things for what they do, not how they do it."
… "fallback" tells how to do it ("use fallback device")
… "compatible" describes behavior ("use device that provides maximum compatibility")
… - 12.5 Name things consistently - "compatible" seems more consistent within MLComputePolicy names
… TAG says "Naming schemes should aim for consistency, to avoid confusion"
… "compatible", "high-performance" and "low-power" all describe what is to be prioritized
… "fallback" describes how to do it
… the non-scientific TAG naming principles test score, using equal weight for all tests:
… - "fallback" 3/5 - fails tests 12.4 and 12.5
… - "compatible" 4/5 - fails test 12.3
… if we give more weight to naming consistency within the web platform APIs it is a tie
… my gut feeling is folks who try WebNN are also familiar with WebGPU
… I see preference from Dwayne and Zoltan for the "compatible" name, Markus seems indifferent?

MarkusH: I was wondering, as a follow up, would be nice to have an effective compute policy on Graph.devices
… slight preference for "fallback"

Zoltan: an original issue connected to this PR, used "fallback" as the name
… argument for fallback was then that WebGPU is also using it, but WebGPU prefers an adapter concept that does not exist in WebNN
… so adapter and ComputePolicy do not go so well together
… ComputePolicy being fallback is not too far from WebGPU fallback adapter
… adding more consideration

<RafaelCintron> +1

Rafael: we prefer "fallback" to keep aligned with WebGPU naming

Dwayne: I can live with "fallback"

MarkusT: should we have a priority list of devices?

<zolkis> we started long ago with a priority list

Zoltan: we have an explainer with background discussion

RESOLUTION: Use MLComputePolicy "fallback" as the name for maximum compatibility preference. (PR #923)

Anssi: other enhancements from this PR, to be addressed in a separate PR:
… - "low-latency" MLComputePolicy, example use case audio processing
… - "precision" MLContextOptions, to signal chopping off low bits is not preferred for this context

Bounded dynamic dimension

Anssi: issue #883

<gb> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+]

Anssi: thanks Ningxin and Markus for providing new information for this consideration
… first, ORT Web does not currently support dimension bounds
… Ningxin proposed to add freeDimensionBounds similar to freeDimensionOverrides in WebNN EP session options, Guenther was pinged

Ningxin: I will catch up with Guenther

Anssi: Markus notes some backends could accept any shape size, thus min/max bounds are to be considered as hints
… this is backend specific, question to ONNX team how to handle this in a compatible manner across all backends

Ningxin: two aspects, ONNX Runtime API, another is EP interface, for the EP interface we can talk with a person in ONNX team with Rafael
… SOTA image generation model Z-Image-Turbo, cannot be supported by only having input dynamic dimension as in the previous proposal shared on Feb 11:

webmachinelearning/webnn#883 (comment)

<gb> https://github.com/webmachinelearning/webnn/issues/883

Anssi: to address this limitation, Bin and Wanming prototyped 9 new ops to enable Z-Image-Turbo successfully:

+ MLOperand mod(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});
+ MLOperand shape(MLOperand input, optional MLOperatorOptions options = {});
+ MLOperand range(MLOperand start, MLOperand limit, MLOperand delta, optional MLOperatorOptions options = {});
+ MLOperand dynamicReshape(MLOperand input, MLOperand newShape, optional MLOperatorOptions options = {});
+ MLOperand dynamicExpand(MLOperand input, MLOperand newShape, optional MLOperatorOptions options = {});
+ MLOperand dynamicSlice(MLOperand input, MLOperand starts, MLOperand ends, optional MLOperatorOptions options = {});
+ MLOperand dynamicPad(MLOperand input, MLOperand pads, optional MLOperatorOptions options = {});
+ sequence<MLOperand> dynamicSplit(MLOperand input, MLOperand splits, unsigned long numOutputs, optional MLOperatorOptions options = {});
+ MLOperand dynamicResample2d(MLOperand input, MLOperand sizes, optional MLDynamicResample2dOptions options = {});

webmachinelearning/webnn#883 (comment)

<gb> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+]

Ningxin: I want to get the group's feedback for there new proposed ops
… dynamic size calculation required within the graph, initial proposal allows developer to define dynamic dimensions at input, placeholder with name and bounds
… this new model and other Transformers.js models require shape calculation, using pad-to-chunks algorithm
… later we want to pad the input to multiple chunks, padding size needs to be calculated at runtime
… must get shape at runtime and return tensor shape in another tensor, int64 or uint32
… tensor size used by other calculations to calculate e.g. padding to chunk size value
… only static padding supported currently
… dynamic size should be accepted, that's why we proposed dynamicPad
… 9 proposed ops, one missing, sample element-wise unary op
… they are dynamic versions of existing ops, e.g. dynamicReshape
… range is a tensor generation op
… Z-Image-Turbo has three models
… this proposal is to add 9 new ops in addition to the previous proposal that is also required, both are required
… we can validate tensor shape, dynamic version of this op, you don't know what the output tensor size is at build time
… next step to explore dispatch-time shape validation when all tensors' sizes are specified or can be inferred

Anssi: multiple new models require these new features?

Ningxin: correct

MarkusT: thank you Ningxin for this work

Anssi: everyone happy with the direction and the proposed next steps?

Rafael: I'm supportive of this work if we can do this, given some platforms are more challenging wrt dynamic shapes

MarkusT: some NPUs may have problems supporting this, in those cases GPU or CPU could be used as a fallback

Ningxin: I think fallback compute policy provides a solution to that issue

– DRAFT –
WebML WG Teleconference – 23 April 2026

23 April 2026

Attendees