Meeting minutes
Repository: webmachinelearning/webnn
Proposed Charter 2023-2025 under W3C Advisory Committee review
anssik: Please make sure your Advisory Committee representative brings input (and ideally support) on the review.
… You'll find your AC rep contact details at https://
… The deadline for responses is 03:59 UTC 5 April 2023
… or 23:59, Boston time on 4 April 2023
dom: nothing to add, it is important to get support from as many as possible
WebNN - WebIDL and Infra standard conventions
anssik: we have aligned the core parts of the API with WebIDL conventions to satisfy CR criteria. These are documented as "Done" in webmachinelearning/
… we have editorial enhancements that have been waiting the CR publication to happen
… because to keep the spec in a consistent and cohesive state for the CR publication before we start landing these
… we will discuss 2 of those enhancements today, Zoltan has worked on these, much thanks, Ningxin reviewed
Sync and async algorithms
anssik: issue #316
<ghurlbot> Issue 316 Review sync vs async compute differences (zolkis) Editorial
anssik: PR #329
<ghurlbot> Pull Request 329 Rework the sync async algorithms based on #323 (zolkis)
zoltan: a 2nd take on #323
<ghurlbot> Pull Request 323 [closed] Transfer the input and output views for asynchronous execution (huningxin)
zoltan: updated based on ningxin's changes, factoring out the common part of sync & async executions
… e.g. now we have validate and execute graph steps, referenced from both sync & async methods
… it helps avoid repetition - mostly editorial
… but would like clarity on whether to merge it now to avoid having too many branches in parallel
… to me, it can be merged for our CR release
… would like chai to take a look so that we can merge it soon
<ningxin_hu> webmachinelearning/
ningxin_hu: +1 to merging it for CR - it's an editorial improvement that also fixes #341
<ghurlbot> Issue 341 Should validate MLGraph.[[context]] in MLContext.compute() and MLContext.computeSync() steps (huningxin) question
chai: I'll take a look it, this week or next
RafaelCintron: what happens if the Web developer calls async passing arraybuffers and change the values of the arraybuffer before the compute promise returns?
zolkan: that's handled by the transfer-input and output steps which ningxin fixed
ninginx: right - we've fixed that race condition while avoiding copy overhead in #323
<ghurlbot> Pull Request 323 [closed] Transfer the input and output views for asynchronous execution (huningxin)
MLOperand and MLActivation internal slots
anssik: issue #336
<ghurlbot> Issue 336 Add internal slots to MLOperand, MLActivation and basic algorithms (zolkis) Editorial
anssik: PR #337
<ghurlbot> Pull Request 337 Add internal slots to MLOperand and MLActivation (zolkis)
zoltan: this to help fix lack of clarity around MLActivation (née MLOperator)
… #337 uses internal slots for the usual attributes, but also to link to the operator
… I was also experimenting with describing constructors, but removed it based on Ningxin's feedback
… Ningxin reviewed the PR, other feedback welcome
… MLActivation also has an implementation internal slot that will be needed for other algorithms improvements to come in other PRs
… there may be additional changes needed as we improve the algorithms of other functions
… but we can do this iteratively
… this PR is a prerequisite to start these iterations
ningxin: could we have separate PRs for internal slots for MLOperand and MLActivation?
zoltan: we could - right now they're separate commits
… (but further amendments aren't)
… I think they belong to the same PR because they're needed as a combination to iterate
Ningxin: I need to take another look now that you've incorporated my feedback
zoltan: no matter what, I'll need an MLActivation internal slot; but I think the current PR should be consistent with your feedback
… let's see what Ningxin says - if there are still open issues on MLActivation, I'll split the PR
<ningxin_hu> sgtm
zoltan: but if not, then we could land it as one
WebNN - enhancements, editorials, questions
Simplify the operand layout support of conv2d and pooling 2d operations
anssik: issue #324
<ghurlbot> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (huningxin) enhancement
anssik: "In the existing WebNN spec, conv2d supports two input operand layouts defined by MLInputOperandLayout and four filter operand layouts defined by MLConv2dFilterOperandLayout."
ningxin_hu: this emerged from implementation feedback
… not all OSes support all layouts
… which forces to emulate the unsupported layouts
… it's mostly a matter to insert a transpose operation in the underlying implementation
… but that adds overhead in the graph compute
… when the transpose is applied to a constant, it is done in the build phase, but for the input tensor, it would have to happen at compute time
… limiting the supported input and filter layouts would help let that overhead be handled by the frameworks
… we need to WG's input of which layouts to support
dom: is there any layout that is supported across all platforms?
ningxin: XNNPack only supports nhwc, directml supports both
… other platforms needs more investigation
anssi: is transpose sync implemented? how complete is it?
ningxin_hu: not implemented yet, but native APIs already have it and we could investigate it
anssi: great example of implementation feedback we're seeking
… it probably needs to stay open to gather input
chai: the layout only matters when the hardware prefers something, transposing to match the hardware layout loses performance
… ONNX took the position to stick with one layout and leaves the rest to the backend
… but here WebNN is the backend and has to pick a layout
Anssi: is this blocking implementation work, or can this be left open for a while?
ningxin: it isn't blocking - right now we're throwing an exception if the layout isn't supported
… there probably should be a better way to exposing that to developers
dom: is the throwing behaviour defined in the spec?
ningxin_hu: it probably needs to be defined in the algorithm steps
dom: I think it is more than enhancement, some of Zoltan's changes are very specific, here layout is tied to hardware support, something we can address
ningxin: right, based on current spec, an implementation would have to support all layouts
… throwing is an implementation decision that doesn't match the spec at this point
chai: layout is not something you need to query the hardware to understand what it supports
… it's at the framework level they would define the type of input layout they want
… once that input layout is translated in the backend, the backend has to adapt to the "historical format preference"
… when WebNN compiles the graph (in the build step), that's the time it needs to query the driver its preference (or anti-preference)
… in some cases, it's only a preference, in some cases a requirement
… the backend can resolve this by inserting a transpose, or manipulating the stride using jump reads instead of block reads
… it depends on the performance characteristics
… a transpose may be wasting time compared to a stride since it forces a copy
… sometimes it's worth it if there are 50 layers of convolutions with a layout preference
… WebNN as a backend has to support both layouts
anssi: so I'm hearing WebNN needs to be liberal in what it accepts
Chai: yes, webNN needs to be flexible in the formats it can handle
… the implementation will make it happen with the info from the driver
… the implementation cannot fail on unsupported formats, it needs to carry on
… if it can't be done, it should fail at the compile stage