WebML WG Teleconference – 20 October 2022

Meeting minutes

WebML WG Charter 2023-2025 under development

anssik: Web Machine Learning Working Group Charter for 2023-2025 is now under development.
… Please review the draft PR and open issues, provide your comments and open new issues as appropriate to help shape the WG's technical scope.

Announcement

Charter PR

Charter open issues

anssik: in the charter PR I included the expected timeline:
… - Q4 '22: Charter development
… - Q1 '23: W3C Advisory Committee review
… - Q2' 23: Charter approved

anssik: I propose we quickly touch each of the issue that has been recorded based on the feedback from the WG
… if you have immediate feedback or comments, feel free to queue yourself, otherwise please provide your feedback in the GH issues
… we expect to have a good draft charter ready EOY 2022

RafaelCintron: how the charter is changing from the last time?

anssik: up to the WG
… we can also ask for an extension

RafaelCintron: what is the proposal then, keep going or change?

RafaelCintron: I'm happy with our current charter, but happy to review any adjustments too

Features deferred to WebNN v2

https://github.com/webmachinelearning/webnn/labels/v2

anssik: "WebNN v2" is a construct that refers to the WebNN API spec post initial Candidate Recommendation
… v1 is in itself is expected to be a useful API
… the WG has labeled a few issues as "v2", so please check out those and let us know if you have other suggestion

https://github.com/webmachinelearning/webnn/issues/128

Dedicated ML hardware accelerators: NPU, VPU, xPU

anssik: The initial version of WebNN specifies two device types, "cpu" and "gpu".
… However, the API is extensible with new device types and in our discussion support for NPU, VPU, or XPU has come up as a new "v2" feature.
… The initial charter refers to "dedicated ML hardware accelerators" in its Motivation and Background, but if this is important we could be more explicit regarding NPU/VPU/XPU device type support.

chai: I think NPU is quite important especially for mobile scenarios
… if we can make it part of the charter, it is quite important

anssik: is a unified hardware-agnostic WebNN API the WG's primary goal?

chai: I think so

ningxin_hu: talked with Chai about this unified agnostic behaviour
… some hardware platforms may not support e.g. some data types, this should be defined by the WebNN spec that whether it allows detecting features so web apps can adapt to the hardware differences
… also backwards compatibility, versioning considerations
… when we go to WebNN v2 how we deal with this versioning and feature detection

zkis: to reinforce Chai, depending on the underlying platform, we can do op without an error, but it is dynamic, cannot know it a priori

chai: re fallback, there are two use cases, WebNN is a backend, fw is on top
… 1st case fw can handle fallback, e.g. CPU fallback
… 2nd use case, NPU may get more popular, can only support a subset of ops
… the fallback needs to happen below the WebNN API
… in the 1st use case the fallback happens above WebNN in the fw
… example, given "auto" device type, then the fallback happens underneath the WebNN implementation
… the second use case is important to v2 I think, especially when the app says "I don't care, handle this for me"
… e.g. app does not want to deal with the error codes
… if we make explicit what errors make WebNN fail we have a possible fingerprint issue

Set of ops supported must be more comprehensive

anssik: Chai shared that external partners are looking for a more comprehensive set of ops
… The current charter Scope enumerates a few common ones: "convolution, pooling, softmax, normalization, fully connected, activation, recurrent neural network (RNN) and long short-term memory (LSTM)".
… This is not meant to be an all inclusive list and does give the WG ability to adapt to the changes in this landscape.
… At minimum, we should review the bullets in the Scope section, and see whether to explicitly mention some of the more recent work such as transformers.
… We want to give enough detail to give good direction without constraining the WG too much. The list of ops mentioned in the charter would be open-ended.

chai: transformer is a huge class of models
… this is a big class of emerging models, but not very information
… natural language processing is a friendlier to the audience
… NLP represents current transformer models
… in future transformers may become less popular when the next hotter one comes around, similarly to LSTMs in the past

ningxin_hu: I want to clarify what Chai said, do you mean the charter should focus more on usages e.g. NLP or computer vision?
… usage can change from time to time
… we use different ML techniques to address these usages
… architectures change, do you suggest we focus more on usages?

Chai: correct, given the popularity of these more recent models

Level of abstraction for neural net operations

anssik: WebNN explainer has a nice section that explains the rationale for the chosen level of abstraction for the neural network operations in WebNN API.
… Chai proposed we could integrate some of this explainer text into the next charter to provide more context on the level of abstraction. This could fit into the Scope section.
… I recall past discussion around this topic, for example Google was interested in XLA (Accelerated Linear Algebra) domain-specific compiler compatibility. XLA project seems to be moving to an open governance mode and is being decoupled from the TensorFlow project.
… I'd welcome someone from Google to talk about their plans and expectations with XLA and its input language HLO IR (High Level Operations), and how they see it being part of the WebNN implementation story.

ningxin_hu: XLA moved to open governance, OpenXLA
… Google previously proposed that, we investigated that with Chai, mapping ops to ONNX and XLA, gap was not so big to me
… I'd look forward to a concrete proposal from that community to understand the mapping to that abstraction and gap
… there was another proposal TOSA (Tensor Op Set Arch)
… we also investigated that, my question probably cannot be answered right now, but I'd like to understand whether we should follow up closely with one of these

chai: just quickly, I'm aware of OpenXLA when they started, not sure what they intend to do with it
… it seems to be split from TensorFlow project
… it is good for us to point to them they should strive for compat with WebNN

ningxin_hu: WebNN positions itself as backend framework
… another issue is graph compiler, we should make it clear what is WebNN position in this stack regarding DL compiler
… is compiler a implementation backend, or use WebNN for codegen, or complementary to WebNN
… Google raised an issue for MLIR to this WG in the past, that questions did not last for long, but there are some DL compilers that are actively developed

WebRTC coordination

anssik: We added an Integration with real-time video processing use case based on learnings from our experimentation into WebNN API spec
… For the next charter, we could be more explicit and confident in Coordination re WebNN and WebRTC
… I made a suggestion in the issue and Dom +1'd it, so I'm thinking of tweaking the WebRTC coordination accordingly.
… any further suggestions for perhaps even more explicit text for our WebRTC integration interests?

WebGPU interoperability

anssik: We discussed WebGPU interoperability expectations on our 6 October 2022 call and concluded working with WebGPU contributors is important for the success of the WebNN spec. I'd want us to revise the charter language around WebNN-WebGPU interoperability expectations accordingly.
… The initial charter mentions WebGPU in the context of Out of Scope and Coordination
… I think this needs to be revised to reflect our evolved thinking. For example:
… "to avoid overlap with existing work, generic primitives used by traditional machine learning algorithms such as base linear algebra operations are out of scope. The WebGL and WebGPU shaders and WebAssembly SIMD are expected to address these requirements, see the Coordination section for details."

ningxin_hu: current charter address usage of custom ops, early WebNN issue re custom ops with WebNN and the solution to say custom ops can be implemented with more generic APIs, Wasm SIMD etc.
… Raphael mentioned some use cases e.g. super resolution, require WebNN to interact with WebGPU with resource and buffer sharing
… this is not mentioned in the current charter, if this usage is important, propose to make this more explicit

RafaelCintron: WebGPU interop is critical and this should not compromise the perf of the API, details subject to discussion

– DRAFT –
WebML WG Teleconference – 20 October 2022

20 October 2022

Attendees