WebML WG Teleconference – 30 January 2025

Meeting minutes

Repository: webmachinelearning/webnn

<tarek> o/

anssik: Happy Lunar New Year to our participants using lunar calendars!
… our PRC participants are taking time off to celebrate during this period
… welcome to Tarek Ziade from Mozilla, Mike Wasserman and Christine Hollingsworth from Google to the WebML WG
… also welcome to Stalgia Grigg from Bocoup and Mingyu Lei from Google, Yuichiro Tachibana from Hugging Face, Sushanth Rajasankar from Microsoft, Tarek Ziade from Mozilla joining the WebML CG!

Tarek: I work in ML AI team at Mozilla that integrates inference features into Firefox desktop
… my team working to integrate Transformers.js into the browser extension, first use case PDF.js with alt text generation for images
… running locally via ONNX Runtime, recently wrapped that API into a new trial API for Web Extensions, doing similar things such as Transformers.js with its pipeline API
… with Web Extensions API we do in addition caching, run inference in a separate process
… very interested in this group, learn about what is happening, hope to see everything converging to cool new stuff for web developers!

W3C Breakouts Day 2025

anssik: there's a call for breakout session proposals for W3C Breakouts Day 2025 on 26 March 2025
… the goal of the day is to foster discussion among the full W3C community about new or existing topics
… this is an opportunity to reach out to folks outside our WebML community
… breakout format is quite relaxed, can be e.g. a presentation and/or discussion
… duration max 1 hour
… deadline for breakout proposals 12 March
… no registration, anyone with a W3C account eligible, non-Members too
… proposals are submitted via new GH issues:

Propose a new breakout session

anssik: instructions linked from the issue template

anssik: you can check the earlier breakouts for inspiration:

2024 breakout proposals

2023 breakout proposals

anssik: more information available on GH:

W3C Breakouts Day 2025

anssik: please let me know if you'd be interested in suggesting a breakout session and I can help get it in
… one possible topic could be to present and discuss the new Community Group incubations

Device selection

anssik: PR #809 specifies the Proposed Minimum Viable Solution per device-selection-explainer.md

<gb> Pull Request 809 Remove MLDeviceType (by zolkis) [device selection]

anssik: and closes issues #749 and #302

<gb> Issue 302 API simplification: context types, context options, createContext() (by zolkis) [v2] [device selection]

<gb> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection]

device-selection-explainer.md

anssik: thank you Zoltan for the PR and Josh for review comments!
… summary of changes:
… - Remove MLDeviceType as explicit context option
… - Update MLContext so that it becomes device agnostic
… - Add algorithmic steps or notes to implementations on how to map power preference to devices
… the following changes documented in the explainer is not in this PR:
… - Also, to align with GPUPowerPreference, we should remove the "default" MLPowerPreference, i.e. the lack of hints will result in creating a generic context.
… In addition, privacy considerations have been updated, reducing fingerprintable surface further
… the corresponding IDL change is the following:

-enum MLDeviceType {

- "cpu",

- "gpu",

- "npu"

-};

dictionary MLContextOptions {

- MLDeviceType deviceType = "cpu";

MLPowerPreference powerPreference = "default";

};

anssik: PR invites further review, I expect we are able to merge this soonish

Zoltan: didn't remove "default" because strictly speaking not part of this minimal change
… discussion on the explainer suggests we could keep it this way for now
… Josh brought up a point the context creation needs some work, should we include it here or let it be another PR?

anssik: editors can decide on how to go about that

Zoltan: PTAL everyone

Dwayne: I will look at it today

Dom: should the explainer be reviewed by TAG and/or Privacy WG?

anssik: proposal to do that review in context of the next CRS

Operator set Wave 3

anssik: PR #805

<gb> Pull Request 805 Operator set wave 3 (by fdwr)

anssik: this sizable PR addresses a number of open issues
… thank you Dwayne for updates and Ningxin and Josh for your review comments
… first I'd like to check we're capturing all the issues this PR closes, current list:

<jsbell> (re: previous topic) To be explicit: plan to "review the change in context of the next CRS" SGTM

anssik: closes #93 - PR adds quantizeLinear and dequantizeLinear

<gb> Issue 93 Add QuantizeLinear and DequantizeLinear for mixed precision (by kpu) [opset] [feature request]

anssik: closes #467 - PR adds scatterElements, scatterND, gatherElements, gatherND (gather added earlier)

<gb> Issue 467 Where is scatter and gather op? (by muazhuda) [feature request] [operator specific]

anssik: closes #772 - PR adds MLSliceOptions

<gb> Issue 772 Support strides option for `slice` operator (by huningxin) [feature request] [operator specific]

anssik: closes #767? - we're adding scatter and gather ops, do we want "this operation can be generically emulated" box?

<gb> Issue 767 Request the decomposition for gatherElements, scatterElements and scatterND (by fujunwei) [operator specific]

anssik: closes #773 - PR adds reverse

<gb> Issue 773 Support `reverse` operator (by huningxin) [feature request] [operator specific]

anssik: closes #779 - PR adds blockwise broadcasting to quantizeLinear and dequantizeLinear

<gb> Issue 779 Support block-wise quantization (by huningxin) [operator specific]

anssik: the spec PR is annotated with "TODO:" for sections that welcome contributions

Dwayne: thanks everyone for your feedback!

<jsbell> Thanks Dwayne!!!

Dwayne: algorithm steps for gather and scatter would welcome contributions

JoshB: I did a rough pass over many of the issues, nothing more related to this

<jsbell> A reminder to use GitHub's magic keywords in PRs to link to issues: https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue

WebNN v2 issue triage

anssik: I discussed our "v2" issues with our two-person triage team, so wanted to bring the proposals to the group:
… to recap, we've used "v2" triage label when "issue is not considered a blocker for Proposed Recommendation”
… IOW, issues we expect to take a long time to settle, sometimes indefinitely if we choose a different design
… here's our proposals from "v2" triage:
… #714 - keep “v2”

<gb> Issue 714 Support multiple op sets / builders (by zolkis) [question] [v2] [opset]

anssik: #623 - remove “v2”, since #809 address the device type and #805 QDQ - also noted in device-selection-explainer.md

<gb> Pull Request 805 Operator set wave 3 (by fdwr)

<gb> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request] [device selection]

<gb> Pull Request 809 Remove MLDeviceType (by zolkis) [device selection]

anssik: #375 - remove “v2”, partially addressed by #805

<gb> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset]

anssik: #346 - closed per w3c/machine-learning-charter#37

<gb> CLOSED Issue 37 Core operator set, scope and coordination (by anssiko)

<gb> CLOSED Issue 346 WebNN and StableHLO opset compatibility (by anssiko) [v2] [opset]

anssik: #302 - remove “v2”, to be closed by #809

<gb> Issue 302 API simplification: context types, context options, createContext() (by zolkis) [v2] [device selection]

anssik: #6 keep “v2”

<gb> Issue 6 Custom operations (by dsmilkov) [v2] [device selection]

anssik: #1 keep “v2” — this is explored in the CG and could be relevant for future WebNN

<gb> Issue 1 Look into pre-canned models (by anssiko) [v2]

anssik: does this look good to you?
… we'll update the issue tracker accordingly

<dwayner> Sounds fine to me.

Disallow operations on scalar tensors that are no-ops

anssi: issue #794 was discussed on our previous call and we agreed to look at for which ops scalars make sense

<gb> Issue 794 Disallow operations on scalar tensors that are no-ops (by reillyeon) [operator specific]

anssi: Dwayne brought up a point that in a math sense adding scalars makes sense, but implementation complexity may suggest otherwise
… Ningxin provided Chromium implementation experience to fill in the operator <-> scalar support table
… now that the results are in, quoting Dwayne's proposal we'd next:
… "determine which backends support scalars already, which do not, and whether it's worth extra code to wrap that operator in a temporary reshape of 0D to 1D"

Dwayne: I will double-check Ningxin's feedback and update the table, and will check with Reilly if he has any reservations for TFLite

<jsbell> SG. I'll let Reilly and Phillis know.

Dwayne: seems like this is well on track, thanks!

anssik: any questions or comments?

Caching mechanism for MLGraph

anssik: issue #807

<gb> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request]

jsbell: Reilly's IDL is not a concrete proposal, the next steps would be to work on an explainer and document use cases

jsbell: encourage the group to work on this, not a high priority for the Google Chrome currently

McCool: have looked at this in the past, will see if I can contribute to the explainer

Bryan: many app developers have abandoned monolithic caches in favor of reverting to the older hash-and-cache approach. This method involves setting various pieces of state independently, hashing them all for a GPU call, and using the hash as a key in an app-managed cache
… implicit caching is still the norm for drivers

Rafael: I think the proposal Reilly put forward is something to consider down the road, all comes down to what the platforms underneath do and recommend
… e.g. Core ML saves the compiled graph to disk that could be reused
… I can reach out to the web developers to provide feedback to the WG

Zoltan: it would be good to list developer use cases as code, couldn't we achieve the same with existing graph provided to the builder?
… use MLGraph objects themselves

Expose WebNN API to service workers

anssik: issue #804 is a request from a web developer who's trying to use WebNN in a browser extension

<gb> Issue 804 Expose WebNN API to service workers (by zweack) [use case] [feature request]

anssik: the error report in the issue is from the Edge browser dev tools console
… I believe all Chromium-based browsers use a service worker for the extension's background code that stays off the main thread

https://developer.chrome.com/docs/extensions/develop/migrate/to-service-workers

anssik: WebGPU recently added support for service workers (and shared workers) to enable use cases such as WebLLM chrome extension

gpuweb/gpuweb#4197

<gb> CLOSED Issue 4197 API should be exposed to ServiceWorker (by MiguelsPizza) [proposal] [feature request]

https://github.com/mlc-ai/web-llm/tree/main/examples/chrome-extension-webgpu-service-worker

anssik: it looks like WebNN could similarly consider exposing the API to service workers to enable this use case for extensions?
… there are probably also other use cases for service workers?
… currently the ML interface is exposed to window and dedicated worker scope only
… a separate consideration would be whether to expose the API to shared workers that can be accessed from several browsing contexts that share the exact same origin e.g. windows, iframes, also interested in use cases and possible abuse cases that we should mitigate against

<dom> gpuweb/gpuweb#4197

RafaelCintron: this same issue with service workers came up with WebGPU and they said yes
… I don't know under which criteria a Web API should not be exposed to service workers

jsbell: makes sense to me, haven't looked at this deeply

<dom> When exposing a feature, please consider whether it makes sense to expose the feature to all possible environments (via the [Exposed=*] annotation or including it on all global scope interfaces). "Only purely computational features should be exposed everywhere. That is, they do not

<dom> perform I/O and do not affect the state of the user agent or the user’s device."

jsbell: an API for SW has to be async, because sync APIs are not supported in that context
… for SW there's no user visible surface, cannot support surfacing UI from SW
… long compute time might be an issue, because SW are short-lived and browser can shut them down after some activity
… if we do a graph build and it takes 30 seconds we should discuss with SW whether that is OK

dom: I put some guidance from TAG on this, "only purely computational APIs should be exposed everywhere"
… no observable side-effects in WebNN, seems compliant. This suggests we may need to consider even exposing everywhere, not just in service worker

jsbell: shared worker use case, we get that from folks doing big web apps so can move logic to a worker so not getting taxed when multiple tabs from the same origin are open

Community Group meeting scheduling poll

anssik: the Community Group has been rechartered and the group’s scope expanded to new incubations

Community Group Incubations
… to provide another venue for participants to exchanges ideas in addition to asynchronous collaboration through GitHub
… we plan to restart the Community Group meetings
… I asked interested CG participants to respond to the meeting scheduling poll by EOB 29 Jan 2025 to find a good time

<McCool> (sorry, ntd)

W3C WebML Community Group meeting poll
… the top 3 options are:

Option 1 (14 yes, 1 if needs be)

Tue 4-5 pm PST / Wed 00-01 am UTC / Wed 8-9 am CST / Wed 9-10 am JST

cannot attend: Rafael, Thomas, Christian, Maxim, Christine

Option 2 (14 yes, 1 if need be)

Wed 4-5 pm PST / Thu 00-01 am UTC / Thu 8-9 am CST / Thu 9-10 am JST

cannot attend: Sushanth, Thomas, Christian, Christine

if need be: Etienne

Option 3 (12 yes, 2 if need be)

Tue 2-3 pm PST / Tue 10-11 pm UTC / Wed 7-8 am CST / Wed 7-8 am JST

cannot attend: Brad, Thomas, Ningxin, Sungpil, Domenic, Maxim

if need be: Christian, Christine

anssik: I'd like to find a time that works for the editor Domenic
… I've asked Etienne to help facilitate these meetings, so should choose the time that works for him too
… it looks like Option 1 is the best compromise
… for consideration for people who cannot attend, meeting summary will be provided and discussed in the next WG meeting
… I believe this setup will enable closer WG-CG collaboration across multiple timezones

– DRAFT –
WebML WG Teleconference – 30 January 2025

30 January 2025

Attendees

Meeting minutes

W3C Breakouts Day 2025

Device selection

Operator set Wave 3

WebNN v2 issue triage

Disallow operations on scalar tensors that are no-ops

Caching mechanism for MLGraph

Expose WebNN API to service workers

Community Group meeting scheduling poll

Diagnostics