14:54:00 <RRSAgent> RRSAgent has joined #webmachinelearning
14:54:05 <RRSAgent> logging to https://www.w3.org/2026/01/15-webmachinelearning-irc
14:54:05 <Zakim> RRSAgent, make logs Public
14:54:06 <Zakim> please title this meeting ("meeting: ..."), anssik
14:54:07 <anssik> Meeting: WebML WG Teleconference – 15 January 2026
14:54:17 <anssik> Chair: Anssi
14:54:18 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2026-01-15-wg-agenda.md
14:54:34 <anssik> Scribe: Anssi
14:54:47 <anssik> scribeNick: anssik
14:54:58 <anssik> gb, this is webmachinelearning/webnn
14:55:03 <gb> anssik, OK.
14:55:05 <anssik> Present+ Anssi_Kostiainen
14:59:09 <anssik> Present+ Dwayne_Robinson
14:59:57 <anssik> Present+ Doug_Schepers
15:00:25 <anssik> Present+ Jonathan_Ding
15:00:31 <dom> Present+ Dominique_Hazael-Massieux
15:00:38 <Joshua_Lochner> Joshua_Lochner has joined #webmachinelearning
15:00:42 <anssik> Present+ Joshua_Lochner
15:00:53 <anssik> Present+ Ugur_Acar
15:01:13 <anssik> Present+ Tarek_Ziade
15:01:29 <anssik> Present+ Mike_Wyrzykowski
15:02:09 <anssik> Present+ Fabio_Bernardon
15:02:24 <anssik> RRSAgent, draft minutes
15:02:26 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
15:02:38 <anssik> Present+ Rafael_Cintron
15:02:48 <dwayner> dwayner has joined #webmachinelearning
15:02:52 <anssik> Present+ Markus_Tavenrath
15:02:55 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:03:37 <anssik> RRSAgent, draft minutes
15:03:38 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
15:04:00 <anssik> Present+ Ben_Greenstein
15:04:12 <anssik> Present+ Ehsan_Toreini
15:04:33 <anssik> Anssi: welcome to our first meeting of the year 2026, we had a break over the holiday and return to the usual cadence
15:04:39 <anssik> Anssi: we'll start by acknowledging our later new participant who joined the WG:
15:04:44 <anssik> ... Liang Zeng from ByteDance
15:04:49 <anssik> ... welcome to the group, Liang!
15:04:52 <BenGreenstein> BenGreenstein has joined #webmachinelearning
15:05:05 <Ehsan> Ehsan has joined #webmachinelearning
15:05:56 <Ugur_Depixen> Ugur_Depixen has joined #webmachinelearning
15:05:59 <anssik> Anssi: also welcome again Doug Schepers!
15:06:15 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
15:06:39 <anssik> Doug: using classical ML for a11y improvements
15:07:13 <anssik> ... Jonathan Ding from Intel joins us as a guest for this meeting to present a new proposal, discussed next
15:07:24 <anssik> Topic: New proposal: Dynamic AI Offloading Protocol (DAOP)
15:07:32 <anssik> gb, this is webmachinelearning/proposals
15:07:32 <gb> anssik, OK.
15:07:45 <anssik> Anssi: from time to time we review new proposals submitted for consideration by the WebML community
15:07:53 <anssik> Anssi: we have received a new proposal called the Dynamic AI Offloading Protocol (DAOP) #15 that could benefit from this group's feedback and suggestions
15:07:54 <gb> https://github.com/webmachinelearning/proposals/issues/15 -> Issue 15 Dynamic AI Offloading Protocol (DAOP) (by jonathanding)
15:08:23 <anssik> ... as you know, our group has received feedback from developers and software vendors from time to time that they'd love to run inference tasks with WebNN, but often times they're unsure if the user's device is capable enough
15:08:49 <anssik> ... diversity of models and client hardware make it challenging to determine up front whether a given model run on the user's device with QoS that meets the requirements of the developer
15:09:18 <anssik> ... and we can't expose low-level details though the Web API to avoid fingerprinting, also we believe we shouldn't expose too much complexity through the Web API layer to remain future-proof
15:09:50 <anssik> ... this easily leads to a situation where web apps either choose to use the least common denominator model, or use cloud-based inference even if the user's device could satisfy the QoS requirements
15:10:09 <ningxin> ningxin has joined #webmachinelearning
15:10:14 <anssik> ... I have invited Jonathan Ding to share a new proposal called Dynamic AI Offloading Protocol (DAOP) to address the challenges related to offloading inference tasks from servers to client devices
15:10:43 <anssik> ... Jonathan will introduce the proposal in abstract, a few example use cases, and a high-level implementation idea -- we won't go into implementation details in this session
15:10:54 <anssik> ... after Jonathan's ~5-min intro we'll brainstorm a bit to feel the room and inform the next steps
15:10:59 <anssik> ... I will ask everyone to focus on the use cases -- do these use cases capture the key requirements?
15:11:38 <anssik> Jonathan: this is about hybrid AI, expectation is to be able to offload the inference task to the client, offloading is not free, you have QoS expectations
15:11:51 <zolkis> zolkis has joined #webmachinelearning
15:12:01 <anssik> ... you need to be able to decide if the device is capable to run the given model while satisfy QoS requirements
15:12:01 <zolkis> present+ Zoltan_Kis
15:12:05 <anssik> Jonathan: Use Case 1: Adaptive Video Conferencing Background Blur
15:12:13 <anssik> ... A cloud-based video conferencing provider wants to offload background blur processing to the user's laptop to save server costs.
15:12:32 <anssik> ... 1. The cloud server sends a light-weight weightless Model Description (topology and input shape only, without heavy weight parameters) of the blur model to the client's laptop
15:12:47 <anssik> Present+ Zoltan_Kis
15:12:57 <anssik> Jonathan: ... 2. The laptop's browser runs a "Dry Run" simulation locally using the proposed API to estimate if it can handle the model at 30 FPS.
15:13:03 <anssik> ... 3. The laptop returns a QoS guarantee to the server.
15:13:07 <anssik> ... 4. If the QoS is sufficient, the server pushes the full model to the laptop; otherwise, processing remains on the cloud.
15:13:12 <anssik> Jonathan: Use Case 2: Privacy-Preserving Photo Enhancement for Mobile Web
15:13:19 <anssik> ... A photo editing web app wants to run complex enhancement filters using the user's mobile NPU to reduce latency.
15:13:45 <anssik> ... 1. The application queries the device's capability using the standard performance estimation API, avoiding fingerprinting by returning a broad performance "bucket" rather than exact hardware specs.
15:13:52 <anssik> ... 2. The device calculates its capability based on the memory bandwidth and NPU TOPs required by the filter model.
15:13:57 <anssik> ... 3. Finding the device capable, the app enables the "High Quality" filter locally, ensuring the user's photos never leave the device.
15:14:03 <anssik> Jonathan: two sub-proposals, differing in how they assign responsibility between the Caller and the Callee
15:14:07 <anssik> ... Sub-proposal A: Device-Centric (Caller Responsible)
15:14:11 <anssik> ... the Cloud acts as the central intelligence. It collects data from the device and makes the decision.
15:14:15 <anssik> ... Sub-proposal B: Model-Centric (Callee Responsible) - Preferred
15:14:19 <anssik> ... the Device acts as the domain expert. It receives a description of the work and decides if it can handle it.
15:14:48 <gb> https://github.com/webmachinelearning/proposals/issues/15 -> Issue 15 Dynamic AI Offloading Protocol (DAOP) (by jonathanding)
15:16:20 <RafaelCintron> q+
15:16:21 <anssik> q?
15:16:23 <anssik> ack RafaelCintron
15:17:02 <anssik> Rafael: estimating capabilities of user hardware, can I run time model, you also want to know if you can run the model well, this has challenges in native environments too
15:17:34 <anssik> ... big weight will run with the same topology slower than with smaller models
15:17:58 <anssik> ... need to consider the impact of other applications running on the system at the same time
15:18:07 <RafaelCintron> http://browserleaks.com/webgpu
15:19:02 <anssik> Rafael: this site shows what information is exposed by WebGPU, WebGPU adapter information does disclose pretty detailed information that allows developers to defer some details about GPU, something similar could in abstract work for WebNN
15:19:03 <anssik> q?
15:19:47 <anssik> Rafael: this is certainly something developers are struggling with and it is worth exploring further
15:19:58 <anssik> ... as models get bigger more people will struggle with this problem
15:19:59 <anssik> q?
15:20:24 <anssik> Jonathan: thank you for the comments, very good feedback
15:20:54 <anssik> ... instead of running the entire model, this aligns with what we observe from ISV discussion
15:20:55 <anssik> q?
15:22:37 <anssik> Anssi: I think https://github.com/rustnn/webnn-graph could be used for prototyping this proposal
15:23:24 <anssik> Tarek: I have also other utils, e.g. for ONNX<->WebNN graph conversion
15:23:28 <anssik> q?
15:25:09 <anssik> Doug: can a person opt into and opt out of sharing device capability information?
15:26:20 <anssik> ... the second model does not fingerprint, does the user have agency in making sure the device is not used for compute they don't want it to be used?
15:27:06 <RafaelCintron> q+
15:27:11 <anssik> ack RafaelCintron
15:28:10 <anssik> Rafael: to answer Doug, for WebGPU and WebGL, those APIs have no permission prompts, and the APIs can allocate a lot of memory and compute, the same with JS, also Storage APIs, Chromium has lifted storage restrictions
15:29:54 <anssik> Anssi: thank you Jonathan for sharing this proposal with the group
15:29:58 <anssik> ... I'm hearing the group agrees these use cases are valuable
15:30:06 <anssik> ... I also hear the group would like to see interested people move forward with this proposal
15:30:34 <anssik> RESOLUTION: Create an explainer for Dynamic AI Offloading Protocol (DAOP) and initiate prototyping
15:30:49 <anssik> Topic: Candidate Recommendation Snapshot 2026 review
15:31:01 <anssik> gb, this is webmachinelearning/webnn
15:31:01 <gb> anssik, OK.
15:31:06 <anssik> Anssi: PR #915
15:31:07 <gb> https://github.com/webmachinelearning/webnn/pull/915 -> Pull Request 915 Add Candidate Recommendation Snapshot for staging (by anssiko)
15:31:11 <anssik> -> WebNN API spec release history https://www.w3.org/standards/history/webnn/
15:31:19 <anssik> Anssi: we're ready to publish a new Candidate Recommendation Snapshot (CRS)
15:31:37 <anssik> ... this milestone will be communicated widely within the W3C community and externally
15:31:42 <anssik> ... our prior CRS release happened 11 April 2024, and a lot of progress has been made since:
15:31:50 <anssik> ... over 100 significant changes
15:32:00 <anssik> ... third wave of operators for enhanced transformers support
15:32:04 <anssik> ... the MLTensor API for buffer sharing
15:32:13 <anssik> ... a new abstract device selection mechanism
15:32:17 <anssik> ... the API surface has been modernized
15:32:27 <anssik> ... interoperability improvements informed by implementation experience and developer feedback
15:32:31 <anssik> ... improved security and privacy considerations
15:32:36 <anssik> ... fingerprinting mitigations
15:32:40 <anssik> ... new accessibility considerations
15:32:48 <anssik> ... I staged a release in PR #915 that adds an appendix with detailed changes per categories that map to W3C Process defined Classes of Changes:
15:32:52 <anssik> -> https://www.w3.org/policies/process/#correction-classes
15:33:09 <anssik> Anssi: the next step for us is to record the group's decision to request transition
15:33:34 <anssik> ... any questions or concerns, are we ready to publish?
15:33:42 <anssik> q?
15:34:22 <shepazu> shepazu has joined #webmachinelearning
15:34:33 <anssik> Dom: this release triggers a Call for Exclusions so everything that's in the release scope gets Royalty-Free protection
15:34:47 <gb> https://github.com/webmachinelearning/webnn/pull/915 -> Pull Request 915 Add Candidate Recommendation Snapshot for staging (by anssiko)
15:35:15 <anssik> RESOLUTION: Publish a new Candidate Recommendation Snapshot of the WebNN API as staged in PR #915
15:35:27 <shepazu> (again, sorry for the noise… I'll try to be more respectful of group meeting time)
15:35:43 <anssik> Topic: Implementation experience, from the past to the future
15:36:15 <anssik> Anssi: in the past the group has also worked on webnn-native, a standalone native implementation as a C/C++ library
15:36:22 <anssik> -> webnn-native https://github.com/webmachinelearning/webnn-native
15:36:48 <anssik> Markus: I'm interested in webnn-native library, understand the technical reasons for moving away from this library and into the current WebNN implementation that is more tightly integrated with the Chromium codebase
15:37:15 <anssik> ... webnn-native is similar to Dawn, a WebGPU implementation
15:39:56 <RafaelCintron> q+
15:40:01 <anssik> Markus: should we revive webnn-native or use rustnn for native interface for WebNN?
15:40:04 <anssik> ack RafaelCintron
15:40:48 <anssik> Rafael: I would be personally in favour of reviving webnn-native once we ship OT
15:41:28 <anssik> ... it is a lot of work to integrate a 3rd party library to Chromium, smart pointers, bitsets and all that
15:41:54 <anssik> ... webnn-native came first, there was opposition back in the time for hosting the webnn-native library outside the Chromium project
15:41:55 <anssik> q?
15:42:20 <anssik> Present+ Ningxin_Hu
15:43:02 <anssik> q?
15:43:22 <anssik> Anssi: before the break Tarek shared news about the Python and Rust implementation of the WebNN API
15:43:31 <anssik> ... this work is now hosted under the newly established RustNN project along with other utils:
15:43:36 <anssik> -> RustNN https://github.com/rustnn
15:43:41 <anssik> Anssi: this GH org hosts a number of repos:
15:43:50 <anssik> ... rustnn, the Rust implementation
15:43:55 <anssik> ... pywebnn, Python bindings for rustnn
15:44:01 <anssik> ... webnn-graph, a WebNN-oriented graph DSL
15:44:09 <anssik> ... webnn-onnx-utils, WebNN <-> ONNX conversion
15:44:15 <anssik> ... trtx-rs, TensorRT-RTX bindings
15:44:20 <anssik> ... and more
15:44:50 <anssik> Tarek: it is a lot of fun working on RustNN, happy to all interested collaborators to the repo
15:45:16 <anssik> ... I want to have all the WebNN demos working on Python as well, focusing on LLMs now
15:45:30 <anssik> RRSAgent, draft minutes
15:45:31 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
15:46:04 <anssik> Tarek: I have a patch for Firefox to expose rustnn with JS bindings
15:46:04 <anssik> q?
15:46:30 <anssik> Topic: Accelerated context option implementation feedback
15:46:36 <anssik> Anssi: issue #911
15:46:37 <gb> https://github.com/webmachinelearning/webnn/issues/911 -> Issue 911 accelerated should be prior to powerPreference for device selection (by mingmingtasd) [device selection]
15:46:45 <anssik> Anssi: we received new implementation feedback from Mingming (thanks!) for the accelerated context option
15:46:51 <anssik> -> https://www.w3.org/TR/webnn/#api-mlcontextoptions
15:47:21 <anssik> Anssi: specifically the feedback asks for clarification how "accelerated" it is supposed to interact with the existing power preference ("default", "high-performance", "low-power")
15:47:31 <anssik> ... currently, as specified, "accelerated" property has lower priority than "powerPreference"
15:47:39 <anssik> ... per Mingming, this creates difficulty in the following scenarios:
15:48:15 <anssik> { powerPreference: "low-power", accelerated: true } if no low-power device is available
15:48:33 <anssik> { powerPreference: "low-power", accelerated: false } if the implementation cannot force CPU to low-power state
15:48:54 <anssik> { powerPreference: "high-performance", accelerated: false } if the implementation cannot force CPU to high-performance state
15:49:07 <anssik> ... Mingming's proposal is to give "accelerated" a higher priority than "powerPreference"
15:49:21 <Mike_Wyrzykowski> q+
15:49:22 <anssik> ... Zoltan's proposal is to consider "powerPreference" to set the power envelope limits
15:49:38 <anssik> ... I'd like to discuss how to evolve the spec to clarify this aspect
15:49:57 <anssik> ... first, I'd like to establish whether we agree both "accelerated" and "powerPreference" are hints i.e. implementers provide best-effort service given this information
15:50:24 <anssik> ... second, I'd like to ask if it would be clearer to present the possible combinations as an informative truth table instead of prose?
15:50:26 <anssik> q?
15:50:29 <anssik> ack Mike_Wyrzykowski
15:50:54 <anssik> Mike: since these are hints, depending on the system the implementation can ignore them and be spec-conformant
15:51:10 <anssik> ... on macOS for example, WebGPU/GL may ignore similar hints
15:51:34 <anssik> ... how to interpret these hints, it may not be successful to try to prescribe what implementers should do
15:51:35 <zolkis> q+
15:52:00 <anssik> ack zolkis
15:53:07 <anssik> Zoltan: I summarized that if power preference is low-power it expressed developer priority for lower power, otherwise accelerated would have priority, nevertheless it is all hints, would use an informantive truth table
15:53:10 <RafaelCintron> q+
15:53:37 <anssik> ... I was considering Apple platform capabilities in this design
15:54:00 <anssik> ... power envelope may be for heat management or other reasons
15:54:02 <anssik> q?
15:54:06 <anssik> ack RafaelCintron
15:54:32 <Mike_Wyrzykowski> q+
15:54:56 <anssik> Rafael: what do people think about items in the powerPreference enum?
15:55:08 <anssik> ... what is available in frameworks today for implementers?
15:55:28 <anssik> ... if the backend is CoreML or LiteRT, what do I do?
15:55:52 <anssik> ... fallback adapter would be one boolean
15:56:48 <anssik> Rafael: suggestion, powerPreference enum could have a new "no-acceleration" entry
15:57:49 <RafaelCintron> https://gpuweb.github.io/gpuweb/#dictdef-gpurequestadapteroptions
15:57:50 <anssik> ... "no-acceleration" could map as of today to CPU to map to current frameworks
15:58:27 <anssik> ... WebGPU has a similar problem and they solved it with powerPreference and fallback adapter
15:58:29 <anssik> q?
15:58:36 <anssik> ack Mike_Wyrzykowski
15:59:19 <anssik> Mike: quick comment, the proposal from Rafael sounds reasonable, the name we may want to iterate on, "no-acceleration" should not explicitly mean run on CPU
15:59:20 <anssik> q?
16:00:15 <anssik> q?
16:00:32 <anssik> RRSAgent, draft minutes
16:00:33 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
16:01:06 <anssik> Zoltan: we had a use case for "accelerated", need to revisit that
16:01:09 <anssik> q?
16:02:30 <anssik> RRSAgent, draft minutes
16:02:31 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
16:04:26 <anssik> s/while satisfy/and satisfy
16:06:46 <anssik> s/to Chromium/into Chromium
16:08:08 <anssik> s/happy to/happy to add
16:08:40 <anssik> s/it is supposed/is supposed
16:09:16 <shepazu> shepazu has joined #webmachinelearning
16:09:16 <zolkis> zolkis has joined #webmachinelearning
16:09:16 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
16:09:16 <Joshua_Lochner> Joshua_Lochner has joined #webmachinelearning
16:09:16 <reillyg> reillyg has joined #webmachinelearning
16:10:51 <anssik> s/entry/value
16:11:17 <anssik> s/to map to/as in
16:11:42 <anssik> RRSAgent, draft minutes
16:11:43 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
16:21:31 <anssik> s/run time model/run the model
16:24:08 <anssik> RRSAgent, draft minutes
16:24:10 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
16:25:02 <anssik> s/big weight/big weights
16:25:15 <anssik> s/smaller models/smaller weights
16:25:17 <anssik> RRSAgent, draft minutes
16:25:19 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
16:25:40 <anssik> s/defer/infer
16:27:17 <anssik> s/second model/model-centric proposal
16:27:17 <anssik> RRSAgent, draft minutes
16:27:19 <RRSAgent> I have made the request to generate https://www.w3.org/2026/01/15-webmachinelearning-minutes.html anssik
18:01:27 <Zakim> Zakim has left #webmachinelearning