15:00:31 RRSAgent has joined #webmachinelearning 15:00:35 logging to https://www.w3.org/2025/12/18-webmachinelearning-irc 15:00:35 RRSAgent, make logs Public 15:00:36 please title this meeting ("meeting: ..."), anssik 15:00:37 Meeting: WebML WG Teleconference – 18 December 2025 15:00:39 DwayneR has joined #webmachinelearning 15:00:41 Chair: Anssi 15:00:51 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-12-18-wg-agenda.md 15:00:58 Scribe: Anssi 15:01:03 scribeNick: anssik 15:01:09 gb, this is webmachinelearning/webnn 15:01:09 anssik, OK. 15:01:13 Present+ Anssi_Kostiainen 15:01:18 Present+ Tarek_Ziade 15:01:24 Present+ Ningxin_Hu 15:01:24 ningxin has joined #webmachinelearning 15:01:29 Present+ Zoltan_Kis 15:01:36 Present+ Dwayne_Robinson 15:01:51 Present+ Reilly_Grant 15:01:59 RRSAgent, draft minutes 15:02:00 I have made the request to generate https://www.w3.org/2025/12/18-webmachinelearning-minutes.html anssik 15:02:12 vasilii has joined #webmachinelearning 15:02:26 Present+ Ehsan_Toreini 15:02:54 Present+ Vasilii_Trofimchuk 15:03:07 zkis has joined #webmachinelearning 15:03:10 Present+ Rafael_Cintron 15:03:15 RafaelCintron has joined #webmachinelearning 15:03:19 RRSAgent, draft minutes 15:03:21 I have made the request to generate https://www.w3.org/2025/12/18-webmachinelearning-minutes.html anssik 15:03:27 present+ Zoltan_Kis 15:03:31 Anssi: we'll start by acknowledging our new participants who joined the WG: 15:03:40 ... Victor Huang from Microsoft 15:03:46 ... JuGuang Liu from ByteDance 15:03:53 ... welcome Victor and JuGuang, we look forward to working with you! 15:04:11 Topic: Incubations 15:04:17 Anssi: I want to share two key takeaways from the Community Group meeting we had last week: 15:04:20 -> WebML CG Teleconference – 11 December 2025 https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-12-11-cg-agenda.md 15:04:39 Anssi: first, Anthropic migrated the MCP development into a newly launched neutral forum, Agentic AI Foundation (AAIF), hosted as a Directed Fund under the Linux Foundation 15:05:01 ... I had discussions with Vasilii from Block, a co-founder of AAIF, as well as Dom from W3C team 15:05:15 ... to that end, we are in process of formalizing the W3C Community Group's coordination relationship with the newly established AAIF to enable seamless collaboration 15:05:41 ... concrete tasks include aligning our charters and ensuring our joint work mode facilitates building interoperability between MCP and WebMCP wrt common primitives where applicable 15:06:03 Anssi: second, we resolved to transition from the WebMCP explainer to a Community Group spec draft stage using the existing explainer, proposal and other supplementary documentation in the repo as the basis 15:06:12 ... we plan to complete this important transition during the first quarter of 2026 15:06:13 q? 15:06:14 handellm has joined #webmachinelearning 15:06:25 Present+ Markus_Handell 15:06:41 Topic: New implementation experience and developer feedback 15:07:05 Subtopic: Python and Rust implementation of the WebNN API aimed at Firefox 15:07:11 Anssi: I'm pleased to share with you all an early Xmas present 15:07:13 vasilii has joined #webmachinelearning 15:07:17 ... I will bring in Tarek to announce the first ever Python and Rust implementation of the WebNN API aimed at Firefox 15:07:32 ... this new WebNN implementation improves the already high-quality WebNN API specification by providing further validation 15:07:59 ... I want to thank Tarek on behalf of the group for initiating this important implementation effort that broadens the reach of WebNN to non-Chromium browsers and to the Python ecosystem making it possible to use WebNN outside the browser for the first time ever 15:08:12 ... this helps us establish the WebNN API as the lingua franca spoken by both web and Python developers 15:08:21 ... I will let Tarek share the exciting story 15:08:23 q? 15:08:40 Tarek: I have a few slide prepared, will share them with the grouo 15:08:45 s/grouo/group 15:09:11 Tarek: the work is at an experimental stage 15:09:28 ... rustnn is a Rust implementation of WebNN, independent library, follow WebNN spec 15:09:53 ... easy to add a Python API similar to JS API, this allows Python ecosystem to play with WebNN 15:10:11 ... the project was built with Firefox in mind, we want to integrate WebNN into Firefox 15:11:35 ... I looked at Chromium implementation, also webnn-native, eventually chose Rust given it fits well with Gecko, also enables Python bindings easily 15:12:58 ... design goals: strict spec interpretation, backend independence, early error detection, testability, not just for browsers 15:13:35 ... high-level architecture, three executors: ONNX Runtime, Core ML, TensorRT 15:14:04 ... ONNX Runtime executor implements the most operators, almost all of them 15:14:38 ... backend converts the WebNN Graph to ONNX Graph and CoreML Graph 15:15:08 ... simple and pluggable Rust code base 15:15:50 ... Python implementation mirrors WebNN structure, context -> builder -> graph 15:16:32 ... MobileNet demo from the WebNN sameples repo works with Python bindings, using pywebnn 15:17:01 ... Rust example, same conceptual phases as with Python 15:17:35 ... for Firefox support, another patch in Bugzilla that implement the same WebIDL API as Chromium and uses cbindgen for bridging rustnn to C++ 15:17:58 ... I didn't do anything via IPC for POC purposes, the final patch will use IPC layer and is coming soon 15:18:21 ... sharing a demo of Firefox with WebNN API 15:19:05 ... MobileNet demo loads weights from 106 layers, grabs an image and does the classification in the browser, works exactly the same as the demo that exists in Chromium 15:19:25 ... implementation status, currently 85 ops implemented, ~89% of current WebNN API spec 15:19:44 ... implemented ops support validation, shape inference, ONNX, CoreML lowering 15:20:42 ... some gaps exists, RNN family deferred, CoreML partially implemented, ref float16 issues, Firefox patch is POC quality due to IPC layer missing 15:20:45 q? 15:21:48 Tarek: WPT and conformance test made implementation work way easier, 1350 test passing now for ONNX backend 15:22:31 ... next steps, finish WPT data convesion for remaining ops, implement more demos, finish TensorRT execution support, performance, improve docs 15:22:31 q? 15:23:02 -> rustnn GitHub repo https://github.com/tarekziade/rustnn 15:23:09 -> rustnn docs https://blog.ziade.org/rustnn/ 15:23:14 -> Blog post: WebNN aimed at Firefox https://blog.ziade.org/2025/12/17/building-rustnn-webnn-implementation-rust/ 15:23:20 -> pywebnn package (Python bindings for W3C WebNN API) https://pypi.org/project/pywebnn/ 15:23:21 q+ 15:23:24 q+ 15:23:25 -> Firefox Integration Bug https://bugzil.la/2005145 15:23:30 ack RafaelCintron 15:23:51 Rafael: thank you so much for this work! Great to see Rust and Python 15:24:08 ... have you though about the more advanced demos for WebNN? 15:24:32 Tarek: I started with the demos hosted under the webmachinelearning GH org 15:24:32 q+ 15:25:22 Tarek: I will check those additional demos out 15:26:00 Rafael: you did all this in Rust, I was surprised you went to C++ to integrate with Firefox 15:26:30 Tarek: Gecko is a big C++ app and core is C++ so to integrate a new feature is to create a Rust lib and expose it to C++ app 15:26:47 ... maybe one day all is in Rust, but not core is in C++, all the things in WebIDL is C++ code 15:26:48 q? 15:26:50 ack ningxin 15:27:01 Ningxin: thank you Tarek, awesome work! 15:27:23 ... I will share another link for additional demos we host on Hugging Face so it'll be great to see those run in rustnn 15:27:47 ... in your presentation the code you use compute and pass CPU buffers, do you have plans to support MLTensor and dispatch interface? 15:28:03 Tarek: I have MLTensor already, maybe I did not surface it to this version yet 15:28:24 q? 15:28:26 ack DwayneR 15:28:30 Dwayne: cool demo! 15:29:01 ... you converted WPT to Python test cases, your experience? 15:29:37 Tarek: see https://github.com/tarekziade/rustnn/tree/main/tests/wpt_data for the approach 15:29:41 q? 15:30:58 q? 15:31:40 Subtopic: New developer feedback from the developer ecosystem 15:31:49 Anssi: we see developer excitement around WebNN building 15:31:53 ... in appreciation of the developer community's contributions, we've curating the experiments and feedback into the awesome-webnn GH open to all: 15:31:57 -> https://github.com/webmachinelearning/awesome-webnn 15:32:06 Anssi: you will find pointers to community contributed demos, tech talks at developer events, tutorial and more in this repo 15:32:16 Anssi: I'd like to share recent feedback we received from a well-known developer through our spec repo, quoting: 15:32:22 ... "Holy s**t, looks like in the right combination WebNN inference on GPU is over 5x faster than WebGPU" 15:32:36 -> https://github.com/webmachinelearning/webnn/issues/763#issuecomment-3605012091 15:32:36 https://github.com/webmachinelearning/webnn/issues/763 -> Issue 763 Request standards positions from Mozilla and WebKit (by reillyeon) [process] 15:32:46 Anssi: the right combination is the latest Chromium, ONNXRuntime backend, and ONNXRuntime Web 15:33:02 ... in this developer's case he was able to tap into CUDA kernels via WebNN instead of generic GPU pipeline through WebGPU 15:33:06 ... this provided a significant performance boost in this use case 15:33:30 Anssi: this feedback demonstrates how WebNN as a high level API abstraction is able to accelerate computationally expensive ops that are the building blocks of modern model architectures 15:33:47 ... this reinforces the message that the WebNN and WebGPU APIs coexist and complement each other on the web platform, thus we continue to improve efficient interop bridges 15:34:02 Topic: External weights, learnings from WebGPU 15:34:12 Anssi: WebNN issue #901 15:34:13 https://github.com/webmachinelearning/webnn/issues/901 -> Issue 901 Proposal: API to Separate Graph Building from Weight Loading to Reduce Peak Memory Usage (by mtavenrath) [feature request] 15:34:18 ... related WebGPU issue https://github.com/gpuweb/gpuweb/issues/4185 15:34:18 https://github.com/gpuweb/gpuweb/issues/4185 -> Issue 4185 Image uploading is insufficiently expressive to be optimized (by kainino0x) [api] [api-milestone-2-202502] 15:34:34 ... I'd like to continue discuss external weights for constants, a proposal we initially explored at TPAC 15:34:52 ... last week we received new information from WebGPU experts regarding data-streaming APIs and plain buffer uploads, and WebGPU's approach 15:34:57 ... Reilly asked Kai from the WebGPU land whether they have considered a design that would allow an HTTP request as the source for a resource like so: 15:35:03 ``` 15:35:03 let constant = builder.constant(new Request("model.bin", {headers: {"Range": "bytes=5435435-5484329"}})); 15:35:03 ``` 15:35:20 Anssi: the WebGPU group was interested in this approach, but Kai shared they haven't yet done concrete work to integrate that feature into the WebGPU API 15:35:43 Anssi: in the WebGPU issue Kai illustrated two paths how an image file is fed into a GPUTexture: 15:35:50 ... HTMLImageElement -> createImageBitmap -> copyExternalImageToTexture 15:35:50 ... fetch -> blob -> createImageBitmap -> copyExternalImageToTexture 15:35:53 ... WebGPU API's copyExternalImageToTexture() has since been upgraded and can now take as a source also ImageData and HTMLImageElement directly and accepts any of the following as a source: 15:36:03 ... ImageData 15:36:03 ... HTMLImageElement 15:36:03 ... HTMLVideoElement 15:36:04 ... VideoFrame 15:36:04 ... HTMLCanvasElement 15:36:04 ... OffscreenCanvas 15:36:08 -> https://www.w3.org/TR/webgpu/#gpucopyexternalimagesourceinfo 15:36:12 Anssi: so the following more optimized path should work now: 15:36:16 ... HTMLImageElement -> copyExternalImageToTexture 15:36:19 ... streaming to texture is still not supported, though 15:36:35 Reilly: I like pulling weights from HTTP request, the maximal hands off approach 15:37:03 ... if using image we can cut some intermediate steps off, but if we host weights somewhere on the internet can load them directly 15:37:30 ... the challenge is on the framework side, they don't do anything similar, they take a streaming approach 15:37:58 ... careful management of how much weights can be loaded in memory before feeding the GPU 15:38:20 ... proposal from TPAC, having a constant that takes a stream might be what we want to do to meet frameworks where they are today 15:38:36 ... you can still call all constants at one 15:38:58 ... there's no backpressure in the system, to add that would needs streams or a promise somewhere 15:39:21 ... next step to talk to ONNX folks who did WebNN integration and other framework providers 15:39:22 q? 15:40:03 Anssi: should we coordinate anything this with WebGPU group? 15:40:24 Reilly: we should run our proposal through WebGPU group to get their feedback for conceptual alignment 15:40:53 ... there are WebGPU specific things, if they're interested in pulling directly from HTTP requests, we should make sure the way we specify things look similar 15:41:07 ... I expect there to be differences 15:41:22 ... alignment on the general pattern is the most important 15:41:23 q? 15:42:05 Rafael: I haven't been heavily involved with this particular WebGPU feature, I wouldn't categorize this as streaming, in this case you need to wait for the download to finish before you can use it 15:42:13 q+ 15:42:15 ... this is more like attaching things together with minimal steps 15:42:39 ... go straight from first connection to WebGPU or WebNN, without conversion and memory copies 15:43:20 ... we've seen cases where WebGPU you don't e.g want alpha channel multiplied with colors, you just want to get the data 15:43:28 ack reillyg 15:43:58 Reilly: I wouldn't view this as streaming but give implementation visibility when the resources are uploaded to the GPU 15:44:25 ... existing frameworks expect the weights are available at model compilation time, this is a limitation 15:44:47 ... that means certain optimizations are not possible 15:45:04 ... with HTTP request approach, we can give frameworks control over when the resources are loaded 15:45:15 ... now the resources are loaded before build() 15:45:39 ... we could change the behaviour when frameworks improve their approach 15:45:50 q? 15:46:28 Ningxin: I will talk to Wangming and ONNX Runtime 15:47:40 Reilly: this is forward-looking feature, will talk to LiteRT framework people on this feature 15:47:50 q? 15:47:58 q? 15:48:08 Topic: New device selection hints for MLContextOptions 15:48:14 Anssi: issue #902 15:48:15 https://github.com/webmachinelearning/webnn/issues/902 -> Issue 902 Device selection criteria for usecase-driven scenarios (by fdwr) [device selection] 15:48:19 ... I wanted to check the group's latest thoughts on hints to complement MLPowerPreference 15:48:32 ... I simply translated Dwayne's table to IDL to tease out feedback: 15:48:36 -> https://github.com/webmachinelearning/webnn/issues/902#issuecomment-3612503939 15:48:37 https://github.com/webmachinelearning/webnn/issues/902 -> Issue 902 Device selection criteria for usecase-driven scenarios (by fdwr) [device selection] 15:48:49 ``` 15:48:49 dictionary MLContextOptions { 15:48:49 MLPowerPreference powerPreference = "default"; 15:48:49 + MLLatencyPreference latencyPreference = /* default? */ 15:48:49 + MLWorkloadSizePreference workloadSizePreference = /* default? */ 15:48:50 + MLContinuityPreference continuityPreference = /* default? */ 15:48:50 boolean accelerated = true; 15:48:50 }; 15:48:51 ``` 15:49:15 Anssi: adding hints is cheaper in a sense implementers can disregard any of them, but we still shouldn't add hints that are not backed with strong use cases to reduce conceptual weight of the API 15:49:45 ... I see MikeW being supportive of the new hints in general and Zoltan's comments on the proposed opSupportLimitsPerDevice() and MikeW's confirmation the per-device op supports could be an addition on top of hints, not conflicting 15:50:02 ... any further feedback on the set, do we see more use cases some of the the three dimensions under consideration: latency, workload size, continuity 15:50:24 q+ 15:50:28 ack RafaelCintron 15:51:23 Rafael: I'd like to see how to map these hints to the current backends to understand the implementability and hardware capabilities 15:52:28 q? 15:52:49 q+ 15:52:54 ack zkis 15:54:01 Zoltan: I fully agree with Rafael, we need to map to the current backends, no need to break down per devices 15:54:42 ... I think apps are interested in knowing which model to download and what constrains to be respected when inference is done, no need to micro-manage the implementation, but provide the best info to the implementation to select the policy 15:54:43 q? 15:55:37 Topic: Add minimum data type set and rank range for input, constant, output 15:55:42 Anssi: issue #896 and PR #910 15:55:43 https://github.com/webmachinelearning/webnn/pull/910 -> Pull Request 910 add minimum data types and rank range for operations (by BruceDai) 15:55:44 https://github.com/webmachinelearning/webnn/issues/896 -> Issue 896 Add minimum data type set and rank range for input, constant, output and each operator into Spec (by BruceDai) 15:55:47 ... related to issue #853 15:55:47 https://github.com/webmachinelearning/webnn/issues/853 -> Issue 853 The minimum data type set (by huningxin) [operator specific] 15:56:03 ... this PR adds "required data types" and "required ranks" columns to the "Tensor limits for ..." tables associated with each op in the spec 15:56:10 ... adding this information was a tedious task, thanks Bruce for the PR 15:56:19 ... with these enhanced tables, as a bonus, implementations can programmatically extract this data about the minimum data type set from WebNN API spec, now it is hard-coded in the implementation 15:56:25 ... the PR review in ongoing with a lot of good comments, most resolved 15:56:31 ... anything specific to discuss today for this issue or PR? 15:57:02 Anssi: feel free to merge the PR when adequate review has been received 15:57:09 Topic: Remove conformance tests with negative scale of DQ/Q operators 15:57:16 Anssi: issue #879 and PR #906 15:57:17 https://github.com/webmachinelearning/webnn/pull/906 -> Pull Request 906 Restrict scale of dequantizeLinear and quantizeLinear to be positive (by BruceDai) 15:57:17 https://github.com/webmachinelearning/webnn/issues/879 -> Issue 879 Propose to remove WPT conformance tests with negative scale of dequantizeLinear and quantizeLinear operator (by BruceDai) [question] 15:57:31 ... per discussion in the issue, negative scales do have utility, but they are not yet supported on all backends 15:57:37 ... due to this implementation limitation, the PR adds the following constraint to dequantizeLinear and quantizeLinear scale argument: 15:57:44 ... "Values must be positive and nonzero." 15:57:53 ... PR has been approved by Dwayne 15:58:02 ... Ningxin you asked how to cover both wrong results and compilation failure cases? 15:58:09 q? 15:58:55 Ningxin: I asked whether we need some text to mention the implementation-dependent behaviour, Phillis was asking about that in an earlier comment 15:59:10 q? 15:59:31 q? 15:59:45 Anssi: good to merge with adequate review 15:59:59 Topic: 2025 Reflections 16:00:10 Anssi: Thank You for an exceptional 2025! 16:00:16 ... it's been a busy year 16:00:27 ... we are ending the year strong, with broader implementation experience as the icing on the cake 16:00:39 ... the specification completed the latest wide review during 2025 with kudos 16:00:51 ... we had F2F time with horizontal groups to deepened our collaboration 16:01:12 ... we established new joint workstream for ethical and sustainability considerations with other groups 16:01:30 ... our group grew +30% YOY, diversity increased with more ISVs and other early adopters joining us 16:01:58 ... we keep hearing positive signals from the web developer ecosystem: WebNN API allows developers to unleash their creativity in ways not possible before in the browser 16:02:10 ... the key message I'm hearing from all fronts is: you are on the right track, keep marching ahead 16:02:22 ... I'm looking ahead to an even more exciting 2026, some milestones: 16:02:36 ... we want to publish a new WebNN API Candidate Recommendation Snapshot aligned with the implementation that gets in the hands of early adopters 16:03:16 ... I anticipate implementers to further improve the WebNN UX in 2026 informed by feedback from real users, continue work on performance optimizations, faster model compilation, reduced memory usage -- we will carefully craft WebNN API enhancements together that will improve the experience further 16:03:46 ... LLM performance is crucially important and running SLMs in browser is becoming a real thing in 2026, a requirement for agentic workloads in the browser 16:03:58 ... to that end, we will continue deliver important WebNN API enhancement for LLMs, such as dynamic shapes and op fusion 16:04:07 ... and of course, WebGPU interop continues to be a crucial focus area 16:04:16 ... and much more! 16:04:25 ... thank you all for your contributions during 2025 16:04:34 thank you all! 16:04:40 ... there's some much more to come in 2026 and our path is clear 16:05:03 ... our group's open standards and open source based approach continues to provide the users and developers agency and choice 16:05:42 ... Thank You Everyone for your focus, dedication, contributions and friedship on this multi-year journey! 16:05:52 ... we will be back after the holiday break 15th January 2026 16:06:10 RRSAgent, draft minutes 16:06:41 I have made the request to generate https://www.w3.org/2025/12/18-webmachinelearning-minutes.html anssik 16:08:50 s/we had last/last 16:10:28 s/follow WebNN/follows WebNN 16:11:18 s/sameples/samples 16:12:09 s/sharing/next I'll share 16:12:57 s/convesion/conversion 16:14:12 s/though about/thought about 16:14:44 s/is to/you 16:15:01 s/all is in/all this is in 16:15:13 s/not core/now core 16:15:56 s/maybe I/I 16:18:15 s/if using/by using 16:18:34 s/can load/we can load 16:19:22 s/anything this/anything wrt this 16:20:14 s/don't e.g/e.g. don't 16:20:40 s/are available/to be available 16:21:12 s/and ONNX Runtime/and ONNX Runtime folks 16:21:31 s/this is forward-looking/this is a forward-looking 16:21:43 s/people on/people about 16:23:21 vasilii_ has joined #webmachinelearning 16:23:27 s/deepened/deepen 16:25:01 RRSAgent, draft minutes 16:25:02 I have made the request to generate https://www.w3.org/2025/12/18-webmachinelearning-minutes.html anssik 16:28:09 s/friedship/friendship 16:28:11 RRSAgent, draft minutes 16:28:12 I have made the request to generate https://www.w3.org/2025/12/18-webmachinelearning-minutes.html anssik 16:28:20 vasilii has joined #webmachinelearning 17:52:05 vasilii has joined #webmachinelearning 18:26:23 Zakim has left #webmachinelearning 19:24:28 vasilii has joined #webmachinelearning 21:00:28 vasilii has joined #webmachinelearning 21:11:02 vasilii has joined #webmachinelearning 21:24:07 vasilii_ has joined #webmachinelearning 21:38:09 vasilii has joined #webmachinelearning 21:43:09 vasilii has joined #webmachinelearning 22:58:17 vasilii has joined #webmachinelearning