14:57:00 RRSAgent has joined #webmachinelearning 14:57:05 logging to https://www.w3.org/2025/06/26-webmachinelearning-irc 14:57:05 Meeting: WebML WG Teleconference – 26 June 2025 14:57:05 RRSAgent, make logs Public 14:57:06 please title this meeting ("meeting: ..."), anssik 14:57:07 Chair: Anssi 14:57:14 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-06-26-wg-agenda.md 14:57:18 Scribe: Anssi 14:57:27 scribeNick: anssik 14:57:27 gb, this is webmachinelearning/webnn 14:57:27 anssik, OK. 14:57:32 Present+ Anssi_Kostiainen 14:57:38 Present+ Zoltan_Kis 14:57:47 Present+ Rafael_Cintron 14:57:58 Present+ Laszlo_Gombos 15:00:02 ningxin has joined #webmachinelearning 15:00:14 Present+ Ningxin_Hu 15:00:31 lgombos has joined #webmachinelearning 15:00:39 present+ Laszlo_Gombos 15:00:39 Present+ Dwayne_Robinson 15:00:58 Present+ Mike_Wyrzykowski 15:01:03 Mike_Wyrzykowski has joined #webmachinelearning 15:01:14 Present+ Tarek_Ziade 15:01:28 Present+ Winston_Chen 15:02:03 Winston has joined #webmachinelearning 15:02:12 Present+ Ehsan_Toreini 15:02:33 Present+ Christian_Liebel 15:02:43 RRSAgent, draft minutes 15:02:45 I have made the request to generate https://www.w3.org/2025/06/26-webmachinelearning-minutes.html anssik 15:03:03 anssik: please welcome Khushal Sagar, Hannah Van Opstal, David Bokan from Google to the WebML WG! 15:03:16 ... and please welcome Frank Li from Microsoft to WebML CG 15:03:41 RafaelCintron has joined #webmachinelearning 15:03:47 ... Frank recently added support for tool/function calling to the Prompt API, a prerequisite as we advance toward exciting space of enabling agentic workflows 15:04:02 Topic: Announcements 15:04:10 Subtopic: Awesome WebNN tools 15:04:12 zkis has joined #webmachinelearning 15:04:15 DwayneR has joined #webmachinelearning 15:04:33 anssik: Awesome WebNN tools updates, new WebNN Model-to-Code conversion tools published 15:04:33 -> WebNN Tools https://github.com/webmachinelearning/awesome-webnn#tools 15:04:33 present+ Zoltan_Kis 15:04:44 ... ONNX2WebNN by Ningxin 15:06:57 Ningxin: converts ONNX model to WebNN JS graph topology and weights bin file so JS can load weights from it, this enables light-weight use of WebNN without any framework dependencies 15:07:03 Present+ Reilly_Grant 15:07:51 anssik: WebNN Code Generator by Belem 15:08:12 ... WebNN Utilities / OnnxConverter by MS Edge team 15:08:20 s/WebNN Code Generator by Belem/... WebNN Code Generator by Belem 15:08:45 anssik: see also a tutorial on how to generate WebNN vanilla JS for package-size sensitive deployments 15:08:55 -> Generating WebNN Vanilla JavaScript https://webnn.io/en/learn/tutorials/webnn/vanillajs 15:09:37 anssik: the team expects to deliver further improvements with new WebNN code-to-code translation tools 15:09:37 ... to allow converting existing Python-based ML code, PyTorch/TorchScript, from other frameworks to WebNN vanilla JavaScript 15:09:47 ... thanks for Ningxin, Belem, MS Edge team for these contributions that help developers to adopt WebNN into their web apps 15:09:59 Subtopic: WebNN Documentation community preview 15:10:07 anssik: I'm pleased to launch a community preview of the new WebNN Documentation 15:10:16 -> https://github.com/webmachinelearning/webnn-docs 15:10:24 -> https://webnn.io/ 15:10:34 anssik: the webnn-docs effort is very important as we enter this stage of wider developer adoption 15:10:38 ... huge thanks to Belem for pulling this off! 15:10:57 ... we believe the vendor-neutral WebNN developer docs should ultimately live in MDN that has the widest reach 15:11:08 ... during this preview phase, we use the dedicated site to gather feedback and plan the next steps 15:11:08 ... the GH repo is open to contributions 15:11:23 Subtopic: Web Almanac Generative AI 2025 chapter 15:11:37 anssik: Web Almanac is HTTP Archive’s annual state of the web report, and Christian is leading the GenAI chapter 15:11:42 -> https://github.com/HTTPArchive/almanac.httparchive.org/issues/4104 15:11:43 https://github.com/HTTPArchive/almanac.httparchive.org/issues/4104 -> Issue 4104 Generative AI 2025 🆕 (by nrllh) [2025 chapter] 15:11:46 -> https://almanac.httparchive.org/ 15:11:50 anssik: Christian gave a great overview of this effort at our CG meeting, please check out: 15:11:53 -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-06-23-cg-minutes.md#http-archives-web-almanac 15:12:09 Present+ 15:12:40 Christian: we are planning a new chapter Web Almanac, annual publication to identify web trends, we want to find out how web sites are using WebNN and Built-in AI APIs 15:13:06 Topic: W3C TPAC 2025 group meetings 15:13:30 anssik: TPAC 2025, W3C's annual all-groups conference, will take place 10-14 November 2025 in Kobe, Japan. The venue is Kobe International Conference Center 15:13:36 ... my expectation is the WebML WG participants prefer to meet during the TPAC week 15:13:43 ... I also expect we will have a joint meeting with the WebML CG 15:13:51 ... Group meetings can happen on Monday, Tuesday, Thursday, and Friday 15:14:02 ... I have requested Monday (10 Nov) for the WG and Tuesday (11) for the CG meeting from TPAC organizers 15:14:12 ... I expect the schedule to be confirmed next month and I'll share the details with the group when available 15:14:38 anssik: one consideration is related to timezones for possible remote participants, Japan Monday is US West Coast Sunday evening 15:14:49 ... this may or may not work depending on how flexible you can be with your work hours on an exceptional basis 15:15:00 ... feedback is still welcome via: https://github.com/webmachinelearning/meetings/issues/32 15:15:00 https://github.com/webmachinelearning/meetings/issues/32 -> Issue 32 WebML WG/CG scheduling poll for TPAC 2025 (Kobe, Japan) (by anssiko) 15:15:04 ... questions? 15:15:47 Tarek: I'm considering coming and wanted to know if there are specific steps to take? 15:16:00 anssik: there will more information shared next month 15:16:52 q? 15:17:06 Topic: Incubations 15:17:10 anssik: the WebML Community Group met at EU-APAC friendly time on Mon 26 May 2025 15:17:15 -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-06-23-cg-agenda.md 15:17:18 -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-06-23-cg-minutes.md 15:17:22 anssik: we received an HTTP Archive's Web Almanac update by Christian, check the minutes if you're interested in contributing 15:17:33 ... we reviewed a new proposal for a Fact-checking API, the initial feedback suggests implementation has risks, a proposals better experimented as a web extensions similarly to what WikiMedia had done 15:17:42 ... we had a Proofreader API kick off 15:17:46 -> Proofreader API https://github.com/webmachinelearning/proofreader-api 15:17:58 anssik: now in dev trial on Chrome, Origin Trial planned for Chrome 139 15:18:07 ... feedback welcome 15:18:20 anssik: discussed new features and recent improvements landed to the Prompt API 15:18:32 ... structured output improvements to fix bugs found via implementation experience 15:18:48 ... assistant prefixes (aka prefills) to allow constraining responses by providing a prefix that will guide the LLM to a specific response format 15:19:01 ... support for tool/function calling landed, paving the way for agentic workflows 15:19:21 ... received updated from new Prompt(-like) API web extensions (e.g. AiBrow, Mozilla's trial web extension API) that extend the Prompt API baseline with new features 15:19:23 anssik: we deferred Translation API to a meeting when we have Mozilla folks on the call 15:19:37 ... we wanted to better understand the use cases og the Mozilla's Translation API proposal and see if we can converge 15:20:00 Topic: Operator specific issues 15:20:03 -> [operator specific] issues https://github.com/webmachinelearning/webnn/labels/operator%20specific 15:20:24 Subtopic: Drop support for int32/uint32 of zeropoint for quantizeLinear 15:20:29 anssik: issue #856 15:20:29 https://github.com/webmachinelearning/webnn/issues/856 -> Issue 856 Consider drop the support for int32/uint32 of zeropoint for quantizeLinear (by lisa0314) [operator specific] 15:20:39 anssik: Lisa reports "WebNN spec said, quantizeLinear zeroPoint can support uint8/int8, uint32/int32" 15:20:44 -> https://www.w3.org/TR/webnn/#dom-mlgraphbuilder-quantizelinear-input-scale-zeropoint-options-zeropoint 15:20:48 anssik: and points the limitations in current backends: 15:20:55 ... ORT quantizeLinear can't support int32/uint32 for zeroPoint 15:20:59 ... TFLite quantize can't support int32/uint32 for zeroPoint 15:21:04 ... per this data, Lisa's suggestion is to drop int32/uint32 of zeropoint for quantizeLinear 15:21:10 ... comments? 15:21:17 Tarek has joined #webmachinelearning 15:21:57 q+ 15:22:02 Reilly: skimming this, it doesn't seem valuable to support int32 quantization, technically quantization, but does not seem very useful feature to me 15:22:09 q? 15:22:22 q- 15:22:35 Dwayne: I don't see a compelling need for int32 in zeropoint 15:22:47 RafaelCintron: +1 to what Dwayne said 15:23:15 ningxin: checking Core ML does not support int32 for zeropoint 15:23:44 Reilly: it does not make sense to quantize values to 32-bit integer type, not useful 15:24:29 Dwayne: this is now specced so that zeropoint is the same type as input, need to split data types between those 15:25:15 Dwayne: ONNX dequantizeLinear can be int32, but not the zeropoint 15:25:23 ningxin: the proposal here is for quantization only 15:26:32 Reilly: ONNX is the outlier to support int32 for quantized input as well, I'd expand the issue accordingly 15:27:58 ... Dwayne and Ningxin, do you agree since there's binding in the spec from input to zeropoint type, does it make sense to drop int32/uint32 from quantizeLinear? 15:28:18 ningxin: my understanding is quantizeLinear is bound to output data type, from float to linear? 15:28:59 Reilly: there's a matching question on dequantize 15:29:17 ... and whether to also drop support for int32/uint32 both input and output 15:29:39 Dwayne: I'll check ONNX history for reason why it is an outlier 15:29:58 Reilly: we do more research on broader int32/uint32 question 15:30:17 ... will make a comment to the issue 15:30:18 q? 15:30:42 Subtopic: Add missing 64-bit integers support for some reduction operators 15:30:47 anssik: issue #694 and PR #695 15:30:48 https://github.com/webmachinelearning/webnn/pull/695 -> Pull Request 695 Bugfix: Add missing 64-bit integers support for some reduction operators (by huningxin) [operator specific] 15:30:48 https://github.com/webmachinelearning/webnn/issues/694 -> Issue 694 Consider adding int64/uint64 data type support for some reduce operators (by lisa0314) [operator specific] 15:30:55 ... related issue #853 15:30:55 https://github.com/webmachinelearning/webnn/issues/853 -> Issue 853 The minimum data type set (by huningxin) [operator specific] 15:31:59 ningxin: IIRC MikeW asked if this is optional, I shared this is optional and not mandatory 15:31:59 q+ 15:32:02 ack Mike_Wyrzykowski 15:32:13 MikeW: I just approved the PR 15:32:29 anssik: this PR is good to merge 15:32:53 Topic: Other issues and PRs 15:32:53 If I remember correctly, int32 input of dequantizeLinear is useful for conv2d's bias 15:33:02 Subtopic: Evaluate sustainability impact 15:33:09 anssik: issue #861 15:33:09 https://github.com/webmachinelearning/webnn/issues/861 -> Issue 861 Evaluate sustainability impact (by anssiko) [tag-needs-resolution] 15:33:27 ... I want to bump this issue opened in response to TAG review feedback: 15:33:34 >TAG: We would appreciate if the WG would evaluate the likely impacts on sustainability from introducing this API, perhaps in collaboration with the Sustainable Web IG. There are several competing likely effects, including the comparative energy efficiency of personal devices vs datacenters, the greater efficiency of WebNN over WebGPU for the same workload, increased use of neural nets as they get easier to access, faster device 15:33:34 obsolescence if older devices can't effectively run the workloads this API encourages, and likely other considerations. Any sustainability impacts might be balanced by increased utility and privacy, but it would be good to know what we're signing up for. 15:34:23 anssik: we discussed last time purpose-built ML accelerators aka NPUs are generally known to be more power-efficient than GPUs 15:34:43 ... I opened an issue to solicit further input, suggestions, corrections and clarifications to inform related explainer and/or specification updates in response to this TAG feedback 15:35:04 q? 15:35:04 q+ 15:35:08 ack RafaelCintron 15:35:41 q+ 15:35:57 RafaelCintron: asking did you remember saying NPUs being better for sustainability? 15:36:10 anssik: I shared the issue with the TAG 15:36:22 ack reillyg 15:37:23 Reilly: a reasonable response would be what impacts sustainability is the broader adoption of ML techniques as a whole, client vs. server side, both take energy, and local execution is only possible if the local has enough energy and power 15:40:12 q+ 15:40:17 ... there's a concern that applies across the whole space, local compute reduces the cost of site developer and pushes it to user, a power-privacy trade off 15:40:51 ... I'm a little concerned e.g. crypto miners using local compute for their own benefit 15:41:51 ... this is possible via Wasm and WebGPU already, however 15:42:11 ack RafaelCintron 15:43:02 RafaelCintron: substantial benefit from JS tools that allow minimizing the amount of bits to transfer over the network 15:43:42 ... new machines are bought for new experiences 15:44:12 anssik: model caching helps with sustainability 15:44:13 q? 15:44:52 q? 15:45:02 Topic: Caching mechanism for MLGraph 15:45:05 anssik: issue #807 and PR #862 15:45:06 https://github.com/webmachinelearning/webnn/pull/862 -> MERGED Pull Request 862 Add WebNN MLGraph Cache Explainer (by anssiko) 15:45:06 https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request] 15:45:10 anssik: thanks to Reilly and Ningxin for your review and comments 15:45:14 .... the first version of the explainer was merged 15:45:18 -> Explainer https://github.com/webmachinelearning/webnn/blob/main/cache-explainer.md 15:45:26 q+ 15:45:27 anssik: I'd like to discuss what participants think are the reasonable next steps for the spec and implementation 15:45:33 ... as you recall, we have a prototype Chromium implementation and have explored how to use this in a real sample app 15:45:36 -> https://github.com/shiyi9801/chromium/pull/227 15:45:37 https://github.com/shiyi9801/chromium/pull/227 -> Pull Request 227 [DO NOT SUBMIT] Model cache POC (by shiyi9801) 15:45:40 -> https://github.com/webmachinelearning/webnn-samples/compare/master...shiyi9801:webnn-samples:model_cache 15:45:46 ack RafaelCintron 15:46:13 RafaelCintron: strong proponent for a caching mechanism, the current API could be improved to have build and buildAndSave combined together 15:46:32 ... there are EPs in ONNX that cannot go and do save at any point, you need to decide at build time 15:47:17 ... I think that's unfortunate to save things to slow disk and I know people have said they want to do model inferencing securely 15:47:52 Zoltan: should we include build options? 15:47:53 q+ 15:48:23 RafaelCintron: build options sounds fine, you need to give it a name 15:48:38 Zoltan: by default does not save, you have to be explicit and define it in options 15:49:15 q? 15:49:19 ack reillyg 15:49:33 anssik: Rafael please comment on the issue so we remember to update the explainer 15:50:36 Reilly: I think what the Intel and Microsoft folks have been looking at is the design in ONNX and Chromium, this has a unique feature that it is possible to make the model ready for inferencing without going through serialization step to prepare the model to be saved 15:51:03 ... in TFLite and Core ML, the only option is to produce a serialized model, and loading serializer model into a form ready for inference 15:51:45 ... only the ORT supports building a model in memory in deserialized for ready for inferences, this raises a question of do we force the model to be serialized anyway, so we can make saving the graph optional or mandatory step 15:52:26 ... the benefit of making this optional is the latency of first inference, the user visits the site multiple time 15:53:06 ... leaning towards not optimizing for the first load case so much, which leads me to maybe saving the model becomes mandatory, only option is "built and save" where you have to name it 15:53:33 ... that makes me concerned if we give the developer this, then everyone has to deal with the question how do I name it 15:53:49 q+ 15:53:50 ... question is, how important this potential optimization for on framework is? 15:53:52 ack RafaelCintron 15:54:23 RafaelCintron: if we have both "build" and "build and save" do we lose perf on TFLite? 15:55:00 ... to have just "build and save", I haven't though of that, when would be the case you wouldn't not want to save? 15:55:07 ... some toy web site? 15:55:19 ... if the user visits for multiple times, it always makes sense to save 15:55:36 Reilly: I feel out advise to developers is to always "build and save" 15:55:45 ... only for sample sites "build" would be reasonable 15:56:12 anssik: the common case should be easy, the less common case should be possible 15:57:42 Reilly: only concern is implementation complexity, build without saving, if not exercised so often 15:58:14 RafaelCintron: I'd be OK with build-only take an option and to force people to name their model 15:58:44 ... not want to save use case would be to keep it secure and not allow developers to inspect it 15:59:42 anssik: is it security by obscurity? 15:59:56 RafaelCintron: would take more effort 16:00:47 anssik: action on Rafael to check the explainer reflects your thinking 16:01:04 Topic: Query supported devices 16:01:10 Subtopic: Before graph compilation 16:01:14 anssik: issue #815 16:01:16 https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query supported devices before graph compilation (by anssiko) [device selection] 16:01:18 ... and related PR #860 by Zoltan (thanks!) for explainer updates 16:01:21 https://github.com/webmachinelearning/webnn/pull/860 -> Pull Request 860 Update with an example HW selection guide and new use cases (by zolkis) 16:01:33 ... I'd like to check all the product-driven use case feedback from Google Meet is translated into explainer updates 16:01:42 ... Zoltan has updated the key use cases and will talk them, you can follow along from the staged explainer doc at: 16:01:42 https://github.com/webmachinelearning/webnn/pull/860 -> Pull Request 860 Update with an example HW selection guide and new use cases (by zolkis) 16:01:45 -> https://github.com/zolkis/webnn/blob/device-selection-explainer-next/device-selection-explainer.md#key-use-cases-and-requirements 16:02:56 Zoltan: developer scenarios to try to figure if the model can run on the target platform 16:03:22 ... tried to avoid solutions, document requirements only 16:03:42 ... UC 1. Pre-download capability check 16:03:46 ... UC 2. Pre-download or pre-build hints and constraints 16:03:50 ... UC 3. Post-compile query of inference details 16:04:17 RRSAgent, draft minutes 16:04:18 I have made the request to generate https://www.w3.org/2025/06/26-webmachinelearning-minutes.html anssik 16:05:25 Zoltan: Google Meet requirement was to figure out very fast whether they can use WebNN 16:05:42 ... PTAL the use cases and requirements section, link shared above 16:07:16 Topic: Next meeting 14 August 2025 16:07:30 Anssi: due to the upcoming holiday season in the Northern hemisphere we will skip over July meetings and will meet again 14 August 2025 16:07:44 ... thank you for your contributions during the first half of 2025, everyone! 16:07:57 ... the community continues to grow and I'm pleased to see new people join from both big and small companies, and individuals 16:08:24 Present+ Sun_Shin 16:08:33 RRSAgent, draft minutes 16:08:35 I have made the request to generate https://www.w3.org/2025/06/26-webmachinelearning-minutes.html anssik 16:11:54 s/shared next month/shared latest early August 16:12:24 s/a proposals better/a proposal is better 16:13:07 s/updated from/updates from 16:13:28 s/a meeting/a future meeting 16:13:42 s/og the/of the 16:16:51 s/saying/sharing with the TAG 16:19:38 s/deserialized for/deserialized form 16:19:50 s/inferences/inference 16:20:31 s/maybe/say maybe 16:20:56 s/built and save/build and save 16:21:19 s/developer this/developer this capability 16:21:43 s/on framework/just one framework 16:21:56 s/perf/performance 16:22:15 s/wouldn't not/wouldn't 16:22:48 s/out advise/our advice 16:24:03 s/not want to save/the "do not save" 16:24:16 s/keep it/keep the model 16:25:02 s/talk them/talk to them 16:26:06 RRSAgent, draft minutes 16:26:07 I have made the request to generate https://www.w3.org/2025/06/26-webmachinelearning-minutes.html anssik