14:57:00 <RRSAgent> RRSAgent has joined #webmachinelearning
14:57:05 <RRSAgent> logging to https://www.w3.org/2025/06/26-webmachinelearning-irc
14:57:05 <anssik> Meeting: WebML WG Teleconference – 26 June 2025
14:57:05 <Zakim> RRSAgent, make logs Public
14:57:06 <Zakim> please title this meeting ("meeting: ..."), anssik
14:57:07 <anssik> Chair: Anssi
14:57:14 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-06-26-wg-agenda.md
14:57:18 <anssik> Scribe: Anssi
14:57:27 <anssik> scribeNick: anssik
14:57:27 <anssik> gb, this is webmachinelearning/webnn
14:57:27 <gb> anssik, OK.
14:57:32 <anssik> Present+ Anssi_Kostiainen
14:57:38 <anssik> Present+ Zoltan_Kis
14:57:47 <anssik> Present+ Rafael_Cintron
14:57:58 <anssik> Present+ Laszlo_Gombos
15:00:02 <ningxin> ningxin has joined #webmachinelearning
15:00:14 <anssik> Present+ Ningxin_Hu
15:00:31 <lgombos> lgombos has joined #webmachinelearning
15:00:39 <lgombos> present+ Laszlo_Gombos
15:00:39 <anssik> Present+ Dwayne_Robinson
15:00:58 <anssik> Present+ Mike_Wyrzykowski
15:01:03 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
15:01:14 <anssik> Present+ Tarek_Ziade
15:01:28 <anssik> Present+ Winston_Chen
15:02:03 <Winston> Winston has joined #webmachinelearning
15:02:12 <anssik> Present+ Ehsan_Toreini
15:02:33 <anssik> Present+ Christian_Liebel
15:02:43 <anssik> RRSAgent, draft minutes
15:02:45 <RRSAgent> I have made the request to generate https://www.w3.org/2025/06/26-webmachinelearning-minutes.html anssik
15:03:03 <anssik> anssik: please welcome Khushal Sagar, Hannah Van Opstal, David Bokan from Google to the WebML WG!
15:03:16 <anssik> ... and please welcome Frank Li from Microsoft to WebML CG
15:03:41 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:03:47 <anssik> ... Frank recently added support for tool/function calling to the Prompt API, a prerequisite as we advance toward exciting space of enabling agentic workflows
15:04:02 <anssik> Topic: Announcements
15:04:10 <anssik> Subtopic: Awesome WebNN tools
15:04:12 <zkis> zkis has joined #webmachinelearning
15:04:15 <DwayneR> DwayneR has joined #webmachinelearning
15:04:33 <anssik> anssik: Awesome WebNN tools updates, new WebNN Model-to-Code conversion tools published
15:04:33 <anssik> -> WebNN Tools https://github.com/webmachinelearning/awesome-webnn#tools
15:04:33 <zkis> present+ Zoltan_Kis
15:04:44 <anssik> ... ONNX2WebNN by Ningxin
15:06:57 <anssik> Ningxin: converts ONNX model to WebNN JS graph topology and weights bin file so JS can load weights from it, this enables light-weight use of WebNN without any framework dependencies
15:07:03 <anssik> Present+ Reilly_Grant
15:07:51 <anssik> anssik: WebNN Code Generator by Belem
15:08:12 <anssik> ... WebNN Utilities / OnnxConverter by MS Edge team
15:08:20 <anssik> s/WebNN Code Generator by Belem/... WebNN Code Generator by Belem
15:08:45 <anssik> anssik: see also a tutorial on how to generate WebNN vanilla JS for package-size sensitive deployments
15:08:55 <anssik> -> Generating WebNN Vanilla JavaScript https://webnn.io/en/learn/tutorials/webnn/vanillajs
15:09:37 <anssik> anssik: the team expects to deliver further improvements with new WebNN code-to-code translation tools
15:09:37 <anssik> ... to allow converting existing Python-based ML code, PyTorch/TorchScript, from other frameworks to WebNN vanilla JavaScript
15:09:47 <anssik> ... thanks for Ningxin, Belem, MS Edge team for these contributions that help developers to adopt WebNN into their web apps
15:09:59 <anssik> Subtopic: WebNN Documentation community preview
15:10:07 <anssik> anssik: I'm pleased to launch a community preview of the new WebNN Documentation
15:10:16 <anssik> -> https://github.com/webmachinelearning/webnn-docs
15:10:24 <anssik> -> https://webnn.io/
15:10:34 <anssik> anssik: the webnn-docs effort is very important as we enter this stage of wider developer adoption
15:10:38 <anssik> ... huge thanks to Belem for pulling this off!
15:10:57 <anssik> ... we believe the vendor-neutral WebNN developer docs should ultimately live in MDN that has the widest reach
15:11:08 <anssik> ... during this preview phase, we use the dedicated site to gather feedback and plan the next steps
15:11:08 <anssik> ... the GH repo is open to contributions
15:11:23 <anssik> Subtopic: Web Almanac Generative AI 2025 chapter
15:11:37 <anssik> anssik: Web Almanac is HTTP Archive’s annual state of the web report, and Christian is leading the GenAI chapter
15:11:42 <anssik> -> https://github.com/HTTPArchive/almanac.httparchive.org/issues/4104
15:11:43 <gb> https://github.com/HTTPArchive/almanac.httparchive.org/issues/4104 -> Issue 4104 Generative AI 2025 🆕 (by nrllh) [2025 chapter]
15:11:46 <anssik> -> https://almanac.httparchive.org/
15:11:50 <anssik> anssik: Christian gave a great overview of this effort at our CG meeting, please check out:
15:11:53 <anssik> -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-06-23-cg-minutes.md#http-archives-web-almanac
15:12:09 <reillyg> Present+
15:12:40 <anssik> Christian: we are planning a new chapter Web Almanac, annual publication to identify web trends, we want to find out how web sites are using WebNN and Built-in AI APIs
15:13:06 <anssik> Topic: W3C TPAC 2025 group meetings
15:13:30 <anssik> anssik: TPAC 2025, W3C's annual all-groups conference, will take place 10-14 November 2025 in Kobe, Japan. The venue is Kobe International Conference Center
15:13:36 <anssik> ... my expectation is the WebML WG participants prefer to meet during the TPAC week
15:13:43 <anssik> ... I also expect we will have a joint meeting with the WebML CG
15:13:51 <anssik> ... Group meetings can happen on Monday, Tuesday, Thursday, and Friday
15:14:02 <anssik> ... I have requested Monday (10 Nov) for the WG and Tuesday (11) for the CG meeting from TPAC organizers
15:14:12 <anssik> ... I expect the schedule to be confirmed next month and I'll share the details with the group when available
15:14:38 <anssik> anssik: one consideration is related to timezones for possible remote participants, Japan Monday is US West Coast Sunday evening
15:14:49 <anssik> ... this may or may not work depending on how flexible you can be with your work hours on an exceptional basis
15:15:00 <anssik> ... feedback is still welcome via: https://github.com/webmachinelearning/meetings/issues/32
15:15:00 <gb> https://github.com/webmachinelearning/meetings/issues/32 -> Issue 32 WebML WG/CG scheduling poll for TPAC 2025 (Kobe, Japan) (by anssiko)
15:15:04 <anssik> ... questions?
15:15:47 <anssik> Tarek: I'm considering coming and wanted to know if there are specific steps to take?
15:16:00 <anssik> anssik: there will more information shared next month
15:16:52 <anssik> q?
15:17:06 <anssik> Topic: Incubations
15:17:10 <anssik> anssik: the WebML Community Group met at EU-APAC friendly time on Mon 26 May 2025
15:17:15 <anssik> -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-06-23-cg-agenda.md
15:17:18 <anssik> -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-06-23-cg-minutes.md
15:17:22 <anssik> anssik: we received an HTTP Archive's Web Almanac update by Christian, check the minutes if you're interested in contributing
15:17:33 <anssik> ... we reviewed a new proposal for a Fact-checking API, the initial feedback suggests implementation has risks, a proposals better experimented as a web extensions similarly to what WikiMedia had done
15:17:42 <anssik> ... we had a Proofreader API kick off
15:17:46 <anssik> -> Proofreader API https://github.com/webmachinelearning/proofreader-api
15:17:58 <anssik> anssik: now in dev trial on Chrome, Origin Trial planned for Chrome 139
15:18:07 <anssik> ... feedback welcome
15:18:20 <anssik> anssik: discussed new features and recent improvements landed to the Prompt API
15:18:32 <anssik> ... structured output improvements to fix bugs found via implementation experience
15:18:48 <anssik> ... assistant prefixes (aka prefills) to allow constraining responses by providing a prefix that will guide the LLM to a specific response format
15:19:01 <anssik> ... support for tool/function calling landed, paving the way for agentic workflows
15:19:21 <anssik> ... received updated from new Prompt(-like) API web extensions (e.g. AiBrow, Mozilla's trial web extension API) that extend the Prompt API baseline with new features
15:19:23 <anssik> anssik: we deferred Translation API to a meeting when we have Mozilla folks on the call
15:19:37 <anssik> ... we wanted to better understand the use cases og the Mozilla's Translation API proposal and see if we can converge
15:20:00 <anssik> Topic: Operator specific issues
15:20:03 <anssik> -> [operator specific] issues https://github.com/webmachinelearning/webnn/labels/operator%20specific
15:20:24 <anssik> Subtopic: Drop support for int32/uint32 of zeropoint for quantizeLinear
15:20:29 <anssik> anssik: issue #856
15:20:29 <gb> https://github.com/webmachinelearning/webnn/issues/856 -> Issue 856 Consider drop the support for int32/uint32 of zeropoint for quantizeLinear (by lisa0314) [operator specific]
15:20:39 <anssik> anssik: Lisa reports "WebNN spec said, quantizeLinear zeroPoint can support uint8/int8, uint32/int32"
15:20:44 <anssik> -> https://www.w3.org/TR/webnn/#dom-mlgraphbuilder-quantizelinear-input-scale-zeropoint-options-zeropoint
15:20:48 <anssik> anssik: and points the limitations in current backends:
15:20:55 <anssik> ... ORT quantizeLinear can't support int32/uint32 for zeroPoint
15:20:59 <anssik> ... TFLite quantize can't support int32/uint32 for zeroPoint
15:21:04 <anssik> ... per this data, Lisa's suggestion is to drop int32/uint32 of zeropoint for quantizeLinear
15:21:10 <anssik> ... comments?
15:21:17 <Tarek> Tarek has joined #webmachinelearning
15:21:57 <RafaelCintron> q+
15:22:02 <anssik> Reilly: skimming this, it doesn't seem valuable to support int32 quantization, technically quantization, but does not seem very useful feature to me
15:22:09 <anssik> q?
15:22:22 <RafaelCintron> q-
15:22:35 <anssik> Dwayne: I don't see a compelling need for int32 in zeropoint
15:22:47 <anssik> RafaelCintron: +1 to what Dwayne said
15:23:15 <anssik> ningxin: checking Core ML does not support int32 for zeropoint
15:23:44 <anssik> Reilly: it does not make sense to quantize values to 32-bit integer type, not useful
15:24:29 <anssik> Dwayne: this is now specced so that zeropoint is the same type as input, need to split data types between those
15:25:15 <anssik> Dwayne: ONNX dequantizeLinear can be int32, but not the zeropoint
15:25:23 <anssik> ningxin: the proposal here is for quantization only
15:26:32 <anssik> Reilly: ONNX is the outlier to support int32 for quantized input as well, I'd expand the issue accordingly
15:27:58 <anssik> ... Dwayne and Ningxin, do you agree since there's binding in the spec from input to zeropoint type, does it make sense to drop int32/uint32 from quantizeLinear?
15:28:18 <anssik> ningxin: my understanding is quantizeLinear is bound to output data type, from float to linear?
15:28:59 <anssik> Reilly: there's a matching question on dequantize
15:29:17 <anssik> ... and whether to also drop support for int32/uint32 both input and output
15:29:39 <anssik> Dwayne: I'll check ONNX history for reason why it is an outlier
15:29:58 <anssik> Reilly: we do more research on broader int32/uint32 question
15:30:17 <anssik> ... will make a comment to the issue
15:30:18 <anssik> q?
15:30:42 <anssik> Subtopic: Add missing 64-bit integers support for some reduction operators
15:30:47 <anssik> anssik: issue #694 and PR #695
15:30:48 <gb> https://github.com/webmachinelearning/webnn/pull/695 -> Pull Request 695 Bugfix: Add missing 64-bit integers support for some reduction operators (by huningxin) [operator specific]
15:30:48 <gb> https://github.com/webmachinelearning/webnn/issues/694 -> Issue 694 Consider adding int64/uint64 data type support for some reduce operators (by lisa0314) [operator specific]
15:30:55 <anssik> ... related issue #853
15:30:55 <gb> https://github.com/webmachinelearning/webnn/issues/853 -> Issue 853 The minimum data type set (by huningxin) [operator specific]
15:31:59 <anssik> ningxin: IIRC MikeW asked if this is optional, I shared this is optional and not mandatory
15:31:59 <Mike_Wyrzykowski> q+
15:32:02 <anssik> ack Mike_Wyrzykowski
15:32:13 <anssik> MikeW: I just approved the PR
15:32:29 <anssik> anssik: this PR is good to merge
15:32:53 <anssik> Topic: Other issues and PRs
15:32:53 <ningxin> If I remember correctly, int32 input of dequantizeLinear is useful for conv2d's bias
15:33:02 <anssik> Subtopic: Evaluate sustainability impact
15:33:09 <anssik> anssik: issue #861
15:33:09 <gb> https://github.com/webmachinelearning/webnn/issues/861 -> Issue 861 Evaluate sustainability impact (by anssiko) [tag-needs-resolution]
15:33:27 <anssik> ... I want to bump this issue opened in response to TAG review feedback:
15:33:34 <anssik> >TAG: We would appreciate if the WG would evaluate the likely impacts on sustainability from introducing this API, perhaps in collaboration with the Sustainable Web IG. There are several competing likely effects, including the comparative energy efficiency of personal devices vs datacenters, the greater efficiency of WebNN over WebGPU for the same workload, increased use of neural nets as they get easier to access, faster device
15:33:34 <anssik> obsolescence if older devices can't effectively run the workloads this API encourages, and likely other considerations. Any sustainability impacts might be balanced by increased utility and privacy, but it would be good to know what we're signing up for.
15:34:23 <anssik> anssik: we discussed last time purpose-built ML accelerators aka NPUs are generally known to be more power-efficient than GPUs
15:34:43 <anssik> ... I opened an issue to solicit further input, suggestions, corrections and clarifications to inform related explainer and/or specification updates in response to this TAG feedback
15:35:04 <anssik> q?
15:35:04 <RafaelCintron> q+
15:35:08 <anssik> ack RafaelCintron
15:35:41 <reillyg> q+
15:35:57 <anssik> RafaelCintron: asking did you remember saying NPUs being better for sustainability?
15:36:10 <anssik> anssik: I shared the issue with the TAG
15:36:22 <anssik> ack reillyg
15:37:23 <anssik> Reilly: a reasonable response would be what impacts sustainability is the broader adoption of ML techniques as a whole, client vs. server side, both take energy, and local execution is only possible if the local has enough energy and power
15:40:12 <RafaelCintron> q+
15:40:17 <anssik> ... there's a concern that applies across the whole space, local compute reduces the cost of site developer and pushes it to user, a power-privacy trade off
15:40:51 <anssik> ... I'm a little concerned e.g. crypto miners using local compute for their own benefit
15:41:51 <anssik> ... this is possible via Wasm and WebGPU already, however
15:42:11 <anssik> ack RafaelCintron
15:43:02 <anssik> RafaelCintron: substantial benefit from JS tools that allow minimizing the amount of bits to transfer over the network
15:43:42 <anssik> ... new machines are bought for new experiences
15:44:12 <anssik> anssik: model caching helps with sustainability
15:44:13 <anssik> q?
15:44:52 <anssik> q?
15:45:02 <anssik> Topic: Caching mechanism for MLGraph
15:45:05 <anssik> anssik: issue #807 and PR #862
15:45:06 <gb> https://github.com/webmachinelearning/webnn/pull/862 -> MERGED Pull Request 862 Add WebNN MLGraph Cache Explainer (by anssiko)
15:45:06 <gb> https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request]
15:45:10 <anssik> anssik: thanks to Reilly and Ningxin for your review and comments
15:45:14 <anssik> .... the first version of the explainer was merged
15:45:18 <anssik> -> Explainer https://github.com/webmachinelearning/webnn/blob/main/cache-explainer.md
15:45:26 <RafaelCintron> q+
15:45:27 <anssik> anssik: I'd like to discuss what participants think are the reasonable next steps for the spec and implementation
15:45:33 <anssik> ... as you recall, we have a prototype Chromium implementation and have explored how to use this in a real sample app
15:45:36 <anssik> -> https://github.com/shiyi9801/chromium/pull/227
15:45:37 <gb> https://github.com/shiyi9801/chromium/pull/227 -> Pull Request 227 [DO NOT SUBMIT] Model cache POC (by shiyi9801)
15:45:40 <anssik> -> https://github.com/webmachinelearning/webnn-samples/compare/master...shiyi9801:webnn-samples:model_cache
15:45:46 <anssik> ack RafaelCintron
15:46:13 <anssik> RafaelCintron: strong proponent for a caching mechanism, the current API could be improved to have build and buildAndSave combined together
15:46:32 <anssik> ... there are EPs in ONNX that cannot go and do save at any point, you need to decide at build time
15:47:17 <anssik> ... I think that's unfortunate to save things to slow disk and I know people have said they want to do model inferencing securely
15:47:52 <anssik> Zoltan: should we include build options?
15:47:53 <reillyg> q+
15:48:23 <anssik> RafaelCintron: build options sounds fine, you need to give it a name
15:48:38 <anssik> Zoltan: by default does not save, you have to be explicit and define it in options
15:49:15 <zkis> q?
15:49:19 <anssik> ack reillyg
15:49:33 <anssik> anssik: Rafael please comment on the issue so we remember to update the explainer
15:50:36 <anssik> Reilly: I think what the Intel and Microsoft folks have been looking at is the design in ONNX and Chromium, this has a unique feature that it is possible to make the model ready for inferencing without going through serialization step to prepare the model to be saved
15:51:03 <anssik> ... in TFLite and Core ML, the only option is to produce a serialized model, and loading serializer model into a form ready for inference
15:51:45 <anssik> ... only the ORT supports building a model in memory in deserialized for ready for inferences, this raises a question of do we force the model to be serialized anyway, so we can make saving the graph optional or mandatory step
15:52:26 <anssik> ... the benefit of making this optional is the latency of first inference, the user visits the site multiple time
15:53:06 <anssik> ... leaning towards not optimizing for the first load case so much, which leads me to maybe saving the model becomes mandatory, only option is "built and save" where you have to name it
15:53:33 <anssik> ... that makes me concerned if we give the developer this, then everyone has to deal with the question how do I name it
15:53:49 <RafaelCintron> q+
15:53:50 <anssik> ... question is, how important this potential optimization for on framework is?
15:53:52 <anssik> ack RafaelCintron
15:54:23 <anssik> RafaelCintron: if we have both "build" and "build and save" do we lose perf on TFLite?
15:55:00 <anssik> ... to have just "build and save", I haven't though of that, when would be the case you wouldn't not want to save?
15:55:07 <anssik> ... some toy web site?
15:55:19 <anssik> ... if the user visits for multiple times, it always makes sense to save
15:55:36 <anssik> Reilly: I feel out advise to developers is to always "build and save"
15:55:45 <anssik> ... only for sample sites "build" would be reasonable
15:56:12 <anssik> anssik: the common case should be easy, the less common case should be possible
15:57:42 <anssik> Reilly: only concern is implementation complexity, build without saving, if not exercised so often
15:58:14 <anssik> RafaelCintron: I'd be OK with build-only take an option and to force people to name their model
15:58:44 <anssik> ... not want to save use case would be to keep it secure and not allow developers to inspect it
15:59:42 <anssik> anssik: is it security by obscurity?
15:59:56 <anssik> RafaelCintron: would take more effort
16:00:47 <anssik> anssik: action on Rafael to check the explainer reflects your thinking
16:01:04 <anssik> Topic: Query supported devices
16:01:10 <anssik> Subtopic: Before graph compilation
16:01:14 <anssik> anssik: issue #815
16:01:16 <gb> https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query supported devices before graph compilation (by anssiko) [device selection]
16:01:18 <anssik> ... and related PR #860 by Zoltan (thanks!) for explainer updates
16:01:21 <gb> https://github.com/webmachinelearning/webnn/pull/860 -> Pull Request 860 Update with an example HW selection guide and new use cases (by zolkis)
16:01:33 <anssik> ... I'd like to check all the product-driven use case feedback from Google Meet is translated into explainer updates
16:01:42 <anssik> ... Zoltan has updated the key use cases and will talk them, you can follow along from the staged explainer doc at:
16:01:42 <gb> https://github.com/webmachinelearning/webnn/pull/860 -> Pull Request 860 Update with an example HW selection guide and new use cases (by zolkis)
16:01:45 <anssik> -> https://github.com/zolkis/webnn/blob/device-selection-explainer-next/device-selection-explainer.md#key-use-cases-and-requirements
16:02:56 <anssik> Zoltan: developer scenarios to try to figure if the model can run on the target platform
16:03:22 <anssik> ... tried to avoid solutions, document requirements only
16:03:42 <anssik> ... UC 1. Pre-download capability check
16:03:46 <anssik> ... UC 2. Pre-download or pre-build hints and constraints
16:03:50 <anssik> ... UC 3. Post-compile query of inference details
16:04:17 <anssik> RRSAgent, draft minutes
16:04:18 <RRSAgent> I have made the request to generate https://www.w3.org/2025/06/26-webmachinelearning-minutes.html anssik
16:05:25 <anssik> Zoltan: Google Meet requirement was to figure out very fast whether they can use WebNN
16:05:42 <anssik> ... PTAL the use cases and requirements section, link shared above
16:07:16 <anssik> Topic: Next meeting 14 August 2025
16:07:30 <anssik> Anssi: due to the upcoming holiday season in the Northern hemisphere we will skip over July meetings and will meet again 14 August 2025
16:07:44 <anssik> ... thank you for your contributions during the first half of 2025, everyone!
16:07:57 <anssik> ... the community continues to grow and I'm pleased to see new people join from both big and small companies, and individuals
16:08:24 <anssik> Present+ Sun_Shin
16:08:33 <anssik> RRSAgent, draft minutes
16:08:35 <RRSAgent> I have made the request to generate https://www.w3.org/2025/06/26-webmachinelearning-minutes.html anssik
16:11:54 <anssik> s/shared next month/shared latest early August
16:12:24 <anssik> s/a proposals better/a proposal is better
16:13:07 <anssik> s/updated from/updates from
16:13:28 <anssik> s/a meeting/a future meeting
16:13:42 <anssik> s/og the/of the
16:16:51 <anssik> s/saying/sharing with the TAG
16:19:38 <anssik> s/deserialized for/deserialized form
16:19:50 <anssik> s/inferences/inference
16:20:31 <anssik> s/maybe/say maybe
16:20:56 <anssik> s/built and save/build and save
16:21:19 <anssik> s/developer this/developer this capability
16:21:43 <anssik> s/on framework/just one framework
16:21:56 <anssik> s/perf/performance
16:22:15 <anssik> s/wouldn't not/wouldn't
16:22:48 <anssik> s/out advise/our advice
16:24:03 <anssik> s/not want to save/the "do not save"
16:24:16 <anssik> s/keep it/keep the model
16:25:02 <anssik> s/talk them/talk to them
16:26:06 <anssik> RRSAgent, draft minutes
16:26:07 <RRSAgent> I have made the request to generate https://www.w3.org/2025/06/26-webmachinelearning-minutes.html anssik