14:59:42 <RRSAgent> RRSAgent has joined #webmachinelearning
14:59:47 <RRSAgent> logging to https://www.w3.org/2025/03/27-webmachinelearning-irc
14:59:50 <Zakim> RRSAgent, make logs Public
14:59:51 <Zakim> please title this meeting ("meeting: ..."), anssik
14:59:51 <anssik> Meeting: WebML WG Teleconference – 27 March 2025
14:59:52 <anssik> Chair: Anssi
14:59:56 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-03-27-wg-agenda.md
15:00:05 <anssik> Scribe: Anssi
15:00:08 <anssik> scribeNick: anssik
15:00:14 <anssik> gb, this is webmachinelearning/webnn
15:00:14 <gb> anssik, OK.
15:00:18 <anssik> Present+ Anssi_Kostiainen
15:00:24 <anssik> Present+ Joshua_Bell
15:00:29 <ningxin> ningxin has joined #webmachinelearning
15:00:30 <anssik> Present+ Dwayne_Robinson
15:00:32 <lgombos> lgombos has joined #webmachinelearning
15:00:37 <winston> Present + Winston_Chen
15:00:38 <lgombos> present+ Laszlo_Gombos
15:00:38 <anssik> Present+ Laszlo_Gombos
15:00:48 <tomayac> Present+ Thomas_Steiner
15:00:51 <jsbell> q+ for when we start (brief announcement re: blinkon)
15:00:54 <anssik> Present+ Mike_Wyrzykowski
15:00:59 <anssik> Present+ Winston_Chen
15:01:16 <anssik> Present+ Christian_Liebel
15:01:18 <McCool> McCool has joined #webmachinelearning
15:01:27 <anssik> Present+ Thomas_Steiner
15:01:36 <tarek> anssik I will have to leave 10 minutes earlier today, sorry
15:01:36 <anssik> Present+ Ningxin_Hu
15:01:48 <anssik> Present+ Michael_McCool
15:01:57 <anssik> RRSAgent, draft minutes
15:01:58 <RRSAgent> I have made the request to generate https://www.w3.org/2025/03/27-webmachinelearning-minutes.html anssik
15:02:43 <anssik> anssik: Please welcome Elena Zhelezina from ARM Limited and Matthew Atkinson & Winston Chen from Samsung Electronics to the WebML WG!
15:03:24 <anssik> ... also, please welcome Laszlo Gombos, Matthew Atkinson, Winston Chen and Raghavendra Ghatage from Samsung Electronics, and Jared Parr, a Google Cloud Partner, to the WebML CG!
15:03:31 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
15:03:44 <zkis> zkis has joined #webmachinelearning
15:03:47 <anssik> Topic: Incubations summary
15:03:56 <anssik> anssik: no summary today, our next WebML CG meeting is next Monday, 31 March 2025, 07:00–08:00 UTC:
15:04:05 <anssik> -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-03-31-cg-agenda.md
15:04:17 <zkis> present+ Zoltan_Kis
15:04:19 <anssik> anssik: This is the EU and APAC timezone friendly WebML CG Teleconference option
15:04:22 <anssik> ... We alternate between the AMER-APAC (Tue/Wed 00:00-01:00 UTC) and this EU-APAC (Mon 07:00-08:00 UTC) option to cater to our geographically diversified participants
15:04:33 <anssik> ... Feel free to join the option that fits your schedule better
15:04:56 <anssik> Topic: W3C Breakouts Day 2025 summary
15:05:04 <anssik> anssik: W3C Breakouts Day 2025 took place 26 March 2025
15:05:10 <anssik> -> List of all proposed sessions https://github.com/w3c/breakouts-day-2025/issues
15:05:18 <anssik> ... a few us us attended the "How would AI Agents change the Web platform?" session proposed by Dom
15:05:22 <anssik> -> AI Agents session description https://github.com/w3c/breakouts-day-2025/issues/7
15:05:23 <gb> https://github.com/w3c/breakouts-day-2025/issues/7 -> Issue 7 How would AI Agents change the Web platform? (by dontcallmedom) [session]
15:05:26 <anssik> -> AI Agents presentation https://www.w3.org/2025/Talks/dhm-ai-agents/
15:05:30 <anssik> -> AI Agents minutes https://www.w3.org/2025/03/26-ai-agents-minutes.html
15:05:38 <anssik> anssik: we had an active discussion, early exploration
15:05:42 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:05:54 <anssik> ... if this group is interested, I can invite Dom to have a brainstorming session with us
15:05:57 <jsbell> q+ (re: agents)
15:06:08 <anssik> ack jsbell
15:06:08 <Zakim> jsbell, you wanted to discuss when we start (brief announcement re: blinkon)
15:06:26 <anssik> jsbell: questions about AI agents, someone asked, if we want to discuss this, what is the proper venue?
15:06:48 <anssik> McCool: Anssi proposed we have an initial discussion in this group
15:07:37 <anssik> q?
15:07:53 <anssik> anssik: another session relevant to this group was about "Web Platform Documentation"
15:07:57 <anssik> -> Web platform documentation session https://github.com/w3c/breakouts-day-2025/issues/10
15:07:58 <gb> https://github.com/w3c/breakouts-day-2025/issues/10 -> Issue 10 Web platform documentation (by Elchi3) [session] [track: browsers]
15:08:02 <anssik> -> [Draft] W3C Docs CG Charter https://docs.google.com/document/d/1rLt1wT_y7OF9VINVGFj9U5x2C_z9_3aZeOtMh-p_Qf8/
15:08:25 <anssik> anssik: this is an opportunity collaborate with this proposed new group to improve dev docs for WebNN API and built-in AI APIs
15:08:30 <anssik> ... we're already contributing in this space as a group, via mdn/browser-compat-data, caniuse.com, webnn-samples, test frameworks, w-p-t, etc.
15:08:45 <anssik> ... I'm talking with Florian Scholz who leads this effort and will connect him with interested people
15:08:51 <anssik> ... any other updates from the breakouts day?
15:09:04 <jsbell> q- (re: agents)
15:09:12 <jsbell> q+
15:09:16 <anssik> ack jsbell
15:09:30 <jsbell> https://www.chromium.org/events/blinkon-20/
15:10:01 <anssik> jsbell: wanted to share that Chromium community has BlinkOn conference, schedule WIP, AI track proposed with WebNN and adjecent things
15:10:10 <anssik> ... allows remote attendance
15:10:34 <McCool> q+
15:10:41 <anssik> ack McCool
15:10:58 <anssik> McCool: is anyone working on enabling WebNN Deno?
15:11:21 <anssik> s/Deno/in Deno
15:11:31 <anssik> Topic: WebNN wide review update
15:11:56 <anssik> anssik: WebNN API is on the W3C Recommendation Track and that comes with an expectation we request review from horizontal groups for substantive changes, typically annually
15:12:02 <anssik> ... I submitted a new round of review requests to all horizontal groups last week
15:12:13 <anssik> ... please take a look at the review reviews, fine-tuning is possible and welcome because it takes a while for the reviewers to pick up
15:12:19 <anssik> ... we expect review feedback to be delivered within the next three months
15:12:26 <anssik> ... the work on the API spec continues as usual, these reviews are non-blocking
15:12:38 <anssik> -> Accessibility https://github.com/w3c/a11y-request/issues/105
15:12:39 <gb> https://github.com/w3c/a11y-request/issues/105 -> Issue 105 Web Neural Network API 2025-03-20 > 2025-06-20 (by anssiko) [CR] [REVIEW REQUESTED] [pending] [agenda+] [s:webnn]
15:12:52 <anssik> -> TAG https://github.com/w3ctag/design-reviews/issues/1072
15:12:52 <gb> https://github.com/w3ctag/design-reviews/issues/1072 -> Issue 1072 Updated review of Web Neural Network API (by anssiko) [Progress: untriaged]
15:13:17 <anssik> -> i18n https://github.com/w3c/i18n-request/issues/258
15:13:17 <gb> https://github.com/w3c/i18n-request/issues/258 -> Issue 258 Web Neural Network API 2025-03-20 > 2025-06-20 (by anssiko) [REVIEW REQUESTED] [CR]
15:13:30 <anssik> -> Privacy https://github.com/w3cping/privacy-request/issues/156
15:13:31 <gb> https://github.com/w3cping/privacy-request/issues/156 -> Issue 156 Web Neural Network API 2025-03-20 > 2025-06-20 (by anssiko) [CR] [pending] [REVIEW REQUESTED]
15:13:44 <anssik> -> Security https://github.com/w3c/security-request/issues/85
15:13:44 <gb> https://github.com/w3c/security-request/issues/85 -> Issue 85 Web Neural Network API 2025-03-20 > 2025-06-20 (by anssiko) [REVIEW REQUESTED] [pending] [CR]
15:14:12 <anssik> anssik: TAG usually takes longer due to broad scope and full pipeline
15:14:18 <anssik> ... as discussed, when we deliver u/int4 spec update, we can append that issue/PR to the TAG review request
15:14:26 <anssik> ... any comments?
15:14:38 <anssik> ... general comments and questions also welcome via the wide review tracker #239
15:14:38 <gb> https://github.com/webmachinelearning/webnn/issues/239 -> Issue 239 Wide review tracker (by anssiko) [process]
15:15:22 <anssik> Topic: Caching mechanism for MLGraph
15:15:26 <anssik> anssik: issue #807
15:15:26 <gb> https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request]
15:15:47 <anssik> ... wanted to poll your interest to initiate work on an explainer, prototyping intent, identify any blockers
15:15:59 <jsbell> q+
15:16:06 <anssik> ... it is important to keep us grounded to reality, focus on designs that map to the frameworks and backends used
15:16:19 <McCool> q+
15:16:28 <anssik> ... three design considerations discussed in the issue for now:
15:16:35 <anssik> ... - reusability across context
15:17:07 <anssik> ... an example of a device-specific compiled model given was OpenVINO, while TFLite and Core ML are device-agnostic i.e. the same compiled model package works for all XPUs
15:17:14 <anssik> ... - implicit vs explicit API
15:17:38 <anssik> ... only explicit API can avoid redundant graph building step, as it allows developer to listGraphs() without the need to build the graph
15:17:56 <anssik> ... - composing a supergraph from multiple cached subgraphs
15:18:06 <anssik> ... it was noted this is not implementable with current frameworks
15:18:12 <anssik> ... is there interest to explore the caching mechanism feature in the near term?
15:18:15 <anssik> q?
15:18:18 <anssik> ack jsbell
15:18:28 <anssik> jsbell: ack McCool
15:18:58 <anssik> McCool: we should define our goals clearly, download time, compilation time, do we want to extend to x-site caching in the future, behaviour
15:19:23 <anssik> ... explainer posted tries to document these things, proposal for explicit API makes sense
15:19:47 <anssik> ... the default update should do nothing, null cache, makes it easy to implement at least, backend issues are relevant
15:20:04 <anssik> q?
15:20:11 <anssik> ack McCool
15:20:43 <anssik> jsbell: Michael thanks for the summary
15:21:03 <anssik> ... we have ideas how to do explicit cache for one site, x-site cache was a big deal, at BlinkOn we are discussing that topic
15:21:23 <anssik> ... do we want to block any work on an explicit same-site caching API on solving or addressing x-site issue?
15:21:35 <zkis> Working on the suggested explicit API is a good start.
15:21:37 <RafaelCintron> q+
15:22:08 <anssik> McCool: prefer extensible for x-site in the future
15:22:37 <anssik> ack RafaelCintron
15:23:06 <anssik> RafaelCintron: wanted to answer Josh, I don't think we should block one on another, since some models are large, I don't think we can rely on auto cache models
15:23:36 <anssik> ... if in the future we can extend to x-origin can explore that separately, inclined to say the key should be a friendly name string
15:23:47 <anssik> ... developer is in change of naming
15:23:48 <anssik> q?
15:23:50 <ningxin> q+
15:23:52 <McCool> q+
15:23:54 <anssik> ack ningxin
15:24:35 <McCool> ack m
15:24:42 <anssik> ningxin: want to add one more point, Reilly mentioned last time that web framework like ONNX Runtime Web can use MLGraph cache, we need to consider that and invite folks working on e.g. ORT Web for input
15:25:00 <anssik> ... native frameworks are already doing model cache, ONNX Runtime EP Context, has design docs
15:25:23 <anssik> ... we could see the connection to MLGraph
15:25:24 <anssik> q?
15:25:50 <ningxin> https://onnxruntime.ai/docs/execution-providers/EP-Context-Design.html
15:26:05 <anssik> anssik: anyone planning to do prototyping?
15:26:12 <anssik> McCool: would be interested
15:26:26 <ningxin> +1 for prototyping
15:27:28 <anssik> McCool: can work on the explainer further
15:29:48 <anssik> q?
15:30:15 <anssik> Topic: Remove pool2d MLRoundingType
15:30:20 <anssik> anssik: issue #324 (#374) and PR #770
15:30:20 <gb> https://github.com/webmachinelearning/webnn/issues/374 -> Issue 374 Simplify `MLPool2dOptions` by removing the `outputSizes` option (by huningxin) [operator specific]
15:30:20 <gb> https://github.com/webmachinelearning/webnn/pull/770 -> Pull Request 770 Remove pool2d MLRoundingType - Simplify the operand layout support of conv2d and pooling 2d operations (by fdwr)
15:30:20 <gb> https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (by huningxin) [feature request] [operator specific] [interop]
15:30:30 <anssik> ... this issue proposed to remove pooling's rounding direction for the output dimensions
15:30:41 <anssik> ... this PR has been open for a while, and Josh notes this needs a redo
15:30:51 <anssik> ... do we have open questions for this change that'd be helpful to discuss today?
15:31:07 <anssik> DwayneR: no open design questions
15:31:40 <anssik> q?
15:32:21 <ningxin> q+
15:32:22 <anssik> jsbell: one of the issues with the update is padding that passed, seems redundant?
15:32:40 <anssik> DwayneR: the other option is to only use padding, not pass output sizes
15:32:56 <anssik> ... concerned of having two different ways to do the same thing
15:33:03 <anssik> jsbell: no strong preference
15:33:11 <anssik> q?
15:33:24 <anssik> ack ningxin
15:33:50 <anssik> ningxin: my previous comment re validating the output size, with Josh's work we already have that, so we can leverage that
15:33:51 <anssik> q?
15:34:14 <anssik> Topic: Behavior when there are NaNs in argmin/max inputs
15:34:22 <anssik> anssik: issue #811
15:34:22 <gb> https://github.com/webmachinelearning/webnn/issues/811 -> Issue 811 Behavior when there are NaNs in argmin/max inputs (by philloooo) [interop]
15:34:29 <anssik> ... question fom Phillis on how backends handle NaNs in the inputs
15:34:41 <anssik> ... Dwayne framed this decision as "performant but implementation defined" vs "slower but consistent"
15:34:45 <anssik> ... Dwayne's proposal shared in the issue:
15:34:49 <anssik> ... - leave NaN behavior implementation defined, and update spec to indicate that
15:34:52 <anssik> ... - add to the spec a mitigation code snippet to sanitize code before calling argMin/argMax
15:35:09 <anssik> ... - add an isNaN operator
15:35:17 <anssik> ... Ningxin's feedback was requested, Phillis gave thumbs up
15:35:35 <anssik> q?
15:35:45 <anssik> Dwayne: we just got Ningxin's +1
15:35:48 <anssik> ningxin: +1
15:36:25 <anssik> DwayneR: with the proposed actions nothing to do for Phillis
15:37:02 <anssik> ... if you have NaNs we should have a way to have deterministic behaviour, should satisfy all concerns
15:37:52 <anssik> jsbell: WFM, did we have anything explicit in the spec that implementation can do maths "their own way", but should not crash, we mention clamping in scatter and gather ops
15:38:00 <anssik> q?
15:38:14 <anssik> Dwayne: we don't have a lot of verbiage about NaNs currently
15:38:57 <anssik> anssik: OK to advance to a PR with this issue
15:39:00 <anssik> q?
15:39:15 <anssik> Topic: (de)quantization behaviors on Core ML
15:39:19 <anssik> anssik: issue #822
15:39:20 <gb> https://github.com/webmachinelearning/webnn/issues/822 -> Issue 822 (de)quantization behaviors on CoreML (by philloooo) [interop]
15:39:33 <anssik> ... Phillis shared quantize/dequantize Core ML implementation feedback, would like to discuss it with the group
15:39:53 <anssik> ... Dwayne responded and it looks like (u)int4 being unsupported on Core ML for non-constant inputs is the remaining open question
15:40:04 <anssik> ... is it possible to work around this limitation using Dwayne's pseudocode?
15:40:11 <anssik> ... what is the performance penalty, is it reasonable?
15:40:23 <anssik> ... I also see Dwayne's suggestions for future Core ML improvements:
15:40:27 <anssik> ... - Add blockwise support to dequantize and quantize
15:40:32 <anssik> ... - Add u/int4 support to dequantize and quantize (or emulation path via u/int4 cast & bitwise ops)
15:40:44 <anssik> ... - Allow the scale and zero point to also be dynamic too (not just constant)
15:41:22 <anssik> Dwayne: int4 was the only uncertain, anything else can be emulated, int4 can be also emulated I think, it is not just pretty
15:41:37 <anssik> ... Phillis gave thumbs up, not sure if that's confirmation
15:41:42 <anssik> jsbell: can't speak to that
15:41:48 <anssik> DwayneR: will wait Phillis
15:42:23 <anssik> Mike_Wyrzykowski: I will take a look after the meeting
15:42:47 <anssik> q?
15:43:41 <anssik> q?
15:43:51 <anssik> Topic: Query mechanism for supported devices
15:43:58 <anssik> anssik: issue #815
15:43:59 <gb> https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query mechanism for supported devices (by anssiko) [device selection]
15:44:03 <anssik> ... recent updates:
15:44:14 <anssik> ... - a question from Mike about the "capacity" concept use cases
15:44:24 <anssik> ... - a new use case for the "capacity" concept from the Google Meet team via Markus
15:44:42 <anssik> ... the question was "could you provide some insight as to why the capacity aspect should be exposed from the UA or what use case we are trying to support?"
15:44:57 <anssik> ... also note "The web app / website only knows its own workloads while the UA has access to OS system calls and other apps running on the system."
15:45:38 <anssik> anssik: Markus responded with a use case, "ML receive side audio processing is currently running on the GPU. Then the app would likely want to avoid launching a large non-realtime LLM task there to avoid unacceptable glitching."
15:46:01 <anssik> ... another use case: "There is also great value ensuring the CPU doesn't get selected [due to] the very large performance difference between "fallback"/"cpu" and "accelerated"/"gpu-like"/"npu""
15:46:41 <anssik> ... I think both are right, local (own workload) and global (all workloads on the OS) optizations can be done and are not mutually exclusive
15:47:18 <anssik> anssik: I wonder can we design an API that supports both capacity and capabilities?
15:47:27 <anssik> q?
15:47:27 <zkis> Or separate APIs for them?
15:48:23 <RafaelCintron> q+
15:48:29 <anssik> Mike_Wyrzykowski: I think that there could be some value for website to be able to add a hint
15:48:33 <anssik> q?
15:48:35 <anssik> ack RafaelCintron
15:48:57 <anssik> RafaelCintron: asking Mike, at TPAC you said high-performance hint would be OK?
15:49:07 <zkis> q+
15:49:12 <anssik> Mike_Wyrzykowski: high-performance / low-power would be OK
15:50:05 <anssik> ... don't want to add hints that could be useless
15:50:27 <anssik> ... Core ML has some hints
15:50:28 <anssik> -> https://developer.apple.com/documentation/coreml/mlcomputeunits#Processing-Unit-Configurations
15:50:50 <anssik> q?
15:50:53 <anssik> ack zkis
15:51:39 <anssik> zkis: I wanted to ask Mike about Reilly's suggestion "gpu-like" and "cpu-like" loads, does that category fit well with you, not explicit CPU or GPU, but along the lines of Reilly's definitions
15:52:41 <anssik> Mike_Wyrzykowski: on Apple's platforms implementers would use Core ML, not implementable on that due to that, could drop down to frameworks below, would make the implementation on browser more complicated, because ANE is via Core ML, BNNS and MPS for CPU and GPU
15:53:12 <anssik> zkis: looking at the processing unit configs, there's a possible mapping, the question is is it useful
15:53:25 <anssik> Mike_Wyrzykowski: it seems high-performance / low-power options overlap with those
15:53:26 <anssik> q?
15:54:22 <anssik> zkis: I try to make a few attempt to satisfy both sides, I think it's Google's turn to check if there's any way to satisfy the use cases using existing hints, some kind of mapping
15:54:52 <anssik> ... or if we need an explicit query API, can we do it so it can be implemented on Apple platforms in a conformant way
15:55:15 <anssik> ... capacity and capabilities together might be hard, we might want to separate the concerns
15:55:30 <anssik> ... can we have separate API surfaces for capacity and capabilitities?
15:55:48 <anssik> Mike_Wyrzykowski: it'd be great to show a use case where the website has better understanding of capacity than the browser engine
15:56:00 <anssik> ... once we add this to the spec we can't remove that easily
15:56:20 <anssik> zkis: how to implement the use case from Markus? https://github.com/webmachinelearning/webnn/issues/815#issuecomment-2738252079
15:56:21 <gb> https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query mechanism for supported devices (by anssiko) [device selection]
15:57:26 <anssik> q?
15:58:09 <anssik> anssik: any learnings from WebGPU capabilities?
15:58:31 <anssik> Mike_Wyrzykowski: some capabilities are interdependent
15:59:03 <anssik> ... not always obvious when writing tests
15:59:28 <anssik> q?
16:01:23 <anssik> RRSAgent, draft minutes
16:01:24 <RRSAgent> I have made the request to generate https://www.w3.org/2025/03/27-webmachinelearning-minutes.html anssik
16:01:47 <Mike_Wyrzykowski> https://github.com/gpuweb/gpuweb/issues/4025 anssik I think this is the issue tracking simplifying webgpu limits
16:01:47 <gb> https://github.com/gpuweb/gpuweb/issues/4025 -> Issue 4025 Consider simplifying limits with overlapping meaning (by kainino0x) [limits] [api]
16:03:23 <anssik> s/review reviews/review requests
16:12:43 <anssik> RRSAgent, draft minutes
16:12:44 <RRSAgent> I have made the request to generate https://www.w3.org/2025/03/27-webmachinelearning-minutes.html anssik
18:04:16 <Zakim> Zakim has left #webmachinelearning