14:59:42 RRSAgent has joined #webmachinelearning 14:59:47 logging to https://www.w3.org/2025/03/27-webmachinelearning-irc 14:59:50 RRSAgent, make logs Public 14:59:51 please title this meeting ("meeting: ..."), anssik 14:59:51 Meeting: WebML WG Teleconference – 27 March 2025 14:59:52 Chair: Anssi 14:59:56 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-03-27-wg-agenda.md 15:00:05 Scribe: Anssi 15:00:08 scribeNick: anssik 15:00:14 gb, this is webmachinelearning/webnn 15:00:14 anssik, OK. 15:00:18 Present+ Anssi_Kostiainen 15:00:24 Present+ Joshua_Bell 15:00:29 ningxin has joined #webmachinelearning 15:00:30 Present+ Dwayne_Robinson 15:00:32 lgombos has joined #webmachinelearning 15:00:37 Present + Winston_Chen 15:00:38 present+ Laszlo_Gombos 15:00:38 Present+ Laszlo_Gombos 15:00:48 Present+ Thomas_Steiner 15:00:51 q+ for when we start (brief announcement re: blinkon) 15:00:54 Present+ Mike_Wyrzykowski 15:00:59 Present+ Winston_Chen 15:01:16 Present+ Christian_Liebel 15:01:18 McCool has joined #webmachinelearning 15:01:27 Present+ Thomas_Steiner 15:01:36 anssik I will have to leave 10 minutes earlier today, sorry 15:01:36 Present+ Ningxin_Hu 15:01:48 Present+ Michael_McCool 15:01:57 RRSAgent, draft minutes 15:01:58 I have made the request to generate https://www.w3.org/2025/03/27-webmachinelearning-minutes.html anssik 15:02:43 anssik: Please welcome Elena Zhelezina from ARM Limited and Matthew Atkinson & Winston Chen from Samsung Electronics to the WebML WG! 15:03:24 ... also, please welcome Laszlo Gombos, Matthew Atkinson, Winston Chen and Raghavendra Ghatage from Samsung Electronics, and Jared Parr, a Google Cloud Partner, to the WebML CG! 15:03:31 Mike_Wyrzykowski has joined #webmachinelearning 15:03:44 zkis has joined #webmachinelearning 15:03:47 Topic: Incubations summary 15:03:56 anssik: no summary today, our next WebML CG meeting is next Monday, 31 March 2025, 07:00–08:00 UTC: 15:04:05 -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-03-31-cg-agenda.md 15:04:17 present+ Zoltan_Kis 15:04:19 anssik: This is the EU and APAC timezone friendly WebML CG Teleconference option 15:04:22 ... We alternate between the AMER-APAC (Tue/Wed 00:00-01:00 UTC) and this EU-APAC (Mon 07:00-08:00 UTC) option to cater to our geographically diversified participants 15:04:33 ... Feel free to join the option that fits your schedule better 15:04:56 Topic: W3C Breakouts Day 2025 summary 15:05:04 anssik: W3C Breakouts Day 2025 took place 26 March 2025 15:05:10 -> List of all proposed sessions https://github.com/w3c/breakouts-day-2025/issues 15:05:18 ... a few us us attended the "How would AI Agents change the Web platform?" session proposed by Dom 15:05:22 -> AI Agents session description https://github.com/w3c/breakouts-day-2025/issues/7 15:05:23 https://github.com/w3c/breakouts-day-2025/issues/7 -> Issue 7 How would AI Agents change the Web platform? (by dontcallmedom) [session] 15:05:26 -> AI Agents presentation https://www.w3.org/2025/Talks/dhm-ai-agents/ 15:05:30 -> AI Agents minutes https://www.w3.org/2025/03/26-ai-agents-minutes.html 15:05:38 anssik: we had an active discussion, early exploration 15:05:42 RafaelCintron has joined #webmachinelearning 15:05:54 ... if this group is interested, I can invite Dom to have a brainstorming session with us 15:05:57 q+ (re: agents) 15:06:08 ack jsbell 15:06:08 jsbell, you wanted to discuss when we start (brief announcement re: blinkon) 15:06:26 jsbell: questions about AI agents, someone asked, if we want to discuss this, what is the proper venue? 15:06:48 McCool: Anssi proposed we have an initial discussion in this group 15:07:37 q? 15:07:53 anssik: another session relevant to this group was about "Web Platform Documentation" 15:07:57 -> Web platform documentation session https://github.com/w3c/breakouts-day-2025/issues/10 15:07:58 https://github.com/w3c/breakouts-day-2025/issues/10 -> Issue 10 Web platform documentation (by Elchi3) [session] [track: browsers] 15:08:02 -> [Draft] W3C Docs CG Charter https://docs.google.com/document/d/1rLt1wT_y7OF9VINVGFj9U5x2C_z9_3aZeOtMh-p_Qf8/ 15:08:25 anssik: this is an opportunity collaborate with this proposed new group to improve dev docs for WebNN API and built-in AI APIs 15:08:30 ... we're already contributing in this space as a group, via mdn/browser-compat-data, caniuse.com, webnn-samples, test frameworks, w-p-t, etc. 15:08:45 ... I'm talking with Florian Scholz who leads this effort and will connect him with interested people 15:08:51 ... any other updates from the breakouts day? 15:09:04 q- (re: agents) 15:09:12 q+ 15:09:16 ack jsbell 15:09:30 https://www.chromium.org/events/blinkon-20/ 15:10:01 jsbell: wanted to share that Chromium community has BlinkOn conference, schedule WIP, AI track proposed with WebNN and adjecent things 15:10:10 ... allows remote attendance 15:10:34 q+ 15:10:41 ack McCool 15:10:58 McCool: is anyone working on enabling WebNN Deno? 15:11:21 s/Deno/in Deno 15:11:31 Topic: WebNN wide review update 15:11:56 anssik: WebNN API is on the W3C Recommendation Track and that comes with an expectation we request review from horizontal groups for substantive changes, typically annually 15:12:02 ... I submitted a new round of review requests to all horizontal groups last week 15:12:13 ... please take a look at the review reviews, fine-tuning is possible and welcome because it takes a while for the reviewers to pick up 15:12:19 ... we expect review feedback to be delivered within the next three months 15:12:26 ... the work on the API spec continues as usual, these reviews are non-blocking 15:12:38 -> Accessibility https://github.com/w3c/a11y-request/issues/105 15:12:39 https://github.com/w3c/a11y-request/issues/105 -> Issue 105 Web Neural Network API 2025-03-20 > 2025-06-20 (by anssiko) [CR] [REVIEW REQUESTED] [pending] [agenda+] [s:webnn] 15:12:52 -> TAG https://github.com/w3ctag/design-reviews/issues/1072 15:12:52 https://github.com/w3ctag/design-reviews/issues/1072 -> Issue 1072 Updated review of Web Neural Network API (by anssiko) [Progress: untriaged] 15:13:17 -> i18n https://github.com/w3c/i18n-request/issues/258 15:13:17 https://github.com/w3c/i18n-request/issues/258 -> Issue 258 Web Neural Network API 2025-03-20 > 2025-06-20 (by anssiko) [REVIEW REQUESTED] [CR] 15:13:30 -> Privacy https://github.com/w3cping/privacy-request/issues/156 15:13:31 https://github.com/w3cping/privacy-request/issues/156 -> Issue 156 Web Neural Network API 2025-03-20 > 2025-06-20 (by anssiko) [CR] [pending] [REVIEW REQUESTED] 15:13:44 -> Security https://github.com/w3c/security-request/issues/85 15:13:44 https://github.com/w3c/security-request/issues/85 -> Issue 85 Web Neural Network API 2025-03-20 > 2025-06-20 (by anssiko) [REVIEW REQUESTED] [pending] [CR] 15:14:12 anssik: TAG usually takes longer due to broad scope and full pipeline 15:14:18 ... as discussed, when we deliver u/int4 spec update, we can append that issue/PR to the TAG review request 15:14:26 ... any comments? 15:14:38 ... general comments and questions also welcome via the wide review tracker #239 15:14:38 https://github.com/webmachinelearning/webnn/issues/239 -> Issue 239 Wide review tracker (by anssiko) [process] 15:15:22 Topic: Caching mechanism for MLGraph 15:15:26 anssik: issue #807 15:15:26 https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request] 15:15:47 ... wanted to poll your interest to initiate work on an explainer, prototyping intent, identify any blockers 15:15:59 q+ 15:16:06 ... it is important to keep us grounded to reality, focus on designs that map to the frameworks and backends used 15:16:19 q+ 15:16:28 ... three design considerations discussed in the issue for now: 15:16:35 ... - reusability across context 15:17:07 ... an example of a device-specific compiled model given was OpenVINO, while TFLite and Core ML are device-agnostic i.e. the same compiled model package works for all XPUs 15:17:14 ... - implicit vs explicit API 15:17:38 ... only explicit API can avoid redundant graph building step, as it allows developer to listGraphs() without the need to build the graph 15:17:56 ... - composing a supergraph from multiple cached subgraphs 15:18:06 ... it was noted this is not implementable with current frameworks 15:18:12 ... is there interest to explore the caching mechanism feature in the near term? 15:18:15 q? 15:18:18 ack jsbell 15:18:28 jsbell: ack McCool 15:18:58 McCool: we should define our goals clearly, download time, compilation time, do we want to extend to x-site caching in the future, behaviour 15:19:23 ... explainer posted tries to document these things, proposal for explicit API makes sense 15:19:47 ... the default update should do nothing, null cache, makes it easy to implement at least, backend issues are relevant 15:20:04 q? 15:20:11 ack McCool 15:20:43 jsbell: Michael thanks for the summary 15:21:03 ... we have ideas how to do explicit cache for one site, x-site cache was a big deal, at BlinkOn we are discussing that topic 15:21:23 ... do we want to block any work on an explicit same-site caching API on solving or addressing x-site issue? 15:21:35 Working on the suggested explicit API is a good start. 15:21:37 q+ 15:22:08 McCool: prefer extensible for x-site in the future 15:22:37 ack RafaelCintron 15:23:06 RafaelCintron: wanted to answer Josh, I don't think we should block one on another, since some models are large, I don't think we can rely on auto cache models 15:23:36 ... if in the future we can extend to x-origin can explore that separately, inclined to say the key should be a friendly name string 15:23:47 ... developer is in change of naming 15:23:48 q? 15:23:50 q+ 15:23:52 q+ 15:23:54 ack ningxin 15:24:35 ack m 15:24:42 ningxin: want to add one more point, Reilly mentioned last time that web framework like ONNX Runtime Web can use MLGraph cache, we need to consider that and invite folks working on e.g. ORT Web for input 15:25:00 ... native frameworks are already doing model cache, ONNX Runtime EP Context, has design docs 15:25:23 ... we could see the connection to MLGraph 15:25:24 q? 15:25:50 https://onnxruntime.ai/docs/execution-providers/EP-Context-Design.html 15:26:05 anssik: anyone planning to do prototyping? 15:26:12 McCool: would be interested 15:26:26 +1 for prototyping 15:27:28 McCool: can work on the explainer further 15:29:48 q? 15:30:15 Topic: Remove pool2d MLRoundingType 15:30:20 anssik: issue #324 (#374) and PR #770 15:30:20 https://github.com/webmachinelearning/webnn/issues/374 -> Issue 374 Simplify `MLPool2dOptions` by removing the `outputSizes` option (by huningxin) [operator specific] 15:30:20 https://github.com/webmachinelearning/webnn/pull/770 -> Pull Request 770 Remove pool2d MLRoundingType - Simplify the operand layout support of conv2d and pooling 2d operations (by fdwr) 15:30:20 https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (by huningxin) [feature request] [operator specific] [interop] 15:30:30 ... this issue proposed to remove pooling's rounding direction for the output dimensions 15:30:41 ... this PR has been open for a while, and Josh notes this needs a redo 15:30:51 ... do we have open questions for this change that'd be helpful to discuss today? 15:31:07 DwayneR: no open design questions 15:31:40 q? 15:32:21 q+ 15:32:22 jsbell: one of the issues with the update is padding that passed, seems redundant? 15:32:40 DwayneR: the other option is to only use padding, not pass output sizes 15:32:56 ... concerned of having two different ways to do the same thing 15:33:03 jsbell: no strong preference 15:33:11 q? 15:33:24 ack ningxin 15:33:50 ningxin: my previous comment re validating the output size, with Josh's work we already have that, so we can leverage that 15:33:51 q? 15:34:14 Topic: Behavior when there are NaNs in argmin/max inputs 15:34:22 anssik: issue #811 15:34:22 https://github.com/webmachinelearning/webnn/issues/811 -> Issue 811 Behavior when there are NaNs in argmin/max inputs (by philloooo) [interop] 15:34:29 ... question fom Phillis on how backends handle NaNs in the inputs 15:34:41 ... Dwayne framed this decision as "performant but implementation defined" vs "slower but consistent" 15:34:45 ... Dwayne's proposal shared in the issue: 15:34:49 ... - leave NaN behavior implementation defined, and update spec to indicate that 15:34:52 ... - add to the spec a mitigation code snippet to sanitize code before calling argMin/argMax 15:35:09 ... - add an isNaN operator 15:35:17 ... Ningxin's feedback was requested, Phillis gave thumbs up 15:35:35 q? 15:35:45 Dwayne: we just got Ningxin's +1 15:35:48 ningxin: +1 15:36:25 DwayneR: with the proposed actions nothing to do for Phillis 15:37:02 ... if you have NaNs we should have a way to have deterministic behaviour, should satisfy all concerns 15:37:52 jsbell: WFM, did we have anything explicit in the spec that implementation can do maths "their own way", but should not crash, we mention clamping in scatter and gather ops 15:38:00 q? 15:38:14 Dwayne: we don't have a lot of verbiage about NaNs currently 15:38:57 anssik: OK to advance to a PR with this issue 15:39:00 q? 15:39:15 Topic: (de)quantization behaviors on Core ML 15:39:19 anssik: issue #822 15:39:20 https://github.com/webmachinelearning/webnn/issues/822 -> Issue 822 (de)quantization behaviors on CoreML (by philloooo) [interop] 15:39:33 ... Phillis shared quantize/dequantize Core ML implementation feedback, would like to discuss it with the group 15:39:53 ... Dwayne responded and it looks like (u)int4 being unsupported on Core ML for non-constant inputs is the remaining open question 15:40:04 ... is it possible to work around this limitation using Dwayne's pseudocode? 15:40:11 ... what is the performance penalty, is it reasonable? 15:40:23 ... I also see Dwayne's suggestions for future Core ML improvements: 15:40:27 ... - Add blockwise support to dequantize and quantize 15:40:32 ... - Add u/int4 support to dequantize and quantize (or emulation path via u/int4 cast & bitwise ops) 15:40:44 ... - Allow the scale and zero point to also be dynamic too (not just constant) 15:41:22 Dwayne: int4 was the only uncertain, anything else can be emulated, int4 can be also emulated I think, it is not just pretty 15:41:37 ... Phillis gave thumbs up, not sure if that's confirmation 15:41:42 jsbell: can't speak to that 15:41:48 DwayneR: will wait Phillis 15:42:23 Mike_Wyrzykowski: I will take a look after the meeting 15:42:47 q? 15:43:41 q? 15:43:51 Topic: Query mechanism for supported devices 15:43:58 anssik: issue #815 15:43:59 https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query mechanism for supported devices (by anssiko) [device selection] 15:44:03 ... recent updates: 15:44:14 ... - a question from Mike about the "capacity" concept use cases 15:44:24 ... - a new use case for the "capacity" concept from the Google Meet team via Markus 15:44:42 ... the question was "could you provide some insight as to why the capacity aspect should be exposed from the UA or what use case we are trying to support?" 15:44:57 ... also note "The web app / website only knows its own workloads while the UA has access to OS system calls and other apps running on the system." 15:45:38 anssik: Markus responded with a use case, "ML receive side audio processing is currently running on the GPU. Then the app would likely want to avoid launching a large non-realtime LLM task there to avoid unacceptable glitching." 15:46:01 ... another use case: "There is also great value ensuring the CPU doesn't get selected [due to] the very large performance difference between "fallback"/"cpu" and "accelerated"/"gpu-like"/"npu"" 15:46:41 ... I think both are right, local (own workload) and global (all workloads on the OS) optizations can be done and are not mutually exclusive 15:47:18 anssik: I wonder can we design an API that supports both capacity and capabilities? 15:47:27 q? 15:47:27 Or separate APIs for them? 15:48:23 q+ 15:48:29 Mike_Wyrzykowski: I think that there could be some value for website to be able to add a hint 15:48:33 q? 15:48:35 ack RafaelCintron 15:48:57 RafaelCintron: asking Mike, at TPAC you said high-performance hint would be OK? 15:49:07 q+ 15:49:12 Mike_Wyrzykowski: high-performance / low-power would be OK 15:50:05 ... don't want to add hints that could be useless 15:50:27 ... Core ML has some hints 15:50:28 -> https://developer.apple.com/documentation/coreml/mlcomputeunits#Processing-Unit-Configurations 15:50:50 q? 15:50:53 ack zkis 15:51:39 zkis: I wanted to ask Mike about Reilly's suggestion "gpu-like" and "cpu-like" loads, does that category fit well with you, not explicit CPU or GPU, but along the lines of Reilly's definitions 15:52:41 Mike_Wyrzykowski: on Apple's platforms implementers would use Core ML, not implementable on that due to that, could drop down to frameworks below, would make the implementation on browser more complicated, because ANE is via Core ML, BNNS and MPS for CPU and GPU 15:53:12 zkis: looking at the processing unit configs, there's a possible mapping, the question is is it useful 15:53:25 Mike_Wyrzykowski: it seems high-performance / low-power options overlap with those 15:53:26 q? 15:54:22 zkis: I try to make a few attempt to satisfy both sides, I think it's Google's turn to check if there's any way to satisfy the use cases using existing hints, some kind of mapping 15:54:52 ... or if we need an explicit query API, can we do it so it can be implemented on Apple platforms in a conformant way 15:55:15 ... capacity and capabilities together might be hard, we might want to separate the concerns 15:55:30 ... can we have separate API surfaces for capacity and capabilitities? 15:55:48 Mike_Wyrzykowski: it'd be great to show a use case where the website has better understanding of capacity than the browser engine 15:56:00 ... once we add this to the spec we can't remove that easily 15:56:20 zkis: how to implement the use case from Markus? https://github.com/webmachinelearning/webnn/issues/815#issuecomment-2738252079 15:56:21 https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query mechanism for supported devices (by anssiko) [device selection] 15:57:26 q? 15:58:09 anssik: any learnings from WebGPU capabilities? 15:58:31 Mike_Wyrzykowski: some capabilities are interdependent 15:59:03 ... not always obvious when writing tests 15:59:28 q? 16:01:23 RRSAgent, draft minutes 16:01:24 I have made the request to generate https://www.w3.org/2025/03/27-webmachinelearning-minutes.html anssik 16:01:47 https://github.com/gpuweb/gpuweb/issues/4025 anssik I think this is the issue tracking simplifying webgpu limits 16:01:47 https://github.com/gpuweb/gpuweb/issues/4025 -> Issue 4025 Consider simplifying limits with overlapping meaning (by kainino0x) [limits] [api] 16:03:23 s/review reviews/review requests 16:12:43 RRSAgent, draft minutes 16:12:44 I have made the request to generate https://www.w3.org/2025/03/27-webmachinelearning-minutes.html anssik 18:04:16 Zakim has left #webmachinelearning