14:48:18 RRSAgent has joined #webmachinelearning 14:48:22 logging to https://www.w3.org/2025/04/24-webmachinelearning-irc 14:48:22 RRSAgent, make logs Public 14:48:23 please title this meeting ("meeting: ..."), anssik 14:48:28 Meeting: WebML WG Teleconference – 24 April 2025 14:48:33 Chair: Anssi 14:48:38 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-04-24-wg-agenda.md 14:48:43 Scribe: Anssi 14:48:48 scribeNick: anssik 14:48:58 gb, this is webmachinelearning/webnn 14:48:58 anssik, OK. 14:49:10 Present+ Anssi_Kostiainen 14:49:24 Regrets+ Reilly_Grant 14:49:28 Regrets+ Michael_McCool 14:49:39 RRSAgent, draft minutes 14:49:40 I have made the request to generate https://www.w3.org/2025/04/24-webmachinelearning-minutes.html anssik 14:56:06 jsbell has joined #webmachinelearning 14:58:18 Present+ Joshua_Bell 14:58:51 Present+ Zoltan_Kis 14:58:56 zkis has joined #webmachinelearning 14:59:58 Present+ Dwayne_Robinson 15:00:36 Mike_Wyrzykowski has joined #webmachinelearning 15:00:38 dwayner has joined #webmachinelearning 15:01:19 Joshua_Lochner has joined #webmachinelearning 15:01:25 Present+ Laszlo_Gombos 15:01:39 Present+ Christian_Liebel 15:01:45 ningxin has joined #webmachinelearning 15:01:55 Present+ Joshua_Lochner 15:02:03 Present+ Mike_Wyrzykowski 15:02:14 Present+ Ningxin_Hu 15:02:24 lgombos8 has joined #webmachinelearning 15:03:06 Present+ Eugen_Thaci 15:03:27 RRSAgent, draft minutes 15:03:28 I have made the request to generate https://www.w3.org/2025/04/24-webmachinelearning-minutes.html anssik 15:03:34 Present+ Zoltan_Kis 15:03:35 present+ Laszlo_Gombos 15:04:12 anssik: please welcome Aditya Chhabra, Drew Morris, Adnaan Nazir, Josi Rosenfeld, Eugen Thaci, Jeffrey Phillips Freeman from CleverThis Inc. to the WebML WG and CG! 15:04:38 ... CleverThis is a tech company with interest in open standards and responsible and explainable AI 15:04:55 Eugen: I'm assisting Jeffrey in this WG 15:05:25 anssik: my expectation is CleverThis participants will be interested in contributing to the Ethical Principles work we have in this WG 15:05:54 anssik: also, please welcome Kevin Petit from ARM Limited to the WebML WG 15:06:17 ... and finally, please welcome Jonathan Schneerson representing Temporal Series AI, a startup working on AI-driven solutions for financial markets, to the WebML CG! 15:06:51 Topic: Incubations 15:07:05 anssik: our next CG meeting is at EU and APAC friendly time on Mon 28 April: 15:07:09 -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-04-28-cg-agenda.md 15:07:26 anssik: the Community Group is adding a new deliverable Proofreader API to its scope 15:07:32 -> https://lists.w3.org/Archives/Public/public-webmachinelearning/2025Apr/0001.html 15:08:13 anssik: please join on Mon if you're interested in the new local inference web extension experiment proposed by Tarek/Mozilla for consideration as a new CG deliverable 15:08:19 Winston has joined #webmachinelearning 15:08:36 ... we'll also discuss Prompt API feature requests: multi-modal real-time capabilities, Model Context Protocol support 15:09:09 ... built-in AI APIs received early wide review, added shared privacy and security considerations, and i18n expert feedback helped address identified i18n issues with language detector 15:09:46 Topic: WG-CG collaboration 15:10:10 anssik: to encourage broader participation, I pushed this WG meeting forward by 1 hour 15:10:41 ... an optional proposal is to reuse certain CG meetings for both WebNN and Built-in AI discussions, available options: 15:10:47 ... AMER-APAC friendly meetings ~monthly Tue 17:00 PDT / Wed 08:00 CST 15:11:04 ... EU-APAC friendly meeting ~monthly Mon 9:00 CEST / Mon 15:00 CST 15:11:58 anssik: the standards group (WG) and the incubation group (CG) have different IPR policies, but if the active participants is in both we can make this happen, you can join at: 15:12:03 -> https://webmachinelearning.github.io/community/#join 15:12:37 anssik: finally, the intent is not to create more meetings, but reuse the existing meeting infra more flexibly 15:12:48 ... questions, feedback, thoughts? 15:14:14 Christian: +1 for the EU friendly slot 15:14:51 Ningxin: both slots are Shanghai friendly, so good with both 15:15:33 For me: neutral; whether a discussion is useful in a given meeting depends who will be present, so knowing agenda and predicted attendance could make it work 15:15:45 q? 15:16:23 MikeW: Tue 17:00 PDT usually works 15:16:58 Topic: BlinkOn 20 takeaways 15:17:01 anssik: BlinkOn 20 happened earlier this month 15:17:24 ... I'd like to discuss takeaways relevant to this group 15:17:45 ... we have representatives from non-Chromium engine projects in this group, so I felt it is helpful to update everyone on discussions relevant to our cross-browser spec efforts 15:17:55 ... and thanks to BlinkOn organizers for publicly sharing videos from the sessions 15:18:04 ... I found the following two talks particularly relevant to the WebML WG and CG: 15:18:12 -> Compute Abstraction for AI: Wasm, WebGPU, and WebNN https://www.youtube.com/watch?v=IgIdayJH4_o 15:18:21 -> Expandable Built in AI: Opening the Vision with Shared AI Models https://www.youtube.com/watch?v=3zOYVlBKOHA 15:18:39 ... in the agenda I mapped a few BlinkOn discussion topics to our spec issues 15:18:45 ... I likely missed some relevant talks, so please fill me in 15:18:50 ... for WebNN: 15:19:03 ... - Expose available hardware → device selection 15:19:23 ... - Ahead-of-time compilation → MLGraph caching 15:19:48 ... - NPU differences & model portability → op support limits future work? 15:20:07 ... - Hybrid execution → best-effort buffer-sharing with tensors and constants 15:20:17 3 more discussions on the "AI track": 15:20:17 -> Exploring Challenges with Cross Orign Resource & Model Sharing https://www.youtube.com/watch?v=TbM1hfl6ZzI 15:20:17 -> Built in AI: One Year Later https://www.youtube.com/watch?v=Rojs7TuLmwI 15:20:17 -> WebGPU in Blink Discussion https://www.youtube.com/watch?v=YMrSFzv_EWI 15:20:58 jsbell: three other talks relevant 15:21:02 ... cross-origin sharing that is experimental 15:21:14 ... another built-in AI talk 15:21:46 ... third one was WebGPU talk relevant to the interop aspects 15:22:34 ... the broaer audience did not know the specifics of compute APIs in general, so the talks were on a higher level of abstraction 15:22:52 q? 15:23:22 ... for built-in AI APIs: 15:23:35 ... - Common model format & ops → (paused) Model Loader API, core operator set 15:24:11 ... - Model sharing → built-in models exposed to low-level compute APIs? 15:24:52 ... - Built-in model transparency → Model Cards integration? 15:26:13 MikeW: WebKit contributors meeting is in the Fall 15:26:37 Present+ Etienne_Noel 15:27:26 Etienne: my key takeaway was AI is everywhere, inference on the web, a lot of feedback on Kenji's and my talk about built-in AI APIs 15:27:56 ... Prompt API with structured output gathered interest, differences between browsers, but we think developers can work around that 15:28:19 ... shared AI models has potential to solve same-origin cache issue 15:29:00 q? 15:29:10 Topic: Operator specific issues 15:29:39 anssik: today we'll review and discuss operator specific issues that reduce code complexity and improve maintainability 15:29:43 -> [operator specific] issues https://github.com/webmachinelearning/webnn/labels/operator%20specific 15:29:59 anssik: the `pad` operator has a few such open issues I wanted to discuss next 15:30:17 Subtopic: Clarify constraints for pad operation 15:30:21 anssik: issue #377 15:30:22 https://github.com/webmachinelearning/webnn/issues/377 -> Issue 377 Need clarify constraint of 'beginningPadding' and 'endingPadding' for pad operation in "reflection" and "symmetric" mode (by BruceDai) [operator specific] [interop] 15:30:52 ... to elaborate, the issue is about the need to clarify constraint of 'beginningPadding' and 'endingPadding' for pad operation in "reflection" and "symmetric" mode 15:30:59 -> https://www.w3.org/TR/webnn/#api-mlgraphbuilder-pad 15:31:16 anssik: in the issue discussion the following platform APIs and frameworks have been reviewed: 15:31:20 ... - DML_PADDING_OPERATOR_DESC 15:31:24 ... - tf.mirrorPad 15:31:29 ... - torch.nn.ReflectionPad2d behavior 15:31:57 ... and recently Phillis shared Core ML constraints for its "reflect" and "replicate" modes that map to MLPaddingModes "reflection" and "edge" respectively 15:32:15 ... and per Ningin's experiment, without limiting the padding size, TFLite gives different results than DirectML 15:32:34 ... it looks like the latest proposal is to limit the padding size 15:32:43 ... this proposal was supported by both Dwayne and Ningxin 15:32:53 Dwayne: correct 15:33:02 Ningxin: +1 15:33:15 ... rationale is the following, I think: 15:33:20 ... - this is a safer option 15:33:51 ... - allows for future extension to support extended wrapping if we identify models that require that feature to perform well 15:34:01 ... - as a bonus, this behaviour could be emulated 15:34:23 ... do we have an agreement to proceed with limiting the padding size? 15:34:52 SGTM! 15:34:58 anssik: no concerns, editors are free to proceed with the proposed solution 15:35:14 For #739, I think the big question is: can we drop "symmetric" ? 15:35:15 https://github.com/webmachinelearning/webnn/issues/739 -> Issue 739 Limited support for pad on CoreML backend (by philloooo) [operator specific] [interop] 15:35:21 Subtopic: Limited support for pad on CoreML backend 15:35:27 anssik: issue #739 15:35:46 ... this issue reports findings from Core ML backend implementation, specifically constraints wrt MLPaddingMode equivalent: 15:35:51 -> https://www.w3.org/TR/webnn/#enumdef-mlpaddingmode 15:36:00 anssik: we touched some of this in the previous issue 15:36:06 anssik: Phillis reports from Core ML: 15:36:14 ... 1. `symmetric` mode is not supported 15:36:26 ... 2. padding for more than the last two dimensions only supports `constant` mode 15:36:43 ... 3. If mode is `reflect` (aka `reflection`) then beginning and ending paddings can be at most input size-1 15:36:56 ... 4. If mode is `replicate` (aka `edge`) then beginning and ending paddings can be at most input size 15:37:00 ... we discussed 3 and 4 in context of issue #377 15:37:01 https://github.com/webmachinelearning/webnn/issues/377 -> Issue 377 Need clarify constraint of 'beginningPadding' and 'endingPadding' for pad operation in "reflection" and "symmetric" mode (by BruceDai) [operator specific] [interop] 15:37:07 ... for 1 and 2 the question is can they be emulated? 15:37:19 ... the issue also contains proposals for Core ML pad future work 15:37:37 ... MikeW has there been any updates to Core ML in this regard, how to share this feedback with your Core ML team? 15:38:09 MikeW: I can take the suggestions to the Core ML team 15:38:31 jsbell: based on research it looks like only one backend supports symmetric, so suggest dropping it from the spec and add if it is needed 15:38:51 Dwayne: I'm OK with it, I don't recall a model that used it 15:39:15 ... OK dropping symmetric, another possibility is to have opSupportLimits support for modes 15:39:29 ... if we can just drop symmetric and emulate edge case it is OK 15:39:34 q+ 15:40:03 ningxin: if there are no well-known use cases I'm fine dropping it to simplify implementation, if use cases are identified later we can reconsider 15:40:06 ack Joshua_Lochner 15:40:47 https://github.com/huggingface/transformers.js-benchmarking/tree/main/data 15:41:01 Joshua_Lochner: I could assist with what models are supporting that, can provide additional information from transformers benchmarking database, on models that use symmetric padding 15:41:29 ... now says uses pad operator, can expand what mode is used 15:42:01 Dwayne: data on this welcome in #739 15:42:02 https://github.com/webmachinelearning/webnn/issues/739 -> Issue 739 Limited support for pad on CoreML backend (by philloooo) [operator specific] [interop] 15:42:39 q? 15:43:11 Topic: Tensors for graph constants 15:43:19 anssik: issue #760 PR #830 15:43:20 https://github.com/webmachinelearning/webnn/pull/830 -> MERGED Pull Request 830 Allow tensors for graph constants. (by bbernhar) 15:43:20 https://github.com/webmachinelearning/webnn/issues/760 -> CLOSED Issue 760 Support building graphs from `MLTensor` containing constants (by bbernhar) [feature request] 15:43:34 ... over the course of the last two weeks all the suggestions in the PR were addressed, PR reviewed and merged, thank you everyone 15:43:45 ... thanks Bryan for the PR, Bryan and Austin for prototyping, others reviews and help 15:43:54 ... from the IDL perspective, the change is the following: 15:44:09 ``` 15:44:09 interface MLContext { 15:44:09 + Promise createConstantTensor( 15:44:09 + MLOperandDescriptor descriptor, AllowSharedBufferSource inputData); 15:44:09 // ... 15:44:10 }; 15:44:10 15:44:10 interface MLTensor { 15:44:11 readonly attribute FrozenArray shape; 15:44:11 readonly attribute boolean readable; 15:44:11 readonly attribute boolean writable; 15:44:12 + readonly attribute boolean constant; 15:44:12 undefined destroy(); 15:44:12 }; 15:44:13 15:44:13 interface MLGraphBuilder { 15:44:13 // Create an operand for a graph constant. 15:44:14 MLOperand constant(MLOperandDescriptor descriptor, 15:44:14 AllowSharedBufferSource buffer); 15:44:14 15:44:15 // Create a scalar operand from the specified number of the specified type. 15:44:15 MLOperand constant(MLOperandDataType type, MLNumber value); 15:44:15 15:44:16 + // Create an operand from a specified constant tensor. 15:44:16 + MLOperand constant(MLTensor tensor); 15:44:16 15:44:17 // Compile the graph up to the specified output operands asynchronously. 15:44:17 Promise build(MLNamedOperands outputs); 15:44:17 }; 15:44:18 ``` 15:45:15 anssik: anything specific to report about this PR? 15:45:41 Dwayne: constants gives the backend an opportunity to optimize data layout, as it is not changing in the future 15:46:09 jsbell: question, what are the plans for ORT Web to take advantage of this? 15:46:31 ningxin: we could have that in our plan, we have a use case for a merged model, two subgraphs sharing constants as weights 15:46:42 ... will update the group when available 15:46:43 q? 15:47:13 Present+ Winston 15:47:18 Present- Winston 15:47:24 Present+ Winston_Chen 15:47:39 Topic: Core operator set 15:47:43 anssik: #573 15:47:45 https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] 15:48:19 anssik: recently Arm Ltd folks joined the WG (welcome!) so I wanted to reinvigorate this effort that aims to identify current primitive gaps by mapping compositional fundamentals (e.g. PyTorch prims, TOSA, StableHLO) to WebNN operators 15:48:23 ... Dwayne has produced a table to help with this mapping: 15:48:27 -> Machine Learning Operator Mapping - All Raw Operators https://onedrive.live.com/edit?id=EE82F5C6F06C7371!345450&resid=EE82F5C6F06C7371!345450&ithint=file%2Cxlsx&authkey=!AK8f-RDTleqlLXE&wdo=2&cid=ee82f5c6f06c7371 15:48:51 anssik: in particular I wanted to discuss if the TOSA mappings identified as part of this exercise have any open questions attached to them that ARM participants could help address 15:49:02 ... for reference, the MLIR TOSA dialect implements the TOSA specification: 15:49:05 -> MLIR TOSA dialect https://mlir.llvm.org/docs/Dialects/TOSA/ 15:49:09 -> TOSA 1.0.0 draft spec https://www.mlplatform.org/tosa/tosa_spec.html 15:49:29 anssik: I note some of the recently added Wave 3 WebNN ops have not been mapped to TOSA yet in this table: 15:49:39 ``` 15:49:39 transpose 15:49:39 triangular 15:49:39 where 15:49:39 tile 15:49:39 sign 15:49:39 scatterNd 15:49:39 gatherNd 15:49:40 gatherElements 15:49:40 scatterElements 15:49:40 cumulativeSum 15:49:41 quantizeLinear 15:49:41 dequantizeLinear 15:49:41 logicalAnd 15:49:42 logicalOr 15:49:42 logicalXor 15:49:42 notEqual 15:49:43 ``` 15:49:55 anssik: it looks like some of these map quite directly to the TOSA 1.0.0 spec but it'd help if someone working closely with TOSA could take a look and provide feedback in this issue? 15:51:03 Dwayne: I remember talking with NVIDIA folks that they're planning to simplify TOSA, so have been waiting for that 15:51:35 q+ 15:51:44 ack ningxin 15:52:04 ningxin: I want to note issue #817 about rounding op 15:52:06 https://github.com/webmachinelearning/webnn/issues/817 -> Issue 817 Rounding operators (by fdwr) [feature request] [interop] 15:52:44 Mike_Wyrzykowski has joined #webmachinelearning 15:52:45 ... use case for ONNX Runtime Web decomposition, missing rounding op support, rounding should be part of the core op set 15:52:53 +1 to adding rounding (already gave issue thumbs up) 15:52:58 q+ 15:53:05 ack Mike_Wyrzykowski 15:53:25 MikeW: with the rounding, should behave it is consistent across platforms and backends 15:54:04 Dwayne: Ningxin tested round on Core ML, how about different compute units? 15:54:21 ningxin: that'd be good, I need to investigate how to do that 15:54:55 jsbell: I talked with Phillis and what you request about compute units 15:55:11 q+ 15:55:24 ack jsbell 15:55:48 jsbell: Dwayne, one of the things in your analysis you found expanding conv beyond 2d make it N dimensional 15:56:03 ... do you see models that require that? 15:56:10 whisper uses conv1d 15:56:15 Dwayne: we see conv1D cases 15:56:43 jsbell: Joshua_Lochner, perhaps you can add conv1d to your script? 15:57:14 Joshua_Lochner: can you send me a message with details and I'll follow up, I'm updating the script so everyone can run it locally 15:57:14 q? 15:57:42 q? 15:57:48 Topic: Caching mechanism for MLGraph 15:57:53 anssik: issue #807 15:57:54 https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request] 15:58:06 anssik: we made a decision at out last meeting to create a new explainer that aligns with the Chromium implementation direction, Reilly is working on that 15:58:28 ... meanwhile, we've received very initial Chromium implementation feedack via Shiyi (thanks!) 15:58:28 -> https://github.com/shiyi9801/chromium/pull/227 15:58:28 https://github.com/shiyi9801/chromium/pull/227 -> Pull Request 227 [DO NOT SUBMIT] Model cache POC (by shiyi9801) 15:58:37 anssik: there's also an example how this caching feature integrates into an existing image_classification webnn-sample from Shiyi (thanks again!): 15:58:41 -> https://github.com/webmachinelearning/webnn-samples/compare/master...shiyi9801:webnn-samples:model_cache 15:59:11 anssik: basically the web developer first stores a key locally and replaces the build() line with caching logic wrapped in try...catch block 15:59:14 ... the logic first tries to loadGraph() and if that fails falls back to build() followed by saveGraph() 15:59:18 ... here's the code snippet: 15:59:24 ``` 15:59:24 try { 15:59:24 console.log("try to load graph..."); 15:59:24 this.graph_ = await this.context_.loadGraph(this.modelCacheKey_); 15:59:24 console.log("load graph succeed!"); 15:59:25 } catch (e) { 15:59:25 console.log("failed to load graph: ", e.message, " try to build graph..."); 15:59:25 this.graph_ = await this.builder_.build({'output': outputOperand}); 15:59:26 await this.context_.saveGraph(this.modelCacheKey_, this.graph_); 15:59:26 } 15:59:26 ``` 15:59:40 anssik: this example looks clear, any feedback or suggestions? 16:00:01 ... Ningxin, any feedback from ORT backend model cache experimentation, e.g. does this WebNN feature integrate fit in with the ORT EPContext? 16:00:45 ningxin: have discussed with Wanming, would make it simple from the start, provide the key so the EP does what the sample shows above 16:01:26 ... want to have a prototype for this, earlier we discussed if we discussed if we should bring native ORT EPContext to the web, but that's more complex 16:01:45 ... we re-thinked that and want to go with the simpler solution and will let the group know 16:01:46 q? 16:02:12 q? 16:02:57 RRSAgent, draft minutes 16:02:58 I have made the request to generate https://www.w3.org/2025/04/24-webmachinelearning-minutes.html anssik 16:04:36 s/is in both/are in both 16:06:11 s/talks relevant/talks that are relevant 16:06:30 s/to the interop/for the interop 16:06:40 s/broaer/broader 16:07:17 s/for built-in AI APIs:/anssik: for built-in AI APIs: 16:08:45 RRSAgent, draft minutes 16:08:47 I have made the request to generate https://www.w3.org/2025/04/24-webmachinelearning-minutes.html anssik 16:09:40 s/... rationale is/anssik: rationale is 16:09:44 RRSAgent, draft minutes 16:09:45 I have made the request to generate https://www.w3.org/2025/04/24-webmachinelearning-minutes.html anssik 16:10:13 s/… anssik: for built-in AI APIs:/anssik: for built-in AI APIs: 16:12:18 s/others reviews/others for reviews 16:14:09 s/should behave it is/the behaviour should be 16:14:40 s/conv1D/conv1d 16:16:02 s/the ORT EPContext/ORT Web 16:16:22 s/the EP/WebNN EP 16:16:55 s/if we discussed // 16:17:12 RRSAgent, draft minutes 16:17:13 I have made the request to generate https://www.w3.org/2025/04/24-webmachinelearning-minutes.html anssik 18:30:32 Zakim has left #webmachinelearning 21:22:26 zkis has joined #webmachinelearning