14:53:56 <RRSAgent> RRSAgent has joined #webmachinelearning
14:54:00 <RRSAgent> logging to https://www.w3.org/2025/12/04-webmachinelearning-irc
14:54:00 <Zakim> inviting RRSAgent
14:54:00 <Zakim> RRSAgent, make logs Public
14:54:01 <Zakim> please title this meeting ("meeting: ..."), anssik
14:54:02 <anssik> Zakim, prepare meeting
14:54:02 <Zakim> RRSAgent, make logs Public
14:54:03 <Zakim> please title this meeting ("meeting: ..."), anssik
14:54:05 <anssik> Meeting: WebML WG Teleconference – 4 December 2025
14:54:10 <anssik> Chair: Anssi
14:54:17 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-12-04-wg-agenda.md
14:54:21 <anssik> Scribe: Anssi
14:54:26 <anssik> scribeNick: anssik
14:54:35 <anssik> gb, this is webmachinelearning/webnn
14:54:35 <gb> anssik, OK.
14:54:41 <anssik> Present+ Anssi_Kostiainen
14:59:04 <anssik> RRSAgent, draft minutes
14:59:05 <RRSAgent> I have made the request to generate https://www.w3.org/2025/12/04-webmachinelearning-minutes.html anssik
14:59:33 <anssik> Present+ Davis_Shaver
15:00:14 <anssik> Present+ Zoltan_Kis
15:00:17 <davisshaver> davisshaver has joined #webmachinelearning
15:00:28 <anssik> Present+ Markus_Handell
15:00:36 <Joshua_Lochner> Joshua_Lochner has joined #webmachinelearning
15:00:54 <anssik> Present+ Joshua_Lochner
15:01:15 <anssik> Present+ Markus_Tavenrath
15:01:22 <mtavenrath> mtavenrath has joined #webmachinelearning
15:01:31 <anssik> Present+ Dominique_Hazael-Massieux
15:01:44 <anssik> Present+ Ehsan_Toreini
15:02:14 <anssik> Present+ Ningxin_Hu
15:02:20 <anssik> Present+ Rob_Kochman
15:02:32 <anssik> Present+ Dwayne_Robinson
15:02:53 <ningxin> ningxin has joined #webmachinelearning
15:04:01 <dom> dom has joined #webmachinelearning
15:04:18 <anssik> Anssi: we'll start by acknowledging our new participants who joined the WG:
15:04:25 <zkis> zkis has joined #webmachinelearning
15:04:40 <anssik> ... Simon Wijckmans from cside (Client side development Inc)
15:04:49 <anssik> ... Lynne Jiang, Ben Greenstein from Google
15:04:58 <anssik> ... Chris Needham from BBC
15:05:05 <anssik> ... JaEun Jemma Ku from University of Illinois
15:05:17 <anssik> ... Pavan Yanamadala, Siddharth Mangesh, Sharanya Chandrasekaran, Noormina Abuthahir from PayPal
15:05:21 <anssik> ... Dexter Yang from ByteDance
15:05:28 <anssik>  ... Zoltan Kis as an Invited Expert
15:05:36 <anssik>  ... on behalf of the entire group, welcome to all new participants!
15:05:56 <DwayneR> DwayneR has joined #webmachinelearning
15:06:18 <anssik> Topic: F2F recap
15:06:29 <anssik> -> Archived F2F agenda https://github.com/webmachinelearning/meetings/tree/main/2025-11-10-kobe
15:06:40 <anssik> -> Working Group minutes https://www.w3.org/2025/11/09-webmachinelearning-minutes.html
15:06:40 <anssik> -> Community Group minutes https://www.w3.org/2025/11/11-webmachinelearning-minutes.html
15:06:49 <anssik> Anssi: I was not planning to recap the official agenda, but summarize the progress made outside the official meeting
15:06:59 <anssik> ... some highlights for me were the following:
15:07:17 <anssik> ... - we were able to raise awareness of our groups' inference and agentic work via breakouts and horizontal groups with the broader W3C community
15:07:24 <anssik> ... - we presented WebML WG and CG work at the very popular AI Agents and The Web breakout
15:07:29 <anssik> -> WebML WG/CG at AI Agents and The Web breakout https://anssiko.github.io/ai-and-web-tpac-2025/
15:07:43 <anssik> Anssi: - together with Reilly, we presented WebNN at the Security IG F2F meeting on Fri
15:07:47 <anssik> -> https://github.com/w3c/securityig/blob/main/meetings/2025/2025-11-14_agenda.md
15:08:03 <anssik> Anssi: - Mozilla revised its WebNN position to support and Tarek initiated implementation work, see #763
15:08:04 <gb> https://github.com/webmachinelearning/webnn/issues/763 -> Issue 763 Request standards positions from Mozilla and WebKit (by reillyeon) [process]
15:08:17 <anssik> ... - WebKit reopened its WebNN standards position
15:08:26 <Ehsan> Ehsan has joined #webmachinelearning
15:08:37 <anssik> ... - Markus and the NVIDIA team extended their exploration into various WebNN implementation strategies and optimizations, discussed during the week on the hallway track
15:09:07 <anssik> ... given the broader implementer interest is now ramping up fast, I propose we use W3C's Slack #webmachinelearning for synchronous implementation related discussions across implementers and continue to use IRC for these bi-weekly meetings
15:09:14 <anssik> ... Slack has certain benefits over IRC for this type of long-running discussions, e.g. message persistence, so I think this separation of concerns works here
15:09:24 <anssik> ... Tarek already started discussions about his Rust implementation on Slack and Markus chimed in, thanks!
15:09:29 <anssik> - ... please join the W3C Slack #webmachinelearning to exchange ideas across implementers interested in WebNN
15:09:32 <anssik> -> How to join W3C Slack https://www.w3.org/wiki/Slack
15:09:35 <RobKochman> RobKochman has joined #webmachinelearning
15:10:46 <anssik> q?
15:11:16 <anssik> MarkusT: I'm looking forward to Tarek's work
15:11:33 <anssik> Topic: W3C Web & AI Interest Group launched
15:11:50 <anssik> Anssi: The Web & AI Interest Group is a forum to discuss ethical, societal, and technical implications of AI related technologies. Ethical Principles for Web Machine Learning established as a joint deliverable.
15:11:58 <anssik> -> W3C Web & AI Interest Group Charter https://www.w3.org/2025/10/webai-ig-charter.html
15:12:19 <anssik> ... I had an exchange with Fabien Gandon who co-chairs the IG
15:12:35 <anssik> ... unfortunately Fabien couldn't join us today, but I'm conveying his welcome and invite anyone interested in ethical, societal, and technical implications of AI related technologies to join the IG
15:12:39 <anssik> ... we will develop our Ethical Principles document together with this newly formed IG
15:12:44 <anssik> ... to join, please follow the link in the charter document
15:13:17 <anssik> Dom: thanks for the intro, the way to think about the IG, a place for the broader picture, this WebML WG does excellent deep work on technical specifications
15:14:34 <anssik> ... IG looks primarily non-technical topics, higher-level considerations how the AI & Web ecosystems can evolve harmoniously
15:14:35 <anssik> q?
15:15:03 <anssik> Anssi: what is the IG's work mode?
15:15:14 <anssik> Dom: GH-driven, some meetings planned
15:16:32 <anssik> Anssi: is there flexibility in terms of non-normative deliverables the IG could work on?
15:17:03 <anssik> Dom: yes, non-normative deliverables would be welcome, the W3C team contact is working on a roadmap for the IG
15:17:05 <anssik> q?
15:17:49 <anssik> Topic: Core operator set
15:18:09 <anssik> Subtopic: Expand the expand operator to support blockwise broadcasting
15:18:11 <anssik> Anssi: issue #903
15:18:12 <gb> https://github.com/webmachinelearning/webnn/issues/903 -> Issue 903 Expand the expand operator to support blockwise broadcasting (by fdwr) [opset]
15:18:32 <anssik> ... this is one of the sub-issues spun off from the core op set meta issue
15:20:35 <anssik> Dwayne: A) the preferred proposal is to:
15:20:40 <anssik> ... - move the blockwise broadcasting aspect into expand
15:20:45 <anssik> ... - leave the rest of the decomposition being the respective mul/div/sub/add for Q and DQ
15:20:50 <anssik> -> expand() https://webmachinelearning.github.io/webnn/#dom-mlopsupportlimits-expand
15:20:56 <anssik> Dwayne: B) the alternative considered:
15:21:00 <anssik> ... - extend resample with nearest neighbor to support multiple axes
15:21:04 <anssik> -> resample() https://webmachinelearning.github.io/webnn/#api-mlgraphbuilder-resample2d-method
15:21:22 <anssik> Anssi: the preferred proposal is motivated by better conceptual alignment
15:21:54 <anssik> Anssi: any questions or concerns with the preferred proposal?
15:22:34 <anssik> Rob: need Reilly's input on Google's side
15:23:19 <anssik> ... will ping Reilly to provide feedback on this issue
15:23:50 <anssik> q?
15:24:15 <anssik> Subtopic: Extend rank support
15:24:20 <anssik> Anssi: issue #904
15:24:22 <gb> https://github.com/webmachinelearning/webnn/issues/904 -> Issue 904 2D, or not 2D, that is the question (by fdwr) [opset]
15:24:24 <anssik> ... the more catchy name for this issue is "2D, or not 2D, that is the question" :-)
15:24:45 <anssik> ... Dwayne reports: "Multiple WebNN operators still have limited ranks which was historically done for backends that might be more limited"
15:24:57 <anssik> ... current backends have evolved since rank support was specified in WebNN
15:25:05 <anssik> ... limited ranks have caused issues in certain popular models
15:25:09 <anssik> .... Dwayne notes Whisper uses 1D conv and thus requires an extra reshape() step
15:25:22 <anssik> ... the issue contains a survey of the current operator rank support for CoreML, DML, LiteRT, ORT backends
15:25:40 <anssik> ... the proposal from Dwayne is to extend the rank support to match the intersection of the rank support of current backends
15:26:26 <anssik> ... the solution can take various API shapes
15:26:30 <anssik> ... Dwayne's came up with the following options by studying other libraries:
15:26:37 <anssik> ... - A) Bake the axis count directly into the operator name
15:26:37 <anssik> ... - B) Use a single operator name, with an implicit axis count based on the input rank
15:26:37 <anssik> ... - C) Pass the reduction axis count separately from the input rank
15:26:37 <anssik> ... - D) Pass the explicit axes
15:26:55 <anssik> ... based on the pros/cons analysis option C or D is the most preferred
15:27:56 <anssik> Anssi: we can reflect platform rank differences through MLOpSupportLimits
15:28:07 <anssik> ... any axis count 1-3 would be legal to WebNN if `axis count <= input rank`, see the table in the issue
15:28:18 <anssik> Anssi: Dwayne suggests to avoid adding a zoo of new function names:
15:28:27 <anssik> ... foo1, foo2, foo3 etc.
15:28:35 <anssik> ... conv2d -> conv
15:28:35 <anssik> ... convTranspose2d -> convTranspose
15:28:36 <anssik> ... averagePool2d -> averagePool
15:28:36 <anssik> ... l2Pool2d -> l2Pool
15:28:36 <anssik> ... maxPool2d -> maxPool
15:28:37 <anssik> ... resample2d() -> resample
15:30:13 <anssik> Dwayne: for each op, I can list IDL proposals to help readers
15:30:25 <anssik> q?
15:31:31 <anssik> Anssi: does option C or D still allow AOT feature detection of ranks?
15:31:35 <anssik> ... a simple feature detection mechanism is to check for existence of a method on an object
15:31:38 <anssik> ... can we implement such a feature detection of supported ranks entirely with MLOpSupportLimits?
15:31:53 <anssik> ... an example of a simple feature detection:
15:31:53 <anssik> ```
15:31:53 <anssik> const graphBuilder = new MLGraphBuilder(await navigator.ml.createContext());
15:31:53 <anssik> if ('conv' in graphBuilder) console.log('conv() exists');
15:31:53 <anssik> ```
15:32:32 <anssik> Anssi: the naming change has a compatibility impact as discussed in context of issue #821
15:32:33 <gb> https://github.com/webmachinelearning/webnn/issues/821 -> Issue 821 Operator naming 2D vs 2d (by fdwr) [conventions]
15:33:24 <anssik> ... given the Origin Trials are imminent, I think this change would land after the initial OT period?
15:34:26 <anssik> Dwayne: when making a new change, we give it 4 weeks for frameworks to update themselves, leave an alias in place
15:34:27 <anssik> q?
15:35:26 <anssik> Subtopic: Composite operators / subgraphs
15:35:34 <anssik> Anssi: issue #907
15:35:35 <gb> https://github.com/webmachinelearning/webnn/issues/907 -> Issue 907 Composite operators / subgraphs (by fdwr) [opset]
15:35:50 <anssik> ... Core operator set was discussed at TPAC 2025 where we resolved to evolve the proposal for aggregate operators via subgraphs
15:35:54 <anssik> -> RESOLUTION from TPAC 2025 https://www.w3.org/2025/11/09-webmachinelearning-minutes.html#ffff
15:36:05 <anssik> Anssi: this builds upon the earlier exploration by Ningxin et al. on custom ops discussed at TPAC 2024
15:36:10 <anssik> -> Custom ops at TPAC 2024 https://www.w3.org/2024/09/23-webmachinelearning-minutes.html#b039
15:36:25 <anssik> Anssi: Dwayne opened this topic-specific issue to pursue this proposal further and shared his background research on the topic (thanks!)
15:36:34 <anssik> ... see also the Case Study on WebNN Small Language Model Performance Optimization presented at TPAC 2025 for further motivation:
15:36:40 <anssik> -> WebNN SLM Performance Optimization Case Study at TPAC 2025 https://lists.w3.org/Archives/Public/www-archive/2025Nov/att-0000/WebNN_SLM_Optimization_-_TPAC.pdf
15:37:06 <anssik> Anssi: high-level motivation for the proposal has been discussed in context of the core op set meta issue and I think we have a general agreement
15:37:13 <anssik> ... - 100s of potential operators across ML libraries
15:37:26 <anssik> ... - adding all of them into a Web API is not feasible
15:37:42 <anssik> ... - WebNN core op set is designed to enable composability of larger aggregate ops
15:38:12 <anssik> ... - if the backend has a compatible implementation of the subgraph, can use a more efficient path vs. relying on pattern recognition by the implementation
15:38:32 <anssik> ... a popular concrete example of an aggregate op is multi-head attention, a key component of the transformer architecture introduced in the original 2017 paper
15:38:56 <anssik> ... Dwayne has a code snippet in the issue to demonstrate how this could look like in terms of API surface, basic steps (details, names etc. to be discussed)
15:39:43 <anssik> Dwayne: a web developer defines a composite operator as a JS function using the existing WebNN built-in ops
15:39:51 <anssik> ... buildSubgraph() method returns the built subgraph
15:40:00 <anssik> ... subgraph() method returns the output given the built subgraph and input
15:40:06 <zkis> q+ to ask if we maintain the semantics this was meant to be tanh
15:40:17 <anssik> Dwayne: this is more of an example, ideas welcome
15:40:18 <anssik> q?
15:40:31 <anssik> MarkusT: how to handle different constants?
15:40:36 <anssik> ... do we want to get subgraph names?
15:40:47 <anssik> Dwayne: would a name be helpful for a backend to recognize?
15:41:16 <anssik> MarkusT: ML is done by frameworks, and pattern matching the subgraph, can pre-check if this is a name I expect, the name would be a hash to know what to pattern match against
15:41:39 <anssik> Dwayne: looked at various ML libraries, would a list of candidate names be better?
15:42:25 <anssik> MarkusT: if you dumb the subgraph into debugging tool, the name would help with debugging
15:42:39 <anssik> ... subgraphs calling subgraphs?
15:42:46 <anssik> Dwayne: seems useful for composability?
15:44:03 <anssik> MarkusT: the input would be dynamic, the shape of the input determined by whatever output is sent to the input, subgraphs are like macros
15:44:16 <anssik> Dwayne: ONNX has this concept of functions composed of multiple graphs
15:44:28 <anssik> MarkusT: do we expect macro expansion by every backend?
15:45:01 <anssik> Dwayne: not sure about that, each backend or layer below, should know the capabilities of the backend
15:45:25 <ningxin> q+
15:45:28 <anssik> MarkusT: if the backend would support subgraphs, perhaps the WebNN native interface would unroll
15:45:30 <anssik> q?
15:45:40 <anssik> ack zkis
15:45:40 <Zakim> zkis, you wanted to ask if we maintain the semantics this was meant to be tanh
15:46:05 <handellm> handellm has joined #webmachinelearning
15:46:37 <anssik> Zoltan: question, do we want to maintain semantics?
15:46:49 <anssik> Dwayne: MarkusT's idea of including names would help with that
15:47:09 <anssik> Zoltan: we should discuss whether we need to "standardize" those names
15:47:22 <anssik> ... an annotation mechanism
15:48:06 <anssik> MarkusT: not prefer any meta information, new operation, how long does it take for us to standardize vs backend find a name an implement it?
15:48:07 <anssik> q?
15:48:10 <anssik> ack ningxin
15:48:57 <anssik> Ningxin: to express an operator, the ops take optional input, for some attention ops, they have optional input too, how the subgraph concept can support that
15:49:33 <anssik> ... secondly, some existing WebNN ops have attributes, WebNN conventions
15:50:11 <anssik> Ningxin: static attributes,  how to go about them?
15:50:24 <anssik> Dwayne: will add that as a consideration
15:50:25 <anssik> q?
15:50:51 <anssik> MarkusT: what if attributes could override?
15:50:53 <anssik> Dwayne: suspect so
15:50:54 <anssik> q?
15:51:16 <anssik> q?
15:51:25 <anssik> Topic: Push v pull architecture for constants
15:51:30 <anssik> Anssi: issue #901
15:51:31 <gb> https://github.com/webmachinelearning/webnn/issues/901 -> Issue 901 Proposal: API to Separate Graph Building from Weight Loading to Reduce Peak Memory Usage (by mtavenrath) [feature request]
15:51:44 <anssik> ... we discussed this proposal to reduce peak memory usage from Markus and the NVIDIA team at TPAC 2025
15:51:53 <anssik> ... and resolved to explore streaming constructor for constants
15:51:57 <anssik> -> https://www.w3.org/2025/11/09-webmachinelearning-minutes.html#e940
15:52:04 <anssik> Anssi: after TPAC, Markus provided further details on the benefits of the proposed pull-based model for constants in this issue:
15:52:11 <anssik> ... - 1. Latency Hiding via Parallel Compilation
15:52:11 <anssik> ... - 2. Direct-to-Disk Caching & I/O Alignment
15:52:11 <anssik> ... - 3. Persistent Layout Optimization
15:52:11 <anssik> ... - 4. Memory Architecture & UVM Efficiency
15:52:11 <anssik> ... - 5. Dynamic Resource Management
15:52:36 <anssik> ... and with the broader NVIDIA team looked into remote execution of neural networks, e.g. on a home-server, using external weights
15:52:48 <anssik> Anssi: Dwayne notes external weights are already achievable via MLTensor when combined with MLGraphBuilder.input() method
15:53:06 <anssik> ... this allows MLGraph to be build without weights, to be written later via writeTensor()
15:53:11 <anssik> ... Dwayne suggests this addresses some of the concerns raised in this issue?
15:53:20 <anssik> ... what functioanality do we miss with MLTensor and input()?
15:53:27 <anssik> ... I guess 5. dynamic resource management?
15:53:43 <anssik> q?
15:54:22 <anssik> MarkusT: parsing constants is delayed, not all backends happy to call writeTensor()
15:54:58 <anssik> ... 5 points based on discussion with Reilly, if we have external resources we don't need to do mem copies at all, get the data at the time when you need it
15:55:23 <anssik> ... backends can pull the resources on demand, responsibility on the backend implementation
15:56:13 <anssik> ... we were wondering about caching, current ORT likely downloads all content, backend could be faster than code running in a JS process
15:56:35 <anssik> ... we're currently doing work in another ML framework with similar optimizations
15:56:37 <anssik> q?
15:57:13 <anssik> MarkusT: pass an URL and offset proposal by Reilly sounded good, gguf file passing to GraphBuilder and give input tensor names and we're done
15:57:14 <anssik> q?
15:57:30 <anssik> q?
15:57:48 <anssik> Dwayne: I'll think about this more
15:58:01 <anssik> q?
15:58:23 <anssik> Present+ Mike_Wyrzykowski
15:58:45 <anssik> q?
15:58:51 <anssik> Topic: Device selection
15:58:56 <anssik> Subtopic: Device selection criteria for usecase-driven scenarios
15:59:03 <anssik> Anssi: issue #902
15:59:04 <gb> https://github.com/webmachinelearning/webnn/issues/902 -> Issue 902 Device selection criteria for usecase-driven scenarios (by fdwr) [device selection]
15:59:07 <anssik> ... we discussed this at TPAC 2025:
15:59:11 <anssik> -> https://www.w3.org/2025/11/09-webmachinelearning-minutes.html#93d2
16:00:16 <anssik> ... - there was consensus that generally hints is the preferred mechanism, but no decision on which hints to pursue, if any, I posted an IDL diff to tease additional perspectives
16:00:36 <anssik> ... - there was interest in supporting multiple devices of a given type
16:00:56 <anssik> ... - there was an agreement prompt fatique is an issue, still evolving page embedded permission control (PEPC) might be solution to that
16:01:00 <anssik> -> https://github.com/WICG/PEPC
16:01:39 <anssik> Anssi: - proposal that hints would help UA schedule real-time vs non-real time workloads running in parallel
16:01:44 <anssik> q?
16:03:06 <anssik> MarkusH: if we have explicit (or implicitly detected by UA) Worker QoS, would there remain a use case for specifying the latency requirement? Same goes for the continuity.
16:03:35 <anssik> ... perhaps Worker QoS is implicitly detectable by the UA, could remove low-latency preference in that case
16:03:52 <anssik> ... a hint of real-time activity going on
16:03:53 <anssik> q?
16:04:14 <anssik> q?
16:05:12 <anssik> MarkusH: perhaps Mike has feedback on an exact interface that would work out
16:05:22 <anssik> q?
16:05:29 <anssik> RRSAgent, draft minutes
16:05:30 <RRSAgent> I have made the request to generate https://www.w3.org/2025/12/04-webmachinelearning-minutes.html anssik
16:08:13 <anssik> s/yes, non-normative/yes, new non-normative
16:09:20 <anssik> s/Dwayne's came/Dwayne came
16:12:09 <anssik> s/dumb the/dump the
16:13:16 <anssik> s/an implement/and implement
16:14:25 <anssik> s/suspect so/I suspect so
16:15:11 <anssik> s/functioanality/functionality
16:15:37 <anssik> s/mem copies/memory copies
16:16:18 <anssik> s/gguf/GGUF
16:16:56 <anssik> s/be solution/be a solution
16:17:29 <anssik> RRSAgent, draft minutes
16:17:31 <RRSAgent> I have made the request to generate https://www.w3.org/2025/12/04-webmachinelearning-minutes.html anssik
18:30:29 <Zakim> Zakim has left #webmachinelearning