14:59:39 RRSAgent has joined #webmachinelearning 14:59:43 logging to https://www.w3.org/2025/05/22-webmachinelearning-irc 14:59:43 RRSAgent, make logs Public 14:59:44 please title this meeting ("meeting: ..."), anssik 14:59:47 Meeting: WebML WG Teleconference – 22 May 2025 14:59:54 Chair: Anssi 15:00:02 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-05-22-wg-agenda.md 15:00:18 Scribe: Anssi 15:00:28 scribeNick: anssik 15:00:34 gb, this is webmachinelearning/webnn 15:00:34 anssik, OK. 15:00:44 Present+ Anssi_Kostiainen 15:00:52 RafaelCintron has joined #webmachinelearning 15:00:52 Regrets+ Christian_Liebel 15:01:07 Present+ Dwayne_Robinson 15:01:15 Present+ Joshua_Bell 15:01:23 Present+ Laszlo_Gombos 15:01:27 lgombos3 has joined #webmachinelearning 15:01:29 Present+ Rafael_Cintron 15:01:41 Present+ Laszlo_Gombos 15:02:12 Present+ Reilly_Grant 15:02:21 Present+ Ningxin_Hu 15:02:33 Present+ Zoltan_Kis 15:02:36 ningxin has joined #webmachinelearning 15:02:50 Present+ Tarek_Ziade 15:03:43 RRSAgent, draft minutes 15:03:44 I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik 15:03:52 anssik: please welcome our new WebML WG participants: 15:04:12 ... Jonathan Schneerson from Temporal Series AI, AI startup specialized on time-dependent data from financial transactions, sensor streams etc. 15:04:39 ... Peter Tanski and Suraj Bisht from Capital One Financial, a financial services company, also an early adopter of forward-looking web capabilities e.g. Web NFC API for authentication 15:05:20 anssik: as our group growing, at the same time, with mixed emotions, I'm sharing that one esteemed participant is taking a break from work 15:06:39 Josh: Hi! I'm departing Google, will move to another country and no future plans yet, enjoyed working with this group truly, will remain contactable through personal email and IRC etc. 15:07:09 me: inexorabletash AT gmail DOT com 15:07:46 q+ 15:08:14 Josh, thanks so much for your tremendous contribution to this WG and WebNN spec! 15:08:44 ack RafaelCintron 15:08:49 anssik: thank you Josh for everything! 15:09:08 q+ 15:09:12 Rafael: the spec has tremendously benefited from your work, I've learned a lot from you! 15:09:14 ack ningxin 15:09:38 ningxin: I will echo Rafael and Anssi, thank you so much for your tremendous work for this group! 15:10:02 ... you've been a couch for me as an editor, really appreciate that and thank you and wish you a great next chapter! 15:10:53 Thanks Josh! It started as a great run / job together on merging a lot of algorithms, which you have single-handedly improved in many iterations. I learned a lot in that process. Thank you! 15:11:28 q? 15:11:47 Topic: Incubations 15:11:52 Thanks Zoltan and all! 15:12:06 anssik: on our upcoming Community Group EU-APAC Mon 26 May agenda for we have: 15:12:23 ... Prompt API implementation experience from AiBrow 15:12:48 ... New proposals: Web AI for Time Series, (recap) Local Inference Web extension 15:12:59 ... Proofreader API kick off 15:13:12 ... Prompt API security and privacy 15:13:17 ... See the CG agenda for more references 15:13:20 -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-05-26-cg-agenda.md 15:14:00 Topic: Google I/O and MS Build 2025 takeaways 15:14:18 anssik: both the events were (unsurprisingly) AI heavy, a few observations I think are of interest to this group: 15:14:33 Subtopic: Built-in AI APIs 15:14:44 anssik: both Edge and Chrome made announcements around Built-in AI APIs being worked on in the WebML CG 15:14:52 "Enabled by default: Prompt API for Chrome Extensions, Summarizer API, Translator API, Language Detector API; Origin trials: Writer API, Rewriter API; Early preview: Proofreader API" 15:15:19 -> Google I/O built-in AI APIs announcement https://www.youtube.com/watch?v=GjvgtwSOCao&t=2687s 15:15:31 Winstonc has joined #webmachinelearning 15:16:06 Tarek has joined #webmachinelearning 15:16:09 Josh: other thing demonstrated was multimodal Prompt API use, processing of image and audio as inputs, that's in early preview stage 15:17:03 "The Prompt API and Writing Assistance APIs — now available as developer previews in Edge Canary and Dev channels" 15:17:21 -> Microsoft Build built-in AI APIs announcement https://blogs.windows.com/msedgedev/2025/05/19/introducing-the-prompt-and-writing-assistance-apis/ 15:17:40 anssik: notably, Prompt API in Edge developer preview available to web apps and pages, not just to extensions 15:18:03 q? 15:18:35 Rafael: that's well covered Anssi, nothing to add 15:19:48 Subtopic: Windows ML 15:20:00 anssik: at Build Microsoft announced Windows ML as an evolution of DirectML 15:20:05 -> Windows ML announcement https://blogs.windows.com/windowsdeveloper/2025/05/19/introducing-windows-ml-the-future-of-machine-learning-development-on-windows/ 15:20:20 anssik: this is relevant to the group from the WebNN implementation perspective, an opportunity to gather further implementation experience 15:20:26 ... based on what was announced at Build: 15:20:37 ... Windows ML promises to simplify dependency management on Windows 15:20:58 ... vendor-specific execution providers are part of the Windows ML and updated by the OS 15:21:26 RafaelCintron: that is correct, it will be ONNX Runtime based 15:24:18 Josh: question to the group is, should we update the explainer WebNN architecture? 15:25:01 ... - new device selection mechanism that supports hint-based, explicit selection, and automatic selection 15:25:08 "WinML" is a WinRT-based wrapper atop ONNX Runtime (several years old). It's a fairly thin wrapper, plus some additional support for video frames and image conversion to input tensors. It only supported CPU and DirectML EP's. 15:25:08 "WindowsML" is a Windows-specific fork of ONNX Runtime, directly calling the ORT API (with some slight renamings in the header). It supports multiple EP's. 15:26:08 Topic: Operator specific issues 15:26:13 anssik: as usual, we focus our review and discussion on operator specific issues 15:26:17 -> [operator specific] issues https://github.com/webmachinelearning/webnn/labels/operator%20specific 15:26:22 Subtopic: int64 data type 15:26:34 anssik: I wanted us to take a look at various float64 data type related issues and PRs to check we're all aligned 15:26:39 ... issue #283 fixed by PR #646 15:26:40 https://github.com/webmachinelearning/webnn/pull/646 -> MERGED Pull Request 646 Specify the operand data type constraints of operations (by inexorabletash) 15:26:40 https://github.com/webmachinelearning/webnn/issues/283 -> CLOSED Issue 283 Specify the operand data type constraints of operation (by huningxin) [question] 15:26:47 ... introduced constraints for input operands (thanks Josh!) 15:26:58 ... issue #694 has a draft PR #695 15:26:58 https://github.com/webmachinelearning/webnn/issues/694 -> Issue 694 Consider adding int64/uint64 data type support for some reduce operators (by lisa0314) [operator specific] 15:26:59 https://github.com/webmachinelearning/webnn/pull/695 -> Pull Request 695 Bugfix: Add missing 64-bit integers support for some reduction operators (by huningxin) [operator specific] 15:27:11 ... to add int64/uint64 support for reduce ops 15:27:19 ... the PR awaits Mike's response to a question: "should we also allow optional 64-bits integers support for these reduction ops?" 15:27:50 Present+ Winston_Chen 15:28:26 Ningxin: before changing this PR to draft due to opSupportLimits, in last meeting we heard Microsoft's feedback for sign for optional support for int64 15:28:39 ... I'd propose to open this PR for review 15:28:49 a+ 15:28:51 q+ 15:28:56 ack reillyg 15:29:34 Reilly: at TPAC we discussed minimum data type set implementable across all Chromium backends, this might be a sign we should commence with that 15:29:55 ... this should also allow us to clean up many wpt failures, we could hard-fail ops that do not support the minimum set 15:30:08 ... and keep some ops optional 15:30:18 [ thumbs up from Dwayne ] 15:30:34 Reilly: some ops have no overlapping data types, that is an issue 15:31:12 q+ 15:31:15 ... I'm not blocking re-opening PR #695 15:31:16 https://github.com/webmachinelearning/webnn/pull/695 -> Pull Request 695 Bugfix: Add missing 64-bit integers support for some reduction operators (by huningxin) [operator specific] 15:31:17 ack ningxin 15:31:34 ningxin: so for Reilly's proposal, should we have a separate issue? 15:31:48 ... we should also consider device, CPU and GPU device may have different data type support for the same op? 15:31:52 q+ 15:31:57 ack RRSAgent 15:32:00 ack reillyg 15:32:27 Reilly: I guess, we have to do the analysis first, I expect if we include data type we find there are more data types not supported across all device types 15:32:42 ... the proposal is we should consider device type selection optional 15:32:59 ... and intersections should not consider data type, be orthogonal 15:33:25 ... there's a separate question how we communicate to developer for particular device type need to use specific data type 15:33:46 ... if we try to consider all these things at once, we can't come up with a useful op set 15:34:10 ... what ops an implementations must support is the questions? 15:34:31 ... this might force all implementations to support CPU and GPU always and rely on feature detection to find out NPU support 15:34:52 q+ 15:34:59 ... "I prefer to run on NPU, give me the available data types for that" 15:35:32 ... should focus on compatibility first, models that will surely execute 15:35:55 ningxin: compatibility means native framework compatibility, that is separate from the device? 15:36:00 Reilly: correct 15:36:02 ack zkis 15:36:47 zkis: I think with Windows ML announcement we can reiterate the device selection design, hints-based vs. explicit, it seems the current hints based is a subset of what Windows ML supports 15:38:18 anssik: issue #845 was fixed by PR #848 15:38:19 https://github.com/webmachinelearning/webnn/pull/848 -> MERGED Pull Request 848 Bugfix: Support `int64` for `abs`, `neg`, `sign`, `prelu` and `relu` (by huningxin) 15:38:19 https://github.com/webmachinelearning/webnn/issues/845 -> CLOSED Issue 845 The allowed data types of input operand for `sign` operator should also include `int64` type (by BruceDai) [operator specific] 15:39:22 I propose to open a sperate issue for Reilly's proposal of minimum data type set if there is not an existing one 15:40:21 sg 15:40:24 Reilly: Ningxin feel free to open an issue for this proposal 15:40:36 Subtopic: triangular 15:40:39 anssik: issue #768 15:40:40 https://github.com/webmachinelearning/webnn/issues/768 -> Issue 768 Consider removing or redesigning the `triangular` operator (by a-sully) [operator specific] 15:40:52 ... JoshuaL shared new per-model Trilu op count data (thanks!) so I wanted us to discuss this as a group 15:40:56 ... Dwayne shared his observations in the issue: "Most of these models contain just one instance of trilu, but that is a substantial percentage of model" 15:41:19 anssik: does this new data support the proposal to remove triangular op from the spec? 15:41:23 q? 15:41:36 Dwayne: the additional data encourages keeping the triangular op 15:41:51 ... but need to consider how many backends have support for this op 15:42:27 ... I'm inclined to keep this op now, unless we have a better understanding of the Core ML decomposition 15:42:52 ... decomposition is possible with large triangular matrices without taking a lot of memory 15:43:48 Reilly: as fuzzing the implementation, the fuzzer was able to find huge matrices with masks, not existing in practical models 15:44:08 ... more implementation work required to do this, we could compute the mark as inference time than bake it via the generated model 15:44:33 Dwayne: computable at runtime, Core ML should be able to decompose this on the fly, I have details in the issue 15:44:49 ... did you encounter input of big size? 15:45:11 Reilly: fuzzers do generate huge inputs 15:45:19 Dwayne: for security perspective? 15:45:28 Reilly: correct, to identify corner cases 15:45:48 Dwayne: any reservation to support this op? 15:46:07 Reilly: no as long as it is secure and efficient to implement 15:46:36 q? 15:46:56 Subtopic: opSupportLimits level of detail for output tensor(s) 15:47:00 anssik: issue #835 15:47:01 https://github.com/webmachinelearning/webnn/issues/835 -> Issue 835 opSupportLimits: Level of detail for output tensor(s)? (by inexorabletash) [question] 15:47:31 ... Josh explains: "there are a variety of opinions about how much detail opSupportLimits() should include for the output tensors." 15:48:16 Josh: just do the bare minimum is one approach, understandability is the burden, the question is, in all the ops it is possible to determine the shape and data type from the algorithm, do we rely on that? 15:48:53 ... the most recent comment whether we should include this data is from Ningxin to let's add, I think this is waiting for someone to write a PR for this 15:49:19 anssik: the proposal from Ningxin to have ranks in output got substantive support 15:49:45 Dwayne: maybe biased from other specs, would prefer to have output and ranks for symmetry 15:50:15 anssik: Phillis reports the actual constraints from underlying ML frameworks are: 15:50:24 ... - global tensor rank constraints 15:50:30 ... - op level input rank constraints 15:50:48 ... concludes we have two ways to expose this: 15:50:52 ... - expose rank for per op output 15:51:02 ... - represent global tensor constraint via opSupportLimits 15:51:28 I can write a PR if Josh hasn't started 15:51:36 Josh: let's spec this in the opSupportLimits 15:51:50 I have no open PRs, not starting anything new right now 15:51:54 q? 15:52:01 Subtopic: Rounding 15:52:06 anssik: issue #817 15:52:06 https://github.com/webmachinelearning/webnn/issues/817 -> Issue 817 Rounding operators (by fdwr) [feature request] [interop] 15:52:12 ... this issue has extensive background research by Dwayne, thanks again! 15:52:43 ... the TLDR: add one function that is consistent with IEEE rounding mode, express decomposition for quantizeLinear operator 15:52:57 ... the remaining open questions seem to be round behavior on Core ML NPU/ANE 15:53:02 ... per Ningxin's experiment the rounding behavior is inconsistent between ANE/NPU and CPU 15:53:14 ... ANE/NPU uses rounding away from zero 15:53:18 ... Dwayne asked whether it is round *half* away from zero (RHAZ) 15:53:22 -> RAZ https://en.wikipedia.org/wiki/Rounding#Rounding_away_from_zero 15:53:27 -> RHAZ https://en.wikipedia.org/wiki/Rounding#Rounding_half_away_from_zero 15:53:33 anssik: do we know which it is? 15:53:57 ... is the proposal to add the round operator with a note to implementers they should emulate this due to RAZ/RHAZ inconsistency between CPU and NPU? 15:54:30 Dwayne: Ningxin probably meant RHAZ too, this is low-level, Core ML is the only one that has a potential issue 15:54:45 ... fundamentals to emulate this with only 2 ops 15:55:16 anssik: hearing no concerns to add this op, emulation path performant 15:55:21 ... any comments? 15:55:29 Dwayne: thanks for your research! 15:55:48 Dwayne: I'll do the PR 15:55:55 Thanks Dwayne 15:56:25 Subtopic: isNaN op proposal 15:56:29 anssik: issue #811 15:56:30 https://github.com/webmachinelearning/webnn/issues/811 -> Issue 811 Behavior when there are NaNs in argmin/max inputs (by philloooo) [interop] 15:57:01 Dwayne: several ops have this op, I guess this would benefit from PR 15:58:06 Topic: Caching mechanism for MLGraph 15:58:12 anssik: issue #807 15:58:13 https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request] 15:58:23 ... with Reilly here, I wanted to revisit prototype implementation findings to reinvigorate work on the explainer 15:58:28 ... we have an initial implementation based on Chromium and ORT, and sample code on how this integrates into existing sample 15:58:35 ... there's also a Chromium Design Doc, but I'm not sure if that has been shared with the group yet? 15:58:55 ... I recall Reilly commented he'd take a stab at the explainer based on the implementation 15:59:34 Reilly: I recall offering to write the explainer, I though Mike already put something out there 15:59:44 ... I can take an action to write this explainer 16:00:06 ... as for implementation experience, there's Chromium Design Doc, Ningxin do you have updates for the prototype? 16:01:30 ningxin: I'll ensure the Chromium + ORT based Design Doc is public, for prototype status, we saved the compiled model using the ORT compiled model, within the GPU process, we haven't made it work with GPU process yet 16:01:57 ... based on that API sketch proposed by Reilly we experimented with ORT Web integration and would like to get more experience how the AI framework can utilize this feature 16:02:13 ... can share early prototype of that with the group 16:02:38 ... even if not in real model cache storage managed by the browser process, with saved to disk, we got good performance gain 16:02:52 ... also discussing with ORT people how to reduce memory and disk overhead 16:03:50 ... Reilly proposal separates build from save operation, source model must be kept after build, to allow saving the graph later, or save to temporary place on the disk 16:04:14 ... this seems not very ideal, new idea from Rafael is to help overcome that issue 16:04:31 Rafael: recap, once you create a session from model building, key piece of information is not present 16:04:52 ... later you may want to save the model, Ningxin proposes to keep the information to allow save later 16:05:04 ... or have "build and save" at the same time 16:05:20 ... to use memory efficiently 16:05:21 q? 16:06:23 Reilly: the design I made was based on how Core ML and TFLite backends work, the model has to remain on disk 16:06:46 ... I guess the question to Rafael is re ORT implementation, is there a change to the design that makes this more efficient? 16:07:15 Rafael: yes, it would help to force developers to decide at build time whether to save at the same time 16:07:33 ... if we get more feedback from developers, we can drive that if it is a MUST requirement 16:08:01 Reilly: the cost of saving and deleting is minimum, if the system forces to do both at the same time and delete the file later would be reasonable 16:08:18 Rafael: cost of keeping the data around that may be needed later is the question 16:08:35 Reilly: I guess the answer is no per Ningxin's work 16:09:24 Subtopic: Query supported devices before graph compilation 16:09:29 anssik: issue #815 16:09:30 https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query supported devices before graph compilation (by anssiko) [device selection] 16:09:45 zkis: I will update the explainer, will submit a PR for the group to review 16:10:00 RRSAgent, draft minutes 16:10:02 I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik 16:15:12 s/Subtopic: Query supported devices before graph compilation/Topic: Query supported devices before graph compilation 16:15:15 RRSAgent, draft minutes 16:15:16 I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik 16:17:03 s/group growing/group is growing 16:18:18 s/couch/coach 16:23:15 s/for particular device/a particular device 16:24:04 s/an implementations/implementations 16:24:20 s/the questions/the question 16:29:13 s/several ops/several backends 16:29:59 s/I though/I recall 16:30:55 s/within the GPU process, // 16:32:53 RRSAgent, draft minutes 16:32:54 I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik 16:37:00 s/… - new/anssik: - new 16:37:02 RRSAgent, draft minutes 16:37:03 I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik 16:37:41 s/- new device selection/anssik: - new device selection 16:37:42 RRSAgent, draft minutes 16:37:43 I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik 18:23:07 Zakim has left #webmachinelearning