14:59:39 <RRSAgent> RRSAgent has joined #webmachinelearning
14:59:43 <RRSAgent> logging to https://www.w3.org/2025/05/22-webmachinelearning-irc
14:59:43 <Zakim> RRSAgent, make logs Public
14:59:44 <Zakim> please title this meeting ("meeting: ..."), anssik
14:59:47 <anssik> Meeting: WebML WG Teleconference – 22 May 2025
14:59:54 <anssik> Chair: Anssi
15:00:02 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-05-22-wg-agenda.md
15:00:18 <anssik> Scribe: Anssi
15:00:28 <anssik> scribeNick: anssik
15:00:34 <anssik> gb, this is webmachinelearning/webnn
15:00:34 <gb> anssik, OK.
15:00:44 <anssik> Present+ Anssi_Kostiainen
15:00:52 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:00:52 <anssik> Regrets+ Christian_Liebel
15:01:07 <anssik> Present+ Dwayne_Robinson
15:01:15 <anssik> Present+ Joshua_Bell
15:01:23 <anssik> Present+ Laszlo_Gombos
15:01:27 <lgombos3> lgombos3 has joined #webmachinelearning
15:01:29 <anssik> Present+ Rafael_Cintron
15:01:41 <lgombos3> Present+ Laszlo_Gombos
15:02:12 <anssik> Present+ Reilly_Grant
15:02:21 <anssik> Present+ Ningxin_Hu
15:02:33 <anssik> Present+ Zoltan_Kis
15:02:36 <ningxin> ningxin has joined #webmachinelearning
15:02:50 <anssik> Present+ Tarek_Ziade
15:03:43 <anssik> RRSAgent, draft minutes
15:03:44 <RRSAgent> I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik
15:03:52 <anssik> anssik: please welcome our new WebML WG participants:
15:04:12 <anssik> ... Jonathan Schneerson from Temporal Series AI, AI startup specialized on time-dependent data from financial transactions, sensor streams etc.
15:04:39 <anssik> ... Peter Tanski and Suraj Bisht from Capital One Financial, a financial services company, also an early adopter of forward-looking web capabilities e.g. Web NFC API for authentication
15:05:20 <anssik> anssik: as our group growing, at the same time, with mixed emotions, I'm sharing that one esteemed participant is taking a break from work
15:06:39 <anssik> Josh: Hi! I'm departing Google, will move to another country and no future plans yet, enjoyed working with this group truly, will remain contactable through personal email and IRC etc.
15:07:09 <jsbell> me: inexorabletash AT gmail DOT com
15:07:46 <RafaelCintron> q+
15:08:14 <ningxin> Josh, thanks so much for your tremendous contribution to this WG and WebNN spec!
15:08:44 <anssik> ack RafaelCintron
15:08:49 <anssik> anssik: thank you Josh for everything!
15:09:08 <ningxin> q+
15:09:12 <anssik> Rafael: the spec has tremendously benefited from your work, I've learned a lot from you!
15:09:14 <anssik> ack ningxin
15:09:38 <anssik> ningxin: I will echo Rafael and Anssi, thank you so much for your tremendous work for this group!
15:10:02 <anssik> ... you've been a couch for me as an editor, really appreciate that and thank you and wish you a great next chapter!
15:10:53 <zkis> Thanks Josh! It started as a great run / job together on merging a lot of algorithms, which you have single-handedly improved in many iterations. I learned a lot in that process. Thank you!
15:11:28 <anssik> q?
15:11:47 <anssik> Topic: Incubations
15:11:52 <jsbell> Thanks Zoltan and all!
15:12:06 <anssik> anssik: on our upcoming Community Group EU-APAC Mon 26 May agenda for we have:
15:12:23 <anssik> ... Prompt API implementation experience from AiBrow
15:12:48 <anssik> ... New proposals: Web AI for Time Series, (recap) Local Inference Web extension
15:12:59 <anssik> ... Proofreader API kick off
15:13:12 <anssik> ... Prompt API security and privacy
15:13:17 <anssik> ... See the CG agenda for more references
15:13:20 <anssik> -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-05-26-cg-agenda.md
15:14:00 <anssik> Topic: Google I/O and MS Build 2025 takeaways
15:14:18 <anssik> anssik: both the events were (unsurprisingly) AI heavy, a few observations I think are of interest to this group:
15:14:33 <anssik> Subtopic: Built-in AI APIs
15:14:44 <anssik> anssik: both Edge and Chrome made announcements around Built-in AI APIs being worked on in the WebML CG
15:14:52 <anssik> "Enabled by default: Prompt API for Chrome Extensions, Summarizer API, Translator API, Language Detector API; Origin trials: Writer API, Rewriter API; Early preview: Proofreader API"
15:15:19 <anssik> -> Google I/O built-in AI APIs announcement https://www.youtube.com/watch?v=GjvgtwSOCao&t=2687s
15:15:31 <Winstonc> Winstonc has joined #webmachinelearning
15:16:06 <Tarek> Tarek has joined #webmachinelearning
15:16:09 <anssik> Josh: other thing demonstrated was multimodal Prompt API use, processing of image and audio as inputs, that's in early preview stage
15:17:03 <anssik> "The Prompt API and Writing Assistance APIs — now available as developer previews in Edge Canary and Dev channels"
15:17:21 <anssik> -> Microsoft Build built-in AI APIs announcement https://blogs.windows.com/msedgedev/2025/05/19/introducing-the-prompt-and-writing-assistance-apis/
15:17:40 <anssik> anssik: notably, Prompt API in Edge developer preview available to web apps and pages, not just to extensions
15:18:03 <anssik> q?
15:18:35 <anssik> Rafael: that's well covered Anssi, nothing to add
15:19:48 <anssik> Subtopic: Windows ML
15:20:00 <anssik> anssik: at Build Microsoft announced Windows ML as an evolution of DirectML
15:20:05 <anssik> -> Windows ML announcement https://blogs.windows.com/windowsdeveloper/2025/05/19/introducing-windows-ml-the-future-of-machine-learning-development-on-windows/
15:20:20 <anssik> anssik: this is relevant to the group from the WebNN implementation perspective, an opportunity to gather further implementation experience
15:20:26 <anssik> ... based on what was announced at Build:
15:20:37 <anssik> ... Windows ML promises to simplify dependency management on Windows
15:20:58 <anssik> ... vendor-specific execution providers are part of the Windows ML and updated by the OS
15:21:26 <anssik> RafaelCintron: that is correct, it will be ONNX Runtime based
15:24:18 <anssik> Josh: question to the group is, should we update the explainer WebNN architecture?
15:25:01 <anssik> ... - new device selection mechanism that supports hint-based, explicit selection, and automatic selection
15:25:08 <DwayneR> "WinML" is a WinRT-based wrapper atop ONNX Runtime (several years old). It's a fairly thin wrapper, plus some additional support for video frames and image conversion to input tensors. It only supported CPU and DirectML EP's.
15:25:08 <DwayneR> "WindowsML" is a Windows-specific fork of ONNX Runtime, directly calling the ORT API (with some slight renamings in the header). It supports multiple EP's.
15:26:08 <anssik> Topic: Operator specific issues
15:26:13 <anssik> anssik: as usual, we focus our review and discussion on operator specific issues
15:26:17 <anssik> -> [operator specific] issues https://github.com/webmachinelearning/webnn/labels/operator%20specific
15:26:22 <anssik> Subtopic: int64 data type
15:26:34 <anssik> anssik: I wanted us to take a look at various float64 data type related issues and PRs to check we're all aligned
15:26:39 <anssik> ... issue #283 fixed by PR #646
15:26:40 <gb> https://github.com/webmachinelearning/webnn/pull/646 -> MERGED Pull Request 646 Specify the operand data type constraints of operations (by inexorabletash)
15:26:40 <gb> https://github.com/webmachinelearning/webnn/issues/283 -> CLOSED Issue 283 Specify the operand data type constraints of operation (by huningxin) [question]
15:26:47 <anssik> ... introduced constraints for input operands (thanks Josh!)
15:26:58 <anssik> ... issue #694 has a draft PR #695
15:26:58 <gb> https://github.com/webmachinelearning/webnn/issues/694 -> Issue 694 Consider adding int64/uint64 data type support for some reduce operators (by lisa0314) [operator specific]
15:26:59 <gb> https://github.com/webmachinelearning/webnn/pull/695 -> Pull Request 695 Bugfix: Add missing 64-bit integers support for some reduction operators (by huningxin) [operator specific]
15:27:11 <anssik> ... to add int64/uint64 support for reduce ops
15:27:19 <anssik> ... the PR awaits Mike's response to a question: "should we also allow optional 64-bits integers support for these reduction ops?"
15:27:50 <anssik> Present+ Winston_Chen
15:28:26 <anssik> Ningxin: before changing this PR to draft due to opSupportLimits, in last meeting we heard Microsoft's feedback for sign for optional support for int64
15:28:39 <anssik> ... I'd propose to open this PR for review
15:28:49 <reillyg> a+
15:28:51 <reillyg> q+
15:28:56 <anssik> ack reillyg
15:29:34 <anssik> Reilly: at TPAC we discussed minimum data type set implementable across all Chromium backends, this might be a sign we should commence with that
15:29:55 <anssik> ... this should also allow us to clean up many wpt failures, we could hard-fail ops that do not support the minimum set
15:30:08 <anssik> ... and keep some ops optional
15:30:18 <anssik> [ thumbs up from Dwayne ]
15:30:34 <anssik> Reilly: some ops have no overlapping data types, that is an issue
15:31:12 <ningxin> q+
15:31:15 <anssik> ... I'm not blocking re-opening PR #695
15:31:16 <gb> https://github.com/webmachinelearning/webnn/pull/695 -> Pull Request 695 Bugfix: Add missing 64-bit integers support for some reduction operators (by huningxin) [operator specific]
15:31:17 <anssik> ack ningxin
15:31:34 <anssik> ningxin: so for Reilly's proposal, should we have a separate issue?
15:31:48 <anssik> ... we should also consider device, CPU and GPU device may have different data type support for the same op?
15:31:52 <reillyg> q+
15:31:57 <anssik> ack RRSAgent
15:32:00 <anssik> ack reillyg
15:32:27 <anssik> Reilly: I guess, we have to do the analysis first, I expect if we include data type we find there are more data types not supported across all device types
15:32:42 <anssik> ... the proposal is we should consider device type selection optional
15:32:59 <anssik> ... and intersections should not consider data type, be orthogonal
15:33:25 <anssik> ... there's a separate question how we communicate to developer for particular device type need to use specific data type
15:33:46 <anssik> ... if we try to consider all these things at once, we can't come up with a useful op set
15:34:10 <anssik> ... what ops an implementations must support is the questions?
15:34:31 <anssik> ... this might force all implementations to support CPU and GPU always and rely on feature detection to find out NPU support
15:34:52 <zkis> q+
15:34:59 <anssik> ... "I prefer to run on NPU, give me the available data types for that"
15:35:32 <anssik> ... should focus on compatibility first, models that will surely execute
15:35:55 <anssik> ningxin: compatibility means native framework compatibility, that is separate from the device?
15:36:00 <anssik> Reilly: correct
15:36:02 <anssik> ack zkis
15:36:47 <anssik> zkis: I think with Windows ML announcement we can reiterate the device selection design, hints-based vs. explicit, it seems the current hints based is a subset of what Windows ML supports
15:38:18 <anssik> anssik: issue #845 was fixed by PR #848
15:38:19 <gb> https://github.com/webmachinelearning/webnn/pull/848 -> MERGED Pull Request 848 Bugfix: Support `int64` for `abs`, `neg`, `sign`, `prelu` and `relu` (by huningxin)
15:38:19 <gb> https://github.com/webmachinelearning/webnn/issues/845 -> CLOSED Issue 845 The allowed data types of input operand for `sign` operator should also include `int64` type (by BruceDai) [operator specific]
15:39:22 <ningxin> I propose to open a sperate issue for Reilly's proposal of minimum data type set if there is not an existing one
15:40:21 <ningxin> sg
15:40:24 <anssik> Reilly: Ningxin feel free to open an issue for this proposal
15:40:36 <anssik> Subtopic: triangular
15:40:39 <anssik> anssik: issue #768
15:40:40 <gb> https://github.com/webmachinelearning/webnn/issues/768 -> Issue 768 Consider removing or redesigning the `triangular` operator (by a-sully) [operator specific]
15:40:52 <anssik> ... JoshuaL shared new per-model Trilu op count data (thanks!) so I wanted us to discuss this as a group
15:40:56 <anssik> ... Dwayne shared his observations in the issue: "Most of these models contain just one instance of trilu, but that is a substantial percentage of model"
15:41:19 <anssik> anssik: does this new data support the proposal to remove triangular op from the spec?
15:41:23 <anssik> q?
15:41:36 <anssik> Dwayne: the additional data encourages keeping the triangular op
15:41:51 <anssik> ... but need to consider how many backends have support for this op
15:42:27 <anssik> ... I'm inclined to keep this op now, unless we have a better understanding of the Core ML decomposition
15:42:52 <anssik> ... decomposition is possible with large triangular matrices without taking a lot of memory
15:43:48 <anssik> Reilly: as fuzzing the implementation, the fuzzer was able to find huge matrices with masks, not existing in practical models
15:44:08 <anssik> ... more implementation work required to do this, we could compute the mark as inference time than bake it via the generated model
15:44:33 <anssik> Dwayne: computable at runtime, Core ML should be able to decompose this on the fly, I have details in the issue
15:44:49 <anssik> ... did you encounter input of big size?
15:45:11 <anssik> Reilly: fuzzers do generate huge inputs
15:45:19 <anssik> Dwayne: for security perspective?
15:45:28 <anssik> Reilly: correct, to identify corner cases
15:45:48 <anssik> Dwayne: any reservation to support this op?
15:46:07 <anssik> Reilly: no as long as it is secure and efficient to implement
15:46:36 <anssik> q?
15:46:56 <anssik> Subtopic: opSupportLimits level of detail for output tensor(s)
15:47:00 <anssik> anssik: issue #835
15:47:01 <gb> https://github.com/webmachinelearning/webnn/issues/835 -> Issue 835 opSupportLimits: Level of detail for output tensor(s)? (by inexorabletash) [question]
15:47:31 <anssik> ... Josh explains: "there are a variety of opinions about how much detail opSupportLimits() should include for the output tensors."
15:48:16 <anssik> Josh: just do the bare minimum is one approach, understandability is the burden, the question is, in all the ops it is possible to determine the shape and data type from the algorithm, do we rely on that?
15:48:53 <anssik> ... the most recent comment whether we should include this data is from Ningxin to let's add, I think this is waiting for someone to write a PR for this
15:49:19 <anssik> anssik: the proposal from Ningxin to have ranks in output got substantive support
15:49:45 <anssik> Dwayne: maybe biased from other specs, would prefer to have output and ranks for symmetry
15:50:15 <anssik> anssik:  Phillis reports the actual constraints from underlying ML frameworks are:
15:50:24 <anssik> ... - global tensor rank constraints
15:50:30 <anssik> ... - op level input rank constraints
15:50:48 <anssik> ... concludes we have two ways to expose this:
15:50:52 <anssik> ... - expose rank for per op output
15:51:02 <anssik> ... - represent global tensor constraint via opSupportLimits
15:51:28 <ningxin> I can write a PR if Josh hasn't started
15:51:36 <anssik> Josh: let's spec this in the opSupportLimits
15:51:50 <jsbell> I have no open PRs, not starting anything new right now
15:51:54 <anssik> q?
15:52:01 <anssik> Subtopic: Rounding
15:52:06 <anssik> anssik: issue #817
15:52:06 <gb> https://github.com/webmachinelearning/webnn/issues/817 -> Issue 817 Rounding operators (by fdwr) [feature request] [interop]
15:52:12 <anssik> ... this issue has extensive background research by Dwayne, thanks again!
15:52:43 <anssik> ... the TLDR: add one function that is consistent with IEEE rounding mode, express decomposition for quantizeLinear operator
15:52:57 <anssik> ... the remaining open questions seem to be round behavior on Core ML NPU/ANE
15:53:02 <anssik> ... per Ningxin's experiment the rounding behavior is inconsistent between ANE/NPU and CPU
15:53:14 <anssik> ... ANE/NPU uses rounding away from zero
15:53:18 <anssik> ... Dwayne asked whether it is round *half* away from zero (RHAZ)
15:53:22 <anssik> -> RAZ https://en.wikipedia.org/wiki/Rounding#Rounding_away_from_zero
15:53:27 <anssik> -> RHAZ https://en.wikipedia.org/wiki/Rounding#Rounding_half_away_from_zero
15:53:33 <anssik> anssik: do we know which it is?
15:53:57 <anssik> ... is the proposal to add the round operator with a note to implementers they should emulate this due to RAZ/RHAZ inconsistency between CPU and NPU?
15:54:30 <anssik> Dwayne: Ningxin probably meant RHAZ too, this is low-level, Core ML is the only one that has a potential issue
15:54:45 <anssik> ... fundamentals to emulate this with only 2 ops
15:55:16 <anssik> anssik: hearing no concerns to add this op, emulation path performant
15:55:21 <anssik> ... any comments?
15:55:29 <anssik> Dwayne: thanks for your research!
15:55:48 <anssik> Dwayne: I'll do the PR
15:55:55 <ningxin> Thanks Dwayne
15:56:25 <anssik> Subtopic: isNaN op proposal
15:56:29 <anssik> anssik: issue #811
15:56:30 <gb> https://github.com/webmachinelearning/webnn/issues/811 -> Issue 811 Behavior when there are NaNs in argmin/max inputs (by philloooo) [interop]
15:57:01 <anssik> Dwayne: several ops have this op, I guess this would benefit from PR
15:58:06 <anssik> Topic: Caching mechanism for MLGraph
15:58:12 <anssik> anssik: issue #807
15:58:13 <gb> https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request]
15:58:23 <anssik> ... with Reilly here, I wanted to revisit prototype implementation findings to reinvigorate work on the explainer
15:58:28 <anssik> ... we have an initial implementation based on Chromium and ORT, and sample code on how this integrates into existing sample
15:58:35 <anssik> ... there's also a Chromium Design Doc, but I'm not sure if that has been shared with the group yet?
15:58:55 <anssik> ... I recall Reilly commented he'd take a stab at the explainer based on the implementation
15:59:34 <anssik> Reilly: I recall offering to write the explainer, I though Mike already put something out there
15:59:44 <anssik> ... I can take an action to write this explainer
16:00:06 <anssik> ... as for implementation experience, there's Chromium Design Doc, Ningxin do you have updates for the prototype?
16:01:30 <anssik> ningxin: I'll ensure the Chromium + ORT based Design Doc is public, for prototype status, we saved the compiled model using the ORT compiled model, within the GPU process, we haven't made it work with GPU process yet
16:01:57 <anssik> ... based on that API sketch proposed by Reilly we experimented with ORT Web integration and would like to get more experience how the AI framework can utilize this feature
16:02:13 <anssik> ... can share early prototype of that with the group
16:02:38 <anssik> ... even if not in real model cache storage managed by the browser process, with saved to disk, we got good performance gain
16:02:52 <anssik> ... also discussing with ORT people how to reduce memory and disk overhead
16:03:50 <anssik> ... Reilly proposal separates build from save operation, source model must be kept after build, to allow saving the graph later, or save to temporary place on the disk
16:04:14 <anssik> ... this seems not very ideal, new idea from Rafael is to help overcome that issue
16:04:31 <anssik> Rafael: recap, once you create a session from model building, key piece of information is not present
16:04:52 <anssik> ... later you may want to save the model, Ningxin proposes to keep the information to allow save later
16:05:04 <anssik> ... or have "build and save" at the same time
16:05:20 <anssik> ... to use memory efficiently
16:05:21 <anssik> q?
16:06:23 <anssik> Reilly: the design I made was based on how Core ML and TFLite backends work, the model has to remain on disk
16:06:46 <anssik> ... I guess the question to Rafael is re ORT implementation, is there a change to the design that makes this more efficient?
16:07:15 <anssik> Rafael: yes, it would help to force developers to decide at build time whether to save at the same time
16:07:33 <anssik> ... if we get more feedback from developers, we can drive that if it is a MUST requirement
16:08:01 <anssik> Reilly: the cost of saving and deleting is minimum, if the system forces to do both at the same time and delete the file later would be reasonable
16:08:18 <anssik> Rafael: cost of keeping the data around that may be needed later is the question
16:08:35 <anssik> Reilly: I guess the answer is no per Ningxin's work
16:09:24 <anssik> Subtopic: Query supported devices before graph compilation
16:09:29 <anssik> anssik: issue #815
16:09:30 <gb> https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query supported devices before graph compilation (by anssiko) [device selection]
16:09:45 <anssik> zkis: I will update the explainer, will submit a PR for the group to review
16:10:00 <anssik> RRSAgent, draft minutes
16:10:02 <RRSAgent> I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik
16:15:12 <anssik> s/Subtopic: Query supported devices before graph compilation/Topic: Query supported devices before graph compilation
16:15:15 <anssik> RRSAgent, draft minutes
16:15:16 <RRSAgent> I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik
16:17:03 <anssik> s/group growing/group is growing
16:18:18 <anssik> s/couch/coach
16:23:15 <anssik> s/for particular device/a particular device
16:24:04 <anssik> s/an implementations/implementations
16:24:20 <anssik> s/the questions/the question
16:29:13 <anssik> s/several ops/several backends
16:29:59 <anssik> s/I though/I recall
16:30:55 <anssik> s/within the GPU process, //
16:32:53 <anssik> RRSAgent, draft minutes
16:32:54 <RRSAgent> I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik
16:37:00 <anssik> s/… - new/anssik: - new
16:37:02 <anssik> RRSAgent, draft minutes
16:37:03 <RRSAgent> I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik
16:37:41 <anssik> s/- new device selection/anssik: - new device selection
16:37:42 <anssik> RRSAgent, draft minutes
16:37:43 <RRSAgent> I have made the request to generate https://www.w3.org/2025/05/22-webmachinelearning-minutes.html anssik
18:23:07 <Zakim> Zakim has left #webmachinelearning