14:52:49 <RRSAgent> RRSAgent has joined #webmachinelearning
14:52:54 <RRSAgent> logging to https://www.w3.org/2024/11/14-webmachinelearning-irc
14:52:54 <Zakim> RRSAgent, make logs Public
14:52:55 <Zakim> please title this meeting ("meeting: ..."), anssik
14:52:59 <anssik> Meeting: WebML WG Teleconference – 14 November 2024
14:53:06 <anssik> Chair: Anssi
14:53:20 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2024-11-14-wg-agenda.md
14:53:27 <anssik> Scribe: Anssi
14:53:32 <anssik> scribeNick: anssik
14:55:42 <McCool> McCool has joined #webmachinelearning
14:56:02 <anssik> Present+ Anssi_Kostiainen
14:56:05 <anssik> Present+ Michael_McCool
14:56:22 <anssik> RRSAgent, draft minutes
14:56:23 <RRSAgent> I have made the request to generate https://www.w3.org/2024/11/14-webmachinelearning-minutes.html anssik
14:59:52 <anssik> Present+ Dwayne_Robinson
14:59:59 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
15:00:53 <anssik> Present+ Mike_Wyrzykowski
15:01:08 <anssik> Present+ Bryan_Bernhart
15:01:17 <anssik> Present+ Joshua_Bell
15:01:21 <jsbell> jsbell has joined #webmachinelearning
15:01:29 <anssik> Present+ Zoltan_Kis
15:01:30 <ningxin> ningxin has joined #webmachinelearning
15:01:33 <dwayner> dwayner has joined #webmachinelearning
15:01:47 <anssik> Present+ Rafael_Cintron
15:01:52 <anssik> Present+ Ningxin_Hu
15:02:13 <anssik> Present+ Christian_Liebel
15:02:36 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:02:55 <anssik> Present+ Austin_Sullivan
15:03:50 <anssik> RRSAgent, draft minutes
15:03:51 <RRSAgent> I have made the request to generate https://www.w3.org/2024/11/14-webmachinelearning-minutes.html anssik
15:03:57 <zkis> zkis has joined #webmachinelearning
15:04:24 <anssik> ... Welcome to our new participant Kaushik Satpathy from Yahoo!
15:04:43 <anssik> Topic: Announcements
15:04:51 <anssik> anssik: implementation status of WebNN has been updated, thank you all for the great progress!
15:04:56 <anssik> -> Implementation Status of WebNN https://webmachinelearning.github.io/webnn-status/
15:05:09 <anssik> -> https://github.com/webmachinelearning/webmachinelearning.github.io/pull/84
15:05:10 <gb> https://github.com/webmachinelearning/webmachinelearning.github.io/pull/84 -> MERGED Pull Request 84 November update for Impl Status (by ibelem)
15:05:28 <anssik> anssik: also Awesome WebNN, a curated list of awesome things related to the WebNN API, has received updates
15:05:31 <anssik> -> Awesome WebNN https://github.com/webmachinelearning/awesome-webnn
15:05:35 <anssik> -> https://github.com/webmachinelearning/awesome-webnn/pull/11
15:05:36 <gb> https://github.com/webmachinelearning/awesome-webnn/pull/11 -> MERGED Pull Request 11 November 2024 Update (by ibelem)
15:06:08 <anssik> anssik: please share this reference to people interested in this topic for the latest articles, demos, presentations, samples, tutorials, videos and more about WebNN and the ecosystem around it
15:06:44 <anssik> Topic: Call for review: WebML Community Group Charter update
15:06:53 <anssik> gb, this is webmachinelearning/charter
15:06:53 <gb> anssik, OK.
15:07:02 <anssik> anssik: on 2024-11-01 we initiated a call for review of the Web Machine Learning Community Group Charter update, open until 2024-12-02
15:07:08 <anssik> ... folks who are also WebML Community Group participants are encouraged to review the Charter proposal
15:07:14 <anssik> ... to be eligible to vote, you must be a CG participant
15:07:18 <anssik> -> WebML CG Charter update, vote by 2024-12-02 https://lists.w3.org/Archives/Public/public-webmachinelearning/2024Nov/0000.html
15:07:27 <anssik> -> How to join the CG: https://webmachinelearning.github.io/community/#join
15:07:49 <anssik> ... Summary of changes: refresh Goals, add Task-specific APIs and Prompt API to Deliverables, note WebNN has graduated to the WG
15:07:57 <anssik> ... for more information about the proposed task-specific APIs, please refer to the TPAC 2024 presentation:
15:08:01 <anssik> -> TPAC 2024 slides for task-specific APIs https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0008/TPAC_2024_Built-in_AI_APIs.pdf
15:08:04 <anssik> anssik: and the GH repos for the proposals:
15:08:07 <christianliebel5> christianliebel5 has joined #webmachinelearning
15:08:07 <anssik> -> https://github.com/WICG/translation-api
15:08:10 <anssik> -> https://github.com/WICG/writing-assistance-apis
15:08:16 <anssik> -> https://github.com/explainers-by-googlers/prompt-api
15:08:23 <anssik> anssik: any questions?
15:08:52 <anssik> Topic: Device selection abstractions
15:08:57 <anssik> gb, this is webmachinelearning/webnn
15:08:57 <gb> anssik, OK.
15:09:17 <anssik> anssik: Zoltan synthesized the group's current thinking and discussions into a device selection explainer, thanks!
15:09:26 <anssik> ... issues #749 and PR #784
15:09:26 <gb> https://github.com/webmachinelearning/webnn/issues/749 -> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection]
15:09:26 <gb> https://github.com/webmachinelearning/webnn/pull/784 -> Pull Request 784 Add device selection explainer (WiP) (by zolkis)
15:09:58 <anssik> ... it discusses intro, history, key use cases and requirements, considered alternatives, examples and design
15:10:02 <anssik> ... and open questions
15:10:43 <anssik> ... the doc is written so that we can hand this to folks outside this group for review, e.g. privacy, TAG etc.
15:11:14 <anssik> zolkis: this is a collection of thoughts from GH issues, it is WIP still
15:11:41 <anssik> ... Ningxin and Chai contributed in the early phase, Rafael, Joshua, MikeW
15:12:04 <anssik> -> Explainer https://github.com/webmachinelearning/webnn/blob/6f73ebb38a4aa4670805cdc7e88eeb6223b387fe/explainer-device-selection.md
15:12:53 <anssik> zolkis: MikeW's proposal is in considered alternatives, fingerprinting story the remaining concern
15:13:16 <anssik> ... we can possibly go with Mike's proposal if we give more examples
15:14:07 <anssik> anssiko: considered alternatives:
15:14:11 <anssik> ... 1. Keep the current MLDeviceType as a context option, but improve the device type names
15:14:37 <Mike_Wyrzykowski> q+
15:14:39 <anssik> ... 2. Follow this proposal MLOpSupportLimits should be opt-in per #759
15:14:40 <gb> https://github.com/webmachinelearning/webnn/issues/759 -> Issue 759 MLOpSupportLimits should be opt-in with base functionality (by mwyrzykowski) [device selection]
15:14:48 <anssik> q?
15:14:51 <anssik> ack Mike_Wyrzykowski
15:15:13 <anssik> MikeW: great document, thanks for writing this! Option 1 is simpler, could flesh that out right now
15:16:20 <jsbell> jsbell has joined #webmachinelearning
15:16:22 <anssik> zolkis: listing of opLimits is inside the context, if we move it out then we know what the underlying platform is capable of, can match with the model to run
15:16:49 <anssik> ... we need more concrete examples for Option 2, whether go with full WebGPU adapter approach
15:17:15 <anssik> ... Option 2 could come after Option 1
15:17:25 <anssik> q?
15:17:25 <AramZS> AramZS has joined #webmachinelearning
15:18:36 <anssik> Dwayne: would Option 1 be relaxing the device type to be a hint?
15:19:22 <anssik> zolkis: PowerPerformance would provide the hint that the implementation could use to map to underlying processing unit(s), or could rename the MLDeviceType
15:21:05 <anssik> zolkis: let's solicit more use cases to have a complete view
15:21:36 <zkis> q?
15:22:51 <anssik> anssik: for feedback, please use the PR #784
15:22:51 <gb> https://github.com/webmachinelearning/webnn/pull/784 -> Pull Request 784 Add device selection explainer (WiP) (by zolkis)
15:23:43 <anssik> Topic: MLTensor
15:23:50 <anssik> anssik: The group is gathering implementation experience on the MLTensor design to inform an upcoming specification update.
15:23:57 <anssik> ... The explainer is considered the source of truth in this prototyping phase.
15:24:11 <anssik> ... I'd like to discuss the open questions, currently 5
15:24:36 <anssik> -> https://github.com/webmachinelearning/webnn/blob/main/mltensor-explainer.md#open-questions
15:25:21 <anssik> Austin: not blocking forward progress, the first bullet has come up in the Chromium implementation, we deprecated compute() for dispatch(), and if there's an error you don't find about it
15:25:29 <anssik> Subtopic: How will errors be surfaced?
15:25:33 <anssik> anssik: issue #477
15:25:34 <gb> https://github.com/webmachinelearning/webnn/issues/477 -> CLOSED Issue 477 API lacks handling for async ML device errors on the context (by bbernhar) [question]
15:26:58 <anssik> Bryan: want to understand how backends can surface errors midway?
15:27:33 <anssik> Austin: for many backends peak memory usage can exceed what's available, you could OOM while inferencing
15:28:02 <anssik> ... seeing failures on some backends because we have implementation gaps, a few classes of errors
15:28:47 <anssik> ... e.g. trying to allocate too much memory, model file is deleted from disk, compile the model and think everything is good but things change undernearth
15:29:37 <anssik> ... or if you do scatter or gather with OOB indices
15:30:34 <jsbell> https://github.com/webmachinelearning/webnn/issues/778 has Austin's observations about types of errors
15:30:35 <gb> https://github.com/webmachinelearning/webnn/issues/778 -> Issue 778 Proposal: Report non-fatal errors from the WebNN timeline (by a-sully) [feature request]
15:31:15 <anssik> Austin: CoreML backend may have higher peak memory usage during inferencing than after compile
15:31:37 <anssik> Bryan: writeTensor() has the same issues as dispatch()
15:32:56 <anssik> (this is expanded in issue #778)
15:32:56 <gb> https://github.com/webmachinelearning/webnn/issues/778 -> Issue 778 Proposal: Report non-fatal errors from the WebNN timeline (by a-sully) [feature request]
15:33:53 <anssik> q?
15:34:37 <anssik> Topic: Core op set & MLIR Linalg mapping
15:34:49 <anssik> anssik: issue #573
15:34:50 <gb> https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset]
15:35:25 <anssik> ... this topic is to discuss core op set, primitive ops informed by MLIR Linalg, PyTorch Prims IR, TOSA, StableHLO others.
15:35:39 <anssik> ... I propose we look at the MLIR Linalg mapping today
15:35:45 <anssik> ... Dwayne contributed a preliminary analysis of op correspondence (thanks!):
15:35:50 <anssik> -> Machine Learning Operator Mapping https://onedrive.live.com/edit?id=EE82F5C6F06C7371!345450&resid=EE82F5C6F06C7371!345450&ithint=file%2cxlsx&authkey=!AK8f-RDTleqlLXE&wdo=2&cid=ee82f5c6f06c7371
15:36:20 <anssik> anssik: Dwayne notes WebNN demonstrates viability of popular models, but it lacks breadth
15:36:31 <jsbell> jsbell has joined #webmachinelearning
15:37:18 <anssik> ... implementing all the 800+ ops is untenable due to interop requirements (multiple browsers, multiple underlying platforms and backends), this is why we are investigating what make for an appropriate set of primitive ops to allow composition
15:38:01 <anssik> Dwayne: no firm recommendations, but some categories that are absent
15:38:28 <anssik> ... WebNN backend support 1D to 3D
15:38:46 <anssik> ... modular div, rounding, bitwise
15:39:10 <anssik> ... composite ops, lego blocks for decomposition of other ops, e.g. sumPooling
15:40:17 <anssik> ... sheet legend: yellow = absent; red = not interesting, it's a named variant
15:41:15 <jsbell> q+
15:41:40 <anssik> Dwayne: yellow = worth adding to WebNN
15:41:43 <anssik> ack jsbell
15:42:00 <anssik> Joshua: have you done analysis what backends in Chromium support these?
15:42:11 <anssik> Dwayne: it is future work
15:42:48 <anssik> Joshua: do we look at the current backends, or look at primitive core ops on top of which everything can be constructed on
15:43:32 <anssik> q?
15:44:29 <anssik> anssik: how could the group help?
15:44:41 <anssik> Dwayne: CoreML backend and TFLite support for these would be welcome
15:45:03 <anssik> q?
15:48:02 <anssik> Topic: Support reverse operator
15:48:07 <anssik> anssik: issue #773
15:48:07 <gb> https://github.com/webmachinelearning/webnn/issues/773 -> Issue 773 Support `reverse` operator (by huningxin) [feature request] [operator specific]
15:48:11 <anssik> ... Ningixin proposed a reverse op that reverses the order of the input tensor along specified axes
15:48:15 <anssik> ... improves performance of PyTorch models
15:48:19 <anssik> ... framework support PT, TF, ONNX (with reverse slicing with step -1)
15:48:54 <anssik> ... native APIs DML (simiarly to ONNX), CoreML, TFLite
15:49:02 <anssik> ... also in primitive opsets StableHLO, TOSA, PT Prims
15:49:15 <RafaelCIntron> RafaelCIntron has joined #webmachinelearning
15:50:02 <anssik> Ningxin: we have Chromium CL to prototype
15:50:57 <anssik> Subtopic: Support strides option for slice operator
15:51:08 <anssik> anssik: issue #772
15:51:09 <gb> https://github.com/webmachinelearning/webnn/issues/772 -> Issue 772 Support strides option for `slice` operator (by huningxin) [feature request] [operator specific]
15:51:35 <anssik> ... Ningxin reports stride option for the slice operator is widely supported, but WebNN's flavour only support stride of 1
15:51:38 <anssik> ... real-world models with stride > 1 cause WebNN to fallback to other EP impacting performance
15:52:05 <anssik> anssik: can we link to some sample models in this issue?
15:52:49 <anssik> Ningxin: the model is a transformer-based model with some customization, possibly not shareable yet
15:53:01 <anssik> s/Subtopic: Support/Topic: Support
15:54:58 <anssik> Dwayne: it is an audio model, I can share that it is for speech recognition usage, consider it a Whisper variant or sorts
15:55:34 <anssik> Topic: Support block-wise quantization
15:55:38 <anssik> anssik: issue #779
15:55:38 <gb> https://github.com/webmachinelearning/webnn/issues/779 -> Issue 779 Support block-wise quantization (by huningxin) [operator specific]
15:55:45 <anssik> anssik: request to support block-wise quantization
15:56:08 <anssik> ... allows input tensors be divided into smaller independently quantized blocks, used by SLMs
15:56:19 <anssik> ... benefits include faster optimization and high precision quantization
15:56:31 <anssik> ... DML and CoreML support, it seems no TFLite/LiteRT?
15:56:35 <anssik> ... Dwayne suggests a decomp path is viable?
15:56:48 <anssik> ... no API signature changes, only changes to the algorithm
15:57:54 <anssik> Ningxin: we have a prototype in Chromium for DML backend and we successfully used that to enable Phi3-mini with this capability
15:58:26 <anssik> ... this prototype is successful
15:58:29 <jsbell> q+
15:58:33 <anssik> ack jsbell
15:58:56 <anssik> Joshua: would be great if you could share in the issue performance improvements "X times faster"
15:59:24 <anssik> Ningxin: for TF we need a composition, can file an issue for TFLite
15:59:36 <anssik> ... I can follow up on that
15:59:39 <dwayner> dwayner has joined #webmachinelearning
15:59:55 <anssik> ... for CoreML, Austin can comment
16:00:26 <anssik> Austin: more constraints than DML, block-wise quant is theoretically supportable
16:00:57 <anssik> Ningxin: Phi3-mini uses block-wise quantization and will hit the CoreML implementation
16:01:10 <dwayner> The memory savings were huge (I forget the exact numbers before/after, but IIRC 20GB's before o_o).
16:01:14 <anssik> q?
16:01:20 <jsbell> If any TPAC attendees can answer the question in https://github.com/webmachinelearning/webnn/issues/470#issuecomment-2475325615 please chime in
16:01:21 <gb> https://github.com/webmachinelearning/webnn/issues/470 -> Issue 470 Simplify `matmul` op (by huningxin) [operator specific]
16:02:19 <anssik> RRSAgent, draft minutes
16:02:21 <RRSAgent> I have made the request to generate https://www.w3.org/2024/11/14-webmachinelearning-minutes.html anssik
16:08:26 <anssik> s/CoreML backend and/investigation on CoreML backend and
16:09:33 <anssik> s/or sorts/of sorts
16:11:17 <anssik> RRSAgent, draft minutes
16:11:19 <RRSAgent> I have made the request to generate https://www.w3.org/2024/11/14-webmachinelearning-minutes.html anssik
18:10:30 <Zakim> Zakim has left #webmachinelearning
19:14:35 <gb> gb has joined #webmachinelearning