14:52:49 RRSAgent has joined #webmachinelearning 14:52:54 logging to https://www.w3.org/2024/11/14-webmachinelearning-irc 14:52:54 RRSAgent, make logs Public 14:52:55 please title this meeting ("meeting: ..."), anssik 14:52:59 Meeting: WebML WG Teleconference – 14 November 2024 14:53:06 Chair: Anssi 14:53:20 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2024-11-14-wg-agenda.md 14:53:27 Scribe: Anssi 14:53:32 scribeNick: anssik 14:55:42 McCool has joined #webmachinelearning 14:56:02 Present+ Anssi_Kostiainen 14:56:05 Present+ Michael_McCool 14:56:22 RRSAgent, draft minutes 14:56:23 I have made the request to generate https://www.w3.org/2024/11/14-webmachinelearning-minutes.html anssik 14:59:52 Present+ Dwayne_Robinson 14:59:59 Mike_Wyrzykowski has joined #webmachinelearning 15:00:53 Present+ Mike_Wyrzykowski 15:01:08 Present+ Bryan_Bernhart 15:01:17 Present+ Joshua_Bell 15:01:21 jsbell has joined #webmachinelearning 15:01:29 Present+ Zoltan_Kis 15:01:30 ningxin has joined #webmachinelearning 15:01:33 dwayner has joined #webmachinelearning 15:01:47 Present+ Rafael_Cintron 15:01:52 Present+ Ningxin_Hu 15:02:13 Present+ Christian_Liebel 15:02:36 RafaelCintron has joined #webmachinelearning 15:02:55 Present+ Austin_Sullivan 15:03:50 RRSAgent, draft minutes 15:03:51 I have made the request to generate https://www.w3.org/2024/11/14-webmachinelearning-minutes.html anssik 15:03:57 zkis has joined #webmachinelearning 15:04:24 ... Welcome to our new participant Kaushik Satpathy from Yahoo! 15:04:43 Topic: Announcements 15:04:51 anssik: implementation status of WebNN has been updated, thank you all for the great progress! 15:04:56 -> Implementation Status of WebNN https://webmachinelearning.github.io/webnn-status/ 15:05:09 -> https://github.com/webmachinelearning/webmachinelearning.github.io/pull/84 15:05:10 https://github.com/webmachinelearning/webmachinelearning.github.io/pull/84 -> MERGED Pull Request 84 November update for Impl Status (by ibelem) 15:05:28 anssik: also Awesome WebNN, a curated list of awesome things related to the WebNN API, has received updates 15:05:31 -> Awesome WebNN https://github.com/webmachinelearning/awesome-webnn 15:05:35 -> https://github.com/webmachinelearning/awesome-webnn/pull/11 15:05:36 https://github.com/webmachinelearning/awesome-webnn/pull/11 -> MERGED Pull Request 11 November 2024 Update (by ibelem) 15:06:08 anssik: please share this reference to people interested in this topic for the latest articles, demos, presentations, samples, tutorials, videos and more about WebNN and the ecosystem around it 15:06:44 Topic: Call for review: WebML Community Group Charter update 15:06:53 gb, this is webmachinelearning/charter 15:06:53 anssik, OK. 15:07:02 anssik: on 2024-11-01 we initiated a call for review of the Web Machine Learning Community Group Charter update, open until 2024-12-02 15:07:08 ... folks who are also WebML Community Group participants are encouraged to review the Charter proposal 15:07:14 ... to be eligible to vote, you must be a CG participant 15:07:18 -> WebML CG Charter update, vote by 2024-12-02 https://lists.w3.org/Archives/Public/public-webmachinelearning/2024Nov/0000.html 15:07:27 -> How to join the CG: https://webmachinelearning.github.io/community/#join 15:07:49 ... Summary of changes: refresh Goals, add Task-specific APIs and Prompt API to Deliverables, note WebNN has graduated to the WG 15:07:57 ... for more information about the proposed task-specific APIs, please refer to the TPAC 2024 presentation: 15:08:01 -> TPAC 2024 slides for task-specific APIs https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0008/TPAC_2024_Built-in_AI_APIs.pdf 15:08:04 anssik: and the GH repos for the proposals: 15:08:07 christianliebel5 has joined #webmachinelearning 15:08:07 -> https://github.com/WICG/translation-api 15:08:10 -> https://github.com/WICG/writing-assistance-apis 15:08:16 -> https://github.com/explainers-by-googlers/prompt-api 15:08:23 anssik: any questions? 15:08:52 Topic: Device selection abstractions 15:08:57 gb, this is webmachinelearning/webnn 15:08:57 anssik, OK. 15:09:17 anssik: Zoltan synthesized the group's current thinking and discussions into a device selection explainer, thanks! 15:09:26 ... issues #749 and PR #784 15:09:26 https://github.com/webmachinelearning/webnn/issues/749 -> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection] 15:09:26 https://github.com/webmachinelearning/webnn/pull/784 -> Pull Request 784 Add device selection explainer (WiP) (by zolkis) 15:09:58 ... it discusses intro, history, key use cases and requirements, considered alternatives, examples and design 15:10:02 ... and open questions 15:10:43 ... the doc is written so that we can hand this to folks outside this group for review, e.g. privacy, TAG etc. 15:11:14 zolkis: this is a collection of thoughts from GH issues, it is WIP still 15:11:41 ... Ningxin and Chai contributed in the early phase, Rafael, Joshua, MikeW 15:12:04 -> Explainer https://github.com/webmachinelearning/webnn/blob/6f73ebb38a4aa4670805cdc7e88eeb6223b387fe/explainer-device-selection.md 15:12:53 zolkis: MikeW's proposal is in considered alternatives, fingerprinting story the remaining concern 15:13:16 ... we can possibly go with Mike's proposal if we give more examples 15:14:07 anssiko: considered alternatives: 15:14:11 ... 1. Keep the current MLDeviceType as a context option, but improve the device type names 15:14:37 q+ 15:14:39 ... 2. Follow this proposal MLOpSupportLimits should be opt-in per #759 15:14:40 https://github.com/webmachinelearning/webnn/issues/759 -> Issue 759 MLOpSupportLimits should be opt-in with base functionality (by mwyrzykowski) [device selection] 15:14:48 q? 15:14:51 ack Mike_Wyrzykowski 15:15:13 MikeW: great document, thanks for writing this! Option 1 is simpler, could flesh that out right now 15:16:20 jsbell has joined #webmachinelearning 15:16:22 zolkis: listing of opLimits is inside the context, if we move it out then we know what the underlying platform is capable of, can match with the model to run 15:16:49 ... we need more concrete examples for Option 2, whether go with full WebGPU adapter approach 15:17:15 ... Option 2 could come after Option 1 15:17:25 q? 15:17:25 AramZS has joined #webmachinelearning 15:18:36 Dwayne: would Option 1 be relaxing the device type to be a hint? 15:19:22 zolkis: PowerPerformance would provide the hint that the implementation could use to map to underlying processing unit(s), or could rename the MLDeviceType 15:21:05 zolkis: let's solicit more use cases to have a complete view 15:21:36 q? 15:22:51 anssik: for feedback, please use the PR #784 15:22:51 https://github.com/webmachinelearning/webnn/pull/784 -> Pull Request 784 Add device selection explainer (WiP) (by zolkis) 15:23:43 Topic: MLTensor 15:23:50 anssik: The group is gathering implementation experience on the MLTensor design to inform an upcoming specification update. 15:23:57 ... The explainer is considered the source of truth in this prototyping phase. 15:24:11 ... I'd like to discuss the open questions, currently 5 15:24:36 -> https://github.com/webmachinelearning/webnn/blob/main/mltensor-explainer.md#open-questions 15:25:21 Austin: not blocking forward progress, the first bullet has come up in the Chromium implementation, we deprecated compute() for dispatch(), and if there's an error you don't find about it 15:25:29 Subtopic: How will errors be surfaced? 15:25:33 anssik: issue #477 15:25:34 https://github.com/webmachinelearning/webnn/issues/477 -> CLOSED Issue 477 API lacks handling for async ML device errors on the context (by bbernhar) [question] 15:26:58 Bryan: want to understand how backends can surface errors midway? 15:27:33 Austin: for many backends peak memory usage can exceed what's available, you could OOM while inferencing 15:28:02 ... seeing failures on some backends because we have implementation gaps, a few classes of errors 15:28:47 ... e.g. trying to allocate too much memory, model file is deleted from disk, compile the model and think everything is good but things change undernearth 15:29:37 ... or if you do scatter or gather with OOB indices 15:30:34 https://github.com/webmachinelearning/webnn/issues/778 has Austin's observations about types of errors 15:30:35 https://github.com/webmachinelearning/webnn/issues/778 -> Issue 778 Proposal: Report non-fatal errors from the WebNN timeline (by a-sully) [feature request] 15:31:15 Austin: CoreML backend may have higher peak memory usage during inferencing than after compile 15:31:37 Bryan: writeTensor() has the same issues as dispatch() 15:32:56 (this is expanded in issue #778) 15:32:56 https://github.com/webmachinelearning/webnn/issues/778 -> Issue 778 Proposal: Report non-fatal errors from the WebNN timeline (by a-sully) [feature request] 15:33:53 q? 15:34:37 Topic: Core op set & MLIR Linalg mapping 15:34:49 anssik: issue #573 15:34:50 https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] 15:35:25 ... this topic is to discuss core op set, primitive ops informed by MLIR Linalg, PyTorch Prims IR, TOSA, StableHLO others. 15:35:39 ... I propose we look at the MLIR Linalg mapping today 15:35:45 ... Dwayne contributed a preliminary analysis of op correspondence (thanks!): 15:35:50 -> Machine Learning Operator Mapping https://onedrive.live.com/edit?id=EE82F5C6F06C7371!345450&resid=EE82F5C6F06C7371!345450&ithint=file%2cxlsx&authkey=!AK8f-RDTleqlLXE&wdo=2&cid=ee82f5c6f06c7371 15:36:20 anssik: Dwayne notes WebNN demonstrates viability of popular models, but it lacks breadth 15:36:31 jsbell has joined #webmachinelearning 15:37:18 ... implementing all the 800+ ops is untenable due to interop requirements (multiple browsers, multiple underlying platforms and backends), this is why we are investigating what make for an appropriate set of primitive ops to allow composition 15:38:01 Dwayne: no firm recommendations, but some categories that are absent 15:38:28 ... WebNN backend support 1D to 3D 15:38:46 ... modular div, rounding, bitwise 15:39:10 ... composite ops, lego blocks for decomposition of other ops, e.g. sumPooling 15:40:17 ... sheet legend: yellow = absent; red = not interesting, it's a named variant 15:41:15 q+ 15:41:40 Dwayne: yellow = worth adding to WebNN 15:41:43 ack jsbell 15:42:00 Joshua: have you done analysis what backends in Chromium support these? 15:42:11 Dwayne: it is future work 15:42:48 Joshua: do we look at the current backends, or look at primitive core ops on top of which everything can be constructed on 15:43:32 q? 15:44:29 anssik: how could the group help? 15:44:41 Dwayne: CoreML backend and TFLite support for these would be welcome 15:45:03 q? 15:48:02 Topic: Support reverse operator 15:48:07 anssik: issue #773 15:48:07 https://github.com/webmachinelearning/webnn/issues/773 -> Issue 773 Support `reverse` operator (by huningxin) [feature request] [operator specific] 15:48:11 ... Ningixin proposed a reverse op that reverses the order of the input tensor along specified axes 15:48:15 ... improves performance of PyTorch models 15:48:19 ... framework support PT, TF, ONNX (with reverse slicing with step -1) 15:48:54 ... native APIs DML (simiarly to ONNX), CoreML, TFLite 15:49:02 ... also in primitive opsets StableHLO, TOSA, PT Prims 15:49:15 RafaelCIntron has joined #webmachinelearning 15:50:02 Ningxin: we have Chromium CL to prototype 15:50:57 Subtopic: Support strides option for slice operator 15:51:08 anssik: issue #772 15:51:09 https://github.com/webmachinelearning/webnn/issues/772 -> Issue 772 Support strides option for `slice` operator (by huningxin) [feature request] [operator specific] 15:51:35 ... Ningxin reports stride option for the slice operator is widely supported, but WebNN's flavour only support stride of 1 15:51:38 ... real-world models with stride > 1 cause WebNN to fallback to other EP impacting performance 15:52:05 anssik: can we link to some sample models in this issue? 15:52:49 Ningxin: the model is a transformer-based model with some customization, possibly not shareable yet 15:53:01 s/Subtopic: Support/Topic: Support 15:54:58 Dwayne: it is an audio model, I can share that it is for speech recognition usage, consider it a Whisper variant or sorts 15:55:34 Topic: Support block-wise quantization 15:55:38 anssik: issue #779 15:55:38 https://github.com/webmachinelearning/webnn/issues/779 -> Issue 779 Support block-wise quantization (by huningxin) [operator specific] 15:55:45 anssik: request to support block-wise quantization 15:56:08 ... allows input tensors be divided into smaller independently quantized blocks, used by SLMs 15:56:19 ... benefits include faster optimization and high precision quantization 15:56:31 ... DML and CoreML support, it seems no TFLite/LiteRT? 15:56:35 ... Dwayne suggests a decomp path is viable? 15:56:48 ... no API signature changes, only changes to the algorithm 15:57:54 Ningxin: we have a prototype in Chromium for DML backend and we successfully used that to enable Phi3-mini with this capability 15:58:26 ... this prototype is successful 15:58:29 q+ 15:58:33 ack jsbell 15:58:56 Joshua: would be great if you could share in the issue performance improvements "X times faster" 15:59:24 Ningxin: for TF we need a composition, can file an issue for TFLite 15:59:36 ... I can follow up on that 15:59:39 dwayner has joined #webmachinelearning 15:59:55 ... for CoreML, Austin can comment 16:00:26 Austin: more constraints than DML, block-wise quant is theoretically supportable 16:00:57 Ningxin: Phi3-mini uses block-wise quantization and will hit the CoreML implementation 16:01:10 The memory savings were huge (I forget the exact numbers before/after, but IIRC 20GB's before o_o). 16:01:14 q? 16:01:20 If any TPAC attendees can answer the question in https://github.com/webmachinelearning/webnn/issues/470#issuecomment-2475325615 please chime in 16:01:21 https://github.com/webmachinelearning/webnn/issues/470 -> Issue 470 Simplify `matmul` op (by huningxin) [operator specific] 16:02:19 RRSAgent, draft minutes 16:02:21 I have made the request to generate https://www.w3.org/2024/11/14-webmachinelearning-minutes.html anssik 16:08:26 s/CoreML backend and/investigation on CoreML backend and 16:09:33 s/or sorts/of sorts 16:11:17 RRSAgent, draft minutes 16:11:19 I have made the request to generate https://www.w3.org/2024/11/14-webmachinelearning-minutes.html anssik 18:10:30 Zakim has left #webmachinelearning 19:14:35 gb has joined #webmachinelearning