13:57:17 RRSAgent has joined #webmachinelearning 13:57:17 logging to https://www.w3.org/2022/10/06-webmachinelearning-irc 13:57:19 RRSAgent, make logs Public 13:57:20 please title this meeting ("meeting: ..."), anssik 13:57:51 Meeting: WebML WG Teleconference – 6 October 2022 13:57:56 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2022-10-06-wg-agenda.md 13:58:00 Scribe: Anssi 13:58:05 scribeNick: anssik 13:58:12 ghurlbot, this is webmachinelearning/webnn 13:58:12 anssik, OK 13:58:18 Present+ Anssi_Kostiainen 13:58:23 Regrets+ Dominique_Hazael-Massieux 14:01:47 Present+ Ningxin_Hu 14:01:51 Present+ Bruce_Dai 14:01:58 Present+ Rafael_Cintron 14:02:04 Present+ Dwayne_Robinson 14:02:06 ningxin_hu has joined #webmachinelearning 14:02:14 RafaelCintron has joined #webmachinelearning 14:02:47 Present+ Chai_Chaoweeraprasit 14:03:02 RRSAgent, draft minutes 14:03:02 I have made the request to generate https://www.w3.org/2022/10/06-webmachinelearning-minutes.html anssik 14:03:08 Chai has joined #webmachinelearning 14:04:04 fdwr has joined #webmachinelearning 14:12:36 Topic: WebML WG Charter 2023-2025 early heads-up 14:13:04 anssik: This is a kick off for the WebML WG Charter 2023-2025 brainstorming discussion. Our current charter ends 2023-04-30 and we are expected to define our new charter for the next 2-year period 2023-05-01 - 2025-04-30. 14:13:47 ... And a call to start solicit use cases and model requirements for WebNN "v2". 14:14:44 ... WG rechartering process to formally kick off early 2023, but we want to start now to have adequate time to converge on consensus 14:14:57 ... my expectation is we will develop a good draft WG charter during Q4. 14:15:21 ... as a reminder, W3C expects new technical proposals to be incubated in the CG prior to the WG adoption. 14:15:45 ... the charter development happens in a separate GH repo and is open to proposals from WG participants and also public: 14:16:00 -> WG Charter https://www.w3.org/2021/04/web-machine-learning-charter.html 14:16:00 -> WG Charter (GH repo) https://github.com/w3c/machine-learning-charter 14:16:28 ... I'll continue request feedback from the WG on our calls from time to time and will reflect that into the WIP charter in the GH repo, feel free to chime in there or provide your feedback on these calls. 14:16:53 ... please talk to your colleagues internally to ensure you can convey the whole company's perspective -- there's a formal checkpoint for that when we enter so-called AC review Q1 '23 but it is better to align already in this draft state! 14:16:54 bruce_dai has joined #webmachinelearning 14:17:32 Subtopic: Current scope & deliverables 14:17:49 anssik: the W3C's WebML WG high-level scope your participating is: 14:17:55 ... Deliverables: 14:18:01 ... - Web Neural Network API 14:18:05 ... - Ethical Principles for Web Machine Learning (non-normative) 14:18:20 ... Out of scope: 14:18:36 ... - "Training capabilities are out of scope due to limited availability of respective platform APIs." 14:18:58 ... - "To avoid overlap with existing work, generic primitives used by traditional machine learning algorithms such as base linear algebra operations are out of scope." 14:19:12 anssik: Tentative Deliverables: 14:19:17 ... Model Loader API is tentative until "agreement on a model format" 14:19:34 q? 14:19:49 Subtopic: Proposed removals 14:20:03 anssik: WG's decided to drop WebGL interop, we should remove coordination unless there are reasons to keep it? 14:20:22 [agreement] 14:20:35 anssik: any objections to keep Model Loader API as a tentative deliverable? 14:20:58 [agreement] 14:21:07 Subtopic: Proposed new work 14:21:20 anssik: Let me introduce some topics we may want to discuss to understand what new work to add in scope (or not): 14:21:48 ... - v2 features. Which features are substantial enough to warrant explicit mention in the charter? Examples: VPU, int8 ... 14:22:02 ... I propose we label such proposals with "v2". If no issue exists, you can file one and note it should be labeled "v2" 14:22:07 -> Proposed v2 features https://github.com/webmachinelearning/webnn/labels/v2 14:22:38 ... - WebGPU interoperability, do we want to be more explicit than in the current Coordination: "[WebGPU API] ... may be used to implement traditional machine learning algorithms efficiently" -- needs revision I think? 14:23:21 q+ 14:23:34 ... - New model and their requirements. Do we want to take another look of ML models we want to support in v2? We had collected the first-wave models to inform our v1 API, we might want to pull of a similar exercise for v2. 14:23:36 q? 14:23:38 ack Chai 14:23:45 Chai: thanks Anssi! 14:24:07 ... for v2, one of the constant feedback from our external partners when discussing WebNN for their use case has been the ops 14:24:22 ... the set of ops supported must be more comprehensive 14:24:35 ... this needs to be more explicit goal, this is important 14:24:53 ... related to that, use cases around transformers 14:25:09 ... getting big, an area we should probably spend more time on 14:25:22 ... also need to support NPU or VPU or xPU 14:26:57 q+ 14:27:06 anssik: WebGPU interop thoughts? 14:27:11 ack RRSAgent 14:27:16 ack RafaelCintron 14:27:51 RafaelCintron: working with WebGPU folks is important for the success of the spec for that part 14:28:11 q? 14:28:20 -> The first-wave models https://github.com/webmachinelearning/webnn/blob/main/op_compatibility/first_wave_models.md 14:29:18 anssik: - Level of abstraction for neural net operations? The explainer has the rationale for the current abstraction in the explainer 14:29:28 -> [Explainer] What is the right level of abstraction for the neural network operations? https://github.com/webmachinelearning/webnn/blob/main/explainer.md 14:30:24 q+ 14:30:27 ack Chai 14:30:45 Chai: on this topic, I don't remember we disagree with Google's position 14:31:12 ... we need lower level primitives they say, but it is important to also have ops that are generally implemented as a single unit, we want to do both low-level and high-level ops 14:31:18 present+ Zoltan_Kis 14:31:35 ... in the spec we wrote informative sections on those ops that are high-level so that they are translatable to low-level ops 14:31:56 ... we made a point of doing more clarification in the spec to address that concern 14:32:35 q? 14:32:40 q+ 14:33:29 Chai: we can summarize that explainer text and embed it in the charter, re abstraction chosen 14:33:38 ack ningxin_hu 14:33:45 https://www.w3.org/TR/webnn/#security-new-ops 14:34:02 ningxin_hu: agree with Chai, I pasted a link to the spec that discusses on how to add new ops to the spec 14:34:35 ... we say in the spec "if an operation can be decomposed to low level primitives: 14:34:35 14:34:35 Add an informative emulation path 14:34:35 14:34:35 Prefer primitives over new high level operations but consider performance consequences" 14:35:03 ningxin_hu: we define both and ensure high-level ops can be HW optimized 14:35:08 q? 14:35:59 Topic: WebNN API Candidate Recommendation open issues 14:36:10 -> Current CR issues https://github.com/webmachinelearning/webnn/labels/cr 14:36:14 Subtopic: Web platform tests 14:36:29 anssik: to satisfy CR requirements, we must document how adequate implementation experience will be demonstrated, and the right way to do that is to produce a cross-browser test suite to validate implementation correctness. 14:36:35 ... of course the results are only as good as the tests area and to that end 14:36:52 ... Bruce working on web-platform-tests for WebNN indicated the only blocker was Unit of Least Precision (ULP) tolerances 14:37:04 ... to that end I'm pleased to welcome Dwayne Robinson from Msft to this meeting 14:37:19 ... Dwayne is a senior developer at Microsoft working on ML platform engineering 14:37:31 ... Dwayne posted his initial list of recommended ULP tolerances to the GH issue 14:37:35 -> Recommended tolerances (Dwayne) https://github.com/webmachinelearning/webnn/issues/265#issuecomment-1256242643 14:38:29 Dwayne: [presents slides "Operator Tolerance Conformance Considerations"] 14:39:18 ... Operator Categories, 8 grouped in complexity roughly 14:39:36 ... Data movement 14:39:42 ... Data generation 14:39:47 ... Exact math 14:39:52 ... Simple math 14:39:57 ... Complex math 14:40:02 ... Trigonometric functions 14:40:09 ... Lossy accumulation 14:40:13 ... Very complex iterative 14:40:28 ... each of these categories have similar precision numbers or tolerance values 14:40:51 ... one question to answer, are you interested in op conformance or precision? 14:40:54 q? 14:41:49 Dwayne: you cannot test all values, asymptotes etc. dubious areas, 0/0 issues 14:42:26 -> catastrophic cancellation https://en.wikipedia.org/wiki/Catastrophic_cancellation 14:42:52 ... Note there are numerical gotchas to beware of, including subtraction of nearly equal numbers (see catastrophic cancellation) 14:43:34 ... Precious issues and gotchas 14:44:11 ... also subtraction of nearly equal numbers 14:44:26 ... division by very small numbers 14:44:50 ... Ideal (Expected) vs. Actual Signal Behaviour 14:45:09 ... left hand side error that is roughly a constant distance from the value 14:45:19 ... center, error proportional to the value 14:46:00 ... on the right, floating point error with jagged red line due to IEEE float 14:46:07 ... Measurement methods 14:46:11 ... 3 methods 14:46:21 ... - absolute tolerance 14:46:26 ... - relative tolerance 14:46:34 ... - unit last place 14:46:49 RRSAgent, draft minutes 14:46:49 I have made the request to generate https://www.w3.org/2022/10/06-webmachinelearning-minutes.html anssik 14:48:56 Dwayne: it is sufficient in SW comparison to use ULP instead of relative error 14:49:05 q? 14:50:08 Dwayne: Contributing error factors 14:50:49 ... Compute precision and tensor data type: float16 vs float32 also non-standard types (e.g. bloat16) 14:51:04 ... rounding modes 14:51:15 ... subnormal flushing 14:51:30 ... different NaN bit patterns 14:51:43 ... Number of calculations: 14:52:05 ... the more input elements, the greater potential for error 14:52:18 RRSAgent, draft minutes 14:52:18 I have made the request to generate https://www.w3.org/2022/10/06-webmachinelearning-minutes.html anssik 14:52:48 Dwayne: Algorithm used e.g. summation order 14:53:04 ... Fusion magnifies earlier errors 14:53:32 ... Contributing error factors - IEPOE (Input Element Per Output Element) 14:54:07 ... not used directly, but conceptually gives a degree of complexity 14:54:25 ... inputs per elements concept not used directly, but useful for considering the lossy math ops 14:54:39 ... greater the number of lossy math ops, the greater the potential for error 14:55:36 ... adding ULP tolerances help establish sensible upper bound 14:56:47 ... beats pulling numbers from thin air 14:56:54 ... any questions? 14:56:54 q? 14:57:27 q+ 14:57:43 q+ 14:58:04 q? 14:58:06 ack ningxin_hu 14:58:21 ningxin_hu: thanks Dwayne! 14:58:49 ... question about sanity checks, you mention "n" can you clarify 14:58:57 q+ to ask whether the Web ML group should work on testing/attesting implementations, or share best practices on testing in the spec or separate document? 14:58:59 Dwayne: e.g. adding 100 numbers together we have n = 100 14:59:00 q? 14:59:36 Dwayne: applies to mul() add(), does not add to all ops e.g. not to exp() 14:59:37 q? 14:59:41 ack Chai 15:00:11 Chai: proposed tolerances in GH issue are not what we use internally in our products, it is based on expertise but not exactly what we use 15:00:39 q? 15:01:17 ack zkis 15:01:17 zkis, you wanted to ask whether the Web ML group should work on testing/attesting implementations, or share best practices on testing in the spec or separate document? 15:01:35 q+ 15:02:31 q? 15:02:35 ack bruce_dai 15:03:00 bruce_dai: I have a question test input data and output data 15:03:35 ... we use number type as input, double precision 15:04:30 Dwayne: cast to what the machine will see, if input is double and you pass as float32 into GPU you lost precision in translation 15:06:38 Subtopic: Add method steps and normative algorithms to operations 15:06:52 anssik: Zoltan is with us today and starting now has time to focus on this task 15:07:02 -> 22 Sep 2022 discussion https://www.w3.org/2022/09/22-webmachinelearning-minutes.html#t04 15:07:12 ... last time we collected feedback on work items for this task: 15:07:20 - internal slot definitions 15:07:39 - clarify graph building with algorithmic steps and internal slots 15:07:54 - (Chromium impl where internal slots are private members will help) 15:08:01 - MLOperand interface 15:08:05 - MLOperator interface 15:08:33 - keep the current declarative op definitions side by side with the algorithmic steps 15:08:42 anssik: Zoltan you have comments or questions? 15:08:54 ... thanks for your help addressing this CR blocker issue. 15:09:24 zkis: I'm relative new to this spec so will be asking good questions from you and fix unclear parts as I go 15:10:04 sgtm 15:10:57 anssik: thank you Zoltan! 15:11:00 q? 15:12:26 RRSAgent, draft minutes 15:12:26 I have made the request to generate https://www.w3.org/2022/10/06-webmachinelearning-minutes.html anssik