14:57:52 RRSAgent has joined #webmachinelearning 14:57:52 logging to https://www.w3.org/2022/03/10-webmachinelearning-irc 14:57:55 RRSAgent, make logs Public 14:57:55 please title this meeting ("meeting: ..."), anssik 14:58:01 Meeting: WebML WG Teleconference – 10 March 2022 14:58:06 Chair: Anssi 14:58:11 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2022-03-10-wg-agenda.md 14:58:17 Scribe: Anssi 14:58:24 scribeNick: anssik 14:58:32 scribe+ dom 14:58:33 Present+ Anssi_Kostiainen 14:58:38 RRSAgent, draft minutes 14:58:38 I have made the request to generate https://www.w3.org/2022/03/10-webmachinelearning-minutes.html anssik 14:58:50 Rama has joined #webmachinelearning 14:59:59 Present+ Ganesan_Ramalingam 15:00:17 Present+ Wan_Xiaojian 15:01:13 Present+ Jonathan_Bingham 15:01:31 Present+ Rafael_Cintron 15:01:36 ningxin_hu has joined #webmachinelearning 15:01:38 Present+ Ningxin_Hu 15:02:11 Present+ Dominique_Hazael-Massieux 15:02:21 Present+ Chai_Chaoweeraprasit 15:03:28 Jonathan has joined #webmachinelearning 15:03:40 RafaelCintron has joined #webmachinelearning 15:03:50 chai has joined #webmachinelearning 15:03:54 Topic: Security considerations - last call for review 15:04:25 -> issue: General Security Questions https://github.com/webmachinelearning/webnn/issues/241 15:04:36 -> issue: General Security Questions: https://github.com/webmachinelearning/webnn/issues/241 15:04:41 -> PR: Update Security Considerations per review feedback: https://github.com/webmachinelearning/webnn/pull/251 15:04:46 -> All security-tracker issues: https://github.com/webmachinelearning/webnn/issues?q=label%3Asecurity-tracker+ 15:05:44 -> Op metadata that helps avoid implementation mistakes (issue #243) https://github.com/webmachinelearning/webnn/issues/243 15:06:37 Anssi: PR#251 addresses most questions of #241, but doesn't address #243 15:06:45 ... propose we leave that for later 15:07:30 dom: happy to review PR #251 15:07:41 dom: +1 to leave #243 for later 15:08:58 q? 15:09:18 Topic: Graph execution methods used in different threading models: immediate, async, queued 15:09:49 s/Topic: Graph execution methods used in different threading models: immediate, async, queued// 15:09:54 Topic: Ethical considerations update 15:11:19 RRSAgent, draft minutes 15:11:19 I have made the request to generate https://www.w3.org/2022/03/10-webmachinelearning-minutes.html anssik 15:11:42 James: we conducted an internal / external literature review, which led to the writing of a draft consunltation document 15:11:51 ... not complete, but enough for people to engage and react with 15:12:00 s/consunltation/consultation/ 15:12:10 ... please take a look at the document and bring comments 15:12:24 ... it contains material that may or may not end up in the final WG note 15:12:40 ... including the thinking process, background on ethics and ML 15:12:52 ... still incomplete and work in progress 15:13:20 ... A summary of the process: I looked at existing principles rather than developing our own 15:13:41 Present+ James_Fletcher 15:13:43 ... we want these principles to be universal given the reach of the Web 15:13:58 ... align with W3C values & principles 15:14:14 ... which led to recommending to use the UNESCO value & principles - with more justification in the doc 15:14:19 RRSAgent, draft minutes 15:14:19 I have made the request to generate https://www.w3.org/2022/03/10-webmachinelearning-minutes.html anssik 15:14:53 ... Unesco has a set of 4 values and 10 principles, developed through very wide review and approval, globally 15:15:27 ... confirmed their fitness through meta-analysis about completeness and focus 15:15:53 ... We're looking for feedback on the process, and where it led to in terms of proposed principles 15:16:09 ... this is leading to the next phase where we want to hear from experts and stakeholders 15:16:36 ... The principles are very high level - one challenge is how to turn them into practices, which the document wants to tackle 15:16:52 ... We're running group sessions early April to kickstart that process 15:17:18 ... between principles and risks/mitigations, there may be guidance that elaborates on principles with more context, with more details and more specific to the W3C context 15:17:58 ... e.g. mapping to the W3C TAG ethical principles, which has "autonomy" or "decentralization" principles that don't emerge in UNESCO 15:18:53 ... We'll look to synthetize this into a single guidance per principle, shorter than what we've extracted and specific to the W3C context 15:19:01 ... that guidance would then to risks & mitigations 15:19:24 ... The document also presents case studies that illustrates typical issues in ML ethics 15:19:45 ... Re risks & mitigations, the doc will contain high level considerations, not yet to the level of individual specs 15:20:04 ... the document has an example of a possible risk & mitigation to illustrate this 15:20:51 ... We'll update this version of the document by March 21st, including guidance, feedback received incl on issues & case studies, and high level risks & mitigations 15:21:33 ... Week of April 4 will run group review & brainstorm sessions to feed risks & mitigations 15:21:45 q? 15:21:48 ... and we're targeting April 21st as the time to approve this as a WG Note 15:21:58 -> Draft consultation document https://docs.google.com/document/d/1n55liw3cAcrIdMlvRPEAdV1ANWT9QzgOZ6R0pUaSVY4/ 15:22:23 dom: thanks for turning out plans into this document 15:22:24 Dom: THANKS! 15:23:14 bbcjames has joined #webmachinelearning 15:23:27 q? 15:23:40 anssi: we'll want participants from this group to be involved in the live session; got lots of interest from other W3C groups, e.g "horizontal" groups 15:23:43 Topic: Graph execution methods used in different threading models: immediate, async, queued 15:24:00 -> issue #230: Should WebNN support async APIs? https://github.com/webmachinelearning/webnn/issues/230 15:24:02 anssi: we started discussing in issue #230 15:24:14 ... Chai produced 2 alternative designs up for review 15:24:14 -> PR #255: Define graph execution methods used in different threading models https://github.com/webmachinelearning/webnn/pull/255 15:24:21 -> PR #257: Context-based graph execution methods for different threading models https://github.com/webmachinelearning/webnn/pull/257 15:24:56 Anssi: this is a substantial change to the API - I want us to get it right 15:25:00 q? 15:25:40 Chai: the two pull requests are both trying to do the same thing; I recommend starting with #255 - #257 builds on top of it 15:26:00 ... the core change in #255 is trying to add several execution methods that the user can use to execute the compiled graph 15:26:24 ... and based on the requirements we collected the few months, there are 3 ways people want to use WebNN 15:26:41 ... the summary in #255 describes what we want to do: we want to enable: 15:27:03 ... 1- immediate execution from the calling thread,where you wait until the result is available in the output buffer - a blocking call 15:27:15 ... this is the simplest way to execute a graph 15:27:43 ... this is needed in scenarios where it runs on the CPU 15:28:32 ... 2- the second method allow async execution with a promise, allowing not to block the UI thread 15:29:00 ... 3- the 3rd method is specific to WebGPU; with WebGPU you can buffer commands before they get executed in order 15:29:36 ... sync or async doens't help here, you wouldn't get a deterministic execution 15:30:11 ... the key difference is that it doesn't run the graph, it records the commands in the command buffer, and leaves it to the caller to execute the buffered commands 15:30:50 ... in terms of API shape, in #255, I tried to not change too much of the existing API - 90% of the API is GraphBuilder 15:31:16 q+ to ask if this interop method suggests any changes to the WebGPU API, whether we should seek explicit WebGPU WG review 15:31:25 ... because the execution methods have a strong dependency on the kind of context they're creating, I tried to separate the various mode of executions in execution interfaces 15:31:39 ... MLExecution, MLAwaitedExecution, and MLCommandEncoder 15:31:50 ... the latter name is directly inspired from the WebGPU spec 15:32:31 ... on #255, Dom & Ningxin pointed that having the MLContext being something that calls the execution makes more sense 15:32:51 ... a separate Execution interface makes it harder to see the dependency 15:32:57 ... #257 addresses that 15:33:11 q+ 15:33:34 ... it builds on #255 - it no longer has a separate Execution interface, but instead they become methods in MLContext 15:33:55 ... with a compute method and a computeAsync method 15:34:15 ... the runtime dependency is still there - if you try to use compute to execute on the GPU context, it's not allowed 15:35:22 ... The caller that tries to execute the graph should know a lot about the context - we're not supporting a mode where the context is created independently from the execution 15:36:30 ... when it comes to WebGPU, I chose to ise a createCommandEncoder that creates an MLCommandEncoder that uses an interface consistant with the WebGPU command queue 15:36:42 q? 15:37:09 ack anssik 15:37:09 anssik, you wanted to ask if this interop method suggests any changes to the WebGPU API, whether we should seek explicit WebGPU WG review 15:37:35 anssik: re MLCommandEncoder command buffer, does it require any chance to WebGPU? 15:37:50 ... should we seek explicit WebGPU WG review on the proposal 15:38:00 q+ 15:38:01 chai: it doesn't require any change to WebGPU 15:38:04 q- later 15:38:46 q? 15:38:57 i|James: we:|Slideset: https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0001/0310_W3C_Ethical_Web_ML_Update.pdf 15:39:24 q? 15:39:27 ack RafaelCintron 15:39:59 RafaelCintron: re commandencoder - initialize and dispatch take a Graph; initialize should be called only once 15:40:11 ... could initialize be a constructor for an MLCommandEncoder? 15:40:32 ... what happens if someone calls dispatch with a different graph than the one used to initialize 15:40:49 chai: this is similar to what ningxin asked on #255 15:41:09 ... initializeGraph records the commands we need to initialize the graph; it's not initializing the encoder 15:41:19 ... if you put in the constructor, it would be misleading 15:41:35 ... in many systems that we know, before you want to process the model, you want to pre-process the weights 15:41:42 ... e.g. on GPUs and one some NPUs 15:42:05 ... at the driver level, when you have the weights, they want the opportunity to process it and cache it in their driver in their layout format 15:42:24 ... passing the weights in initializeGraph, the command encoder will record a copy into the GPU buffer 15:43:12 ... that will send this down to the GPU driver with a flag that some systems would use to indicate the opportunity to initialize them at least once 15:43:38 q? 15:43:43 ... the actual commands get dispatched when the inference happens 15:43:51 RafaelCintron: what happens initialize multiple times? 15:43:59 Chai: wouldn't be efficient, but wouldn't fail 15:44:08 RafaelCintron: what about multiple dispatch? 15:44:24 Chai: the encoder is reusable, it doesn't carry state 15:45:13 ... compared to compute, it's a lower level API, matching the WebGPU approach 15:45:20 ... compute could be implemented on top of it 15:45:40 RafaelCintron: what if initialize A with some input, and then dispatch it with other input 15:45:53 Chai: they're different inputs - only constant weights for initialize 15:46:34 ... this preprocessing step matches the approach taken by several low level model API (incl DirectML) 15:46:58 anssi: does the spec talk about these 2 inputs being different? 15:47:22 chai: feedback welcomed in the PR, which could use more explanation in places 15:47:38 anssi: maybe name them differently to help improve the ergonomics 15:47:59 chai: would also like a section with a sample with WebGPU usage 15:48:09 q+ 15:48:10 ... but probably done in a separate PR 15:48:26 q? 15:48:30 ack dom 15:48:41 dom: thanks for this piece of work! 15:49:00 ... I prefer #257 over #255, the API shape is explained better in that 15:49:16 ... not sure still on CPU only compute() method, but will comment on the PR 15:49:35 ... Anssi raised question about WebGPU intersection, we will need WebGPU WG to chime in 15:50:06 ... we need WebGPU WG review for the intersection, there was a GH thread that pointed out some gaps, this PR might start address those 15:50:07 https://github.com/gpuweb/gpuweb/issues/2500 15:50:36 anssi: not requiring changes to WebGPU is definitely a big + 15:51:26 dom: we have Rafael as a bridge between WebML-WebGPU WGs 15:51:44 ... if someone can give us reliable review on this PR from WebGPU let's check with them 15:51:51 q? 15:51:58 ack ningxin_hu 15:52:03 Anssi: would Brian be able to give a WebGPU-angled review of this PR? 15:52:24 Ningxin: +1 to ask Brian, in addition to Rafael's review 15:52:32 s/Brian/Bryan 15:52:37 ... Also thanks again to Chair - very significant contribution 15:52:47 s/Chair/Chai 15:53:08 ... the PR brings both sync/async, and integration with WebGPU 15:53:32 ... If we were to interact with the WebGPU people, we would want to highlight the latter - MLCommandEncoder 15:54:24 ... the discussion on constants weights, it reminds me of an open comment I made on the first PR 15:54:42 ... we have 2 surfaces to upload constants / weights for a context built on a WebGPU device 15:55:18 ... the MLContext via GPUBuffer; with MLCommandEncoder, this gives another path to provide the weights 15:55:42 q? 15:55:53 ... Do we need to remove the constant method for the GPUBuffer binding, and move that to the initialize graph method? 15:56:48 ... if we do so, the graph building code gives 2 different paths for graph building based on different contexts, builder vs initialize graph 15:57:04 q? 15:57:05 ... this isn't ideal; can we find a way to combine them? 15:57:20 chai: I understand that feedback; let's iterate on the PR 15:57:39 ... re integration with WebGPU, a lot of these ideas came from Bryan 15:58:24 q+ 15:59:12 chai: my original idea was to have the ml path in WebGPU - but it creates a hard dependency to WebGPU spec & implementation 15:59:40 q- 15:59:53 q? 16:00:02 anssi: I'm hearing #257 as the PR to continue with 16:00:24 ... thanks for the good progress 16:00:45 ... summarizing: reviews expected on #257, including looping people from WebGPU (Bryan, RafaelCintron) 16:01:01 ... and maybe later seek review from the broader WebGPU WG (possibly after the PR landed) 16:01:01 q? 16:01:15 RRSAgent, draft minutes 16:01:15 I have made the request to generate https://www.w3.org/2022/03/10-webmachinelearning-minutes.html dom 16:10:43 i|James: we conducted an internal / external literature review, which led to the writing of a draft consunltation document|->UPDATE Ethical Web Machine Learning https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0002/0310_W3C_Ethical_Web_ML_Update.pdf 16:11:02 RRSAgent, draft minutes 16:11:02 I have made the request to generate https://www.w3.org/2022/03/10-webmachinelearning-minutes.html anssik 16:11:44 i|James: we conducted an internal / external literature review, which led to the writing of a draft consultation document|->UPDATE Ethical Web Machine Learning https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0002/0310_W3C_Ethical_Web_ML_Update.pdf 16:11:46 RRSAgent, draft minutes 16:11:46 I have made the request to generate https://www.w3.org/2022/03/10-webmachinelearning-minutes.html anssik 18:01:02 Zakim has left #webmachinelearning