14:58:43 <RRSAgent> RRSAgent has joined #webmachinelearning
14:58:47 <RRSAgent> logging to https://www.w3.org/2023/01/12-webmachinelearning-irc
14:58:47 <Zakim> RRSAgent, make logs Public
14:58:48 <Zakim> please title this meeting ("meeting: ..."), anssik
14:58:48 <anssik> Meeting: WebML WG Teleconference – 12 January 2023
14:59:29 <anssik> Chair: Anssi
14:59:34 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2023-01-12-wg-agenda.md
14:59:47 <anssik> Scribe: Anssi
14:59:47 <anssik> scribeNick: anssik
14:59:53 <anssik> Present+ Anssi_Kostiainen
15:00:06 <anssik> Regrets+ Dominique_Hazael-Massieux
15:00:25 <ningxin_hu> ningxin_hu has joined #webmachinelearning
15:00:37 <anssik> Present+ Ningxin_Hu
15:01:02 <anssik> Present+ Sungpil_Shin
15:01:21 <anssik> Present+ Bruce_Dai
15:01:23 <Sungpil_Shin__ETRI_> Sungpil_Shin__ETRI_ has joined #webmachinelearning
15:01:28 <anssik> Present+ Chai_Chaoweeraprasit
15:01:42 <anssik> RRSAgent, draft minutes
15:01:43 <RRSAgent> I have made the request to generate https://www.w3.org/2023/01/12-webmachinelearning-minutes.html anssik
15:02:15 <bruce_dai> bruce_dai has joined #webmachinelearning
15:03:49 <chai> chai has joined #webmachinelearning
15:03:49 <anssik> anssik: Welcome to 2023!
15:03:56 <anssik> Regrets- Dominique_Hazael-Massieux
15:03:56 <dom> Regrets-  Dominique_Hazael-Massieux
15:04:01 <dom> Present+ Dominique_Hazael-Massieux
15:05:37 <dom> scribe+
15:05:51 <anssik> Topic: WebNN API open PRs and issues
15:05:56 <anssik> ghurlbot, this is webmachinelearning/webnn
15:05:56 <ghurlbot> anssik, OK. But note that I'm currently off. Please use: ghurlbot, on
15:06:06 <anssik> ghurlbot, on
15:06:07 <ghurlbot> anssik, OK.
15:06:17 <anssik> anssik: good progress was made in GH over the holiday period, so I'd like us to review the open PRs and those landed and discuss issues filed.
15:06:31 <anssik> ... my expectation is we'll identify and fast track any priority changes that should get into the initial CR release train.
15:06:39 <anssik> Subtopic: Add lstm and lstmCell ops, rename MLOperator to MLActivation
15:06:41 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:06:48 <anssik> anssik: #321
15:06:49 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/321 -> Pull Request 321 [closed] Add LSTM to the operator list (wchao1115)
15:06:52 <dom> Present+ Rafael_Cintron
15:06:57 <anssik> ... this PR adds lstm and lstmCell ops to the spec.
15:07:03 <anssik> ... this PR received adequate review and I considered it ready to be merged.
15:07:16 <anssik> ... notably, we explicitly mention LSTM architecture in our current charter so it was great to get this done. This is a crucial building block that improves the "classic" RNNs.
15:07:22 <anssik> ... thanks Chai!
15:07:24 <anssik> ... any comments from Chai? Any questions from anyone?
15:07:37 <zkis> zkis has joined #webmachinelearning
15:08:01 <anssik> Chai: no further comments, happy to get this in
15:08:44 <anssik> q?
15:09:00 <anssik> Subtopic: Simplify MLContext creation: remove MLDeviceType, remove "high-performance" from MLPowerPreference
15:09:22 <anssik> anssik: #322
15:09:22 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/322 -> Pull Request 322 Simplify MLContext creation (wchao1115)
15:09:32 <anssik> anssik: this PR proposes to remove MLDeviceType enum:
15:09:40 <anssik> enum MLDeviceType {
15:09:40 <anssik>   "cpu",
15:09:40 <anssik>   "gpu"
15:09:40 <anssik> };
15:09:57 <anssik> anssik: and to remove "high-performance" from MLPowerPreference enum:
15:09:57 <anssik> 
15:09:57 <anssik> enum MLPowerPreference {
15:09:57 <anssik>   "default",
15:09:58 <anssik>   "high-performance",
15:09:58 <anssik>   "low-power"
15:09:58 <anssik> };
15:10:17 <anssik> anssik: To ease the review, I added a diff of the proposed IDL changes to GH excluding MLOperator -> MLActivation rename changes
15:10:17 <anssik> -> Diff with IDL changes in PR #322 https://github.com/webmachinelearning/webnn/pull/322#issuecomment-1379983187
15:11:06 <anssik> anssik: my summary of the practical impact of these changes for the default context is:
15:11:21 <anssik> ... - device type selection is an implementation detail
15:11:51 <anssik> ... - ML frameworks (key customers) using WebNN API cannot request a CPU, GPU or NPU implementation explicitly
15:12:09 <anssik> ... - a GPU context can be created only from a WebGPU device
15:13:12 <anssik> chai: thanks for the diff in a comment!
15:13:28 <anssik> ... PR is straightforward, there are a few conversation points to discuss
15:13:49 <anssik> ... before the change, if you want to create a CPU context you set the device type as "cpu" and create a context
15:13:57 <anssik> ... then everything is done on CPU
15:14:11 <anssik> ... with the change you create a context without setting any device type
15:14:29 <anssik> ... the implementation will be a CPU implementation
15:14:55 <anssik> ... for the GPU, before the change, you had two ways to create a context
15:15:15 <anssik> ... first, set device type to "gpu" and you'll get GPU context, second way is to give it a WebGPU device
15:15:20 <anssik> ... so two ways to do it
15:15:30 <anssik> ... after the change, we remove the first way to do it
15:15:41 <anssik> ... say, go create a WebGPU device and give it to createContext method
15:16:10 <anssik> ... we remove the first way of creating a GPU implementation, for 1) simplicity, with this change we can simply think of GPU context in terms of WebGPU context so it maps 1:1
15:16:23 <zkis> q?
15:17:09 <anssik> ... 2) the two ways to create a GPU context are fundamentally different in one way, if you create with a device type "gpu" the implementation will have to own the device that is created internally, so "gpu" device type has to manage its GPU device, while WebGPU device is not owned, but given to you
15:17:24 <anssik> ... WebGPU is passed in terms of implementation, "this is my device, feel free to use it"
15:17:42 <zkis> q+
15:17:53 <anssik> ... eliminating the so-called internal device type "gpu", we don't need to own the device but we work with the WebGPU device
15:18:23 <dom> q+ to ask about extensibility to other *PU
15:19:12 <anssik> ... creating a WebGPU-backed context is a little more complicated in a way you have to create and select and adapter for WebGPU
15:19:12 <anssik> ... advantages: 1) can have multiple GPU adapters in one system
15:19:42 <anssik> ... ~1/3 of all systems will have more than one adapter
15:19:57 <anssik> ... design the API to get WebGPU be in the biz or selecting the adapter
15:20:31 <anssik> ... if you have the MLDeviceType "gpu" then such system might pick a different adapter than WebGPU would use, e.g. discrete and integrated
15:21:16 <anssik> ... each adapter is its own resource domain, you need to create a shareable resource to work cross adapters
15:21:55 <anssik> ... having just WebGPU way of selecting and adapter, you just go design what you want, WebGPU has the same enum with three values, incl. "high-performance", "low-power"
15:21:59 <anssik> q?
15:22:45 <anssik> ack zkis
15:23:23 <anssik> ack dom
15:23:24 <Zakim> dom, you wanted to ask about extensibility to other *PU
15:23:43 <anssik> dom: thanks Chai, this makes a lot of sense, supportive of the direction
15:24:04 <anssik> ... at some point I think we were thinking of CPU, GPU and extensibility for xPUs such as NPUs and such
15:24:18 <anssik> ... I wonder how that has been taken into account
15:24:49 <anssik> Chai: thanks Dom, this PR will not address all about NPU, needs additional change
15:25:03 <anssik> ... thinking around NPU is it'll most likely be more similar how we do CPU currently than with GPU
15:25:28 <anssik> ... because NPU surfaces itself to the system as another adapter, a weird kind of adapter, cannot render anything just does compute
15:25:38 <anssik> ... in the future WebGPU might want to pick it up
15:26:23 <dom> q+
15:26:31 <anssik> ... from the point of view hardware adapter itself, it is a lot more appropriate for WebNN to enumerate NPU on behalf of the user
15:26:49 <anssik> ... WebGPU contract is around graphics operations
15:27:05 <anssik> ... if you want NPU, you want WebNN to deal with the adapter, more similar to CPU than GPU
15:27:26 <anssik> dom: question was, we're closing the simple path to say pick a CPU or pick a NPU
15:27:49 <anssik> ... WebNN is a logical place to find logical NPU adapters? and we can figure out how to integrate that into createContext method?
15:28:02 <anssik> Chai: I see it not as a new device type but a new power preference
15:28:14 <ningxin_hu> q+
15:28:16 <anssik> Dom: we can consider that when we get there
15:28:17 <anssik> q?
15:28:18 <dom> ack me
15:29:00 <anssik> zkis: question to Chai, in earlier discussions we had a context type and a device type
15:29:19 <anssik> ... with script managed context and user agent managed context
15:29:37 <anssik> ... script manager context has valid use cases for ML frameworks that want to select an explicit type of an adapter
15:29:43 <anssik> s/manager/managed
15:29:55 <anssik> Chai: you're question is how does this work with frameworks?
15:30:44 <anssik> ... the framework will own the WebGPU device, if the app uses both WebGPU and ML frameworks, e.g. WebGPU to render and TF.js to run ML, the app needs to manage the lifetime of devices so this makes it easier to manage with no hidden device
15:31:10 <anssik> ... we make it WebNN biz to manage the device and not to mention, performance impact given no control over the device
15:31:36 <anssik> ... to answer your question, [the device] needs to be owned by either the app or the framework
15:32:20 <anssik> ... the app can device whether to use one adapter or more, ownership of the adapter will bubble up to the app or the framework
15:32:31 <anssik> ... I agree with Ningxin this will complicate framework code
15:32:54 <anssik> ... now just set enum and that's it, with these changes must pass the device and/or own the device themselves and pass it to WebNN
15:32:56 <anssik> q?
15:33:11 <anssik> zkis: solves for WebGPU, how to create a CPU specific context?
15:33:20 <anssik> Chai: call createContext without parameters
15:33:32 <anssik> zkis: but that can be overridden by GPU
15:33:46 <anssik> Chai: no, you get a CPU if called without parameters
15:34:01 <anssik> zkis: what if I want CPU+some accelerator?
15:34:09 <anssik> Chai: I don't have an answer right now
15:34:21 <anssik> zkis: we don't have an adapter abstraction in WebNN
15:34:47 <anssik> Chai: NPU adapter will look very different from any WebGPU adapter, it does not render, does not support shaders
15:34:52 <anssik> zkis: owned by lower layers
15:35:13 <anssik> ... how about add another createContext method?
15:35:36 <anssik> Chai: as of today, there is no a notion of NPU adapter, WebGPU could do that if it would want to
15:35:49 <anssik> ... but we don't know if they want to enumerate NPU adapters in the future
15:35:49 <anssik> q?
15:37:16 <anssik> Chai: focus of the PR is between CPU and GPU and simplify the GPU story
15:37:20 <anssik> ack ningxin_hu
15:37:39 <anssik> ningxin_hu: thanks for the discussion, to summarize:
15:38:59 <anssik> ... 1) framework developer point of view, the device selection is part of the WebGPU, concern is on compute side, because per our current spec only default context can execute computeSync method
15:39:33 <anssik> ... for framework developer, also graph execution part is affected, CPU and GPU execution paths would diverge
15:39:59 <anssik> ... that's my concern regarding the framework developer impact
15:40:44 <anssik> ... 2) if you look at the native frameworks, ONNX or others, commonly the fw allows developer to select a device or some execution provider, a device abstraction, and the execution path stays the same
15:41:20 <anssik> ... we abstract the device difference but allow to offload compute to different devices, but with this change a developer needs to adapt for that
15:41:25 <anssik> s/2)//
15:42:10 <anssik> ... 2) backend implementer, e.g. Chrome or other browser engine, this change puts hard requirement on any backend, it must support WebGPU
15:43:05 <anssik> ... some OSes may not interact with WebGPU, e.g. MLService of ChromeOS, that abstracts CPU, NPU etc. with no integration with WebGPU
15:44:29 <anssik> ... 3) dependency, adds a hard dependency on WebGPU API, but WebNN interop is post-v1, now v2 feature, after this change, GPU support would move to "v2" feature for WebNN, otherwise we lose GPU support for WebNN "v1", and in extreme case if WebGPU interop does not materialize we lose GPU support all together
15:44:40 <zkis> I have updated #302 with this proposal, Chai please take a look: https://github.com/webmachinelearning/webnn/issues/302#issuecomment-1380586654
15:44:40 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/302 -> Issue 302 API simplification: context types, context options, createContext() (zolkis) v2
15:44:51 <anssik> Chai: what you say is indexed on the thing that this change will make WebGPU a requirement for WebNN v1
15:45:11 <zkis> q+ to point out synergy with #302 (V2 discussion)
15:45:14 <anssik> ningxin_hu: correct
15:46:15 <anssik> Chai: that is not entirely true, given the reason we introduce command encoder interface is to allow to submit work to WebGPU and allow WebGPU implementation manage its queue
15:46:56 <anssik> ... you still own submitting to the queue, populate the queue with a workload with MLCommandEncoder, allow to interop with WebGPU API and expect the app to submit work
15:47:40 <anssik> ... all is behind the scenes, we are going to have our own queue, and submit just the ML workload and get the final result of the execution
15:47:56 <anssik> ... compute method for WebGPU is a wrapper on top of the work submission mechanism
15:48:15 <anssik> ... does not make WebGPU interop a requirement, changes implementation of submitting work to GPU
15:49:15 <anssik> ... with internal GPU you also own your device, ask for a queue and you have your output data in output buffer you hand out to the caller
15:49:15 <anssik> q?
15:49:40 <anssik> ... ONNX RT as an example, to execute on GPU you give it a GPU queue, because apps using ONNX RT may use graphics API to do graphics work
15:50:00 <anssik> ... for GPU submission, for ORT, you give it a queue
15:50:12 <anssik> ... eventually the owner of the GPU device tends to be the app
15:50:49 <anssik> ... the notion we wrap everything works but is less flexible
15:50:49 <anssik> ... it makes efficient interop much harder, because the app does not see the device
15:51:21 <anssik> ... this change makes that slightly harder to implement but hope gives more flexibility to the app
15:52:04 <ningxin_hu> q+
15:52:16 <anssik> ack ningxin_hu
15:52:40 <anssik> ningxin_hu: if ML context compute can work with WebGPU context it would make me feel better
15:53:46 <anssik> ... I had a concern re throwing operation error in compute() as commented in GH
15:54:09 <anssik> Chai: I'll look into that and try to fix that
15:54:12 <anssik> q?
15:54:48 <anssik> zkis: pointer to Chai that there's a connection to #302 marked for "v2", check the updated comment there, almost the same thing what you proposed but makes the context type explicit
15:54:48 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/302 -> Issue 302 API simplification: context types, context options, createContext() (zolkis) v2
15:55:06 <anssik> ... please review
15:55:18 <anssik> q?
15:55:22 <anssik> ack zkis
15:55:22 <Zakim> zkis, you wanted to point out synergy with #302 (V2 discussion)
15:55:57 <anssik> q?
15:56:40 <anssik> q?
15:56:47 <anssik> RRSAgent, draft minutes
15:56:48 <RRSAgent> I have made the request to generate https://www.w3.org/2023/01/12-webmachinelearning-minutes.html anssik
15:57:00 <ningxin_hu> Probably we could loop in Ping on this issue for feedback
15:57:09 <anssik> Subtopic: Transfer the input and output views for asynchronous execution
15:57:14 <ningxin_hu> I'll check with PIng
15:57:17 <anssik> anssik: issue #318
15:57:18 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/318 -> Issue 318 The input and output resources race condition issue of asynchronous execution (huningxin)
15:57:24 <anssik> ... Jiawei spotted this bug as part of the Chromium implementation review.
15:57:51 <anssik> ... The issue summary is that for WebNN graph async execution, the main thread and the worker thread may access the input or output array buffers at the same time that can cause a race condition.
15:58:11 <anssik> ... Ningxin submitted a PR #323 to fix this. Much thanks for that! Also much thanks to Domenic for his advice that is incorporated into the PR for large parts.
15:58:11 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/323 -> Pull Request 323 Transfer the input and output views for asynchronous execution (huningxin)
15:58:20 <anssik> ... currently the PR is welcoming everyone's review, I'd like to see this landed very soon given it fixes a critical bug.
15:58:27 <anssik> ... Ningxin, anything you'd like to share regarding this change?
15:59:15 <anssik> ningxin_hu: good summary, only thing to add, after investigation this is a well known design consideration, noted as a warning in Web IDL, we are using the preferred solution, "Bring Your Own Buffer"
15:59:46 <anssik> ... from WebIDL point of view, frameworks will bring their own buffers
16:00:40 <anssik> ... this is a solved by Streams API and documented in Web IDL spec and we follow that design, check it out for more details, Domenic also has a blog post about this, check the issue for these materials
16:00:40 <anssik> RRSAgent, draft minutes
16:01:11 <RRSAgent> I have made the request to generate https://www.w3.org/2023/01/12-webmachinelearning-minutes.html anssik
16:01:19 <anssik> q?
16:01:27 <anssik> zkis: this discussion has been extremely helpful for the upcoming algorithm discussions
16:01:27 <anssik> Subtopic: Improve graph execution steps
16:01:37 <anssik> anssik: issue #316 documents editorial improvements to improve the steps for async and sync execution.
16:01:38 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/316 -> Issue 316 Review sync vs async compute differences (zolkis)
16:01:54 <anssik> ... Zoltan has a WIP PR #319 out that defines generic graph execution steps and use them in the sync and async compute() methods.
16:01:55 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/319 -> Pull Request 319 WiP: Improve graph execution steps (zolkis)
16:02:07 <anssik> zkis: I need to rework these when Ningxin's PR is merged
16:02:12 <anssik> RRSAgent, draft minutes
16:02:13 <RRSAgent> I have made the request to generate https://www.w3.org/2023/01/12-webmachinelearning-minutes.html anssik
16:02:52 <anssik> Subtopic: Simplify the operand layout support of conv2d and pooling 2d operations
16:02:52 <anssik> anssik: in issue #324 Nignxin explains how in the existing WebNN spec, conv2d supports two input operand layouts defined by MLInputOperandLayout and four filter operand layouts defined by MLConv2dFilterOperandLayout.
16:02:52 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (huningxin)
16:03:09 <anssik> ... Ningxin concludes this may make the implementation more complicated especially if a native ML framework or OS API doesn't support some of these layouts
16:03:17 <anssik> ... this feedback was also thanks to Chromium review by Jiawei - another demonstration why working on spec and impl in tandem is so effective
16:03:55 <anssik> ... To fix this, Ningxin proposes to reduce the supported operand layouts and just keep the default operand layout
16:03:55 <anssik> ... and let the layout adaption and graph level optimization be handled by ML frameworks that usually already support such functionalities.
16:03:58 <anssik> ... Ningxin, please feel free to fill me in
16:04:11 <ningxin_hu> need more inputs from framework developers
16:04:30 <anssik> q?
16:04:31 <ningxin_hu> I can also @ping in that issue
16:04:31 <ghurlbot> https://github.com/ping -> @ping
16:04:49 <anssik> Topic: WebNN API Candidate Recommendation readiness
16:05:38 <anssik> anssik: I'd reached what I'd describe as a "W3C Process-defined expectation bar for CR readiness" == green with some yellow in CR readiness tracker #240
16:05:38 <ghurlbot> https://github.com/webmachinelearning/webnn/issues/240 -> Issue 240 Candidate Recommendation readiness tracker (anssiko)
16:05:57 <anssik> ... - we can show that the specification has met all Working Group requirements
16:06:08 <anssik> ... - we have added no new normative references since FPWD
16:06:48 <anssik> ... - we can document how adequate implementation experience will be demonstrated thanks to well advanced implementations across multiple backends and the first version of the test suite that just landed to WPT repo
16:07:01 <anssik> ... - we have strong evidence that the spec has received wide review
16:07:22 <anssik> ... - we have identified WebGPU interop as a feature at risk for initial CR and we plan to address it in CR updates
16:07:55 <anssik> ... but because we're an ambitious WG, I've set our CR quality bar higher than normal so that we do not just meet the bar but exceed the quality expectations
16:08:08 <anssik> ... thus I allow us for some time for final polish before we ship the CR during Q1
16:08:42 <anssik> q?
16:09:08 <anssik> Topic: Meeting scheduling adjustment
16:09:13 <anssik> anssik: Chinese New Year is Jan 22, 2023
16:09:25 <anssik> ... 2023 animal sign is rabbit, the luckiest out of all the twelve animals in the Chinese zodiac :-)
16:09:31 <anssik> ... Chinese folks will get 7 days off from work from January 21st to January 27th in 2023
16:09:37 <anssik> ... in consideration of this, I will cancel our next scheduled meeting that overlaps with this holiday period.
16:09:42 <anssik> ... Happy New Year to our participants from China!
16:09:55 <anssik> ... I'm proposing we push forward our bi-weekly meeting cycle by one week so that we'd meet:
16:09:55 <anssik> ... 2 Feb, 16 Feb, 2 Mar, 16 Mar, 30 Mar etc. at the usual time 15:00-16:00 UTC
16:09:59 <anssik> RRSAgent, draft minutes
16:10:00 <RRSAgent> I have made the request to generate https://www.w3.org/2023/01/12-webmachinelearning-minutes.html anssik
16:10:07 <ningxin_hu> Thanks Anssi!
16:10:55 <anssik> RRSAgent, draft minutes
16:10:57 <RRSAgent> I have made the request to generate https://www.w3.org/2023/01/12-webmachinelearning-minutes.html anssik
16:14:13 <anssik> s/I'd reached/we've reached
16:14:16 <anssik> RRSAgent, draft minutes
16:14:17 <RRSAgent> I have made the request to generate https://www.w3.org/2023/01/12-webmachinelearning-minutes.html anssik