14:57:22 RRSAgent has joined #webmachinelearning 14:57:22 logging to https://www.w3.org/2021/03/18-webmachinelearning-irc 14:57:24 Chair: Anssi 14:57:25 RRSAgent, make logs Public 14:57:25 please title this meeting ("meeting: ..."), anssik 14:57:35 Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2021-03-18-agenda.md 14:57:45 Meeting: WebML CG Teleconference – 18 March 2021 14:57:58 Scribe: Anssi 14:58:03 scribeNick: anssik 14:58:10 Present+ Anssi_Kostiainen 14:58:17 RRSAgent, draft minutes v2 14:58:17 I have made the request to generate https://www.w3.org/2021/03/18-webmachinelearning-minutes.html anssik 15:00:12 ningxin_hu has joined #webmachinelearning 15:00:37 Present+ Ningxin_Hu 15:00:57 Present+ Chai_Chaoweeraprasit 15:03:15 RafaelCintron has joined #webmachinelearning 15:03:38 Present+ Rafael_Cintron 15:04:28 RRSAgent, draft minutes v2 15:04:28 I have made the request to generate https://www.w3.org/2021/03/18-webmachinelearning-minutes.html anssik 15:04:42 Topic: webnn-native update 15:04:52 -> https://github.com/webmachinelearning/webnn-native/pull/1 webnn-native initial implementation 15:05:14 chai has joined #webmachinelearning 15:05:35 Present+ Ganesan_Ramalingam 15:05:44 anssik: "The initial implementation supports 10 first-wave ops required by LeNet example including add, averagePool2d, conv2d, matmul, maxPool2d, mul, relu, reshape, softmax and reshape." 15:05:48 anssik: Ningxin to share an update and plans 15:05:50 wonsuk_ has joined #webmachinelearning 15:05:51 ... raise any topics the group should weight in on 15:05:56 Rama has joined #webmachinelearning 15:05:59 Present+ Wonsuk_Lee 15:06:26 ningxin_hu: Dawn project code generator reused by webnn-native 15:06:51 Chai__ has joined #webmachinelearning 15:07:10 ... WebNN C/C++ headers generated by Dawn code generator from webnn.json and generator/templates 15:07:23 ... The webnn.h: the C header (generated as out/Release/gen/src/include/webnn/webnn.h after build) 15:07:29 ... A C++ wrapper for the webnn.h (generated as out/Release/gen/src/include/webnn/webnn_cpp.h after build) 15:07:40 ... designed to keep up with the spec changes more easily 15:07:58 ... infrastructure also generated such as base objects and interface classes 15:07:59 chai___ has joined #webmachinelearning 15:08:27 wonsuk__ has joined #webmachinelearning 15:08:33 ... second part, backend implementations: 15:08:42 ... DirectML on Windows 10 (under src/webnn_native/dml/) 15:08:53 ... OpenVINO on Windows 10 and Linux (under src/webnn_native/openvino/) 15:09:10 ... The unit and end2end tests cover the 10 first-wave ops (under src/tests/) 15:09:24 ... the lastly, there's LeNet C++ example that is equivalent to webmachinelearning/webnn-samples/lenet (under src/examples/LeNet) 15:09:49 RRSAgent, draft minutes v2 15:09:49 I have made the request to generate https://www.w3.org/2021/03/18-webmachinelearning-minutes.html anssik 15:12:19 ningxin_hu: Apache 2.0 licensed 15:12:54 ... 3rd party dependencies: The code generator and infrastructure code of Dawn project and The DirectMLX and device wrapper of DirectML project 15:13:34 ... if Chai and Rafael could share their comments it'd be welcome 15:13:47 present+ 15:13:55 q? 15:14:27 ping_yu has joined #webmachinelearning 15:14:59 anssik: any blockers to land the PR? 15:15:21 ningxin_hu: for DML usage, would like to get Chai's approval for that 15:15:30 Chai: have been looking at this on and off, it is a lot of work 15:15:46 ... will look at this a bit more 15:15:55 Present+ Ping_Yu 15:16:10 q+ 15:16:52 ningxin_hu: the implementation doesn't reflect the very latest spec, but will catch up 15:17:48 ... the webnn-native high-level goals: 1) inform the WebNN API work and group about op compatibility and 2) provide a performance benchmark 15:17:57 q? 15:18:00 ack RafaelCintron 15:18:20 RafaelCintron: what is the long-term maintenance story? 15:19:16 q+ 15:19:27 ningxin_hu: I will continue work on this project in addition to the spec work, I commit to maintain the project, cannot speak for other contributors of course 15:19:30 q? 15:19:59 ack wonsuk__ 15:20:30 wonsuk__: a questions about the plan? This is using C++, mostly focused on desktop and embedded systems, right? 15:20:44 ... any plan for mobile implementations Android, iOS? 15:21:08 ningxin_hu: from the API perspective we use C/C++ to make this standalone so no runtime or app framework dependency 15:22:24 ... I don't think a mobile app usage is our primary goal, but nothing prevents integration in such environments 15:22:28 q? 15:23:10 ningxin_hu: Ping asked an important questions about Wasm usage 15:23:30 ... if we need to continue Wasm usage investigation, we need some tooling for EMScripten 15:23:53 ... that is missing today, but if we want to continue Wasm investigation this project could be used to fill in that gap 15:24:44 Ping: our thinking is, if this can be available, usage in Wasm world similar to WebGPU bindings (ed. note: line was breaking, scribe struggles) 15:25:02 q? 15:25:26 https://github.com/webmachinelearning/webnn-native/ 15:25:41 Topic: Operation-specific APIs 15:25:52 anssik: Ping and Jonathan shared Operation-specific APIs proposal use cases and requirements 15:25:57 -> https://github.com/webmachinelearning/proposals/issues/2 Operation-specific APIs proposal 15:26:18 anssik: I'd like us to unpack this proposal and discuss whether the high-level requirements from the proposal could be translated into WebNN API requirements and whether there's support for that design direction 15:26:35 ... In the agenda I enumerated some requirements derived from discussion in PR #149 15:26:48 ... Req: Direct access to the convolution op 15:26:55 ... Addressed by: https://github.com/webmachinelearning/webnn/pull/149#discussion_r591634607 15:27:34 ... correct? 15:27:35 q? 15:27:42 q+ 15:27:46 ack ningxin_hu 15:28:07 ningxin_hu: my point is, if we use WebNN graph API developers can create a single-op graph 15:28:37 ... and thanks to Chai's PR, the steps to create a single-op graph reduces to two steps, that could map to Ping's requirement in the Operation-specific APIs proposal 15:30:35 Ping: I think it looks fine to use graph for this, one concern is performance? 15:31:13 ... we can use both graph API and op API for fusion 15:31:34 ... the only concern is performance, if there's overhead with graph API 15:32:03 q+ 15:32:08 ack Chai__ 15:32:59 Chai: I would expect the user of the op-specific convolution would go though the process of compilation before using it, it happens in the framework anyway, e.g. conv+relu 15:33:24 ... you'd create a subgraph for conv+relu and compile it ahead of time, this is something should already happen in any framework when they load the model 15:34:11 ... by the time you run them you call into any native convolution API, this is happening in WinML, TF, pretty common step 15:34:41 ... to execute and op immediately is essentially a combination of the prior steps including compilation, but it is not expected to do every time 15:34:55 ... I'm not concerned of performance 15:35:03 question is the caching is done at the framework level or webnn level? 15:35:06 ... relative to native 15:35:40 Chai: I think it should be done on the framework level, because the graph has no notion of internal caching 15:36:00 ... there's a GraphBuilder object that has context for graph building and compilation, can be managed by the user 15:36:14 ... the lifetime of that builder belongs to the framework, only for building, not for executing it 15:36:22 for example WebGL has shader caching, I think WebNN might want to create a low level caching for compiled graph. 15:36:35 ... the MLGraph, compiled graph, is something the framework would hang on to 15:37:02 ... maybe conv+relu, of just conv the framework should keep them, when executing though this just invoke it, similar to a native API 15:37:33 Ping: thanks for the explanation 15:37:52 ... I think WebGL has its own shader caching, we try to hide those caches 15:39:28 Chai: from the implementation standpoint, at compile step going from platform to the driver, interrogating what the driver supports 15:39:47 ... DML would then create this pathway to reach the hardware block 15:40:54 ... not worries about performance, because it is going to be the same as with native, unless your browser implementation does something inefficient 15:41:19 ... the spec allows for an efficient implementation, so the WebNN API spec does not constrain performance 15:41:39 ... from the API semantic point of view, this is exactly for what it is designed for 15:41:45 sounds good 15:42:33 anssik: ... Req: Support for native tensor types, GPU buffers 15:42:41 Addressed by https://github.com/webmachinelearning/webnn/pull/149 for GPU buffers 15:43:26 ... what is missing? Is this a reasonable req? 15:45:28 Ping: The question is, whether we should keep the device specific graph, or device agnostic graph compiled to a specific device? 15:45:44 ... Req: Device preference setting when compiling a graph to reduce IO between accelerators 15:45:59 q+ 15:46:04 ack ningxin_hu 15:46:16 ningxin_hu: I asked this question on our last meeting 15:46:24 ... in PR discussion Chai mentioned a use case 15:46:42 ... to create a device specific graph that allows constants to be loaded from GPU or device buffers 15:46:49 ... in the PR discussions my question was resolved 15:47:05 ... looking at the latest spec, a device specific graph can be created 15:47:06 q? 15:47:52 Chai: the last outstanding issue is around how the caller of this API using this as an op API can resource upload and download(?) 15:48:36 ... it should already support it the way the PR is written, if the context is created by an existing device and input and output is submitted as device resources, by definition the caller of this API is responsible for manually downloading back to CPU memory 15:49:03 ... I haven't explained this well in the spec, but I want to improve this and add more text to amend the PR #149 15:49:24 ... if the caller asks WebNN to create a context for their own device, their responsible 15:50:14 q+ 15:50:17 q? 15:50:27 ack ningxin_hu 15:51:00 ningxin_hu: I have no question about GPU interops, but question about the use case for the op-specific API and Wasm CPU scenario 15:51:20 ... in the latest API you can only interact with CPU buffers for Wasm 15:51:42 ... that way if the device is CPU, the implementation will use some optimized memory layout for hardware acceleration 15:52:00 ... if the user code uses Wasm and ArrayBuffer that'll result in every input and output to be relayout 15:52:21 ... not optimal for executing multiple ops and without access to intermediate results 15:53:42 maybe we should open an issue to track WASM interop scenario 15:53:44 Ping: my question is for Wasm, typically if the CPU is supposed to always bring back the value of the op or is it a handle for the tensor? 15:53:59 i'd like to learn more, maybe there is something we can do to improve it further 15:55:00 anssik: I'd propose to spin the relevant high-level requirements for WebNN API into their own issues to be discussed and resolved one by one 15:55:02 I'll create an issue 15:55:16 q+ 15:55:21 ack RafaelCintron 15:55:53 RafaelCintron: wanted to say, one thing we need to clarify at which point the web developers weights are used in the compilation 15:56:25 ... if they give us an ArrayBuffer, change it in between, need to specify when that works 15:56:38 ... be more explicit if compilation means we copy things 15:56:59 +1 on rafael 15:57:09 anssik: Rafael to open an issue on that 15:57:20 anssik: Req: Optimization for single-op graph execution 15:58:47 yes, I agree graph API can address the op level requirement 15:59:03 anssik: ping are you fine with a subset of WebNN API satisfying the reqs of op-specific APIs? 15:59:08 Ping: SGTM 15:59:47 anssik: I propose we keep https://github.com/webmachinelearning/proposals/issues/2 open until we have addressed all the reqs in the WebNN API 16:00:45 RRSAgent, draft minutes v2 16:00:45 I have made the request to generate https://www.w3.org/2021/03/18-webmachinelearning-minutes.html anssik 16:00:57 RRSAgent, draft minutes v2 16:00:57 I have made the request to generate https://www.w3.org/2021/03/18-webmachinelearning-minutes.html anssik 16:01:48 Topic: Adjourn 16:01:51 RRSAgent, draft minutes v2 16:01:51 I have made the request to generate https://www.w3.org/2021/03/18-webmachinelearning-minutes.html anssik 16:38:57 sitarharel has joined #webmachinelearning 17:59:54 Zakim has left #webmachinelearning