13:58:19 RRSAgent has joined #webmachinelearning 13:58:19 logging to https://www.w3.org/2020/05/14-webmachinelearning-irc 13:58:20 Zakim has joined #webmachinelearning 13:58:28 RRSAgent, make logs public 13:58:36 Meeting: WebML CG Teleconference – 14 May 2020 13:58:42 Chair: Anssi 13:59:03 Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2020-05-14-agenda.md 13:59:07 Scribe: Anssi 13:59:11 scribeNick: anssik 14:00:00 Present+ Anssi_Kostiainen 14:00:04 Present+ Andrew_Brown 14:00:09 Present+ Chai_Chaoweeraprasit 14:00:13 Present+ Ningxin_Hu 14:00:48 Present+ Rafael_Cintron 14:01:03 Present+ Ganesan_Ramalingam 14:01:08 rama has joined #webmachinelearning 14:01:13 Present+ Greg_Whitworth 14:01:27 Present+ Mingqiu_Sun 14:02:12 RafaelCintron has joined #webmachinelearning 14:02:23 RRSAgent, draft minutes v2 14:02:23 I have made the request to generate https://www.w3.org/2020/05/14-webmachinelearning-minutes.html anssik 14:02:27 Present+ G_Ramalingam 14:02:30 TOPIC: WebNN first wave models and ops ONNX and XLA-HLO intercept 14:03:03 anssik: PR updated with input from Chai and Daniel to add ONNX and XLA-HLO columns 14:03:08 -> https://github.com/webmachinelearning/webnn/pull/52 Add first_wave_models.md (PR #52) 14:03:18 anssik: PR pending Daniel's review. 14:03:32 abrown has joined #webmachinelearning 14:03:54 ningxin_hu: I've integrated Daniel's and Chai's input to the table 14:03:56 https://github.com/webmachinelearning/webnn/blob/d97d2dba34b82bc2bb579c55fb5741a82f253f93/op_compatibility/first_wave_models.md 14:04:03 -> https://github.com/webmachinelearning/webnn/blob/d97d2dba34b82bc2bb579c55fb5741a82f253f93/op_compatibility/first_wave_models.md first_wave_models.md (HTML preview) 14:04:40 ningxin_hu: latest update was to add ONNX and XLA-HLO data for ops, for example Daniel commented that some ops can be lowered to more primitive ops in XLA-HLO 14:04:56 ... that's incorporated into the table now, so all comments have been addressed 14:05:18 ... another update is the gemm op, that's also added to the table to reflect the recent spec changes 14:05:29 ... anssik: Any blockers that need to be discussed and resolved in the call? 14:06:10 PROPOSED RESOLUTION: Merge First Wave Model PR #52 14:06:18 anssik: any concerns? 14:06:22 [hearing none] 14:06:31 RESOLUTION: Merge First Wave Model PR #52 14:07:23 paul-mcdadniel-msft has joined #webmachinelearning 14:07:40 TOPIC: Element-wise add & mul, concat, reshape, gemm, and transpose ops WebNN API definitions 14:07:55 ... to summarize the work, 3 PRs are open 1 has been merged already: 14:08:06 hello ! 14:08:09 Present+ Paul_McDaniel 14:08:26 -> https://github.com/webmachinelearning/webnn/pull/54 element-wise add & mul (PR #54) 14:08:32 -> https://github.com/webmachinelearning/webnn/pull/55 concat (PR #55) 14:08:38 -> https://github.com/webmachinelearning/webnn/pull/57 reshape (PR #57) 14:08:43 -> https://github.com/webmachinelearning/webnn/pull/58 gemm and transpose (PR #58) 14:08:53 anssik: specifically, PRs #54, #55, and #57 are open, PR #58 merged. 14:09:03 ... let's take these op definitions one by one 14:09:08 TOPIC: Element-wise add & mul 14:09:21 anssik: pending Chai's review, merge conflicts now resolved. 14:09:25 ... other opens? 14:09:46 ningxin_hu: no other opens, waiting for Chai's review 14:10:13 TOPIC: Concat 14:10:19 anssik: pending Rama's review 14:10:40 ... Rama suggested to use of negative axis values to count backwards from the last axis 14:11:01 ... Ningxin shared some OS APIs do not support negative axis values, would need to be handled in browser implementation, alternative proposal to leave it to JS framework-level 14:11:07 ... Comments? 14:11:55 ningxin_hu: waiting for Rama's opinion whether supporting only positive axis values is OK, leave it to JS frameworks to handle conversion 14:12:32 rama: that's definitely fine, I guess only case where that could be a problem is if the rank is not statically known, you cannot do this at build time, must do at runtime 14:13:16 ningxin, i just approved the add PR. Looks good. 14:13:49 TOPIC: Reshape 14:13:56 anssik: Pending Rama's review 14:14:11 ... it seems this PR could probably be merged, since Rama's suggestion to overload shape() could be added later, thoughts? 14:14:21 rama: that's fine 14:14:55 TOPIC: Gemm and transpose 14:15:43 anssik: General Matrix Multiply (gemm) and transpose, a lot of good design discussion in this PR 14:15:47 ... initially explored an idea to make gemm() a static method on the nn interface instead of a regular interface method, to allow the caller choose whether to use the static helper functions or to explicitly call into regular interface operations. 14:15:51 ... then dropped the static gemm() operation and instead creating a non-normative note section that explains how the behavior of this operation can be generically emulated 14:15:57 ... this convention is borrowed from WebGL spec (thanks Rafael!) 14:16:50 Chai: Gemm goes back to model table, Gemm is a high-level op that can be implemented by using others ops, one of the ops that are often handled as a single op in the OS level 14:17:23 ... e.g. in DirectML handled as a single op, the question is how to handle high-level ops 14:17:54 ... need to understand the core set of the interface that needs to be implemented, and for high-level ops they can be implemented with something else, so they could be considered "optional" 14:18:28 ... in the beginning explored the idea, looked at WebIDL static method approach, based on feedback figured out it might create more confusion that clarify 14:19:09 ... WebGL spec tackes similar issues using non-normative notes, so following the same pattern for gemm definition (thanks Rafael!) 14:19:55 ... this makes it clear in the spec the op is in fact a high-level op, and caller can use the pseudo-code defined means that decomposes into core low-level ops 14:20:23 ... consider that's a fair balance between choosing between small and big ops, like debated earlier 14:20:33 ... everyone seem to sign off on this approach 14:20:54 ... this apply to all the ops in the table, likely relu and pooling, like Daniel suggested 14:21:11 ... for big ops we should be consistent and explain how big ops can be implemented in terms of other ops 14:21:44 ... last note, gemm requires transpose, so added that in the same PR, for any big op, if it turns out we need a primitive op, we need to add that 14:22:27 q+ 14:23:10 ack ningxin_hu 14:23:42 ningxin_hu: first, I'd like to mention there's an agenda item "Intersection between XLA & ONNX (Paul)" 14:24:19 let's push the XLA & ONNX ops table discussion for the next session, thanks ! 14:24:21 ... second, comment on Chai's high and low level primitives, my question is about models, I'd like to propose we can add activation incl. relu, leaky relu 14:25:35 high level ones: relu, leakyRelu, clip 14:25:47 low level ones: element-wise min and max 14:26:07 anssik: any feedback? 14:26:09 q? 14:27:07 Ping: TF.js perspective, I see the benefit of high-level ops, do these differentiate between low and high level from the system perspective are they likely to be implemented by the OS. 14:27:11 Present+ Ping_Yu 14:27:44 q+ 14:27:59 Rama: non-normative means how the high-level can be implemented in terms of low-level ops, different implementations can implement those differently 14:28:20 Ping: conceptually, does the user need to undertand the difference, or is this hidden from the user? 14:28:29 Rama: this is an implementation detail, hidden from the user 14:29:09 Chai: from the user point of view, high and low ops are available, they can use one of another, the purpose for adding a non-normative section is to say this op will probably be faster if you use it rather than implement it yourself (in JS) 14:29:37 ... WebNN API is a browser API and is treated as a single unit, on Windows DirectML support gemm as a single unit 14:30:17 ... so implementation is a lot more efficient for GPU to use high-level that low-level that requires copies between ops especially with large tensors 14:30:39 ... we'd implement both, the goal of the spec should be to focus on ops that we already know and are supported in OS API as single units 14:31:14 ... anything can be a big op but differentiation is that only the ones we know are implemented as fused unit, in the interest to keep the spec smaller 14:31:32 ... the point of defining big ops is to allow map to OS layer that implements them as a single unit 14:32:09 Ping : my question is from our colleagues from Android, NN API, their design is more having a system becoming a compiler, they try compile the graph into high level ops against their implementation 14:33:15 ... compilation effort put on to the user in this case, the other way around is to leave this to the OS and let OS compile the graph instead of the user 14:33:29 Rama: AFAICS, the spec allows both 14:35:05 Ping: people who compile the model, it is not the best way forward? 14:35:36 Chai: for instance we know if you have certain ops you can fuse, that's happening on OS level 14:35:43 ... OS level fusion can also be more dynamic 14:36:50 ... we cannot leave the option open that if the caller wants to handle it as a single unit it has an option to do that, that is what the spec represents, allows both the options, if the OS is able to do more complex fusion, we're able to do that with the current spec, and spec describes what is possible 14:37:02 ... provides options for both paths 14:37:16 Ping: OS can do further optimization? 14:37:20 Chai: Corrent. 14:38:25 Ping: another question, maybe related, for the compiler, should we provide a different type...? 14:39:18 Chai: I made one attempt to differentiate using static method, the feedback was that from the browser's point of view, it needs to implement both so no benefit, has benefit to the caller that can see high level op is different 14:39:43 ... now spec describes how those big ops can be implemented in terms of small ops 14:40:06 q? 14:40:27 q+ 14:41:01 TOPIC: WebAssembly System Interface (WASI) machine learning module 14:41:38 anssik: Then something new, please welcome Mingqiu_Sun and Andrew_Brown to introduce the WASI-nn proposal 14:41:43 -> https://github.com/WebAssembly/WASI/issues/272 Proposal for WASI-nn: a machine learning module (issue #272) 14:42:00 anssik: Note that we identified this exploration in this group last year, so I'm pleased to see this work is now being explored 14:42:06 -> https://github.com/webmachinelearning/webnn/issues/32 Web Neural Network API as-is in WASI? (PR #32) 14:42:17 anssik: WASI issue #272 is for discussing the addition of a machine learning module to WASI. Contains a very rough draft of what the API could look like, wasi_ephemeral_nn.witx 14:42:25 -> https://github.com/abrown/WASI-nn/blob/master/phases/ephemeral/witx/wasi_ephemeral_nn.witx wasi_ephemeral_nn.witx 14:42:31 anssik: loosely inspired by the WebNN API, hence the name WASI-nn 14:43:23 Mingqiu_Sun: prepared some slides with background, will walk you through the API after 14:43:40 ... WASI is WebAssembly System Interface, driven by a subgroup of WASM CG 14:44:06 RafaelCintron_ has joined #webmachinelearning 14:44:06 ... this module defined in terms of Witx interface defined by Mozilla 14:44:37 ... four month ago ByteCode Alliance was formed, focused on WASI, and promote Wasm use outside browser 14:45:09 ... we're happily involved with many activities, have an open source Wasm VM for constrained devices, called Wasm Microruntime 14:45:26 ... also worked on Wasmtime SIMD implementation 14:45:35 ... and them this WASI-nn proposal 14:46:05 ... motivation for WASI-nn, we think after you train your model you need to deploy it to various devices different archs and OSs, Wasm provides the benefit of portability 14:46:23 ... we want to start simple, have scope on inferencing initially, inspired by WebNN model loader API 14:46:37 ... I talked with Ningxin about this idea half a year ago 14:46:45 ... we want to be a framework and model format agnostic 14:47:15 ... we want to have an initial simple step, and when this group figures out how to do model loader API we can reuse concept 14:47:47 RafaelCintron: what is the plan to adopt this for JS developers? 14:48:17 Mingqiu_Sun: current focus on Wasm, WebNN to cover JS developers 14:48:44 RafaelCintron: WebNN is for graph builders, load model API is another one, we'd like to eventually put these together 14:49:16 ... I think making a Wasm only API might not make sense, so we don't exclude web developers 14:49:39 Andrew_Brown: not all use cases are shared between Wasm and JS 14:49:50 ... there are runtimes, where there's only Wasm available 14:50:04 ... that's the key reason, to serve runtimes that do not understand JS 14:50:34 RafaelCintron: having the JS and Wasm APIs look and feel the same, should be out goals 14:50:46 ... so we have first-class Web API for model loader 14:51:29 [Andrew sharing witx definition of the proposal] 14:51:55 Andrew: witx is like WebIDL but for defining WASI modules 14:53:11 [reviewing the details of the proposal definition] 14:53:17 -> https://github.com/abrown/WASI-nn/blob/master/phases/ephemeral/witx/wasi_ephemeral_nn.witx wasi_ephemeral_nn.witx 14:56:35 q? 14:56:44 ack Chai 14:57:13 question from ping 14:57:21 Chai: is it corrent to assume, WASI wants to be consistent with WebNN defined for JS audience 14:57:46 Mingqiu_Sun: that is one goal, we want to align where possible, so we can have a uniform API between different languages 14:57:57 q+ 14:58:01 Here's a more legible version of the API (auto-generated docs so beware): https://github.com/WebAssembly/wasi-nn/blob/master/phases/ephemeral/docs.md#-wasi_ephemeral_nn 14:58:05 ... but using Wasm specific mechanism,s so it cannot be 100% consistent 14:58:25 Andrew_Brown: it is not exactly the same, but the intent is to be as close as possible to the JS API 14:58:39 https://webmachinelearning.github.io/model-loader/ 14:58:45 https://github.com/webmachinelearning/model-loader 14:58:55 https://github.com/webmachinelearning/model-loader/blob/master/explainer.md 14:59:25 ack ningxin_hu 14:59:38 ningxin_hu: was on the queue before this topic :) 14:59:52 ... this topic is good, and having a consistent interface between the two is a great goal 15:00:10 zkis has joined #webmachinelearning 15:00:15 ingqiu_Sun: any comments, please submit feedback via GH repo 15:00:33 https://github.com/WebAssembly/WASI/issues/272 15:01:05 Andrew_Brown: all feedback should go to the tracking issue https://github.com/WebAssembly/WASI/issues/272 15:01:15 it would be great to know if WASI and WASI-nn having started exploring how to layer this on top of operating systems, like WinML and DirectML. how would they pass off the model/graph to the OS to run. thanks for the slides today !! 15:01:16 s/ingqiu_Sun/Mingqiu_Sun 15:02:09 q? 15:02:25 ack RafaelCintron_ 15:02:40 RafaelCintron_: how much support does WASI have across browsers? 15:03:05 Andrew: it is not aimed at browsers, it is a system interface for standalone runtimes 15:03:21 ... Node.js and Wasmtime are two biggest implementations of WASI 15:03:35 RafaelCintron_: can you use WASI in browsers 15:03:45 Andrew: there are experimental means to do that 15:04:05 RafaelCintron_: how does WASI handle if someone want to load say video and keep that on GPU? 15:04:43 Mingqiu_Sun: WASI is in early phase in itself, work ongoing on Crypto API 15:04:53 Andrew: File IO like in POSIX is available 15:05:31 thanks ! 15:05:33 TOPIC: Adjourn 15:05:47 RRSAgent, draft minutes v2 15:05:47 I have made the request to generate https://www.w3.org/2020/05/14-webmachinelearning-minutes.html anssik 17:04:40 Zakim has left #webmachinelearning 20:59:14 zkis has joined #webmachinelearning