13:58:19 <RRSAgent> RRSAgent has joined #webmachinelearning
13:58:19 <RRSAgent> logging to https://www.w3.org/2020/05/14-webmachinelearning-irc
13:58:20 <Zakim> Zakim has joined #webmachinelearning
13:58:28 <anssik> RRSAgent, make logs public
13:58:36 <anssik> Meeting: WebML CG Teleconference – 14 May 2020
13:58:42 <anssik> Chair: Anssi
13:59:03 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2020-05-14-agenda.md
13:59:07 <anssik> Scribe: Anssi
13:59:11 <anssik> scribeNick: anssik
14:00:00 <anssik> Present+ Anssi_Kostiainen
14:00:04 <anssik> Present+ Andrew_Brown
14:00:09 <anssik> Present+ Chai_Chaoweeraprasit
14:00:13 <anssik> Present+ Ningxin_Hu
14:00:48 <anssik> Present+ Rafael_Cintron
14:01:03 <anssik> Present+ Ganesan_Ramalingam
14:01:08 <rama> rama has joined #webmachinelearning
14:01:13 <anssik> Present+ Greg_Whitworth
14:01:27 <anssik> Present+ Mingqiu_Sun
14:02:12 <RafaelCintron> RafaelCintron has joined #webmachinelearning
14:02:23 <anssik> RRSAgent, draft minutes v2
14:02:23 <RRSAgent> I have made the request to generate https://www.w3.org/2020/05/14-webmachinelearning-minutes.html anssik
14:02:27 <rama> Present+ G_Ramalingam
14:02:30 <anssik> TOPIC: WebNN first wave models and ops ONNX and XLA-HLO intercept
14:03:03 <anssik> anssik: PR updated with input from Chai and Daniel to add ONNX and XLA-HLO columns
14:03:08 <anssik> -> https://github.com/webmachinelearning/webnn/pull/52 Add first_wave_models.md (PR #52)
14:03:18 <anssik> anssik: PR pending Daniel's review.
14:03:32 <abrown> abrown has joined #webmachinelearning
14:03:54 <anssik> ningxin_hu: I've integrated Daniel's and Chai's input to the table
14:03:56 <ningxin_hu> https://github.com/webmachinelearning/webnn/blob/d97d2dba34b82bc2bb579c55fb5741a82f253f93/op_compatibility/first_wave_models.md
14:04:03 <anssik> -> https://github.com/webmachinelearning/webnn/blob/d97d2dba34b82bc2bb579c55fb5741a82f253f93/op_compatibility/first_wave_models.md first_wave_models.md (HTML preview)
14:04:40 <anssik> ningxin_hu: latest update was to add ONNX and XLA-HLO data for ops, for example Daniel commented that some ops can be lowered to more primitive ops in XLA-HLO
14:04:56 <anssik> ... that's incorporated into the table now, so all comments have been addressed
14:05:18 <anssik> ... another update is the gemm op, that's also added to the table to reflect the recent spec changes
14:05:29 <anssik> ... anssik: Any blockers that need to be discussed and resolved in the call?
14:06:10 <anssik> PROPOSED RESOLUTION: Merge First Wave Model PR #52
14:06:18 <anssik> anssik: any concerns?
14:06:22 <anssik> [hearing none]
14:06:31 <anssik> RESOLUTION: Merge First Wave Model PR #52
14:07:23 <paul-mcdadniel-msft> paul-mcdadniel-msft has joined #webmachinelearning
14:07:40 <anssik> TOPIC: Element-wise add & mul, concat, reshape, gemm, and transpose ops WebNN API definitions
14:07:55 <anssik> ... to summarize the work, 3 PRs are open 1 has been merged already:
14:08:06 <paul-mcdadniel-msft> hello !
14:08:09 <anssik> Present+ Paul_McDaniel
14:08:26 <anssik> -> https://github.com/webmachinelearning/webnn/pull/54 element-wise add & mul (PR #54)
14:08:32 <anssik> -> https://github.com/webmachinelearning/webnn/pull/55 concat (PR #55)
14:08:38 <anssik> -> https://github.com/webmachinelearning/webnn/pull/57 reshape (PR #57)
14:08:43 <anssik> -> https://github.com/webmachinelearning/webnn/pull/58 gemm and transpose (PR #58)
14:08:53 <anssik> anssik: specifically, PRs #54, #55, and #57 are open, PR #58 merged.
14:09:03 <anssik> ... let's take these op definitions one by one
14:09:08 <anssik> TOPIC: Element-wise add & mul
14:09:21 <anssik> anssik: pending Chai's review, merge conflicts now resolved.
14:09:25 <anssik> ... other opens?
14:09:46 <anssik> ningxin_hu: no other opens, waiting for Chai's review
14:10:13 <anssik> TOPIC: Concat
14:10:19 <anssik> anssik: pending Rama's review
14:10:40 <anssik> ... Rama suggested to use of negative axis values to count backwards from the last axis
14:11:01 <anssik> ... Ningxin shared some OS APIs do not support negative axis values, would need to be handled in browser implementation, alternative proposal to leave it to JS framework-level
14:11:07 <anssik> ... Comments?
14:11:55 <anssik> ningxin_hu: waiting for Rama's opinion whether supporting only positive axis values is OK, leave it to JS frameworks to handle conversion
14:12:32 <anssik> rama: that's definitely fine, I guess only case where that could be a problem is if the rank is not statically known, you cannot do this at build time, must do at runtime
14:13:16 <Chai> ningxin, i just approved the add PR. Looks good.
14:13:49 <anssik> TOPIC: Reshape
14:13:56 <anssik> anssik: Pending Rama's review
14:14:11 <anssik> ... it seems this PR could probably be merged, since Rama's suggestion to overload shape() could be added later, thoughts?
14:14:21 <anssik> rama: that's fine
14:14:55 <anssik> TOPIC: Gemm and transpose
14:15:43 <anssik> anssik: General Matrix Multiply (gemm) and transpose, a lot of good design discussion in this PR
14:15:47 <anssik> ... initially explored an idea to make gemm() a static method on the nn interface instead of a regular interface method, to allow the caller choose whether to use the static helper functions or to explicitly call into regular interface operations.
14:15:51 <anssik> ... then dropped  the static gemm() operation and instead creating a non-normative note section that explains how the behavior of this operation can be generically emulated
14:15:57 <anssik> ... this convention is borrowed from WebGL spec (thanks Rafael!)
14:16:50 <anssik> Chai: Gemm goes back to model table, Gemm is a high-level op that can be implemented by using others ops, one of the ops that are often handled as a single op in the OS level
14:17:23 <anssik> ... e.g. in DirectML handled as a single op, the question is how to handle high-level ops
14:17:54 <anssik> ... need to understand the core set of the interface that needs to be implemented, and for high-level ops they can be implemented with something else, so they could be considered "optional"
14:18:28 <anssik> ... in the beginning explored the idea, looked at WebIDL static method approach, based on feedback figured out it might create more confusion that clarify
14:19:09 <anssik> ... WebGL spec tackes similar issues using non-normative notes, so following the same pattern for gemm definition (thanks Rafael!)
14:19:55 <anssik> ... this makes it clear in the spec the op is in fact a high-level op, and caller can use the pseudo-code defined means that decomposes into core low-level ops
14:20:23 <anssik> ... consider that's a fair balance between choosing between small and big ops, like debated earlier
14:20:33 <anssik> ... everyone seem to sign off on this approach
14:20:54 <anssik> ... this apply to all the ops in the table, likely relu and pooling, like Daniel suggested
14:21:11 <anssik> ... for big ops we should be consistent and explain how big ops can be implemented in terms of other ops
14:21:44 <anssik> ... last note, gemm requires transpose, so added that in the same PR, for any big op, if it turns out we need a primitive op, we need to add that
14:22:27 <ningxin_hu> q+
14:23:10 <anssik> ack ningxin_hu
14:23:42 <anssik> ningxin_hu: first, I'd like to mention there's an agenda item "Intersection between XLA & ONNX (Paul)"
14:24:19 <paul-mcdadniel-msft> let's push the XLA & ONNX ops table discussion for the next session, thanks !
14:24:21 <anssik> ... second, comment on Chai's high and low level primitives, my question is about models, I'd like to propose we can add activation incl. relu, leaky relu
14:25:35 <ningxin_hu> high level ones: relu, leakyRelu, clip
14:25:47 <ningxin_hu> low level ones: element-wise min and max
14:26:07 <anssik> anssik: any feedback?
14:26:09 <anssik> q?
14:27:07 <anssik> Ping: TF.js perspective, I see the benefit of high-level ops, do these differentiate between low and high level from the system perspective are they likely to be implemented by the OS.
14:27:11 <anssik> Present+ Ping_Yu
14:27:44 <Chai> q+
14:27:59 <anssik> Rama: non-normative means how the high-level can be implemented in terms of low-level ops, different implementations can implement those differently
14:28:20 <anssik> Ping: conceptually, does the user need to undertand the difference, or is this hidden from the user?
14:28:29 <anssik> Rama: this is an implementation detail, hidden from the user
14:29:09 <anssik> Chai: from the user point of view, high and low ops are available, they can use one of another, the purpose for adding a non-normative section is to say this op will probably be faster if you use it rather than implement it yourself (in JS)
14:29:37 <anssik> ... WebNN API is a browser API and is treated as a single unit, on Windows DirectML support gemm as a single unit
14:30:17 <anssik> ... so implementation is a lot more efficient for GPU to use high-level that low-level that requires copies between ops especially with large tensors
14:30:39 <anssik> ... we'd implement both, the goal of the spec should be to focus on ops that we already know and are supported in OS API as single units
14:31:14 <anssik> ... anything can be a big op but differentiation is that only the ones we know are implemented as fused unit, in the interest to keep the spec smaller
14:31:32 <anssik> ... the point of defining big ops is to allow map to OS layer that implements them as a single unit
14:32:09 <anssik> Ping : my question is from our colleagues from Android, NN API, their design is more having a system becoming a compiler, they try compile the graph into high level ops against their implementation
14:33:15 <anssik> ... compilation effort put on to the user in this case, the other way around is to leave this to the OS and let OS compile the graph instead of the user
14:33:29 <anssik> Rama: AFAICS, the spec allows both
14:35:05 <anssik> Ping: people who compile the model, it is not the best way forward?
14:35:36 <anssik> Chai: for instance we know if you have certain ops you can fuse, that's happening on OS level
14:35:43 <anssik> ... OS level fusion can also be more dynamic
14:36:50 <anssik> ... we cannot leave the option open that if the caller wants to handle it as a single unit it has an option to do that, that is what the spec represents, allows both the options, if the OS is able to do more complex fusion, we're able to do that with the current spec, and spec describes what is possible
14:37:02 <anssik> ... provides options for both paths
14:37:16 <anssik> Ping: OS can do further optimization?
14:37:20 <anssik> Chai: Corrent.
14:38:25 <anssik> Ping: another question, maybe related, for the compiler, should we provide a different type...?
14:39:18 <anssik> Chai: I made one attempt to differentiate using static method, the feedback was that from the browser's point of view, it needs to implement both so no benefit, has benefit to the caller that can see high level op is different
14:39:43 <anssik> ... now spec describes how those big ops can be implemented in terms of small ops
14:40:06 <anssik> q?
14:40:27 <ningxin_hu> q+
14:41:01 <anssik> TOPIC: WebAssembly System Interface (WASI) machine learning module
14:41:38 <anssik> anssik: Then something new, please welcome Mingqiu_Sun and Andrew_Brown to introduce the WASI-nn proposal
14:41:43 <anssik> -> https://github.com/WebAssembly/WASI/issues/272 Proposal for WASI-nn: a machine learning module (issue #272)
14:42:00 <anssik> anssik: Note that we identified this exploration in this group last year, so I'm pleased to see this work is now being explored
14:42:06 <anssik> -> https://github.com/webmachinelearning/webnn/issues/32 Web Neural Network API as-is in WASI? (PR #32)
14:42:17 <anssik> anssik: WASI issue #272 is for discussing the addition of a machine learning module to WASI. Contains a very rough draft of what the API could look like, wasi_ephemeral_nn.witx
14:42:25 <anssik> -> https://github.com/abrown/WASI-nn/blob/master/phases/ephemeral/witx/wasi_ephemeral_nn.witx wasi_ephemeral_nn.witx
14:42:31 <anssik> anssik: loosely inspired by the WebNN API, hence the name WASI-nn
14:43:23 <anssik> Mingqiu_Sun: prepared some slides with background, will walk you through the API after
14:43:40 <anssik> ... WASI is WebAssembly System Interface, driven by a subgroup of WASM CG
14:44:06 <RafaelCintron_> RafaelCintron_ has joined #webmachinelearning
14:44:06 <anssik> ... this module defined in terms of Witx interface defined by Mozilla
14:44:37 <anssik> ... four month ago ByteCode Alliance was formed, focused on WASI, and promote Wasm use outside browser
14:45:09 <anssik> ... we're happily involved with many activities, have an open source Wasm VM for constrained devices, called Wasm Microruntime
14:45:26 <anssik> ... also worked on Wasmtime SIMD implementation
14:45:35 <anssik> ... and them this WASI-nn proposal
14:46:05 <anssik> ... motivation for WASI-nn, we think after you train your model you need to deploy it to various devices different archs and OSs, Wasm provides the benefit of portability
14:46:23 <anssik> ... we want to start simple, have scope on inferencing initially, inspired by WebNN model loader API
14:46:37 <anssik> ... I talked with Ningxin about this idea half a year ago
14:46:45 <anssik> ... we want to be a framework and model format agnostic
14:47:15 <anssik> ... we want to have an initial simple step, and when this group figures out how to do model loader API we can reuse concept
14:47:47 <anssik> RafaelCintron: what is the plan to adopt this for JS developers?
14:48:17 <anssik> Mingqiu_Sun: current focus on Wasm, WebNN to cover JS developers
14:48:44 <anssik> RafaelCintron: WebNN is for graph builders, load model API is another one, we'd like to eventually put these together
14:49:16 <anssik> ... I think making a Wasm only API might not make sense, so we don't exclude web developers
14:49:39 <anssik> Andrew_Brown: not all use cases are shared between Wasm and JS
14:49:50 <anssik> ... there are runtimes, where there's only Wasm available
14:50:04 <anssik> ... that's the key reason, to serve runtimes that do not understand JS
14:50:34 <anssik> RafaelCintron: having the JS and Wasm APIs look and feel the same, should be out goals
14:50:46 <anssik> ... so we have first-class Web API for model loader
14:51:29 <anssik> [Andrew sharing witx definition of the proposal]
14:51:55 <anssik> Andrew: witx is like WebIDL but for defining WASI modules
14:53:11 <anssik> [reviewing the details of the proposal definition]
14:53:17 <anssik> -> https://github.com/abrown/WASI-nn/blob/master/phases/ephemeral/witx/wasi_ephemeral_nn.witx wasi_ephemeral_nn.witx
14:56:35 <anssik> q?
14:56:44 <anssik> ack Chai
14:57:13 <Ping> question from ping
14:57:21 <anssik> Chai: is it corrent to assume, WASI wants to be consistent with WebNN defined for JS audience
14:57:46 <anssik> Mingqiu_Sun: that is one goal, we want to align where possible, so we can have a uniform API between different languages
14:57:57 <RafaelCintron_> q+
14:58:01 <abrown> Here's a more legible version of the API (auto-generated docs so beware): https://github.com/WebAssembly/wasi-nn/blob/master/phases/ephemeral/docs.md#-wasi_ephemeral_nn
14:58:05 <anssik> ... but using Wasm specific mechanism,s so it cannot be 100% consistent
14:58:25 <anssik> Andrew_Brown: it is not exactly the same, but the intent is to be as close as possible to the JS API
14:58:39 <anssik> https://webmachinelearning.github.io/model-loader/
14:58:45 <anssik> https://github.com/webmachinelearning/model-loader
14:58:55 <anssik> https://github.com/webmachinelearning/model-loader/blob/master/explainer.md
14:59:25 <anssik> ack ningxin_hu
14:59:38 <anssik> ningxin_hu: was on the queue before this topic :)
14:59:52 <anssik> ... this topic is good, and having a consistent interface between the two is a great goal
15:00:10 <zkis> zkis has joined #webmachinelearning
15:00:15 <anssik> ingqiu_Sun: any comments, please submit feedback via GH repo
15:00:33 <abrown> https://github.com/WebAssembly/WASI/issues/272
15:01:05 <anssik> Andrew_Brown: all feedback should go to the tracking issue https://github.com/WebAssembly/WASI/issues/272
15:01:15 <paul-mcdadniel-msft> it would be great to know if WASI and WASI-nn having started exploring how to layer this on top of operating systems, like WinML and DirectML.   how would they pass off the model/graph to the OS to run.  thanks for the slides today !!
15:01:16 <anssik> s/ingqiu_Sun/Mingqiu_Sun
15:02:09 <anssik> q?
15:02:25 <anssik> ack RafaelCintron_
15:02:40 <anssik> RafaelCintron_: how much support does WASI have across browsers?
15:03:05 <anssik> Andrew: it is not aimed at browsers, it is a system interface for standalone runtimes
15:03:21 <anssik> ... Node.js and Wasmtime are two biggest implementations of WASI
15:03:35 <anssik> RafaelCintron_: can you use WASI in browsers
15:03:45 <anssik> Andrew: there are experimental means to do that
15:04:05 <anssik> RafaelCintron_: how does WASI handle if someone want to load say video and keep that on GPU?
15:04:43 <anssik> Mingqiu_Sun: WASI is in early phase in itself, work ongoing on Crypto API
15:04:53 <anssik> Andrew: File IO like in POSIX is available
15:05:31 <paul-mcdadniel-msft> thanks !
15:05:33 <anssik> TOPIC: Adjourn
15:05:47 <anssik> RRSAgent, draft minutes v2
15:05:47 <RRSAgent> I have made the request to generate https://www.w3.org/2020/05/14-webmachinelearning-minutes.html anssik
17:04:40 <Zakim> Zakim has left #webmachinelearning
20:59:14 <zkis> zkis has joined #webmachinelearning