IRC log of webmachinelearning on 2019-08-08

Timestamps are in UTC.

14:00:43 [RRSAgent]
RRSAgent has joined #webmachinelearning
14:00:43 [RRSAgent]
logging to https://www.w3.org/2019/08/08-webmachinelearning-irc
14:00:44 [Zakim]
Zakim has joined #webmachinelearning
14:00:54 [anssik]
RRSAgent, make logs public
14:01:14 [anssik]
Meeting: WebML CG Teleconference – 8 August 2019
14:01:19 [anssik]
Chair: Anssi
14:01:26 [anssik]
Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2019-08-08-agenda.md
14:01:27 [Ningxin_Hu]
Ningxin_Hu has joined #webmachinelearning
14:01:34 [kainino]
kainino has joined #webmachinelearning
14:01:34 [anssik]
Scribe: Anssi
14:01:45 [anssik]
scribeNick: anssik
14:01:51 [anssik]
Regrets+ Thomas_Steiner
14:01:56 [anssik]
Present+ Anssi_Kostiainen
14:02:00 [anssik]
Present+ Rafael_Cintron
14:02:02 [Rafael]
Rafael has joined #webmachinelearning
14:02:02 [Ningxin_Hu]
Present+ Ningxin_Hu
14:02:04 [anssik]
Present+ Ganesan_Ramalingam
14:02:09 [Rafael]
present+
14:02:11 [anssik]
Present+ Paul_McDaniel
14:02:23 [anssik]
Present+ Gabe_Esteven
14:02:39 [anssik]
Present+ Jonathan_Bingham
14:03:11 [anssik]
Present+ Kai_Ninomiya
14:03:31 [anssik]
Present+ Greg_Whitworth
14:03:38 [anssik]
RRSAgent, draft minutes v2
14:03:38 [RRSAgent]
I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
14:03:53 [Nikhil]
Present+ Nikhil Thorat
14:03:56 [Nikhil]
Present+ Daniel Smilkov
14:04:04 [Nikhil]
Present+ Nikhil_Thorat
14:04:07 [Nikhil]
Present+ Daniel_Smilkov
14:04:10 [Nikhil]
:)
14:04:27 [anssik]
RRSAgent, draft minutes v2
14:04:27 [RRSAgent]
I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
14:04:40 [anssik]
TOPIC: Define the set of operations and their specification
14:04:51 [anssik]
-> https://github.com/webmachinelearning/webnn/issues/17 Define the set of operations and their specification #17
14:05:17 [anssik]
anssik: we had a review of the proposed resolution and receiver good feedback we need to resolve, let's discuss that now.
14:05:50 [anssik]
... the objective of this call is to resolve objections raised for the proposed resolution and clarify proposed resolution based on feedback where appropriate
14:06:38 [anssik]
To start, I captured the following questions from issue #17 we need to resolve:
14:07:06 [anssik]
nsthorat: "An important part of this specification will be ensuring this set of ops are compatible with the major ML JavaScript frameworks [...] it's not possible for us to move forward with this resolution without understanding compatibility."
14:07:41 [anssik]
jbingham: "what's the plan for dealing with versioning?"
14:07:55 [anssik]
jbingham: "How are custom ops defined and included in the graph?"
14:08:09 [anssik]
walrusmcd: "How many ops?"
14:08:29 [anssik]
jbingham: "Decide if a graph API is the right thing to standardize on"
14:09:13 [anssik]
anssik: To summarize, need to choose a set of operations to be included in the API that enables adequate compatibility in the major ML frameworks
14:09:56 [anssik]
q+ to ask about something
14:10:04 [Nikhil]
q+ to talk about something
14:10:19 [anssik]
q+
14:10:22 [anssik]
q?
14:10:22 [Nikhil]
q+ to talk about onnx & tf lite compatibility doc: https://docs.google.com/document/d/1RXCkZ9mliWbqSakYvNlWhsRH4yFtnpe1YQQNFAIRZo8/edit
14:10:25 [jonathan]
jonathan has joined #webmachinelearning
14:10:37 [anssik]
ack Nikhil
14:10:37 [Zakim]
Nikhil, you wanted to talk about something and to talk about onnx & tf lite compatibility doc: https://docs.google.com/document/d/1RXCkZ9mliWbqSakYvNlWhsRH4yFtnpe1YQQNFAIRZo8/edit
14:11:04 [anssik]
Nikhil: shared doc on the chat, please take a look
14:11:25 [anssik]
... spend time looking at compat, started with basic 2 ops, tried to understand diff
14:11:27 [jdarpinian]
jdarpinian has joined #webmachinelearning
14:12:16 [gabe]
gabe has joined #webmachinelearning
14:12:23 [anssik]
... starting with low number of ops is our preference and grow that over time to understand compat issues of each op
14:13:00 [Rafael]
q+
14:13:33 [kainino]
Present+ James_Darpinian
14:14:02 [anssik]
danielsmilkov: this is about diffing libs, looking into possible compat issue, 1) when comparing with NN API e.g. some ops allow fusing, we propose separate ops no fused kernel, under the hood implementer could fuse the ops so that it runs great on a particular hardware
14:14:39 [anssik]
... ONNX is opinionated regarding the layout, TF lite wants channels come last, different hw depend channels first or channels last
14:14:49 [anssik]
... which layout is better changes over time
14:16:31 [anssik]
Nikhil: would prefer to start very small with a POC that works, and have a plan how to grow that set of ops
14:16:39 [anssik]
q?
14:17:24 [anssik]
... probably need a way to deal with custom ops, have a way in app space to describe custom ops share memory with matmul
14:17:26 [anssik]
ack?
14:17:29 [anssik]
ac?
14:17:31 [anssik]
q?
14:17:49 [anssik]
ack Rafael
14:18:27 [anssik]
Rafael: I agree with a plan to keep this hardware agnostic
14:18:39 [Nikhil]
awesome! that would be great.
14:18:47 [Nikhil]
regarding the script to convert onnx / tensorflow
14:20:10 [anssik]
Paul: everything Rafael basically yes, goal is to be hw agnostic
14:21:10 [anssik]
... ONNX has done work on channel formats and hit these same issues, proposed solutions
14:21:18 [anssik]
q?
14:21:21 [Ningxin_Hu]
q+ to talk about op set & use cases
14:21:22 [anssik]
ack anssik
14:21:22 [Zakim]
anssik, you wanted to ask about something and to
14:21:24 [anssik]
ack?
14:21:29 [anssik]
ack Ningxin_Hu
14:21:29 [Zakim]
Ningxin_Hu, you wanted to talk about op set & use cases
14:22:17 [anssik]
Ningxin_Hu: thanks for the efforts of Nikhil and Daniel, great work! Agree with approach of starting with a small set of ops and validate compat with JS libs
14:23:26 [anssik]
... proposal how to grow the op set: add ops that are needed to implement identified use cases
14:23:42 [anssik]
https://webmachinelearning.github.io/webnn/#usecases
14:23:50 [Nikhil]
q+ want to talk about custom ops technical details
14:24:26 [Ningxin_Hu]
op set and use cases: https://github.com/webmachinelearning/webnn/issues/17#issuecomment-508426036
14:24:29 [anssik]
[silence, agreement]
14:24:38 [anssik]
ack Nikhil
14:25:32 [anssik]
q+ Nikhil
14:26:10 [anssik]
Paul: we took an approach where we selected the ops that benefit from hw acceleration
14:26:28 [anssik]
... a bit similar approach to CUDA
14:27:43 [anssik]
Ningxin_Hu: if we only select expensive ops that benefit from hw, that may impose perf penalty when doing context switching
14:28:19 [anssik]
Paul: I agree, it might be worth prototyping that now, assumption we're proposing is this hybrid approach (w/ WebGL) is viable
14:28:25 [jonathan]
What other ML frameworks should review each op, like Daniel did for TensorFlow, and confirm compatibility before we finalize the definition?
14:30:15 [anssik]
Ningxin_Hu: agree with Paul's comments, interleaving with Wasm in POC, overhead was significant
14:30:51 [anssik]
Rafael: CPU readback is slow, staying with GPU compute shaders should work pretty well
14:30:57 [anssik]
q?
14:31:29 [RRSAgent]
I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
14:32:02 [anssik]
q?
14:32:24 [anssik]
ack Nik
14:32:27 [anssik]
ack Nikhil
14:32:29 [anssik]
q?
14:32:30 [jdarpinian]
q+
14:32:36 [anssik]
ack jdarpinian
14:33:02 [anssik]
jdarpinian: I'm on the Chrome team and think custom ops based on WebGL can work, but will be very complex to implement
14:33:33 [Nikhil]
We think it's important to be able to have custom operations share memory with conv2d / matmul without doing a readback. for cpu-accelerators, share the buffer with WASM, for gpu-accelerators share the buffer with WebGL
14:33:52 [anssik]
... portability between custom ops between different systems, CPU and GPU not very good
14:33:56 [Ningxin_Hu]
q+ talk about mem layout reordering overhead between custom ops and hw accelerated ops
14:34:05 [Ningxin_Hu]
q+
14:34:14 [anssik]
q?
14:34:34 [Nikhil]
this allows us to grow the spec slowly and not have tail-end ops be bottlenecks and the webnn accelerated ops can get quick wins by accelerating the bottleneck ops (conv2d, matmul, etc)
14:34:37 [anssik]
Paul: I think Ningxin_Hu posted an architecture diagram
14:34:51 [anssik]
-> https://github.com/webmachinelearning/webnn/issues/17#issuecomment-518915131 arch diagram
14:35:28 [anssik]
Paul: frameworks will do the heavy lifting, web developer won't see the complexity
14:36:01 [anssik]
Nikhil: we think the same, but not all devices have WebGL backend so fallback to Wasm for example
14:36:07 [anssik]
q?
14:36:14 [anssik]
ack Ningxin_Hu
14:36:42 [anssik]
Ningxin_Hu: about custom ops, folks talked about memory transfer overhead
14:38:08 [anssik]
... even long SIMD instructions on CPU can require tensor memory re-layout, an expensive operation
14:38:19 [anssik]
q?
14:38:29 [Nikhil]
q+
14:39:35 [anssik]
anssik: it was asked on the issue whether graph is the right abstraction?
14:40:31 [jdarpinian]
q+
14:40:45 [anssik]
jonathan: what are the other JS frameworks we need to take into compatibility study?
14:41:32 [anssik]
Paul: in ONNX we considered all frameworks that matter, they have a voice in ONNX project
14:42:28 [anssik]
... in ONNX we have considered PyTorch, Caffe, Intel's frameworks, Microsoft's frameworks, TensorFlow, we have ONNX to TF converter, Apple's CoreML
14:42:50 [anssik]
... CoreML was part of the opset 1 compatibility
14:42:55 [anssik]
q?
14:43:31 [anssik]
Nikhil: specifically interested in JS ML frameworks
14:43:40 [anssik]
... for compatibility
14:44:06 [anssik]
... for example, Brain.js
14:44:54 [anssik]
Paul: we don't want to have two bodies managing op schema, right?
14:45:18 [anssik]
Nikhil: we want to grow slowly, right?
14:45:40 [anssik]
... focus on web stuff to figure out an intersection of JS ML libraries, does that sounds reasonable?
14:46:41 [anssik]
Paul: ONNX does have namespace and versioning concepts, so we could create our own ONNX namespace for the ops references by Web NN API
14:47:22 [anssik]
Rafael: it is up to us to decide how many ops to adopt, the op definitions themselves would come from ONNX standards body
14:48:11 [anssik]
danielsmilkov: that makes sense, want to be clear, because of portability issues and JS libs as users, some changes needed to ONNX may be needed e.g. memory layout
14:48:43 [anssik]
Paul: that's fairly reasonable, ONNX community would certainly welcome that
14:48:58 [anssik]
danielsmilkov: relaxing, not breaking existing ONNX behaviour
14:49:13 [anssik]
... going to custom ops
14:50:09 [anssik]
... we deal with real models every way, need to add ops to TF, interoperability important for e.g. pre and post-processing of media, video
14:50:22 [anssik]
q?
14:50:48 [anssik]
ack jdarpinian
14:51:21 [anssik]
jdarpinian: also need to look into hardware we want to support, there's a lot of hardware out these and new coming up, e.g. neural engines coming up in ARM chips
14:51:44 [kainino]
q+
14:51:58 [anssik]
Nikhil: that's a good point, e.g. for matmul would be good to do homework checking how that works across all hardware
14:55:28 [anssik]
anssik: Daniel and Nikhil could you move your doc https://docs.google.com/document/d/1RXCkZ9mliWbqSakYvNlWhsRH4yFtnpe1YQQNFAIRZo8/edit#heading=h.n1gbg8k8lggq into a GH issue
14:55:37 [anssik]
Nikhil: yes, we'll do that
14:56:29 [anssik]
danielsmilkov: GH issue #17 there's a comment where Ningxin_Hu proposed 14 ops, we could do the work to split these 14 ops into 3-4 GH issues with some logical bundling
14:57:24 [anssik]
PROPOSED RESOLUTION: The specification will reference the ONNX operations and if there are any improvements desired for ONNX the work should be there.
14:57:36 [Ningxin_Hu]
14 ops proposal: https://github.com/webmachinelearning/webnn/issues/17#issuecomment-512651711
14:58:41 [anssik]
PROPOSED RESOLUTION: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
14:58:48 [kainino]
q+ re: jdarpinian, want to point out it's important to not only understand the current and upcoming hardware, but since the browser runs in userspace we also need to run on top of the userspace apis (NNAPI, CoreML, DirectML) so we are constrained by how they expose things
14:59:31 [Nikhil]
q+ AI for custom ops
14:59:34 [anssik]
q?
14:59:43 [anssik]
ack want
14:59:43 [Zakim]
want, you wanted to talk about custom ops technical details
14:59:46 [anssik]
q?
14:59:51 [anssik]
ack AI
14:59:51 [Zakim]
AI, you wanted to discuss custom ops
14:59:56 [anssik]
ack kainino
14:59:56 [Zakim]
kainino, you wanted to discuss jdarpinian, want to point out it's important to not only understand the current and upcoming hardware, but since the browser runs in userspace we
15:00:00 [Zakim]
... also need to run on top of the userspace apis (NNAPI, CoreML, DirectML) so we are constrained by how they expose things
15:00:35 [anssik]
kainino: we want to point out it's important to not only understand the current and upcoming hardware, but since the browser runs in userspace
15:00:43 [anssik]
... we also need to run on top of the userspace apis (NNAPI, CoreML, DirectML) so we are constrained by how they expose things
15:00:49 [anssik]
q?
15:00:52 [anssik]
ack Nikhil
15:01:14 [anssik]
Nikhil: sharing memory with custom ops needs to be better understood
15:01:37 [anssik]
... can you Ningxin_Hu do that investigation?
15:01:57 [anssik]
Ningxin_Hu: with help from james or kai we could make progress with custom ops issue
15:02:25 [anssik]
Rafael: have bandwith to help, but not time to drive
15:02:32 [anssik]
jdarpinian: the same, can help not drive
15:02:41 [anssik]
Ningxin_Hu: I can take the lead, with help from others
15:03:08 [anssik]
PROPOSED RESOLUTION: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
15:03:13 [kainino]
Ningxin_Hu: Please reach out to us as needed
15:03:23 [kainino]
oops that's supposed to be @Ningxin_Hu
15:03:39 [Ningxin_Hu]
thanks @kainino
15:05:10 [anssik]
https://www.w3.org/2019/09/TPAC/
15:06:35 [anssik]
any concerns with the amended proposed resolution?
15:06:43 [anssik]
[hearing no concerns]
15:06:48 [anssik]
RESOLUTION: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
15:06:59 [anssik]
RRSAgent, draft minutes v2
15:06:59 [RRSAgent]
I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
15:08:42 [anssik]
TOPIC: Adjourn
15:09:08 [anssik]
RRSAgent, draft minutes v2
15:09:08 [RRSAgent]
I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
17:08:27 [Zakim]
Zakim has left #webmachinelearning