IRC log of webmachinelearning on 2019-08-08
Timestamps are in UTC.
- 14:00:43 [RRSAgent]
- RRSAgent has joined #webmachinelearning
- 14:00:43 [RRSAgent]
- logging to https://www.w3.org/2019/08/08-webmachinelearning-irc
- 14:00:44 [Zakim]
- Zakim has joined #webmachinelearning
- 14:00:54 [anssik]
- RRSAgent, make logs public
- 14:01:14 [anssik]
- Meeting: WebML CG Teleconference – 8 August 2019
- 14:01:19 [anssik]
- Chair: Anssi
- 14:01:26 [anssik]
- Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2019-08-08-agenda.md
- 14:01:27 [Ningxin_Hu]
- Ningxin_Hu has joined #webmachinelearning
- 14:01:34 [kainino]
- kainino has joined #webmachinelearning
- 14:01:34 [anssik]
- Scribe: Anssi
- 14:01:45 [anssik]
- scribeNick: anssik
- 14:01:51 [anssik]
- Regrets+ Thomas_Steiner
- 14:01:56 [anssik]
- Present+ Anssi_Kostiainen
- 14:02:00 [anssik]
- Present+ Rafael_Cintron
- 14:02:02 [Rafael]
- Rafael has joined #webmachinelearning
- 14:02:02 [Ningxin_Hu]
- Present+ Ningxin_Hu
- 14:02:04 [anssik]
- Present+ Ganesan_Ramalingam
- 14:02:09 [Rafael]
- present+
- 14:02:11 [anssik]
- Present+ Paul_McDaniel
- 14:02:23 [anssik]
- Present+ Gabe_Esteven
- 14:02:39 [anssik]
- Present+ Jonathan_Bingham
- 14:03:11 [anssik]
- Present+ Kai_Ninomiya
- 14:03:31 [anssik]
- Present+ Greg_Whitworth
- 14:03:38 [anssik]
- RRSAgent, draft minutes v2
- 14:03:38 [RRSAgent]
- I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
- 14:03:53 [Nikhil]
- Present+ Nikhil Thorat
- 14:03:56 [Nikhil]
- Present+ Daniel Smilkov
- 14:04:04 [Nikhil]
- Present+ Nikhil_Thorat
- 14:04:07 [Nikhil]
- Present+ Daniel_Smilkov
- 14:04:10 [Nikhil]
- :)
- 14:04:27 [anssik]
- RRSAgent, draft minutes v2
- 14:04:27 [RRSAgent]
- I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
- 14:04:40 [anssik]
- TOPIC: Define the set of operations and their specification
- 14:04:51 [anssik]
- -> https://github.com/webmachinelearning/webnn/issues/17 Define the set of operations and their specification #17
- 14:05:17 [anssik]
- anssik: we had a review of the proposed resolution and receiver good feedback we need to resolve, let's discuss that now.
- 14:05:50 [anssik]
- ... the objective of this call is to resolve objections raised for the proposed resolution and clarify proposed resolution based on feedback where appropriate
- 14:06:38 [anssik]
- To start, I captured the following questions from issue #17 we need to resolve:
- 14:07:06 [anssik]
- nsthorat: "An important part of this specification will be ensuring this set of ops are compatible with the major ML JavaScript frameworks [...] it's not possible for us to move forward with this resolution without understanding compatibility."
- 14:07:41 [anssik]
- jbingham: "what's the plan for dealing with versioning?"
- 14:07:55 [anssik]
- jbingham: "How are custom ops defined and included in the graph?"
- 14:08:09 [anssik]
- walrusmcd: "How many ops?"
- 14:08:29 [anssik]
- jbingham: "Decide if a graph API is the right thing to standardize on"
- 14:09:13 [anssik]
- anssik: To summarize, need to choose a set of operations to be included in the API that enables adequate compatibility in the major ML frameworks
- 14:09:56 [anssik]
- q+ to ask about something
- 14:10:04 [Nikhil]
- q+ to talk about something
- 14:10:19 [anssik]
- q+
- 14:10:22 [anssik]
- q?
- 14:10:22 [Nikhil]
- q+ to talk about onnx & tf lite compatibility doc: https://docs.google.com/document/d/1RXCkZ9mliWbqSakYvNlWhsRH4yFtnpe1YQQNFAIRZo8/edit
- 14:10:25 [jonathan]
- jonathan has joined #webmachinelearning
- 14:10:37 [anssik]
- ack Nikhil
- 14:10:37 [Zakim]
- Nikhil, you wanted to talk about something and to talk about onnx & tf lite compatibility doc: https://docs.google.com/document/d/1RXCkZ9mliWbqSakYvNlWhsRH4yFtnpe1YQQNFAIRZo8/edit
- 14:11:04 [anssik]
- Nikhil: shared doc on the chat, please take a look
- 14:11:25 [anssik]
- ... spend time looking at compat, started with basic 2 ops, tried to understand diff
- 14:11:27 [jdarpinian]
- jdarpinian has joined #webmachinelearning
- 14:12:16 [gabe]
- gabe has joined #webmachinelearning
- 14:12:23 [anssik]
- ... starting with low number of ops is our preference and grow that over time to understand compat issues of each op
- 14:13:00 [Rafael]
- q+
- 14:13:33 [kainino]
- Present+ James_Darpinian
- 14:14:02 [anssik]
- danielsmilkov: this is about diffing libs, looking into possible compat issue, 1) when comparing with NN API e.g. some ops allow fusing, we propose separate ops no fused kernel, under the hood implementer could fuse the ops so that it runs great on a particular hardware
- 14:14:39 [anssik]
- ... ONNX is opinionated regarding the layout, TF lite wants channels come last, different hw depend channels first or channels last
- 14:14:49 [anssik]
- ... which layout is better changes over time
- 14:16:31 [anssik]
- Nikhil: would prefer to start very small with a POC that works, and have a plan how to grow that set of ops
- 14:16:39 [anssik]
- q?
- 14:17:24 [anssik]
- ... probably need a way to deal with custom ops, have a way in app space to describe custom ops share memory with matmul
- 14:17:26 [anssik]
- ack?
- 14:17:29 [anssik]
- ac?
- 14:17:31 [anssik]
- q?
- 14:17:49 [anssik]
- ack Rafael
- 14:18:27 [anssik]
- Rafael: I agree with a plan to keep this hardware agnostic
- 14:18:39 [Nikhil]
- awesome! that would be great.
- 14:18:47 [Nikhil]
- regarding the script to convert onnx / tensorflow
- 14:20:10 [anssik]
- Paul: everything Rafael basically yes, goal is to be hw agnostic
- 14:21:10 [anssik]
- ... ONNX has done work on channel formats and hit these same issues, proposed solutions
- 14:21:18 [anssik]
- q?
- 14:21:21 [Ningxin_Hu]
- q+ to talk about op set & use cases
- 14:21:22 [anssik]
- ack anssik
- 14:21:22 [Zakim]
- anssik, you wanted to ask about something and to
- 14:21:24 [anssik]
- ack?
- 14:21:29 [anssik]
- ack Ningxin_Hu
- 14:21:29 [Zakim]
- Ningxin_Hu, you wanted to talk about op set & use cases
- 14:22:17 [anssik]
- Ningxin_Hu: thanks for the efforts of Nikhil and Daniel, great work! Agree with approach of starting with a small set of ops and validate compat with JS libs
- 14:23:26 [anssik]
- ... proposal how to grow the op set: add ops that are needed to implement identified use cases
- 14:23:42 [anssik]
- https://webmachinelearning.github.io/webnn/#usecases
- 14:23:50 [Nikhil]
- q+ want to talk about custom ops technical details
- 14:24:26 [Ningxin_Hu]
- op set and use cases: https://github.com/webmachinelearning/webnn/issues/17#issuecomment-508426036
- 14:24:29 [anssik]
- [silence, agreement]
- 14:24:38 [anssik]
- ack Nikhil
- 14:25:32 [anssik]
- q+ Nikhil
- 14:26:10 [anssik]
- Paul: we took an approach where we selected the ops that benefit from hw acceleration
- 14:26:28 [anssik]
- ... a bit similar approach to CUDA
- 14:27:43 [anssik]
- Ningxin_Hu: if we only select expensive ops that benefit from hw, that may impose perf penalty when doing context switching
- 14:28:19 [anssik]
- Paul: I agree, it might be worth prototyping that now, assumption we're proposing is this hybrid approach (w/ WebGL) is viable
- 14:28:25 [jonathan]
- What other ML frameworks should review each op, like Daniel did for TensorFlow, and confirm compatibility before we finalize the definition?
- 14:30:15 [anssik]
- Ningxin_Hu: agree with Paul's comments, interleaving with Wasm in POC, overhead was significant
- 14:30:51 [anssik]
- Rafael: CPU readback is slow, staying with GPU compute shaders should work pretty well
- 14:30:57 [anssik]
- q?
- 14:31:29 [RRSAgent]
- I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
- 14:32:02 [anssik]
- q?
- 14:32:24 [anssik]
- ack Nik
- 14:32:27 [anssik]
- ack Nikhil
- 14:32:29 [anssik]
- q?
- 14:32:30 [jdarpinian]
- q+
- 14:32:36 [anssik]
- ack jdarpinian
- 14:33:02 [anssik]
- jdarpinian: I'm on the Chrome team and think custom ops based on WebGL can work, but will be very complex to implement
- 14:33:33 [Nikhil]
- We think it's important to be able to have custom operations share memory with conv2d / matmul without doing a readback. for cpu-accelerators, share the buffer with WASM, for gpu-accelerators share the buffer with WebGL
- 14:33:52 [anssik]
- ... portability between custom ops between different systems, CPU and GPU not very good
- 14:33:56 [Ningxin_Hu]
- q+ talk about mem layout reordering overhead between custom ops and hw accelerated ops
- 14:34:05 [Ningxin_Hu]
- q+
- 14:34:14 [anssik]
- q?
- 14:34:34 [Nikhil]
- this allows us to grow the spec slowly and not have tail-end ops be bottlenecks and the webnn accelerated ops can get quick wins by accelerating the bottleneck ops (conv2d, matmul, etc)
- 14:34:37 [anssik]
- Paul: I think Ningxin_Hu posted an architecture diagram
- 14:34:51 [anssik]
- -> https://github.com/webmachinelearning/webnn/issues/17#issuecomment-518915131 arch diagram
- 14:35:28 [anssik]
- Paul: frameworks will do the heavy lifting, web developer won't see the complexity
- 14:36:01 [anssik]
- Nikhil: we think the same, but not all devices have WebGL backend so fallback to Wasm for example
- 14:36:07 [anssik]
- q?
- 14:36:14 [anssik]
- ack Ningxin_Hu
- 14:36:42 [anssik]
- Ningxin_Hu: about custom ops, folks talked about memory transfer overhead
- 14:38:08 [anssik]
- ... even long SIMD instructions on CPU can require tensor memory re-layout, an expensive operation
- 14:38:19 [anssik]
- q?
- 14:38:29 [Nikhil]
- q+
- 14:39:35 [anssik]
- anssik: it was asked on the issue whether graph is the right abstraction?
- 14:40:31 [jdarpinian]
- q+
- 14:40:45 [anssik]
- jonathan: what are the other JS frameworks we need to take into compatibility study?
- 14:41:32 [anssik]
- Paul: in ONNX we considered all frameworks that matter, they have a voice in ONNX project
- 14:42:28 [anssik]
- ... in ONNX we have considered PyTorch, Caffe, Intel's frameworks, Microsoft's frameworks, TensorFlow, we have ONNX to TF converter, Apple's CoreML
- 14:42:50 [anssik]
- ... CoreML was part of the opset 1 compatibility
- 14:42:55 [anssik]
- q?
- 14:43:31 [anssik]
- Nikhil: specifically interested in JS ML frameworks
- 14:43:40 [anssik]
- ... for compatibility
- 14:44:06 [anssik]
- ... for example, Brain.js
- 14:44:54 [anssik]
- Paul: we don't want to have two bodies managing op schema, right?
- 14:45:18 [anssik]
- Nikhil: we want to grow slowly, right?
- 14:45:40 [anssik]
- ... focus on web stuff to figure out an intersection of JS ML libraries, does that sounds reasonable?
- 14:46:41 [anssik]
- Paul: ONNX does have namespace and versioning concepts, so we could create our own ONNX namespace for the ops references by Web NN API
- 14:47:22 [anssik]
- Rafael: it is up to us to decide how many ops to adopt, the op definitions themselves would come from ONNX standards body
- 14:48:11 [anssik]
- danielsmilkov: that makes sense, want to be clear, because of portability issues and JS libs as users, some changes needed to ONNX may be needed e.g. memory layout
- 14:48:43 [anssik]
- Paul: that's fairly reasonable, ONNX community would certainly welcome that
- 14:48:58 [anssik]
- danielsmilkov: relaxing, not breaking existing ONNX behaviour
- 14:49:13 [anssik]
- ... going to custom ops
- 14:50:09 [anssik]
- ... we deal with real models every way, need to add ops to TF, interoperability important for e.g. pre and post-processing of media, video
- 14:50:22 [anssik]
- q?
- 14:50:48 [anssik]
- ack jdarpinian
- 14:51:21 [anssik]
- jdarpinian: also need to look into hardware we want to support, there's a lot of hardware out these and new coming up, e.g. neural engines coming up in ARM chips
- 14:51:44 [kainino]
- q+
- 14:51:58 [anssik]
- Nikhil: that's a good point, e.g. for matmul would be good to do homework checking how that works across all hardware
- 14:55:28 [anssik]
- anssik: Daniel and Nikhil could you move your doc https://docs.google.com/document/d/1RXCkZ9mliWbqSakYvNlWhsRH4yFtnpe1YQQNFAIRZo8/edit#heading=h.n1gbg8k8lggq into a GH issue
- 14:55:37 [anssik]
- Nikhil: yes, we'll do that
- 14:56:29 [anssik]
- danielsmilkov: GH issue #17 there's a comment where Ningxin_Hu proposed 14 ops, we could do the work to split these 14 ops into 3-4 GH issues with some logical bundling
- 14:57:24 [anssik]
- PROPOSED RESOLUTION: The specification will reference the ONNX operations and if there are any improvements desired for ONNX the work should be there.
- 14:57:36 [Ningxin_Hu]
- 14 ops proposal: https://github.com/webmachinelearning/webnn/issues/17#issuecomment-512651711
- 14:58:41 [anssik]
- PROPOSED RESOLUTION: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
- 14:58:48 [kainino]
- q+ re: jdarpinian, want to point out it's important to not only understand the current and upcoming hardware, but since the browser runs in userspace we also need to run on top of the userspace apis (NNAPI, CoreML, DirectML) so we are constrained by how they expose things
- 14:59:31 [Nikhil]
- q+ AI for custom ops
- 14:59:34 [anssik]
- q?
- 14:59:43 [anssik]
- ack want
- 14:59:43 [Zakim]
- want, you wanted to talk about custom ops technical details
- 14:59:46 [anssik]
- q?
- 14:59:51 [anssik]
- ack AI
- 14:59:51 [Zakim]
- AI, you wanted to discuss custom ops
- 14:59:56 [anssik]
- ack kainino
- 14:59:56 [Zakim]
- kainino, you wanted to discuss jdarpinian, want to point out it's important to not only understand the current and upcoming hardware, but since the browser runs in userspace we
- 15:00:00 [Zakim]
- ... also need to run on top of the userspace apis (NNAPI, CoreML, DirectML) so we are constrained by how they expose things
- 15:00:35 [anssik]
- kainino: we want to point out it's important to not only understand the current and upcoming hardware, but since the browser runs in userspace
- 15:00:43 [anssik]
- ... we also need to run on top of the userspace apis (NNAPI, CoreML, DirectML) so we are constrained by how they expose things
- 15:00:49 [anssik]
- q?
- 15:00:52 [anssik]
- ack Nikhil
- 15:01:14 [anssik]
- Nikhil: sharing memory with custom ops needs to be better understood
- 15:01:37 [anssik]
- ... can you Ningxin_Hu do that investigation?
- 15:01:57 [anssik]
- Ningxin_Hu: with help from james or kai we could make progress with custom ops issue
- 15:02:25 [anssik]
- Rafael: have bandwith to help, but not time to drive
- 15:02:32 [anssik]
- jdarpinian: the same, can help not drive
- 15:02:41 [anssik]
- Ningxin_Hu: I can take the lead, with help from others
- 15:03:08 [anssik]
- PROPOSED RESOLUTION: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
- 15:03:13 [kainino]
- Ningxin_Hu: Please reach out to us as needed
- 15:03:23 [kainino]
- oops that's supposed to be @Ningxin_Hu
- 15:03:39 [Ningxin_Hu]
- thanks @kainino
- 15:05:10 [anssik]
- https://www.w3.org/2019/09/TPAC/
- 15:06:35 [anssik]
- any concerns with the amended proposed resolution?
- 15:06:43 [anssik]
- [hearing no concerns]
- 15:06:48 [anssik]
- RESOLUTION: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
- 15:06:59 [anssik]
- RRSAgent, draft minutes v2
- 15:06:59 [RRSAgent]
- I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
- 15:08:42 [anssik]
- TOPIC: Adjourn
- 15:09:08 [anssik]
- RRSAgent, draft minutes v2
- 15:09:08 [RRSAgent]
- I have made the request to generate https://www.w3.org/2019/08/08-webmachinelearning-minutes.html anssik
- 17:08:27 [Zakim]
- Zakim has left #webmachinelearning