15:00:19 RRSAgent has joined #webmachinelearning 15:00:19 logging to https://www.w3.org/2021/11/18-webmachinelearning-irc 15:00:27 Zakim has joined #webmachinelearning 15:00:33 Zakim, prepare meeting 15:00:33 RRSAgent, make logs Public 15:00:35 please title this meeting ("meeting: ..."), anssik 15:00:38 Meeting: WebML WG Teleconference – 18 Nov 2021 15:00:43 Present+ Dom, Anssi, Ningxin, FengDai, Rafael, Chai 15:00:46 Chair: Anssi 15:00:53 Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2021-11-18-agenda.md 15:00:54 ningxin_hu has joined #webmachinelearning 15:01:01 Scribe: Anssi 15:01:06 scribeNick: anssik 15:01:16 RafaelCintron has joined #webmachinelearning 15:01:17 Present+ Anssi_Kostiainen 15:01:20 Chai has joined #webmachinelearning 15:01:24 Present+ Jonathan 15:01:31 Present+ Dominique_Hazael-Massieux 15:01:31 Present+ Chai_Chaoweeraprasit 15:01:32 Present+ Rachel 15:02:07 Present+ Feng_Dai 15:02:30 Present+ Jonathan_Bingham 15:02:37 Present+ Ningxin_Hu 15:02:51 Present+ Rafael_Cintron 15:02:58 RRSAgent, draft minutes 15:02:58 I have made the request to generate https://www.w3.org/2021/11/18-webmachinelearning-minutes.html anssik 15:03:10 RRSAgent, make log public 15:03:13 scribe+ 15:03:19 Topic: TPAC meeting follow-up 15:03:44 anssi: we had great presentations over the 3 days of our TPAC meeting 15:04:05 ... I'm thinking of debriefing these discussions in order we had them during TPAC 15:04:25 ... we could also add Model Loader to our agenda 15:04:30 Subtopic: Rationale/criteria for adding new ops to the WebNN API 15:04:47 -> https://lists.w3.org/Archives/Public/www-archive/2021Nov/att-0000/W3C_Adding_new_Operators.pdf ONNX adding new operators presentation 15:04:57 Rachel has joined #webmachinelearning 15:05:13 anssi: we had a guest speaker, Mikal from the ONNX project 15:05:44 ... how much of these learnings resonate with our needs? which of these thoughts should we adapt to our work? 15:06:18 ... from my perspective, many of the guidance Mikal has showed we've been following implicitly, like decomposition into primitives, or backing operators with use cases 15:06:35 ... or tying addition to support in popular frameworks 15:07:27 q+ 15:07:41 ack Chai 15:07:43 Rafael: Rama is the one most closely involved in ONNIX & operators 15:08:02 s/Mikal/Michel 15:08:29 s/Michel/Michal 15:08:43 chai: in general, I would say we have already taken many of the benefits and design considerations from ONNX in WebNN 15:08:59 ... I think we're already pretty aligned with what was presented 15:09:31 ... one tension point that I'm still unsure about: ONNX was initially focused fully on interoperable, as a portable machine learning format, without being specific about implementations 15:09:39 Feng_Dai has joined #webmachinelearning 15:10:08 ... ONNX is pretty "relax" in that it defines the semantics of an operation without being prescriptive on how it should be implemented 15:10:16 ... WebNN is almost the same: we want interop 15:10:31 ... but the difference is that WebNN is designed for performance 15:10:51 ... to give better results than e.g. WebGL-based framework implementations 15:11:04 ... direct access to platform APIs allows this additional performance 15:11:34 ... because want it to be fast *and* cross-platform interoperable, this creates a tension point 15:11:55 Jonathan has joined #webmachinelearning 15:11:58 ... ONNX didn't have to consider hardware-acceleration 15:12:25 ... for WebNN, we should try to be optimal first while providing x-framework support 15:12:34 ... we need to pay attention to how operators will be implemented 15:12:57 +1 for implementability 15:13:00 anssi: ONNX doesn't include a reference implementation 15:13:08 ... why was that decision made? 15:13:21 ... we've been discussing that for WebNN to implement a proper test suite for it 15:13:25 ... any learning from ONNX? 15:13:59 ... Web APIs traditionally don't have a reference implementation 15:14:17 Chai: this decision emerges from ONNX not being implementation specific 15:14:27 ... from a standard point of view, I agree we shouldn't care 15:14:29 q? 15:14:43 ... but for testing, we do need a reference implementation to serve as a baseline semantic behavior to be compared with 15:14:57 ... otherwise there is no clarity on how it should work 15:15:04 ... but this shouldn't be part of the standard 15:16:03 ... for conformance testing, we have to compare results from the hardware with something 15:16:09 ... as I have been discussing with Bruce 15:16:21 q+ 15:16:45 anssi: should we define a lightweight process for submitting new operators? 15:16:55 ... do we want to add some more formality to our current process? 15:17:19 rachel: +1 to developing some reference implementation 15:17:52 ... ONNX may be presenting a simpler path to ease adoption 15:19:14 q? 15:19:16 ack ningxin_hu 15:20:09 Ningxin: two example of lessons I learned: static input preferred over dynamic values 15:20:14 ONNX "Prefer static attributes over dynamic input values" 15:20:34 ... this relates to our decision to move min / max value from dynamic to static 15:20:47 ... with static values being easier to optimized and to be mapped to native APIs 15:21:07 -> https://github.com/webmachinelearning/webnn/issues/224 Use tensor type for the padding rather than MLOperand #224 15:21:13 ... another is issue #24 which proposes to change the pad operator to use a static array 15:21:31 s/#24/#224 15:21:44 ... I would like us to consider this as a guideline for adding new ops 15:22:22 ... Another lesson is to include the shape inference logic 15:22:26 ONNX "Shape inference logic should be included" 15:22:33 +1 on prefer static, dynamism breaks accelerator's pipelining 15:22:34 ... that's almost true for WebNN operators 15:23:00 ... but there are gaps, e.g. the recent issue about the @@@ is missed for conv2d 15:23:31 ... we could call out for the output shape calculation - that will help for implementations, both for the reference impl and browser implementations 15:23:40 ... some native APIs won't calculate the shape by themselves 15:24:49 anssi: we could document these principles in a .md file to help us keep track of this 15:24:58 ... maybe completed with issue/pull request templates on github 15:25:11 ... any objection to adopting this? starting small 15:25:24 +1 15:25:30 RESOLUTION: Create a light-weight process to guide submitting new operator requests to WebNN 15:25:50 Subtopic: Versioning and web compatibility 15:26:09 +1 15:26:20 anssi: we had TAG participants joining us for that one; my take away from this session is that we should request incremental TAG review when they evolve 15:26:26 -> https://github.com/w3ctag/design-reviews/issues/570 Web Neural Network API TAG review 15:26:28 ... WebNN was reviewed some months ago 15:26:40 ... once we evolve the spec, we can ask TAG feedback on specific design choices we're making 15:27:19 q? 15:27:57 ... we may want to submit the CG's Model Loader API for early TAG review 15:28:12 Subtopic: Privacy and security discussion 15:28:26 -> https://www.w3.org/TR/fingerprinting-guidance/#bp-summary Fingerprinting best practices 15:28:32 -> https://www.w3.org/TR/fingerprinting-guidance/#identifying-fingerprinting-surface-and-evaluating-severity Fingerprinting severity 15:28:37 anssi: we had members from the Privacy IG (PING) joining us with good discussions 15:28:59 ... Nick Doty, author of the fingerprinting guidance, described some of the best practices in this space 15:29:11 ... we should document the fingerprinting surface of the WebNN check based on these best practices 15:29:31 ... we also discussed with the WebGPU people based on their experience in that space 15:29:53 ... the adaptor selection in WebGPU exposes information 15:29:54 -> https://github.com/webmachinelearning/webnn/issues/169 AI accelerator device selection #169 15:30:02 ... similar to our issue #169 15:30:04 q+ 15:30:09 ... any thoughts on this? 15:30:21 q? 15:30:44 q? 15:30:47 ack Chai 15:31:05 Chai: this is a complicated topic; it's not just adding an enum value 15:31:29 ... you don't know where the resources come from; many APIs that run on the accelerators nowadays still take resources from the CPU 15:31:52 ... that's a difference between GPU and an arbitrary accelerator device 15:32:07 ... Also, to different people device selection are thought about differently 15:32:21 ... for WebGPU/WebGL, they'll think of it as picking one device among many 15:32:33 ... for people running a model, they want the best device for the model they have 15:32:53 ... this can require workload analysis (heavy) vs picking one of many adaptors 15:33:05 ... the design is not fully settled 15:33:31 ... someone has been asking a smart selection logic with "auto" 15:33:54 ... it needs a lot more thoughts, since there is no precedent on how to do that on platforms today 15:34:00 -> https://github.com/webmachinelearning/webnn/pull/207 Adding a new use case for 'Framework Use Cases' #207 15:34:01 ... this may be premature to standardize 15:34:19 q+ 15:34:23 q? 15:34:25 ack dom 15:35:07 q+ 15:35:08 dom: on the device adapter aspect, I agree with Chai this is a challenging problem, I think the solution might be to identify what kind of params would help the browser pick the right device 15:36:33 ... e.g. WebXR API is build around selection of the device that is used to run VR/XR experiences, this can have significant impact on the API shape 15:36:52 q? 15:36:56 ack ningxin_hu 15:37:02 anssi: we need to understand the problem before designing the solution 15:37:31 ningxin: with WebNN, we have another preference that developers can be set based on the power setting - low power vs high performance 15:37:58 ... this would give opportunity for implementors to help the Web app to access some AI accelerators if it fits in this power category 15:38:13 ... that would help some AI accelerators be exposed through this setting 15:38:15 q? 15:38:23 q+ 15:38:28 ack Chai 15:39:11 chai: even without the context of the Web today, in a native platform API e.g. on Windows, and you want the app to use that API, even at that level, there is no support for that; we don't know how to do it 15:39:39 ... the current accelerators in the market come with their own ways of @@@ 15:40:10 q? 15:40:12 ... there is no universal way to expose these accelerators today 15:40:25 q? 15:40:35 Subtopic: ML JS framework performance, focus areas for WebNN 15:40:50 -> https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0014/WebNN_ML_JS_Framework_Performance.pdf WebNN ML JS Framework Performance presentation slides 15:41:04 anssi: we got a nice presentation from Ningxin on benchmark results 15:41:13 ... any take away in terms of potential improvements of the API? 15:41:27 ... there are 2 design related issues: sync vs async, main vs worker exposure 15:41:28 -> https://github.com/webmachinelearning/webnn/issues/229 Should restrict the sync APIs to only exist in Workers? #229 15:41:33 -> https://github.com/webmachinelearning/webnn/issues/230 Should WebNN support async APIs? #230 15:41:38 q? 15:41:42 ... they have performance impact 15:42:06 Ningxin: I opened this issue based on feedback from TPAC meetings 15:42:34 ... the WebNN sync APIs have proved to work very well for WASM-based frameworks (ONNX, TF-Lite, OpenCV) 15:43:24 ... because they're written in C++ using sync primitives, using sync WebNN APIs makes it easy to compile them in WASM 15:43:36 ... but sync APIs block the main thread 15:43:50 ... the solution would be to move the APIs to the worker thread 15:44:04 ... this requires communication between main and worker to exchange the data across thread 15:44:09 ... which can have performance impact 15:44:22 ... the impact is still unknown 15:44:33 q+ 15:44:34 ... the results from my prototypes were based on main thread operations 15:45:47 ... the other issue is from the JS developer perspective, e.g. TF.js 15:46:10 ... it's used mostly in the main thread, where we would need an async API to avoid blocking 15:46:26 q? 15:46:27 ... should we also provide an async API for compute? that would help with JS adoption of WebNN 15:46:33 ack RafaelCintron 15:47:16 RafaelCintron: WebGPU and WebGL are technically async APIs because you submit it on the CPU which then queues it to the GPU 15:48:01 ... when WebNN is used in the GPU, the commands are executed in parallel from a CPU perspective, so essentially async - we should block on the CPU until the inference is done 15:48:43 ... for WebNN CPU operations, they shouldn't be blocking - so it might be OK to have a promise callback 15:49:51 ... Workers are challenging to use to manage multiple threads; there is ongoing work to make it simpler to split things across workers 15:50:21 ... we shouldn't require people to use web workers for now 15:50:23 q+ 15:50:32 ack dom 15:50:32 great inputs, thanks Rafael 15:51:23 dom: related discussion happening in WebRTC WG, whether to expose an API in worker context only, no decision yet 15:52:09 ... want to be careful to not create situations similar to XMLHTTPRequest API that initially was both sync and async and sync API made it easier to adopt for developers but had to be deprecated with a high cost due to performance penalty 15:52:25 q? 15:53:24 anssik: other APIs besides WebCodecs and mediacapture-transform with similar issues? 15:53:28 q? 15:53:48 anssik: let's continue discussion in the issues Ningxin opened 15:54:09 Subtopic: Integrating an open-source cross-platform implementation of the Web Neural Network API into a web engine 15:54:41 -> https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0015/Integrate_WebNN-native_into_Chromium_TPAC.pdf Integrate WebNN-native into Chromium presentation slides 15:54:58 -> https://github.com/webmachinelearning/webnn/issues/223 Proposal to start documenting 15:54:58 implementation status 15:55:04 anssik: I've opened a github issue as a follow up to document implementation status 15:55:24 https://webmachinelearning.github.io/ 15:55:39 q? 15:55:47 The ChromeStatus entry of WebNN was created: https://www.chromestatus.com/feature/5738583487938560 15:56:07 q? 15:57:40 ningxin: the addition of the chromestatus entry is part of the chromium feature launch process 15:57:47 ... as would be the prototyping 15:58:00 Present+ Raviraj 15:58:05 ... with help from others, we've documenting the motivation, specification link and design doc 15:58:36 ... this includes information about compatibility, sync vs async, ethical considerations 15:58:46 ... this is a starting point 15:59:03 ... this is under review by the chromium stakeholders 15:59:25 ... after that, we will send an intent to prototype to blink-dev and expect to receive broader feedback 15:59:36 ... once that's done, I'll share the link to that email here 15:59:37 great progress, Ningxin! 15:59:53 anssik: great progress, looking forward to see this landing 15:59:57 q? 16:00:14 me too 16:00:16 RRSAgent, draft minutes 16:00:16 I have made the request to generate https://www.w3.org/2021/11/18-webmachinelearning-minutes.html dom 16:02:53 anssik: in the interest of time, we'll defer "Conformance testing of WebNN API" and "Ethical issues in using Machine Learning on the Web" as well as "Model Loader API update" to our next call that takes place 2nd Dec 2021 16:03:03 ... thanks for joining! 16:03:16 RRSAgent, draft minutes 16:03:16 I have made the request to generate https://www.w3.org/2021/11/18-webmachinelearning-minutes.html anssik 16:04:09 s/Mikal/Michal 16:05:13 RRSAgent, draft minutes 16:05:13 I have made the request to generate https://www.w3.org/2021/11/18-webmachinelearning-minutes.html anssik 16:41:03 anssik has joined #webmachinelearning 16:41:08 iank_ has joined #webmachinelearning 16:41:11 gregwhitworth has joined #webmachinelearning 16:41:29 sangwhan has joined #webmachinelearning 18:01:15 Zakim has left #webmachinelearning