IRC log of webmachinelearning on 2021-05-27
Timestamps are in UTC.
- 13:55:24 [RRSAgent]
- RRSAgent has joined #webmachinelearning
- 13:55:24 [RRSAgent]
- logging to https://www.w3.org/2021/05/27-webmachinelearning-irc
- 13:55:27 [Zakim]
- RRSAgent, make logs Public
- 13:55:27 [Zakim]
- please title this meeting ("meeting: ..."), anssik
- 13:55:29 [anssik]
- Meeting: WebML CG Teleconference – 27 May 2021
- 13:55:34 [anssik]
- Chair: Anssi
- 13:55:39 [anssik]
- Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2021-05-27-agenda.md
- 13:55:45 [anssik]
- Scribe: Anssi
- 13:55:54 [anssik]
- scribeNick: anssik
- 13:56:02 [anssik]
- Present+ Anssi_Kostiainen
- 14:00:09 [Ping_Yu]
- Ping_Yu has joined #webmachinelearning
- 14:00:15 [anssik]
- Present+ Ningxin_Hu
- 14:00:33 [ningxin_hu]
- ningxin_hu has joined #webmachinelearning
- 14:00:39 [anssik]
- Present+ Ping_Yu
- 14:02:06 [anssik]
- Present+ Chai_Chaoweeraprasit
- 14:02:40 [anssik]
- Present+ Jonathan_Bingham
- 14:02:59 [anssik]
- RRSAgent, draft minutes
- 14:02:59 [RRSAgent]
- I have made the request to generate https://www.w3.org/2021/05/27-webmachinelearning-minutes.html anssik
- 14:03:25 [Jonathan]
- Jonathan has joined #webmachinelearning
- 14:03:26 [Chai]
- Chai has joined #webmachinelearning
- 14:03:35 [RafaelCintron]
- RafaelCintron has joined #webmachinelearning
- 14:03:52 [anssik]
- Present+ Rafael_Cintron
- 14:04:00 [anssik]
- Topic: Security and Privacy
- 14:04:12 [anssik]
- Subtopic: Security and Privacy Considerations
- 14:04:55 [anssik]
- anssik: First let's review and discuss initial Security and Privacy Considerations.
- 14:05:00 [zkis]
- present+ Zoltan_Kis
- 14:05:10 [anssik]
- anssik: I submitted a PR #170 to address issue #122. It should be noted we expect to evolve this initial version based on additional feedback, this is a starting point.
- 14:05:15 [anssik]
- -> https://github.com/webmachinelearning/webnn/issues/122 Security and privacy considerations (issue #122)
- 14:05:21 [anssik]
- -> https://github.com/webmachinelearning/webnn/pull/170 Add initial Security and Privacy Considerations sections (PR #170)
- 14:05:26 [anssik]
- Present+ Zoltan_Kis
- 14:05:56 [anssik]
- ... Chai LGTM'd the PR #170, pending Ningxin's LGTM. This content meets the bar for First Public Working Draft purposes.
- 14:06:52 [anssik]
- ... In Security section we should discuss security mechanisms that protect confidentiality, preserve information integrity, or promote availability of data -- we already added Permissions Policy integration per PING feedback
- 14:07:27 [anssik]
- ... In Privacy, we discuss measures taken to protect the rights of individual with respect to personal information, or known privacy concerns
- 14:07:32 [anssik]
- ... fingerprinting is probably the most substantial privacy concern in Web API design
- 14:07:40 [anssik]
- -> https://w3c.github.io/fingerprinting-guidance/ Mitigating Browser Fingerprinting in Web Specifications
- 14:07:58 [anssik]
- anssik: PING has written a doc about it that also proposes mitigations, we all should read it
- 14:08:14 [ningxin_hu]
- sure, I'll review
- 14:08:39 [anssik]
- We have two related [privacy-tracker] labelled issue
- 14:08:45 [anssik]
- -> https://github.com/webmachinelearning/webnn/issues/119 [privacy-tracker] Self-Review Questionnaire (issue #119)
- 14:09:14 [anssik]
- anssik: this documents our questionnaire response and serves as a record Privacy Interest Group has acknowledged and is happy about our initial response. We continue work with PING to expand the privacy considerations.
- 14:09:25 [anssik]
- -> https://github.com/webmachinelearning/webnn/issues/85 [privacy-tracker] Fingerprinting via matmul (issue #85)
- 14:09:37 [anssik]
- anssik: issue #85 is about a possible fingerprinting vector we discussed earlier
- 14:09:47 [anssik]
- ... in PR #170 I incorporated the following statement to inform implementers about this possibility:
- 14:09:54 [anssik]
- ... "An execution time analysis may reveal indirectly the performance of the underlying platform's neural network hardware acceleration capabilities relative to another underlying platform."
- 14:10:55 [anssik]
- anssik: Ningxin provided comments (thanks!) from Wasm people how they're handling these concerns, documented in the issue #85
- 14:11:00 [anssik]
- ... Ningxin, want to brief us on what you learned?
- 14:11:14 [anssik]
- Ningxin: input from Jonathan and Jing involved with Wasm
- 14:11:58 [anssik]
- ... 1. Saturation and rounding (round-to-nearest ties-to-eve) are standardized in Wasm SIMD. So JS developers should see same saturation behavior on different architectures.
- 14:13:01 [anssik]
- ... 2. There is an early proposal in Wasm SIMD called Relaxed SIMD, which wants to relax some strict determinism requirements of instructions to unlock near native performance on different platforms. Fingerprinting would also be considered there.
- 14:14:03 [RafaelCintron]
- q+
- 14:14:03 [anssik]
- ... it'd be useful to monitor how Wasm CG address these issues
- 14:14:18 [anssik]
- ack RafaelCintron
- 14:14:52 [anssik]
- RafaelCintron: question re Relaxed SIMD, is this turned on by default?
- 14:15:00 [anssik]
- ningxin_hu: need to check with Wasm people
- 14:15:36 [anssik]
- anssik: if any of these mitigations apply to WebNN API we should reuse those
- 14:16:13 [ningxin_hu]
- https://github.com/WebAssembly/design/issues/1401
- 14:16:15 [Chai]
- [need to step away briefly. be right back]
- 14:16:46 [ningxin_hu]
- https://github.com/WebAssembly/relaxed-simd
- 14:16:47 [anssik]
- ... Iet's keep the issuie #85 open to solicit further feedback
- 14:17:15 [anssik]
- Subtopic: WebGPU/GL Security and Privacy Considerations
- 14:17:28 [Chai]
- [back]
- 14:17:35 [anssik]
- anssik: wanted the group to discuss and review WebGPU/GL Security and Privacy Considerations to understand whether some of them could be repurposed in this context
- 14:17:40 [anssik]
- -> https://gpuweb.github.io/gpuweb/#security WebGPU Security
- 14:17:45 [anssik]
- -> https://gpuweb.github.io/gpuweb/#security-privacy WebGPU Privacy
- 14:17:49 [anssik]
- -> https://www.khronos.org/registry/webgl/specs/latest/1.0/#4 WebGL Security
- 14:18:03 [anssik]
- anssik: WebGPU identifies the following security considerations:
- 14:18:12 [anssik]
- ... - CPU-based undefined behavior
- 14:18:12 [anssik]
- ... - GPU-based undefined behavior
- 14:18:12 [anssik]
- ... Uninitialized data
- 14:18:12 [anssik]
- ... Out-of-bounds access in shaders
- 14:18:12 [anssik]
- ... Invalid data
- 14:18:12 [anssik]
- ... Driver bugs
- 14:18:13 [anssik]
- ... Timing attacks
- 14:18:13 [anssik]
- ... Row hammer attacks
- 14:18:13 [anssik]
- ... Denial of service
- 14:18:14 [anssik]
- ... Workload identification
- 14:18:14 [anssik]
- ... Memory resources
- 14:18:14 [anssik]
- ... Computation resources
- 14:18:59 [anssik]
- anssik: and these WebGPU Privacy considerations:
- 14:19:06 [anssik]
- ... - Machine-specific limits
- 14:19:06 [anssik]
- ... - Machine-specific artifacts
- 14:19:06 [anssik]
- ... - Machine-specific performance
- 14:19:14 [anssik]
- anssik: WebGL Security considerations:
- 14:19:23 [anssik]
- ... - Resource Restrictions
- 14:19:23 [anssik]
- ... - Origin Restrictions
- 14:19:23 [anssik]
- ... - Supported GLSL Constructs
- 14:19:23 [anssik]
- ... - Defense Against Denial of Service
- 14:19:24 [anssik]
- ... - Out-of-Range Array Accesses
- 14:19:39 [anssik]
- anssik: Maybe Rafael has some comments from the WebGL side?
- 14:19:54 [anssik]
- RafaelCintron: both groups take this very seriously
- 14:20:11 [anssik]
- ... in WebGL you can only use same-origin or CORS textures to avoid ppl using timing attacks
- 14:20:22 [anssik]
- ... many many years based on security research feedback
- 14:20:34 [anssik]
- ... timer queries less precise to avoid fingerprinting
- 14:20:43 [anssik]
- ... WebGL tried to make many undefined things to be defined
- 14:21:08 [anssik]
- ... so behaviour is consistent across browsers
- 14:21:39 [anssik]
- ... not just helping security, but also debuggability
- 14:22:54 [anssik]
- ... it has been a group effort to WebGPU/GL to come up with Security and Privacy Considerations
- 14:23:59 [anssik]
- anssik: questions, comments?
- 14:24:08 [anssik]
- Topic: Operation-specific APIs proposal
- 14:24:36 [anssik]
- anssik: Let's continue our favourite topic, and discuss design considerations and review proposed solutions to enable both efficient graph execution and imperative eager execution with a cohesive WebNN API.
- 14:24:50 [anssik]
- -> https://github.com/webmachinelearning/webnn/pull/166">https://github.com/webmachinelearning/webnn/pull/166 https://github.com/webmachinelearning/webnn/pull/166 Support download data asynchronously (PR #166)
- 14:25:04 [anssik]
- anssik: I'll let Chai and Ningxin update us on the status of this PR
- 14:25:38 [ningxin_hu]
- https://github.com/webmachinelearning/webnn/issues/156#issuecomment-846828170
- 14:25:40 [anssik]
- ningxin_hu: I have a prototype to better understand this issue
- 14:26:07 [anssik]
- ... this follows up on Jonathan and Ping requirement to allow conv2d impl for TF.js Wasm backend
- 14:27:37 [RafaelCintron]
- q+
- 14:27:44 [anssik]
- ... The implementation is in conv2d_impl.cc and the WebNN calls are guarded by USE_WEBNN_OP. With the prototype, I observed good performance speedup (3X to 5X) by a tf.conv2d benchmark when offloading the compute to native library (such as XNNPACK or oneDNN).
- 14:28:55 [anssik]
- ... using same op cache for later use, when the cache hits a graph is fetched and compute run on inputs and outputs allocated by TF.js
- 14:29:41 [anssik]
- ... two backends used, XNNPACK and oneDNN
- 14:30:05 [anssik]
- ... using webnn-native project for this prototyping
- 14:30:28 [anssik]
- ... observations:
- 14:30:45 [anssik]
- ... 1) TF.js Wasm backend expects the input and output data of an op execution to be in standard layout.
- 14:31:07 [anssik]
- ... 2) TF.js Wasm backend pre-allocates input and output buffers for an op execution.
- 14:31:20 [anssik]
- ... 3) TF.js Wasm backend executes an op synchronously.
- 14:31:24 [anssik]
- q?
- 14:31:58 [anssik]
- ningxin_hu: GraphBuilder API implemented as a sync API to satisfy TF.js backend requirement
- 14:32:26 [anssik]
- ack RafaelCintron
- 14:33:14 [anssik]
- RafaelCintron: biggest finding is that using real native code is faster than using the best Wasm you can have today, this gives a lot of hope WebNN is faster than alternatives
- 14:33:45 [Ping_Yu]
- q+
- 14:34:00 [anssik]
- ack Ping_Yu
- 14:34:27 [anssik]
- Ping_Yu: thanks Ningxin for making WebNN working with TF.js Wasm backend
- 14:34:57 [anssik]
- ... want to understand where the performance comes from, my understanding is it is due to wider SIMD and oneDNN optimizations?
- 14:35:26 [anssik]
- ... historically Wasm backend is in sync more, for other backends there's no sync download
- 14:35:36 [anssik]
- [line breaking, typos expected in scribing]
- 14:35:55 [anssik]
- s/sync more/sync mode
- 14:37:13 [anssik]
- ningxin_hu: performance gain comes according to my investigations, wider SIMD via XNNPACK, AVX-256 wide instruction on my dev machine, while Wasm SIMD is only 128 wide today
- 14:37:48 [RafaelCintron]
- q+
- 14:37:56 [anssik]
- ... oneDNN uses even more aggressive optimization strategy, reorder not only width, but input and output to vector instruction optimization layout, this is platform specific, different architectures may use different layouts for better performance
- 14:38:04 [anssik]
- q?
- 14:38:07 [anssik]
- ack RafaelCintron
- 14:38:32 [anssik]
- RafaelCintron: there's difference between API being sync and returning objects right away
- 14:38:37 [Chai]
- q+
- 14:39:16 [Jonathan]
- Is WASM SIMD going to get wider instructions eventually (256 vs 128)?
- 14:39:25 [anssik]
- ... if they're backed by GPU that's possible, so OK for me to have WebNN compute return objects right away, and if you want to read back you have to do that via async promise-returning object
- 14:40:10 [anssik]
- q?
- 14:40:28 [anssik]
- ack Chai
- 14:40:50 [anssik]
- Chai: specifically on the PR itself, I've summarized it in my more recent comment
- 14:41:04 [anssik]
- ... want to double-check what Ningxin responded lately
- 14:41:33 [anssik]
- ... are you saying that based on your prototype it doesn't matter if the API return native format?
- 14:42:31 [anssik]
- ... if we're going to support native format, there's a fingerprinting concern, so would be good for the group to clarify what is the position what comes to native format, overlaps a bit with readback that can happen in standard format
- 14:43:13 [anssik]
- ... sync vs. async is a questions, we should talk pros/cons of both and settle on one design, if we support both it'll be harder for implementers
- 14:43:24 [anssik]
- ... if we do both it'll be more confusing for the caller
- 14:43:28 [ningxin_hu]
- q+
- 14:44:02 [anssik]
- ack ningxin_hu
- 14:44:35 [anssik]
- ningxin_hu: in this PR we discuss native format support, my latest response re investigation on TF.js backend is a major use case I think
- 14:45:00 [anssik]
- ... it turns out TF.js expects input and output in standard layout, based on that we can leave native format in a separate issue
- 14:45:15 [anssik]
- ... to be handled in a future version
- 14:45:20 [Chai]
- +1 on standard format only in V1
- 14:46:10 [anssik]
- ningxin_hu: I can revert MLTensor proposal that support format conversion, because that is not needed for this V1 spec
- 14:46:48 [anssik]
- ... to close op specific use case, I propose we align what Chai proposed, make compute API only support pre-allocated output, good for GPU resource
- 14:46:59 [anssik]
- ... this is documented in my comment
- 14:47:00 [ningxin_hu]
- https://github.com/webmachinelearning/webnn/pull/166#discussion_r637792027
- 14:47:15 [anssik]
- ... secondly, we'll leave native format in a separate issue to be addressed in future
- 14:47:45 [anssik]
- ... third is to add support for sync version of build and compute API, keep async versions too
- 14:48:39 [RafaelCintron]
- q+
- 14:48:46 [anssik]
- anssik: anyone have concerns with Ningxin's/Chai's proposal?
- 14:48:51 [anssik]
- ack RafaelCintron
- 14:49:06 [anssik]
- RafaelCintron: is there agreement that we're going to have both sync and async?
- 14:49:30 [anssik]
- ... is async meant for CPU ArrayBuffer usage?
- 14:49:42 [anssik]
- ningxin_hu: I think the agreement is for the compute API to not return output
- 14:49:57 [anssik]
- ... previously we only had compute accept input and preallocated output
- 14:50:09 [anssik]
- ... this is one proposal, agreement between Chai and me
- 14:50:55 [anssik]
- Chai: the previous API implies the implementer needs to manage the output buffer
- 14:51:01 [anssik]
- ... separate issue is sync vs async
- 14:51:21 [anssik]
- ... my preference is for async, making it simpler for implementer and caller
- 14:52:01 [anssik]
- ... WebNN context can be created off of explicit device or context, in the latter case the implementation can create a GPU device under the hood hiding platform details for the caller fully
- 14:53:35 [anssik]
- ... additional flexibility to caller, but if resource is given to ArrayBuffer, implies the caller want to buffer to be on CPU
- 14:54:45 [anssik]
- ... if we have both async and sync API we have to document carefully both
- 14:55:35 [anssik]
- anssik: can you live with both sync and async API?
- 14:55:38 [RafaelCintron]
- q+
- 14:56:29 [ningxin_hu]
- sgtm
- 14:56:35 [anssik]
- ack RafaelCintron
- 14:57:43 [anssik]
- RafaelCintron: if we must have async version, perhaps we can have that be strongly type to only take ArrayBuffer and return them
- 14:57:54 [Chai]
- +1 on async only for CPU buffer output
- 14:58:11 [anssik]
- Topic: Model Loader API update
- 14:58:16 [anssik]
- anssik: Next, we'll hear a Model Loader API update from Jonathan. Two topics:
- 14:58:21 [anssik]
- ... 1. Chrome OS Origin Trial plans
- 14:58:28 [anssik]
- ... 2. Wasm runner for TF Lite and how it could possibly be used for WebNN benchmarking
- 14:59:00 [anssik]
- Jonathan: ChromeOS has staffing for doing work on Model Loader API
- 14:59:10 [anssik]
- ... starting with the spec in this CG
- 14:59:25 [anssik]
- ... getting to Origin Trial in 1-2 Qs
- 14:59:44 [anssik]
- ... plan to be able to run TF Lite models, understanding that TF Lite models are not suitable for a web standard
- 15:00:09 [anssik]
- ... but want to use that as a starting point to be able to benchmark with WebNN and Wasm
- 15:00:26 [anssik]
- ... at Google I/O we announced work by Ping et al. Wasm runner for TF Lite models
- 15:00:52 [anssik]
- ... similar to Model Loader, takes TF Lite model and runs that with Wasm
- 15:01:16 [anssik]
- ... there's potential for this work to become a polyfill for a graph API or Model Loader API, if we make that more generic, possible future generation
- 15:01:37 [anssik]
- q?
- 15:01:48 [anssik]
- RRSAgent, draft minutes
- 15:01:48 [RRSAgent]
- I have made the request to generate https://www.w3.org/2021/05/27-webmachinelearning-minutes.html anssik
- 15:01:49 [ningxin_hu]
- q+
- 15:01:57 [anssik]
- ack ningxin_hu
- 15:02:13 [anssik]
- ningxin_hu: for the second one, you mentioned TF Lite Wasm version, it is interesting
- 15:02:43 [anssik]
- ... for WebNN benchmarking, do you mean we have WebNN backend for TF Lite Wasm similarly to TF.js backend?
- 15:02:54 [anssik]
- Jonathan: that's potential, you need to talk to Ping about that
- 15:03:21 [anssik]
- ... from our side, TF team wants to do some benchmarking, one way is TF Lite Wasm runner, one is WebNN and parse the model and construct the graph
- 15:03:36 [anssik]
- ... it could be useful for this group
- 15:04:19 [anssik]
- ningxin_hu: regarding our explainer, we target framework usage, TF Lite Wasm runner fits into that, I'm interested in investigation into this and will chat with you offline
- 15:04:56 [Jonathan]
- https://www.youtube.com/watch?v=5q8BzYN4rqA
- 15:05:05 [anssik]
- q?
- 15:05:24 [anssik]
- RRSAgent, draft minutes
- 15:05:24 [RRSAgent]
- I have made the request to generate https://www.w3.org/2021/05/27-webmachinelearning-minutes.html anssik