15:05:32 RRSAgent has joined #webmachinelearning 15:05:32 logging to https://www.w3.org/2021/01/07-webmachinelearning-irc 15:05:35 RRSAgent, make logs Public 15:05:35 please title this meeting ("meeting: ..."), anssik 15:05:38 #/invite RRSAgent #webmachinelearning 15:05:41 Meeting: WebML CG Teleconference – 7 January 2021 15:05:46 Chair: Anssi 15:05:50 Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2021-01-07-agenda.md 15:05:55 Scribe: Anssi 15:05:59 scribeNick: anssik 15:06:04 Present+ Anssi_Kostiainen 15:06:09 Present+ Rafael_Cintron 15:06:17 Present+ Bruce_Dai 15:06:24 Present+ Chai_Chaoweeraprasit 15:06:41 Present+ Geun-Hyang_Kim 15:06:46 Present+ Ningxin_Hu 15:06:47 Jonathan has joined #webmachinelearning 15:06:52 Present+ Ganesan_Ramalingam 15:07:00 Present+ Zoltan_Kis 15:07:05 RRSAgent, draft minutes v2 15:07:05 I have made the request to generate https://www.w3.org/2021/01/07-webmachinelearning-minutes.html anssik 15:07:08 s/Hyang/Hyung 15:07:16 TOPIC: WebNN API TAG review progress report 15:07:29 anssik: The review seems to be work-in-progress and triaged. I just pinged the issue. 15:07:34 -> https://github.com/w3ctag/design-reviews/issues/570 Web Neural Network API - TAG Spec Review request 15:07:45 anssik: related, W3C announced today 7 January 2021 the results of the W3C TAG election and Sangwhan Moon was re-elected for another term on the TAG, congrats to him! https://www.w3.org/blog/news/archives/8846 15:08:02 anssik: I'll work with the TAG to get the review conducted in a timely manner. 15:08:06 anssik: any questions? 15:08:16 TOPIC: Security and Privacy considerations 15:08:36 anssik: Review proposed questionnaire responses, fill in TBDs: 15:08:44 -> https://github.com/webmachinelearning/webnn/issues/119 Self-Review Questionnaire: Security and Privacy #119 15:09:03 anssik: In prep for the TAG review it is recommended to complete the Self-Review Questionnaire. I took the first stab, but appreciate if the spec editors and other contributors take a look and in particular help address the following questions: 15:09:12 -> https://www.w3.org/TR/security-privacy-questionnaire/#first-third-party How does this specification distinguish between behavior in first-party and third-party contexts? 15:09:30 -> https://www.w3.org/TR/security-privacy-questionnaire/#private-browsing How does this specification work in the context of a user agent’s Private Browsing or "incognito" mode? 15:09:50 anssik: The answers will form the basis for the security and Privacy considerations for the spec, this assessment is expected when the spec advances to the WG. 15:10:36 TOPIC: Support style transfer models 15:10:46 -> https://github.com/webmachinelearning/webnn/pull/123 Support style-transfer models with changes/additions to the following operations (PR #123) 15:10:55 anssik: Reviewed and merged, thanks Chai for the work! 15:11:07 ... could you Chai briefly explain the gist of this PR for folks who did not review it yet? 15:11:48 Chai: one of the models that have been used a lot in all the frameworks 15:12:09 ... good exercise because it allows us to look at what the API is still lacking, specifically a few ops and options that need to be filled 15:12:41 ... the important ones are to extend the conv2d operation to support transposed convolution, an essential upsample tool for encoder-decoder models. 15:12:58 ... those new to these models, look at the spec first section, quite a few use cases need this 15:13:11 ... I've been looking through those use cases so the API is able to support those 15:13:45 ... essentially these models take input image and input style image and combine them using the content of the first image and stylistics of the second to produce an output 15:14:40 anssik: any questions? 15:14:52 TOPIC: Dynamic shape inference 15:14:59 -> https://github.com/webmachinelearning/webnn/issues/124 WebNN needs to support models requiring dynamic shape inference (issue #124) 15:15:21 anssik: Let's discuss this issue opened by Chai 15:15:51 Chai: scanned use cases before xmas to see what are the possible gaps 15:16:18 ... essentially, these are some models that in the middle of processing the graph need to get the tensor shape that flow in and turn into another tensor 15:16:35 ... a bit unusual behaviour, but it becomes popular in some models 15:16:51 ... it is easier for an ML developer to do this without having to know the shape ahead of time 15:17:02 ... not a very good technique if you're concerned about performance 15:17:29 ... if you want to construct a model with good perf, you identify perf ahead of time and be explicit of the size and tensors that flow in 15:18:00 ... bad for GPU, you need to flush the queue, kills pipelining 15:18:34 ... ... explained in the issue that dynamic shape inference is required in NSNet2, which "requires shape and constant-of-shape operation. Additionally, static shape inference should also be supported as it's a majority case to many models." 15:19:28 ... WebNN API implementer can also do optimization ahead of time, many cases in model that the shape can be identified ahead of time 15:19:44 ... the section of the spec should explain these details so that the execution can be as fast as possible 15:19:57 ... if all fails the implementation should do that hard work or figuring out the shape 15:21:32 anssik: comments? 15:22:11 ningxin_hu: I implemented NSNet2 sample, so I'd like to add a comment I agree with Chai this is required by NSNet2 15:22:39 ... I implemented a technique Chai just mentioned using static shape, since I know the input dimensions 15:23:35 ... I can skip the op and constant shape because I know then and can set them statically and can workaround and run NSNet2 with success 15:23:41 ... any comments Chai? 15:23:49 Chai: Looked at the sample it looks great 15:24:38 TOPIC: WebNN conv2d layout parameter TensorFlow incompatibility 15:24:48 -> https://github.com/webmachinelearning/webnn/issues/125 TensorFlow conv2d expects channel_last filter layout regardless of input layout format (issue #125) 15:25:39 anssik: Chai explains "From the WebNN conv2d spec, the same layout parameter controls both the input and filter layout (below). In TensorFlow (and presumably TFLite), regardless of the input layout, the filter layout remains in the "channel_last" format i.e. [height, width, input_channels/groups, output_channels]." 15:26:11 Chai: this is one of the issues that we discovered after we put in conv2d definition 15:26:43 ... it turns out TF internally can transpose the input tensor, they do not necessarily have to transpose the filter 15:27:24 ... the API needs to be able to support this scenario of input and filter layout to be specified separately 15:27:59 ... Ningxin may have insights on TF.js? 15:28:29 ningxin_hu: IIRC TF.js supports both for input, but for input only channel_last format 15:28:37 ... this aligns with TF, IIRC 15:29:02 ... checking TFLite, it looks like it uses channel_first for filter 15:29:07 ... need input from Google folks 15:30:24 Zoltan: can we encapsulate this layout in the implementation? 15:31:20 Ningxin: filter layout only channel_last, input can be both 15:31:59 Chai: if filter is channel_last that is the problematic case, WebNN cannot represent that 15:32:55 ... to answer the encapsulation questions, it depends on how WebNN API is used, you can definitely transpose any tensor before calling any API, but normally it is a bad think to do due to behavior, think trying to converse ResNet-50 with many layers 15:33:06 ... in the conversion process you'd end up with a graph that 15:33:20 ... would be inoptimal 15:34:03 ... would need more work to massage the layout every now and then and that hurts the performance 15:34:35 ... writing a converter between two formats, you'd like to have a very simple conversion 15:34:40 ... to answer the Zoltan 15:34:49 ... it could work, it is just more tedious 15:36:31 Chai: would love feedback on the issue #125 whether this should be supported 15:36:55 TOPIC: Super-resolution models 15:37:00 -> https://github.com/webmachinelearning/webnn/issues/127 WebNN should supports super-resolution models (issue #127) 15:37:56 Chai: one of the use cases of the spec, super-resolution has been up and coming, it is a model essentially upsample an image to a higher res 15:38:08 ... a lot of utilization of this technique in images and processing apps etc. 15:38:38 ... processing lab like photoshop, takes a low resolution image to uplevel the resolution of the image 15:39:05 ... also use cases in gaming, where Deep Learning Super Sampling is used to upsample game visuals in real time 15:40:04 ... useful for models, many variant everywhere, in the context of the web I think when this use cases is added, while doing video conferencing, you can on the client upsample the image quality of the video feed 15:40:15 Zoltan: does this operate on image or time series level? 15:40:37 Chai: on image level, a video can be an image, just capture the images from video frame by frame 15:42:00 TOPIC: int8 quantized models 15:42:07 -> https://github.com/webmachinelearning/webnn/issues/128 WebNN should support int8 quantized models (issue #128) 15:42:31 Chai: Supporting int8 quantized models is essential for mobile scenarios and in many NPU architectures. TensorFlow (Lite) and ONNX, for instances, have int8 quantization support built-in, and WebNN should to. Related https://github.com/webmachinelearning/webnn/issues/93 15:42:53 Chai: int8 quantized is not going to be just an op, so pretty large work item 15:43:01 ... essential for NPUs and also GPUs more and more 15:43:27 ... has been supported in TF and PyTorch etc. for WebNN to be relevant should support int8 quantization 15:43:51 ... this is a cross-section of new ops added and enhancing current ops 15:44:11 ... this is a big ticket issue, for the API to be serious we should add this to the API 15:44:53 ... int8 will cut down the intermediate memory needed when you unpack and run the inference 15:45:17 ... more and more developers become aware of this technique to get smaller models that run quicker, that nice! 15:45:58 (need to be away from kbd) 15:46:05 q+ 15:46:10 ack ningxin_hu 15:47:02 ningxin_hu: I would like to share that in my workshop talk I actually experimented with int8 and got good speedup, e.g. 10X better perf on CPU 15:47:30 TOPIC: Conformance testing of WebNN API 15:47:45 (back now) 15:47:45 anssik: Let's discuss conformance testing of operations, integration of web-platform-tests with NNAPI CTS 15:48:03 ... the proposal is to add compatibility tests for WebNN first wave ops by converting existed native Android Neural Networks API tests. Feedback on the general approach? 15:48:10 anssik: Specifically, the request is to review PR for generated NNAPI CTS: 15:48:15 https://github.com/webmachinelearning/webnn-polyfill/pull/29 15:48:33 anssik: and review PR for SqueezeNet model test: 15:48:33 15:48:33 https://github.com/webmachinelearning/webnn-polyfill/pull/32 15:48:38 anssik: Ningxin anything specific you want to share about these tests? 15:48:50 ... noted a question re numerical precision difference between tf.js webgl and cpu/wasm backends. 15:49:14 ningxin_hu: I think the precision is critical, needs group's feedback 15:49:33 ... Chai mentioned on the workshop talk his experience on conformance testing and numerical accuracy 15:49:51 ... want feedback from Chai on that 15:50:20 TOPIC: NSNet2 sample 15:50:35 anssik: Next, we'll discuss the sample of NSNet2 which is one of the first-wave models and used in explainer key scenarios 15:50:40 ... review PR of NSNet2 PR: 15:50:44 https://github.com/webmachinelearning/webnn-samples/pull/22 15:50:50 anssik: also an update to the explainer: 15:50:54 https://github.com/webmachinelearning/webnn/blob/master/explainer.md#key-scenarios 15:51:11 ningxin_hu: explainer update is WIP 15:52:20 ... before merge, one remaining issue 15:53:08 ... once the TF.js issue is fixed we can merge this PR, otherwise the TF.js will crash Chrome 15:53:20 TOPIC: Proposals for future work 15:53:29 -> https://github.com/webmachinelearning/proposals/issues/2 Operation-specific APIs proposal by Jonathan 15:54:21 Jonathan: my main concern with submitting this proposal was that WG Charter would support this if we chose to 15:54:38 ... the concept is what if we take couple of computationally intensive ops and standalone implement those 15:55:06 ... in many DL models those are the expensive ops per Ningxin did earlier, we got 90% perf boost from those ops alone 15:55:33 ... within the implementation of the JS lib, those handful ops would use the op-specific APIs for better perf 15:55:57 ... this proposal would just accelerate the most expensive ops, so a lot of gain with a small number of ops 15:56:14 ... maybe this is a stepping stone, could launch faster due to smaller API surface 15:56:34 ... I brought this proposal back motivated by the charter work 15:56:53 ... want to know if this helps us launch useful subset faster 15:57:03 q+ 15:57:24 ack RafaelCintron 15:57:49 i can't hear him either 15:57:59 me too 15:58:02 great 15:58:26 RafaelCintron: are the ops you wanted to target the same we have already specced as part of WebNN API? 15:58:40 ... or are you expecting a new set of ops with new inputs and outputs? 15:58:50 ... is this proposal just a WebNN with one node in the graph 15:59:03 ... I expect op definitions to be the same 15:59:19 ... as to whether to wrap this proposal in to a graph or not 16:00:27 Jonathan: motivation to make this similar to compute shaders 16:00:48 ... complete graph API would still have these primitive ops that JS libs could use for a perf boost 16:01:03 ... we could commit to graph API and not implement the primitive ops 16:01:15 RafaelCintron: I think that clarifies 16:01:34 ... motivation is you think we spec too many ops? Or because you think graph API would be too complicated? 16:02:06 Jonathan: in this group the benefits were that small API surface, big perf gain, given we spec'd some ops we could launch faster 16:02:43 +q 16:02:45 ... maybe also anxiety some people have with the general direction, get performance boost to web platform sooner without possible pushback from Android folks exploring different options 16:03:08 q? 16:03:42 Jonathan: Android NN team is exploring various options 16:04:18 q? 16:04:23 ack Chai 16:04:56 Chai: you mentioned compute platforms, and folks who already have ML frameworks implemented in terms of compute platforms, this could be useful to them 16:05:00 yes, correct 16:05:36 ... I'm wondering if you're discussing tensorflow implemented in compute shaders, there's already a WebGPU Working Group that works toward interacting at the atomic compute primitive level 16:06:11 ... I'm hearing you want to propose to spec the compute primitive, maybe the WebGPU WG Charter provides a venue to explore that options, WebNN Charter is a bit higher level, 16:06:17 q+ to ask could we split the API to conformance classes/namespaces to handle this? 16:06:19 ... not compute primitives, but ML primitives 16:06:23 q? 16:06:31 Chai: I get what you say 16:07:01 ... I apply that there might be some needs when people are writing a library and want to tap into primitive that does matrix multiply 16:07:26 ... also gemm, but if you want to reach down to compute primitive I can see it could be useful but probably more in scope of WebGPU WG Charter 16:07:47 Jonathan: not strictly GPU, because would need to support ML-specific hardware 16:08:14 Chai: is there any group that deals with compute platforms, not graphics specifically 16:09:15 https://www.w3.org/2020/12/gpu-wg-charter.html 16:09:19 q+ 16:09:57 https://www.w3.org/community/gpu/ 16:10:00 q? 16:10:07 ack RafaelCintron 16:10:23 RafaelCintron: as for whether this proposal belongs to WebNN of WebGPU 16:10:30 +1 16:11:08 ... even if we have a graph API on the table here, we have use cases to the compute platform targeting APIs too 16:11:28 q? 16:11:33 ack zkis 16:11:33 zkis, you wanted to ask could we split the API to conformance classes/namespaces to handle this? 16:12:15 Zoltan: it makes sense to use multiple conformance classes in the API per Jonathan's proposal? 16:12:33 https://github.com/webmachinelearning/proposals/issues/2 16:12:59 RRSAgent, draft minutes v2 16:12:59 I have made the request to generate https://www.w3.org/2021/01/07-webmachinelearning-minutes.html anssik 16:14:06 TOPIC: Adjourn 16:14:08 RRSAgent, draft minutes v2 16:14:08 I have made the request to generate https://www.w3.org/2021/01/07-webmachinelearning-minutes.html anssik 16:14:45 s/#/invite RRSAgent #webmachinelearning// 16:14:48 RRSAgent, draft minutes v2 16:14:48 I have made the request to generate https://www.w3.org/2021/01/07-webmachinelearning-minutes.html anssik 16:49:44 zkis has joined #webmachinelearning 18:26:36 Zakim has left #webmachinelearning 19:50:18 zkis has joined #webmachinelearning 22:09:09 zkis has joined #webmachinelearning