15:05:32 <RRSAgent> RRSAgent has joined #webmachinelearning
15:05:32 <RRSAgent> logging to https://www.w3.org/2021/01/07-webmachinelearning-irc
15:05:35 <Zakim> RRSAgent, make logs Public
15:05:35 <Zakim> please title this meeting ("meeting: ..."), anssik
15:05:38 <anssik> #/invite RRSAgent #webmachinelearning
15:05:41 <anssik> Meeting: WebML CG Teleconference – 7 January 2021
15:05:46 <anssik> Chair: Anssi
15:05:50 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2021-01-07-agenda.md
15:05:55 <anssik> Scribe: Anssi
15:05:59 <anssik> scribeNick: anssik
15:06:04 <anssik> Present+ Anssi_Kostiainen
15:06:09 <anssik> Present+ Rafael_Cintron
15:06:17 <anssik> Present+ Bruce_Dai
15:06:24 <anssik> Present+ Chai_Chaoweeraprasit
15:06:41 <anssik> Present+ Geun-Hyang_Kim
15:06:46 <anssik> Present+ Ningxin_Hu
15:06:47 <Jonathan> Jonathan has joined #webmachinelearning
15:06:52 <anssik> Present+ Ganesan_Ramalingam
15:07:00 <anssik> Present+ Zoltan_Kis
15:07:05 <anssik> RRSAgent, draft minutes v2
15:07:05 <RRSAgent> I have made the request to generate https://www.w3.org/2021/01/07-webmachinelearning-minutes.html anssik
15:07:08 <Geunhyung_Kim> s/Hyang/Hyung
15:07:16 <anssik> TOPIC: WebNN API TAG review progress report
15:07:29 <anssik> anssik: The review seems to be work-in-progress and triaged. I just pinged the issue.
15:07:34 <anssik> -> https://github.com/w3ctag/design-reviews/issues/570 Web Neural Network API - TAG Spec Review request
15:07:45 <anssik> anssik: related, W3C announced today 7 January 2021 the results of the W3C TAG election and Sangwhan Moon was re-elected for another term on the TAG, congrats to him! https://www.w3.org/blog/news/archives/8846
15:08:02 <anssik> anssik: I'll work with the TAG to get the review conducted in a timely manner.
15:08:06 <anssik> anssik: any questions?
15:08:16 <anssik> TOPIC: Security and Privacy considerations
15:08:36 <anssik> anssik: Review proposed questionnaire responses, fill in TBDs:
15:08:44 <anssik> -> https://github.com/webmachinelearning/webnn/issues/119 Self-Review Questionnaire: Security and Privacy #119
15:09:03 <anssik> anssik: In prep for the TAG review it is recommended to complete the Self-Review Questionnaire. I took the first stab, but appreciate if the spec editors and other contributors take a look and in particular help address the following questions:
15:09:12 <anssik> -> https://www.w3.org/TR/security-privacy-questionnaire/#first-third-party How does this specification distinguish between behavior in first-party and third-party contexts?
15:09:30 <anssik> -> https://www.w3.org/TR/security-privacy-questionnaire/#private-browsing How does this specification work in the context of a user agent’s Private Browsing or "incognito" mode?
15:09:50 <anssik> anssik: The answers will form the basis for the security and Privacy considerations for the spec, this assessment is expected when the spec advances to the WG.
15:10:36 <anssik> TOPIC: Support style transfer models
15:10:46 <anssik> -> https://github.com/webmachinelearning/webnn/pull/123 Support style-transfer models with changes/additions to the following operations (PR #123)
15:10:55 <anssik> anssik: Reviewed and merged, thanks Chai for the work!
15:11:07 <anssik> ... could you Chai briefly explain the gist of this PR for folks who did not review it yet?
15:11:48 <anssik> Chai: one of the models that have been used a lot in all the frameworks
15:12:09 <anssik> ... good exercise because it allows us to look at what the API is still lacking, specifically a few ops and options that need to be filled
15:12:41 <anssik> ... the important ones are to extend the conv2d operation to support transposed convolution, an essential upsample tool for encoder-decoder models.
15:12:58 <anssik> ... those new to these models, look at the spec first section, quite a few use cases need this
15:13:11 <anssik> ... I've been looking through those use cases so the API is able to support those
15:13:45 <anssik> ... essentially these models take input image and input style image and combine them using the content of the first image and stylistics of the second to produce an output
15:14:40 <anssik> anssik: any questions?
15:14:52 <anssik> TOPIC: Dynamic shape inference
15:14:59 <anssik> -> https://github.com/webmachinelearning/webnn/issues/124 WebNN needs to support models requiring dynamic shape inference (issue #124)
15:15:21 <anssik> anssik: Let's discuss this issue opened by Chai
15:15:51 <anssik> Chai: scanned use cases before xmas to see what are the possible gaps
15:16:18 <anssik> ... essentially, these are some models that in the middle of processing the graph need to get the tensor shape that flow in and turn into another tensor
15:16:35 <anssik> ... a bit unusual behaviour, but it becomes popular in some models
15:16:51 <anssik> ... it is easier for an ML developer to do this without having to know the shape ahead of time
15:17:02 <anssik> ... not a very good technique if you're concerned about performance
15:17:29 <anssik> ... if you want to construct a model with good perf, you identify perf ahead of time and be explicit of the size and tensors that flow in
15:18:00 <anssik> ... bad for GPU, you need to flush the queue, kills pipelining
15:18:34 <anssik> ... ... explained in the issue that dynamic shape inference is required in NSNet2, which "requires shape and constant-of-shape operation. Additionally, static shape inference should also be supported as it's a majority case to many models."
15:19:28 <anssik> ... WebNN API implementer can also do optimization ahead of time, many cases in model that the shape can be identified ahead of time
15:19:44 <anssik> ... the section of the spec should explain these details so that the execution can be as fast as possible
15:19:57 <anssik> ... if all fails the implementation should do that hard work or figuring out the shape
15:21:32 <anssik> anssik: comments?
15:22:11 <anssik> ningxin_hu: I implemented NSNet2 sample, so I'd like to add a comment I agree with Chai this is required by NSNet2
15:22:39 <anssik> ... I implemented a technique Chai just mentioned using static shape, since I know the input dimensions
15:23:35 <anssik> ... I can skip the op and constant shape because I know then and can set them statically and can workaround and run NSNet2 with success
15:23:41 <anssik> ... any comments Chai?
15:23:49 <anssik> Chai: Looked at the sample it looks great
15:24:38 <anssik> TOPIC: WebNN conv2d layout parameter TensorFlow incompatibility
15:24:48 <anssik> -> https://github.com/webmachinelearning/webnn/issues/125 TensorFlow conv2d expects channel_last filter layout regardless of input layout format (issue #125)
15:25:39 <anssik> anssik: Chai explains "From the WebNN conv2d spec, the same layout parameter controls both the input and filter layout (below). In TensorFlow (and presumably TFLite), regardless of the input layout, the filter layout remains in the "channel_last" format i.e. [height, width, input_channels/groups, output_channels]."
15:26:11 <anssik> Chai: this is one of the issues that we discovered after we put in conv2d definition
15:26:43 <anssik> ... it turns out TF internally can transpose the input tensor, they do not necessarily have to transpose the filter
15:27:24 <anssik> ... the API needs to be able to support this scenario of input and filter layout to be specified separately
15:27:59 <anssik> ... Ningxin may have insights on TF.js?
15:28:29 <anssik> ningxin_hu: IIRC TF.js supports both for input, but for input only channel_last format
15:28:37 <anssik> ... this aligns with TF, IIRC
15:29:02 <anssik> ... checking TFLite, it looks like it uses channel_first for filter
15:29:07 <anssik> ... need input from Google folks
15:30:24 <anssik> Zoltan: can we encapsulate this layout in the implementation?
15:31:20 <anssik> Ningxin: filter layout only channel_last, input can be both
15:31:59 <anssik> Chai: if filter is channel_last that is the problematic case, WebNN cannot represent that
15:32:55 <anssik> ... to answer the encapsulation questions, it depends on how WebNN API is used, you can definitely transpose any tensor before calling any API, but normally it is a bad think to do due to behavior, think trying to converse ResNet-50 with many layers
15:33:06 <anssik> ... in the conversion process you'd end up with a graph that
15:33:20 <anssik> ... would be inoptimal
15:34:03 <anssik> ... would need more work to massage the layout every now and then and that hurts the performance
15:34:35 <anssik> ... writing a converter between two formats, you'd like to have a very simple conversion
15:34:40 <anssik> ... to answer the Zoltan
15:34:49 <anssik> ... it could work, it is just more tedious
15:36:31 <anssik> Chai: would love feedback on the issue #125 whether this should be supported
15:36:55 <anssik> TOPIC: Super-resolution models
15:37:00 <anssik> -> https://github.com/webmachinelearning/webnn/issues/127 WebNN should supports super-resolution models (issue #127)
15:37:56 <anssik> Chai: one of the use cases of the spec, super-resolution has been up and coming, it is a model essentially upsample an image to a higher res
15:38:08 <anssik> ... a lot of utilization of this technique in images and processing apps etc.
15:38:38 <anssik> ... processing lab like photoshop, takes a low resolution image to uplevel the resolution of the image
15:39:05 <anssik> ... also use cases in gaming, where Deep Learning Super Sampling is used to upsample game visuals in real time
15:40:04 <anssik> ... useful for models, many variant everywhere, in the context of the web  I think when this use cases is added, while doing video conferencing, you can on the client upsample the image quality of the video feed
15:40:15 <anssik> Zoltan: does this operate on image or time series level?
15:40:37 <anssik> Chai: on image level, a video can be an image, just capture the images from video frame by frame
15:42:00 <anssik> TOPIC: int8 quantized models
15:42:07 <anssik> -> https://github.com/webmachinelearning/webnn/issues/128 WebNN should support int8 quantized models (issue #128)
15:42:31 <anssik> Chai: Supporting int8 quantized models is essential for mobile scenarios and in many NPU architectures. TensorFlow (Lite) and ONNX, for instances, have int8 quantization support built-in, and WebNN should to. Related https://github.com/webmachinelearning/webnn/issues/93
15:42:53 <anssik> Chai: int8 quantized is not going to be just an op, so pretty large work item
15:43:01 <anssik> ... essential for NPUs and also GPUs more and more
15:43:27 <anssik> ... has been supported in TF and PyTorch etc. for WebNN to be relevant should support int8 quantization
15:43:51 <anssik> ... this is a cross-section of new ops added and enhancing current ops
15:44:11 <anssik> ... this is a big ticket issue, for the API to be serious we should add this to the API
15:44:53 <anssik> ... int8 will cut down the intermediate memory needed when you unpack and run the inference
15:45:17 <anssik> ... more and more developers become aware of this technique to get smaller models that run quicker, that nice!
15:45:58 <Chai> (need to be away from kbd)
15:46:05 <ningxin_hu> q+
15:46:10 <anssik> ack ningxin_hu
15:47:02 <anssik> ningxin_hu: I would like to share that in my workshop talk I actually experimented with int8 and got good speedup, e.g. 10X better perf on CPU
15:47:30 <anssik> TOPIC: Conformance testing of WebNN API
15:47:45 <Chai> (back now)
15:47:45 <anssik> anssik: Let's discuss conformance testing of operations, integration of web-platform-tests with NNAPI CTS
15:48:03 <anssik> ... the proposal is to add compatibility tests for WebNN first wave ops by converting existed native Android Neural Networks API tests. Feedback on the general approach?
15:48:10 <anssik> anssik: Specifically, the request is to review PR for generated NNAPI CTS:
15:48:15 <anssik> https://github.com/webmachinelearning/webnn-polyfill/pull/29
15:48:33 <anssik> anssik: and review PR for SqueezeNet model test:
15:48:33 <anssik> 
15:48:33 <anssik> https://github.com/webmachinelearning/webnn-polyfill/pull/32
15:48:38 <anssik> anssik: Ningxin anything specific you want to share about these tests?
15:48:50 <anssik> ... noted a question re numerical precision difference between tf.js webgl and cpu/wasm backends.
15:49:14 <anssik> ningxin_hu: I think the precision is critical, needs group's feedback
15:49:33 <anssik> ... Chai mentioned on the workshop talk his experience on conformance testing and numerical accuracy
15:49:51 <anssik> ... want feedback from Chai on that
15:50:20 <anssik> TOPIC: NSNet2 sample
15:50:35 <anssik> anssik: Next, we'll discuss the sample of NSNet2 which is one of the first-wave models and used in explainer key scenarios
15:50:40 <anssik> ... review PR of NSNet2 PR:
15:50:44 <anssik> https://github.com/webmachinelearning/webnn-samples/pull/22
15:50:50 <anssik> anssik: also an update to the explainer:
15:50:54 <anssik> https://github.com/webmachinelearning/webnn/blob/master/explainer.md#key-scenarios
15:51:11 <anssik> ningxin_hu: explainer update is WIP
15:52:20 <anssik> ... before merge, one remaining issue
15:53:08 <anssik> ... once the TF.js issue is fixed we can merge this PR, otherwise the TF.js will crash Chrome
15:53:20 <anssik> TOPIC: Proposals for future work
15:53:29 <anssik> -> https://github.com/webmachinelearning/proposals/issues/2 Operation-specific APIs proposal by Jonathan
15:54:21 <anssik> Jonathan: my main concern with submitting this proposal was that WG Charter would support this if we chose to
15:54:38 <anssik> ... the concept is what if we take couple of computationally intensive ops and standalone implement those
15:55:06 <anssik> ... in many DL models those are the expensive ops per Ningxin did earlier, we got 90% perf boost from those ops alone
15:55:33 <anssik> ... within the implementation of the JS lib, those handful ops would use the op-specific APIs for better perf
15:55:57 <anssik> ... this proposal would just accelerate the most expensive ops, so a lot of gain with a small number of ops
15:56:14 <anssik> ... maybe this is a stepping stone, could launch faster due to smaller API surface
15:56:34 <anssik> ... I brought this proposal back motivated by the charter work
15:56:53 <anssik> ... want to know if this helps us launch useful subset faster
15:57:03 <RafaelCintron> q+
15:57:24 <anssik> ack RafaelCintron
15:57:49 <Jonathan> i can't hear him either
15:57:59 <Geunhyung_Kim> me too
15:58:02 <Jonathan> great
15:58:26 <anssik> RafaelCintron: are the ops you wanted to target the same we have already specced as part of WebNN API?
15:58:40 <anssik> ... or are you expecting a new set of ops with new inputs and outputs?
15:58:50 <anssik> ... is this proposal just a WebNN with one node in the graph
15:59:03 <anssik> ... I expect op definitions to be the same
15:59:19 <anssik> ... as to whether to wrap this proposal in to a graph or not
16:00:27 <anssik> Jonathan: motivation to make this similar to compute shaders
16:00:48 <anssik> ... complete graph API would still have these primitive ops that JS libs could use for a perf boost
16:01:03 <anssik> ... we could commit to graph API and not implement the primitive ops
16:01:15 <anssik> RafaelCintron: I think that clarifies
16:01:34 <anssik> ... motivation is you think we spec too many ops? Or because you think graph API would be too complicated?
16:02:06 <anssik> Jonathan: in this group the benefits were that small API surface, big perf gain, given we spec'd some ops we could launch faster
16:02:43 <Chai> +q
16:02:45 <anssik> ... maybe also anxiety some people have with the general direction, get performance boost to web platform sooner without possible pushback from Android folks exploring different options
16:03:08 <anssik> q?
16:03:42 <anssik> Jonathan: Android NN team is exploring various options
16:04:18 <anssik> q?
16:04:23 <anssik> ack Chai
16:04:56 <anssik> Chai: you mentioned compute platforms, and folks who already have ML frameworks implemented in terms of compute platforms, this could be useful to them
16:05:00 <Jonathan> yes, correct
16:05:36 <anssik> ... I'm wondering if you're discussing tensorflow implemented in compute shaders, there's already a WebGPU Working Group that works toward interacting at the atomic compute primitive level
16:06:11 <anssik> ... I'm hearing you want to propose to spec the compute primitive, maybe the WebGPU WG Charter provides a venue to explore that options, WebNN Charter is a bit higher level,
16:06:17 <zkis> q+ to ask could we split the API to conformance classes/namespaces to handle this?
16:06:19 <anssik> ... not compute primitives, but ML primitives
16:06:23 <anssik> q?
16:06:31 <anssik> Chai: I get what you say
16:07:01 <anssik> ... I apply that there might be some needs when people are writing a library and want to tap into primitive that does matrix multiply
16:07:26 <anssik> ... also gemm, but if you want to reach down to compute primitive I can see it could be useful but probably more in scope of WebGPU WG Charter
16:07:47 <anssik> Jonathan: not strictly GPU, because would need to support ML-specific hardware
16:08:14 <anssik> Chai: is there any group that deals with compute platforms, not graphics specifically
16:09:15 <anssik> https://www.w3.org/2020/12/gpu-wg-charter.html
16:09:19 <RafaelCintron> q+
16:09:57 <anssik> https://www.w3.org/community/gpu/
16:10:00 <anssik> q?
16:10:07 <anssik> ack RafaelCintron
16:10:23 <anssik> RafaelCintron: as for whether this proposal belongs to WebNN of WebGPU
16:10:30 <Jonathan> +1
16:11:08 <anssik> ... even if we have a graph API on the table here, we have use cases to the compute platform targeting APIs too
16:11:28 <anssik> q?
16:11:33 <anssik> ack zkis
16:11:33 <Zakim> zkis, you wanted to ask could we split the API to conformance classes/namespaces to handle this?
16:12:15 <anssik> Zoltan: it makes sense to use multiple conformance classes in the API per Jonathan's proposal?
16:12:33 <anssik> https://github.com/webmachinelearning/proposals/issues/2
16:12:59 <anssik> RRSAgent, draft minutes v2
16:12:59 <RRSAgent> I have made the request to generate https://www.w3.org/2021/01/07-webmachinelearning-minutes.html anssik
16:14:06 <anssik> TOPIC: Adjourn
16:14:08 <anssik> RRSAgent, draft minutes v2
16:14:08 <RRSAgent> I have made the request to generate https://www.w3.org/2021/01/07-webmachinelearning-minutes.html anssik
16:14:45 <anssik> s/#/invite RRSAgent #webmachinelearning//
16:14:48 <anssik> RRSAgent, draft minutes v2
16:14:48 <RRSAgent> I have made the request to generate https://www.w3.org/2021/01/07-webmachinelearning-minutes.html anssik
16:49:44 <zkis> zkis has joined #webmachinelearning
18:26:36 <Zakim> Zakim has left #webmachinelearning
19:50:18 <zkis> zkis has joined #webmachinelearning
22:09:09 <zkis> zkis has joined #webmachinelearning