13:51:45 RRSAgent has joined #webmachinelearning 13:51:49 logging to https://www.w3.org/2023/04/27-webmachinelearning-irc 13:51:49 RRSAgent, make logs Public 13:51:50 please title this meeting ("meeting: ..."), anssik 13:51:50 Meeting: WebML WG Teleconference – 27 April 2023 13:51:54 Chair: Anssi 13:51:59 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2023-04-27-wg-agenda.md 13:52:06 Scribe: Anssi 13:52:07 scribeNick: anssik 13:52:18 ghurlbot, this is webmachinelearning/webnn 13:52:18 anssik, OK. 13:52:22 Present+ Anssi_Kostiainen 13:52:26 Regrets+ Dominique_Hazael-Massieux 13:52:30 RRSAgent, draft minutes 13:52:31 I have made the request to generate https://www.w3.org/2023/04/27-webmachinelearning-minutes.html anssik 13:55:40 <\join_subline> \join_subline has joined #webmachinelearning 13:58:21 zkis has joined #webmachinelearning 13:58:37 zkis has joined #webmachinelearning 14:00:06 zkis_ has joined #webmachinelearning 14:00:16 Present+ Rafael_Cintron 14:00:36 Present+ Zoltan_Kis 14:01:36 Present+ Ningxin_Hu 14:02:30 Present+ Chai_Chaoweeraprasit 14:02:42 chai has joined #webmachinelearning 14:03:45 Topic: Announcements 14:03:50 ningxin_hu has joined #webmachinelearning 14:03:50 Subtopic: TPAC 2023 14:03:58 anssik: W3C TPAC 2023 website is now available. You will find the initial details regarding the event: 14:04:03 https://www.w3.org/2023/09/TPAC/ 14:04:07 ... 11–15 September 2023, Seville, Spain & online 14:04:18 ... WebML WG has traditionally not met at TPAC, is there interest in a meeting this year? 14:04:28 ... the TPAC organizers are expecting us to provide our preferencem f2f or online, by early May 2023. 14:04:43 ... please let me and Dom know if you're planning to attend by the end of next week. 14:05:18 Topic: Contribution guidelines 14:05:28 anssik: We have new improved contribution guidelines, thanks Chai for the PR! 14:05:38 -> webnn/CONTRIBUTING.md: https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md 14:05:45 anssik: this was initially discussed in issue #231 and fixed by PR #381 14:05:46 https://github.com/webmachinelearning/webnn/issues/381 -> Pull Request 381 [closed] Additional guidance for contributions (wchao1115) 14:05:46 https://github.com/webmachinelearning/webnn/issues/231 -> Issue 231 [closed] Create a light-weight process to guide submitting new operator requests to WebNN (anssiko) process 14:06:43 chai: I hope these guidelines help streamline future contributions 14:06:59 ... make it easier to triage and review PRs, there can be exceptions to these guidelines 14:07:13 ... we should use this call to resolve any issues with guidelines 14:07:55 Topic: WebIDL and Infra standard conventions 14:08:04 Subtopic: The constant() method steps 14:08:13 anssik: WebIDL and Infra standard conventions meta issue #210 14:08:13 https://github.com/webmachinelearning/webnn/issues/210 -> Issue 210 Use modern WebIDL and Infra standard conventions (anssiko) enhancement, Editorial 14:08:45 ... PR #365 14:08:45 https://github.com/webmachinelearning/webnn/issues/365 -> Pull Request 365 Add the constant() method steps. (zolkis) 14:08:45 ... I believe Zoltan has followed Chai's guidelines for this PR. 14:08:45 ... this is pending Chai's feedback 14:09:17 zkis_: this is one of the open PRs, this addresses all comments from Ningxin and Chai 14:10:07 ... this is the most well-baked PRs 14:10:39 RafaelCintron has joined #webmachinelearning 14:11:23 Chai: I looked at the recent changes, I will follow up on this this week 14:12:48 zkis_: style change PR would be coming up after this 14:14:09 zkis_: PR #322 would be next PR to be reviewed 14:14:09 https://github.com/webmachinelearning/webnn/issues/322 -> Pull Request 322 Simplify MLContext creation (wchao1115) 14:14:36 chai: I was planning to rebase this PR #322 when the style changes have been landed 14:15:25 zkis_: style change won't affect this PR #322 14:15:45 chai: style change should not overlap with other PRs 14:18:22 ... my PR #322 can wait until the style change comes in 14:18:34 q? 14:18:44 Topic: Enhancements, editorials, questions 14:19:03 Subtopic: TC39 proposal to add Float16Array to JavaScript 14:19:14 anssik: issue #373 14:19:15 https://github.com/webmachinelearning/webnn/issues/373 -> Issue 373 heads up re: proposal to add Float16Array to JavaScript (bakkot) 14:19:22 -> TC39 proposal http://tc39.es/proposal-float16array/ 14:20:05 anssik: ECMA TC39 rep let us know there's now a proposal for Float16Array in JavaScript, which would hold IEEE binary16 floats 14:20:12 ... we have flagged this as an issue in the WebNN API spec: 14:20:18 -> WebNN: clarify the usage of ArrayBufferView for float16 https://www.w3.org/TR/webnn/#appendices-mloperandtype-arraybufferview-compatibility 14:20:54 q+ 14:20:54 anssik: we should track the progress of this JS feature, current status: "The proposal is currently at stage 2, which means TC39 has not yet committed to add it to JS" 14:20:57 q? 14:21:07 ack chai 14:21:31 chai: does this plan to cover float8? 14:22:01 q? 14:22:02 ... or bfloat16 even if not universally supported 14:22:12 anssik: we should probably ask the ECMA folks about their position 14:22:56 https://github.com/tc39/proposal-float16array 14:23:47 q? 14:24:14 Subtopic: Define the algorithm of calculating the effective padding for "same-upper" and "same-lower" option 14:24:19 anssik: issue #326 14:24:20 https://github.com/webmachinelearning/webnn/issues/326 -> Issue 326 Define the algorithm of calculating the effective padding for "same-upper" and "same-lower" option (huningxin) Editorial 14:24:45 anssik: "WebNN conv2d operation allows to set MLConv2dOptions.autoPad option to "same-upper" or "same-lower" of MLAutoPad enum." 14:24:55 ... proposed fix: "The spec should define the algorithm of how the padding values are automatically computed." 14:26:11 ningxin_hu: we were allowed to continue implementation and left TODO in the Chromium code to be resolved when the specification clarifies autopad calculation 14:26:27 ... if no concerns heard, I can take this issue and propose a PR 14:27:17 ... this could reuse existing 2d pooling definitions 14:28:49 ... according to current spec definitions, we put one formula to describe the calculation, if we apply new web conventions with algorithmic steps this would be ~10 lines of spec 14:29:16 ... if we prefer to follow the new algorithmic style for this PR 14:30:10 anssik: this issue will wait for algorithm conventions updates to land 14:30:22 Subtopic: Clarify the usage of 32 bit floating point type and consider using double 14:30:31 anssik: issue #325 14:30:31 https://github.com/webmachinelearning/webnn/issues/325 -> Issue 325 Clarify the usage of 32 bit floating point type and consider using double (huningxin) enhancement 14:31:23 ... "The WebIDL 32 bit floating point type float is widely used by WebNN spec, such as for setting min and max value of clamp operator ... However, WebIDL spec has a warning of using float and the recommendation is to use double." 14:31:40 ... proposed fix: "the spec should mention reason of using float" 14:31:53 q+ 14:31:58 ack chai 14:32:36 chai: I think for property types such as attributes this should be straightforward to replace float with double, I would not adjust operands, tensor types, double tensor types would be too big 14:33:01 q+ 14:33:04 ack ningxin_hu 14:33:44 ningxin_hu: you are fine changing attribute type to double? 14:34:01 chai: but float64 tensors are problematic for all the hardware 14:34:37 ningxin_hu: this would impact the double-precision baseline implementation impact? 14:34:50 q+ 14:34:50 chai: attributes do not affect that, conformance is testing tensors 14:35:00 q? 14:35:10 q- 14:35:23 chai: to clarify, example attributes would be min and max values 14:35:24 q? 14:35:29 q+ 14:35:34 ack ningxin_hu 14:36:05 ningxin_hu: for min and max, we can define these as double, but tensors won't support double, they can be small like float16 14:36:19 q+ 14:36:55 ... should we put a note there to mention how this double attributes are applied to float32 or float16 tensors? 14:36:55 q? 14:36:55 chai: I think it is up to you if you want to add a note to clarify this 14:36:55 q? 14:37:07 chai: strong feeling for float64 tensors 14:37:11 ack zkis_ 14:37:31 zkis_: I'm wondering if symbolic definition would be better? 14:37:38 ... type related prose in one place 14:37:40 q? 14:37:54 chai: I think it is maybe obvious if we look at the operand types 14:38:19 ... maybe no need for explicit text for that 14:38:43 ... I prefer a spec that is concise, if we look at operand type it is clear we don't support float64 tensors 14:38:45 q? 14:39:46 ningxin_hu: this is from Jiawei, he mentioned fp64, I think he means the tensor but Chai does not recommend that 14:40:05 ... need to confirm with Jiawei, maybe Chai can chime in on GH issue 14:40:42 chai: I'll comment on this issue #325 14:40:43 https://github.com/webmachinelearning/webnn/issues/325 -> Issue 325 Clarify the usage of 32 bit floating point type and consider using double (huningxin) enhancement 14:40:53 Subtopic: Subclass MLGraph based on the context that creates it 14:40:57 anssik: issue #344 14:40:58 https://github.com/webmachinelearning/webnn/issues/344 -> Issue 344 Subclass MLGraph based on the context that creates it (huningxin) question 14:41:17 RafaelCintron_ has joined #webmachinelearning 14:41:21 anssik: is API ergonomics improvement the key reason for this proposed change? 14:42:39 anssik: we'll defer this issue from today's call for later 14:42:47 Topic: Support for transformers 14:43:15 anssik: We received feedback during the AC review of our charter that the WG should initiate discussion on transformer models applicable for accelerated inference via WebNN API 14:43:23 ... this feedback is publicly captured in issue #375 14:43:23 https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Mention transformer in use cases (dontcallmedom) v2 14:43:40 ... as we know, the transformer architecture was introduced in Jun 2017 initially focusing on translation tasks 14:43:59 ... later adding more models such as GPT, BERT, GPT-2, BART, GPT-3 etc. roughly in order of appearance. 14:44:13 ... there's roughly three top-level categories for the transformer models AFAICT: 14:44:19 ... - auto-regressive models ("GPT-like") 14:44:25 ... - auto-encoding models ("BERT-like") 14:44:32 ... - sequence-to-sequence models ("BART-like") 14:44:52 ... there have been some early experiments in using these models in the web context e.g. Transformers.js that support a number of transformer models from Hugging Face. 14:44:56 -> Transformers.js https://xenova.github.io/transformers.js/ 14:45:01 -> Transformers.js supported tasks and models https://xenova.github.io/transformers.js/#usage 14:45:21 ... I believe the use cases WebNN might want to target would be a subset of what Transformers.js supports today. Let me drop a list here to initiate discussion: 14:45:34 ... - text classification 14:45:34 ... - token classification 14:45:34 ... - zero-shot classification 14:45:34 ... - question answering 14:45:35 ... - language modelling 14:45:35 ... - summarization 14:45:35 ... - translation 14:45:35 ... - text generation 14:45:35 ... - automatic speech recognition 14:45:35 ... - image-to-text 14:45:36 ... - image classification 14:45:36 ... - zero-shot image classification 14:45:36 ... - image segmentation 14:45:37 ... - object detection 14:45:37 ... - embeddings 14:46:03 q+ 14:46:27 ... I think it is fair to say many transformers are big models so I'm eager to hear your thought on which models you feel are the best first candidates in the web context for WebNN 14:46:37 ... we know training transformer models is a HUGE task and luckily training is out of scope for WebNN so we don't need to worry about that. 14:46:51 ... let's discuss what inference tasks on a trained transformer model would be good targets 14:47:04 ... And let's try use our improved contribution guidelines that suggest we should start by looking at use cases. 14:47:08 -> Guidelines > Proposing and adding a new operation https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation 14:47:12 q? 14:47:13 q+ 14:47:15 ack chai 14:47:36 chai: I agree, transformers are becoming very hot and WebNN needs to support transformers 14:47:50 ... most transformers coming up recently and popular right now, are not single model but pipeline 14:48:24 ... e.g. stable diffusion is 6 models starting with embedding stage, core part is the attention network that is the part that is generally speaking what needs processing 14:49:01 ... in fact depending on the pipeline implementation, python or C# some of those languages have a library for tokenization, in embedding 14:49:29 ... in most cases done on CPU, ~99% running on this is auto-encoding, stable diffusion running in a loop 14:49:44 ... totally support the idea of adding support for transformers, look at which transformers to support 14:50:06 ... for embeddings and tokenization not much to be done, they are outside the DNN, but auto-encoder is super important, the crux of the whole thing 14:50:07 q? 14:50:28 ack zkis_ 14:51:22 q+ 14:51:24 zkis_: I have some questions to clarify, Transformer.js uses a pipeline in its API, it's using ONNX models it seems so you have to transform your models to ONNX, what is the core WebNN use case? 14:51:25 q? 14:51:28 ack ningxin_hu 14:51:44 ningxin_hu: I would also support transformers network in WebNN 14:52:04 ... my perspective, WebNN intends to be the backend of frameworks 14:52:26 ... not a single model, but multiple models need to be supported, interaction between the models, how to run them in a loop 14:52:47 ... need to understand how to partition the network and WebNN as a backend 14:53:29 ... we need to be careful how to partition the task where model parts can be delegated of offloaded to WebNN for eventual acceleration by hardware, and what tasks to be left to the framework or user code to handle 14:53:46 ... we may want to talk to web application developers to understand their usages 14:53:48 q+ 14:53:57 ... Transformers.js and ONNX would be good projects to investigate by the WG 14:53:58 q? 14:54:17 ack chai 14:54:37 chai: my thought is aligned with Ningxin, if we look at Tranformers.js, it is all implemented in JS in Wasm, the whole pipeline can be written in JS compiled into Wasm 14:54:59 ... when you run this natively, each model is instantiated, the heavy-weight is the auto-encoder 14:55:40 ... WebNN is the backend that can be used by these frontends, most of the time is spend in auto-encoder 14:56:12 ... transformer is probably the best network to highlight WebNN benefit, without proper acceleration this will be very slow and use a lot of memory 14:56:24 ... benefits greatly from hardware acceleration 14:56:25 q? 14:57:04 q? 14:57:38 anssik: would you like to work on this in a GH issue? 15:00:18 ... sounds like we want to first and brainstorm use cases in issue #375 15:00:18 https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Mention transformer in use cases (dontcallmedom) v2 15:00:45 q? 15:01:03 RRSAgent, draft minutes 15:01:04 I have made the request to generate https://www.w3.org/2023/04/27-webmachinelearning-minutes.html anssik 15:02:13 s/want to first/want to 15:02:16 RRSAgent, draft minutes 15:02:17 I have made the request to generate https://www.w3.org/2023/04/27-webmachinelearning-minutes.html anssik 15:06:34 s/of offloaded/or offloaded 15:06:35 RRSAgent, draft minutes 15:06:37 I have made the request to generate https://www.w3.org/2023/04/27-webmachinelearning-minutes.html anssik 15:08:21 s/and brainstorm/brainstorm 15:08:23 RRSAgent, draft minutes 15:08:25 I have made the request to generate https://www.w3.org/2023/04/27-webmachinelearning-minutes.html anssik