13:56:14 <RRSAgent> RRSAgent has joined #webmachinelearning
13:56:18 <RRSAgent> logging to https://www.w3.org/2023/06/08-webmachinelearning-irc
13:56:18 <Zakim> RRSAgent, make logs Public
13:56:19 <Zakim> please title this meeting ("meeting: ..."), anssik
13:56:19 <anssik> Chair: Anssi
13:56:28 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2023-06-08-wg-agenda.md
13:56:32 <anssik> Scribe: Anssi
13:56:35 <anssik> scribeNick: anssik
13:57:07 <anssik> ghurlbot, this is webmachinelearning/webnn
13:57:07 <ghurlbot> anssik, OK.
13:58:24 <anssik> Present+ Anssi_Kostiainen
13:58:34 <anssik> Regrets+ Dominique_Hazael-Massieux
14:01:49 <RafaelCintron> RafaelCintron has joined #webmachinelearning
14:01:53 <ningxin_hu> ningxin_hu has joined #webmachinelearning
14:01:55 <zkis> zkis has joined #webmachinelearning
14:02:50 <chai> chai has joined #webmachinelearning
14:05:15 <Joshua> Joshua has joined #webmachinelearning
14:05:55 <Vivek> Vivek has joined #webmachinelearning
14:06:55 <anssik> RRSAgent, draft minutes
14:06:57 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
14:07:55 <anssik> Meeting: WebML WG Teleconference – 8 June 2023
14:08:01 <anssik> Zakim, prepare meeting
14:08:01 <Zakim> RRSAgent, make logs Public
14:08:03 <Zakim> please title this meeting ("meeting: ..."), anssik
14:08:09 <anssik> Chair: Anssi
14:08:21 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2023-06-08-wg-agenda.md
14:08:26 <anssik> Scribe: Anssi
14:08:32 <anssik> scribeNick: anssik
14:08:49 <anssik> RRSAgent, draft minutes
14:08:50 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
14:09:18 <anssik> Present+ Joshua_Lochner
14:09:26 <anssik> Present+ Zoltan_Kis
14:09:30 <anssik> Present+ Ningxin_Hu
14:09:35 <anssik> Present+ Rafael_Cintron
14:09:42 <anssik> Present+ Chai_Chaoweeraprasit
14:09:46 <anssik> Present+ Vivek_Sekhar
14:09:56 <anssik> RRSAgent, draft minutes
14:09:58 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
14:10:07 <anssik> Topic: Introductions
14:10:13 <anssik> anssik: please welcome Joshua to today's call!
14:10:23 <anssik> ... Joshua Lochner (@xenova) created Transformers.js and joined HuggingFace recently
14:10:23 <ghurlbot> https://github.com/xenova -> @xenova
14:10:29 <anssik> ... after a discussion with Nikhil I thought I must invite Joshua to share his findings from Transformers.js with this WG and here he is!
14:10:39 <anssik> ... we can so a super quick 15 sec intro round:
14:11:30 <anssik> ... Joshua, 23 year-old SW developer, created Transformers.js and now at HuggingFace, flattered to be called an invited expert! Love open source.
14:12:17 <anssik> anssik: Anssi, chair of this WG, working at Intel, long-term web standards contributor, excited to see our WG grow and get new participants as we advance to v2
14:12:56 <anssik> ... Ningxin, WebNN API spec co-editor, impl WebNN in Chromium, working at Intel, welcome Joshua!
14:13:29 <anssik> ... Chai, running a team at Msft working on ML platform for the core OS, WebNN API spec co-editor
14:14:22 <anssik> ... Rafael, developer of Msft Edge team, low-level graphics, contribute to browsers, WG participants, DirectML focus
14:15:00 <anssik> ... Zoltan, Intel, AI research background, part of the web team helping specs advance, in this WG help with the spec algorithms
14:15:29 <anssik> ... Vivek, Google, Chrome team, WebGPU, Wasm, recently joined the ML effect within Chrome
14:15:40 <anssik> RRSAgent, draft minutes
14:15:42 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
14:16:03 <anssik> Topic: Transformers.js
14:16:15 <anssik> anssik: Joshua Lochner (@xenova) will introduce Transformers.js
14:16:24 <anssik> ... and share his learnings from this project
14:16:35 <anssik> ... including practical use cases to help inform the WebNN v2 feature work.
14:16:55 <anssik> ... We want to make WebNN the most performant and robust backend for a future version of Transformers.js.
14:17:07 <anssik> ... Joshua provided background material for this meeting in a GH comment where we discuss support for transformers:
14:17:12 <anssik> https://github.com/webmachinelearning/webnn/issues/375#issuecomment-1560785639
14:17:31 <anssik> anssik: in our follow up discussion I told him that the WG's key interest is to hear based on his Transformers.js experience:
14:17:45 <anssik> ... 1) what are the feasible tasks or use cases now or short term in the browser.
14:18:08 <anssik> ... 2) what is “coming up” but not yet ready for the browser, what is missing to make them feasible.
14:18:26 <anssik> ... this feedback will be great input into v2 feature discussions to help prioritize our work.
14:18:41 <anssik> ... I also shared with Joshua the WG's high-level approach to adding new features into WebNN API:
14:18:46 <anssik> ... 1) identify use cases
14:18:49 <chai> brb
14:19:01 <anssik> ... 2) "research" models, framework support, cross-platform support
14:19:32 <anssik> ... 3) derive requirements (ops, other functional requirements, non-functional reqs such as perf, a11y, privacy, i18n, usability, responsibility & transparency)
14:19:46 <anssik> ... 4) spec new features in a close feedback loop with the implementations
14:19:46 <chai> back
14:20:05 <anssik> ... WG's work mode is also captured in our contribution guidelines for new ops
14:20:12 <anssik> -> https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation Proposing and adding a new operation
14:20:22 <anssik> ... our preexisting "v1" use cases, our spec documents two categories:
14:20:27 <anssik> -> application use cases https://www.w3.org/TR/webnn/#usecases
14:20:39 <anssik> -> framework use cases https://www.w3.org/TR/webnn/#usecases-framework
14:21:05 <anssik> anssik: application use cases is a mix of tasks across multiple modalities, derived from some well-known classic models such as SqueezeNet, MobileNet, ResNet, TinyYOLO, RNNoise, NSNet etc.
14:21:25 <anssik> ... Computer Vision: semantic segmentation, person detection, skeleton detection etc.
14:21:25 <anssik> ... Text-to-text: summarization and translation
14:21:25 <anssik> ... Video-to-text: video summarization
14:21:25 <anssik> ... Audio: noise suppression
14:21:25 <anssik> ... etc.
14:21:50 <anssik> ... framework "use cases" include requirements received from JS framework vendors, e.g. custom layer, network concatenation, perf adaptation, op-level execution, integration with real-time video processing (WebRTC) etc.
14:22:11 <anssik> ... with that as an intro on behalf of the WG, I'll let Joshua share with us his feedback from Transformers.js, including introduction to this library that uses ONNX Runtime Web (currently with Wasm backend) under the hood!
14:22:46 <anssik> [Joshua presents slides]
14:23:02 <anssik> Joshua: Transformers.js runs HF transformers in the browsers
14:23:30 <anssik> ... run pre-trained models in browser, GH community growing
14:23:46 <anssik> ... "What can it do?" Text, Vision, Audio, Multimodal
14:24:21 <anssik> "How does it work?" 1) Convert your modelto ONNX with HF Optimum, 2) Write JS code, 3) Run in the browser
14:27:33 <anssik> ... Why was it created? Origins: remove spam YT comments; Current plan: support all Transformers models, tokenizers, processors, pipelines, and tasks; Ultimate goal: Help bridge the gap between web dev and ML
14:27:58 <anssik> RRSAgent, draft minutes
14:27:59 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
14:30:02 <anssik> Joshua: Applications
14:30:53 <anssik> ... WebML environments: Websites and PWAs; Browser extensions; Server-side / Electron apps
14:32:00 <anssik> ... WebML's ability to use the same model across websites is a massive positive, privacy benefits of on-device inference
14:32:12 <anssik> ... Feasible tasks:
14:33:08 <anssik> ... Text classification (sentiment analysis, NER), Code Completion (constrained text-generation problems), Text-to-text (translation, summarization)
14:33:58 <anssik> s/Feasible tasks:/Feasible tasks in Text / Vision / Audio / Multimodal
14:34:50 <anssik> ... Image Classification (label images), Object Detection (bb for objects), Segmentation
14:36:32 <anssik> ... Speech-to-Text (ASR), Text-To-Speech
14:37:30 <anssik> ... Multimodal: Embeddings (semantic search, clustering, data analysis), Image-to-text (captions to images)
14:38:31 <anssik> ... Limitations
14:39:31 <anssik> ... Speed (CPU only now), Memory (Wasm can't address >4GB), Models (standards, distribution, interop), Browsers (Unified model caching, Tensor API)
14:41:48 <anssik> RRSAgent, draft minutes
14:41:49 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
14:43:50 <chai> q+
14:43:57 <anssik> ack chai
14:44:10 <Vivek> q+
14:44:24 <anssik> Chai: Thanks Joshua! When using ONNX you're using ONNX Runtime Web?
14:44:28 <anssik> Joshua: correct
14:44:40 <anssik> Chai: using Optimal?
14:45:08 <anssik> Joshua: defaults to FP32 quant to 8-bit
14:45:20 <anssik> Chai: SegmentAnything running as quantized models?
14:45:32 <anssik> Joshua: correct, working on some enhancements
14:45:59 <anssik> Chai: thanks!
14:46:39 <anssik> Joshua: would love to connect with Msft ONNX Runtime folks
14:46:44 <anssik> Chai: will love to make that connection
14:47:47 <anssik> Joshua: I've debugged the WebGPU issues with ONNX Runtime Web and would love to connect with people working on that feature
14:47:49 <anssik> q?
14:48:41 <anssik> ack Vivek
14:49:15 <anssik> Vivek: thanks this is fantastic! Scientific computing and linear algebra and tensor API? Can you mention a bit more what the requirements would be?
14:49:29 <anssik> Joshua: I want to be able to do pre- and post-processing
14:50:11 <anssik> ... so if go through the code of utils, maths, audio, tensor in JS, it is annoying I had to implement these ops myself in JS
14:50:42 <anssik> ... I think these should have Web APIs, maybe similar to NumPy
14:51:18 <anssik> ... image resizing, many ways to do this, in the browser it is done with canvas API, but it does not allow you to select interpolation algorithm
14:51:30 <anssik> ... this has performance implications
14:52:49 <ningxin_hu> q+
14:52:53 <anssik> ... other lib developers might also find these scientific computing and linear algebra helpers useful, see maths.js and tensor.js in utils
14:52:54 <anssik> q?
14:52:56 <anssik> ack ningxin_hu
14:53:03 <chai> q+
14:53:27 <anssik> ningxin_hu: questions related to memory limitation, Wasm heap limitation, there are some workarounds using WebGPU
14:54:34 <anssik> ... what is the ideal was for big model weights to be downloaded to the client side?
14:55:02 <anssik> ... Wasm provides streaming compiling, compile it while streaming it
14:55:20 <anssik> Joshua: in my mind the downloading and the size don't exceed 4GB when using Wasm implementation
14:55:57 <anssik> ... some translation models are ~1 GB, I wouldn't worry how it is loaded as long as its cached
14:56:13 <anssik> ... now you load the model, it is saved so you use the cached version if you use it again
14:56:53 <anssik> ... I wouldn't care if it is big as long as the browser handles caching and you can share it across websites
14:56:59 <anssik> q?
14:57:33 <anssik> ... saving to a local storage is an ideal solution
14:58:36 <anssik> ... for loading weights as you process, I probably wouldn't advise running such models, if a 4 GB model is needed today, maybe not realistic in browser today, maybe in the future
14:59:28 <anssik> anssik: are you using CDN for models?
15:00:17 <anssik> Joshua: serving models from huggingface.co currently
15:00:45 <anssik> s/huggingface.co/huggingface.co model hub
15:01:24 <anssik> q?
15:01:37 <anssik> ack chai
15:02:08 <anssik> chai: One specific question, you mentioned the popular models are quantized, also that you look for WebGPU support
15:02:31 <anssik> ... quant 8-bit is not great in WebGPU, the more optimized data type is FP16
15:02:43 <anssik> ... I'm wondering what are your thoughts here?
15:03:26 <anssik> Joshua: I spoke about the desire for quantized models here
15:04:19 <anssik> ... FP16 support with ops and run that with GPU, haven't got to that point yet, I'm a one-man show currently :-)
15:05:10 <anssik> Chai: my day job is ML platforms in core Windows OS, happy to connect with you to help out here, now focusing on GPU
15:05:31 <anssik> q?
15:06:20 <anssik> Chai: we're dealing with GPU issues typical developer face writing shaders
15:06:49 <anssik> Joshua: WebGPU support is the next big thing on the todo list
15:06:53 <anssik> q?
15:07:28 <anssik> q?
15:09:03 <anssik> Topic: WebIDL and Infra standard conventions
15:09:17 <ningxin_hu> i am good
15:09:27 <anssik> anssik: I discussed with Chai on how to make progress with our open PRs for WebIDL and Infra standard conventions changes
15:09:43 <anssik> ... we came to a conclusion it would help if there would a fork that tracks the official spec and integrates all these changes into it
15:09:52 <anssik> ... I proposed to Zoltan he could host the rendered version of such a fork at https://zolkis.github.io/webnn/
15:10:07 <anssik> ... we could then use https://services.w3.org/htmldiff to visually compare the delta between the built versions of https://www.w3.org/TR/webnn/ and https://zolkis.github.io/webnn/ that is often faster and easier than source-level diffs
15:10:33 <anssik> ... the WG would review all the changes in the fork together and merge wholesale once reviewed and ready.
15:10:49 <anssik> ... I also discussed this plan with Zoltan and I think we agreed on the big picture, but wanted to have this discussion to sync all of us.
15:11:28 <anssik> Chai: thanks Anssi! that's a good description what I think would work better.
15:12:00 <anssik> Zoltan: this will solve the review problem, how to deal with merging 1000s LOCs
15:12:15 <anssik> Chai: we can agree on the types of changes, stylistic vs. normative
15:12:30 <anssik> ... those changes can be staged, when we look at the PR for the entire fork it is a lot of work
15:12:31 <anssik> q?
15:12:48 <anssik> ... bikeshed is not ideal for diff
15:13:18 <anssik> ... PR may become outdated, no magic bullet how to ingest big changes, we must spend the time reviewing them, I'm convinced staging this as a fork reduces work
15:13:23 <anssik> Zoltan: I agree
15:13:36 <anssik> ... privately I have setup such a fork
15:13:58 <anssik> ... I can make a GH Action that builds the spec in an integration branch and deploys the built spec
15:14:27 <anssik> ... changes are simple, adds algorithmic steps, I have separate branches for all the methods, I have integration branch that unified everything
15:15:16 <anssik> ... moving descriptions for arguments, if there are dictionaries in IDL I move them into their own subsections, argument sections become main text
15:15:30 <anssik> ... we will have separate sections for polymorphic functions
15:16:22 <anssik> ... with a polymorphic function we have generic text and use autolinking, the last change is for internal slots for algorithms, those are merged already
15:16:54 <anssik> Chai: my key point is atomicity of changes
15:17:24 <anssik> ... from the PoV of editors, we want to make sure that when compared to the baseline, by the time the PR is merged, it does not leave any undefined state in the spec
15:18:03 <anssik> ... should do all stylistics changes in a one change to make it an atomic change
15:18:27 <anssik> Zoltan: I make all the changes and then we slice them for merging in pieces, would that work?
15:18:53 <anssik> Chai: style change and content change should be separated
15:19:17 <anssik> ... that will help regulate the proposed changes going into the mainline, it will make harder to review the entire fork
15:19:35 <anssik> ... atomicity is important, I hope that makes sense
15:19:52 <anssik> ... no need to review incrementally, bring the fork forward in a one go
15:20:10 <anssik> ... I will be one with my fork next week, will notify you when it is ready to review
15:20:29 <anssik> Chai: I'd stop the in-flight PRs and go over to that fork and port over the fork when it is ready
15:21:25 <anssik> Zoltan: I can merge into the integration branch, I will share a proposal next week?
15:21:54 <anssik> Chai: maintaining as a branch or fork, either way should work
15:21:59 <anssik> ... personally prefer a fork, so can pull forward
15:22:11 <anssik> ... mechanics of this up to you, must just stage it somewhere
15:22:22 <anssik> Zoltan: some of the PRs have been merged
15:22:39 <anssik> Chai: I'm aware, I'd prefer to stop that and bring all the rest changes in when you are ready
15:23:04 <anssik> ... async and sync changes are content changes
15:23:22 <anssik> Zoltan: batchNorm, clamp and concat you want closed and moved to fork?
15:23:29 <anssik> Chai: correct
15:23:40 <chai> i need to drop
15:26:06 <anssik> RRSAgent, draft minutes
15:26:07 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
15:26:27 <anssik> ningxin_hu: integration branch will be fine I think
15:27:50 <anssik> RRSAgent, draft minutes
15:27:51 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
15:28:52 <Joshua> @Anssi can I email you the slides?
15:28:52 <ghurlbot> https://github.com/Anssi -> @Anssi
16:53:40 <anssik> s/[Joshua presents slides]/-> Transformers.js presentation slides https://lists.w3.org/Archives/Public/www-archive/2023Jun/att-0000/Transformers_js.pdf
16:53:50 <anssik> RRSAgent, draft minutes
16:53:51 <RRSAgent> I have made the request to generate https://www.w3.org/2023/06/08-webmachinelearning-minutes.html anssik
17:24:31 <Zakim> Zakim has left #webmachinelearning