14:50:30 <RRSAgent> RRSAgent has joined #webmachinelearning
14:50:35 <RRSAgent> logging to https://www.w3.org/2023/12/14-webmachinelearning-irc
14:50:35 <Zakim> RRSAgent, make logs Public
14:50:36 <Zakim> please title this meeting ("meeting: ..."), anssik
14:50:36 <anssik> Meeting: WebML WG Teleconference – 14 December 2023
14:50:41 <anssik> Chair: Anssi
14:50:47 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2023-12-14-wg-agenda.md
14:57:50 <jsbell> jsbell has joined #webmachinelearning
14:58:04 <anssik> Scribe: Anssi
14:58:09 <anssik> scribeNick: anssik
14:58:30 <anssik> gb, this is webmachinelearning/webnn
14:58:30 <gb> anssik, OK.
14:58:39 <anssik> Present+ Anssi_Kostiainen
14:58:49 <anssik> Present+ Joshua_Bell
14:59:00 <anssik> RRSAgent, draft minutes
14:59:01 <RRSAgent> I have made the request to generate https://www.w3.org/2023/12/14-webmachinelearning-minutes.html anssik
14:59:56 <Rachel> Rachel has joined #webmachinelearning
15:00:09 <anssik> Present+ Rachel_Yager
15:00:20 <anssik> Present+ Bryan_Bernhart
15:00:36 <anssik> Present+ Chai_Chaoweeraprasit
15:01:00 <Ningxin_Hu> Ningxin_Hu has joined #webmachinelearning
15:01:29 <anssik> Present+ Joshua_Lochner
15:01:56 <Joshua_Lochner> Joshua_Lochner has joined #webmachinelearning
15:02:09 <anssik> Present+ Ningxin_Hu
15:02:26 <anssik> RRSAgent, draft minutes
15:02:27 <RRSAgent> I have made the request to generate https://www.w3.org/2023/12/14-webmachinelearning-minutes.html anssik
15:02:38 <anssik> Topic: Welcome to new participants
15:02:49 <anssik> anssik: we had a slew of new folks joining the WG over the past few weeks, let me introduce them:
15:03:07 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:03:43 <dom> dom has joined #webmachinelearning
15:04:09 <anssik> anssik: Tianqi Chen joined the WG as an Invited Expert, welcome! Tianqi is an Assistant Professor at CMU, creator of WebLLM, Chief Technologist at OctoML. I've invited Tianqi to share his learnings from WebLLM and his proposed use cases for WebNN to consider on our 11 January 2024 call subject to his availability.
15:04:26 <dom> Present+
15:04:41 <anssik> anssik: Bryan Bernhart from Intel joined as a WG rep, welcome! Bryan brings in a wealth of GPU expertise, he is an active WebGPU contributor, works with many of our participants in that space already.
15:05:09 <anssik> Bryan: good intro Anssi!
15:06:13 <anssik> anssik: Laszlo Gombos from Samsung also joined the WG, welcome! It has been my pleasure to work with Laszlo in many areas of the web platform. I want to note Samsung's web contributions extend beyond mobile devices. For example, Samsung Internet, Samsung's Chromium-powered browser, is available on Windows PCs too. I'm pleased to see more browser vendors join the WG with interest in the WebNN API.
15:06:50 <anssik> anssik: Christos Bacharakis, Director of Engineering at eyeo, joined the WG to help drive the ML ethical guidelines forward. Welcome and thank you! The ethics effort is expected to accelerate in 2024.
15:07:10 <anssik> anssik: I may have missed some new folks due to the avalanche of new people joining, my apologies if I missed you.
15:07:22 <anssik> ... the strong momentum behind the WG's work has clearly been recognized in the industry and as a consequence the WG continues to grow.
15:07:31 <anssik> ... please join me in welcoming these new people to the WG
15:07:37 <anssik> ... there are opportunities for everyone to make impactful contributions.
15:08:20 <zkis> zkis has joined #webmachinelearning
15:08:37 <zkis> present+ Zoltan_Kis
15:08:43 <anssik> Topic: WebNN v2: Transformer ops spec contributions celebration
15:09:17 <chai> chai has joined #webmachinelearning
15:09:17 <anssik> anssik: issue #375 and PR #478
15:09:20 <gb> https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [operation set]
15:09:20 <gb> https://github.com/webmachinelearning/webnn/issues/478 -> CLOSED Pull Request 478 Add support for operations needed for well-known transformers e.g. Segment Anything, Stable Diffusion, etc. (by wchao1115)
15:09:22 <anssik> Subtopic: Thank You!
15:09:33 <anssik> anssik: I wanted to use this meeting to celebrate the major milestone the WG hit this week by adding support for operations needed for well-known transformers.
15:09:37 <anssik> ... first I want to acknowledge various individuals for their contributions:
15:10:04 <anssik> ... Chai and Ningxin as the editors diligently worked on this PR addressing in total 195 review comments! A lot of work to go though all the feedback. Chai's contribution was a big PR so getting this done was a heavy lift.
15:10:59 <anssik> ... Dwayne and Joshua L provided the initial seeds with their contributions that shaped and helped formulate the initial scope of v2 ops
15:11:28 <anssik> ...your implementation experience informed the WG in significant ways and helped converge on the target models
15:12:14 <anssik> ... Wanming's detailed transformer models analysis formed the basis for the WG's data-driven work mode that ensured the new ops that were added to the WebNN API and truly required to satisfy the requirements of the target models
15:12:44 <anssik> ... we also recognize Google Chrome team's directional guidance received earlier this year and reflected that into this analysis
15:12:52 <anssik> ... including consideration for compatibility with other op sets, TOSA and StableHLO
15:13:34 <anssik> ... Bruce, May & co worked to advance WPT tests and webnn-baseline implementation to meet the interop demonstration expectations
15:14:01 <anssik> ... Jiewei continued his careful review of WebNN API implementation and as you've noticed, his Chromium CL review comments have helped this WG improve the WebNN API spec in significant ways
15:14:27 <anssik> ... Jiewei is demonstrating and role-modelling how a tight spec-implementation feedback loop works, identify numerous edge cases and proposed improvements
15:14:54 <anssik> ... Joshua Bell and Zoltan helped keep this latest PR and the spec in general aligned with the latest spec authoring conventions, an important area of work
15:15:27 <anssik> ... Bin, Shiyi and Rafael have all provided great implementation informed insights and Rafael in addition has been our resident GPU expert in this WG, thank you! I expect Rafael to pair with Bryan who joined the WG with whom he has worked on many GPU things in the past.
15:15:33 <anssik> ... and this is not a full list of people who made this possible, everyone's contributions are equally appreciated!
15:15:38 <anssik> q?
15:16:35 <anssik> Subtopic: Reflections and learnings
15:16:40 <anssik> q?
15:16:49 <Rachel> q+
15:16:52 <anssik> q?
15:16:55 <anssik> ack Rachel
15:17:00 <Joshua_Lochner> q+
15:17:48 <anssik> Rachel: I want to say you missed you Anssi, I want to acknowledge your leadership in this WG, thank you for pulling this WG together!
15:18:00 <Ningxin_Hu> +1
15:19:01 <anssik> ... I want also suggest that how is it possible to bring more people to work in this space, computational models, also not just neural nets, the other side of the ecosystem
15:19:13 <anssik> q?
15:19:16 <Ningxin_Hu> q+
15:19:16 <anssik> ack Joshua_Lochner
15:20:42 <anssik> Joshua_Lochner: I want to say thank you to Anssi! It is fun to be able to contribute to a project like this, from my side with Transformers.js library I created in Mar this year, happy to see people are interested in this technology with a huge community building around the project
15:21:43 <anssik> ... 75 supported architectures currently with Transformers.js, as WebNN becomes more mainstream adding different new execution providers is great to give people more options to run in the browser, a key goal for the next year
15:22:01 <anssik> ... getting WebLLM on board this WG is amazing!
15:22:22 <anssik> ... pushing the boundaries, great to be part where the WebML world is going
15:22:23 <anssik> q?
15:22:40 <anssik> q?
15:22:45 <anssik> ack Ningxin_Hu
15:23:32 <anssik> Ningxin_Hu: I want to acknowledge Alex Gough from Chrome Security Team who provided great inputs from security perspective, e.g. for gather op tightening
15:24:13 <anssik> ... also want to acknowledge S. Raja and Patrick who provided great input on the spec and both helped with the Chromium implementation for new transformer ops
15:24:14 <anssik> q?
15:25:12 <anssik> anssik: Acknowledge also Dom our staff contact from W3C
15:25:15 <anssik> q?
15:25:29 <anssik> q?
15:26:26 <anssik> ... careful background work in the GH issues with a description of the problem and and careful analysis of solutions prior to the PR helps reach consensus faster
15:26:59 <anssik> ... close collaboration between the spec effort and implementation effort is a huge plus, can help validate assumptions with running code during the PR review even
15:27:04 <zkis> q+
15:27:12 <anssik> ... similarly, co-developing tests together with the spec help uncover underspecified parts
15:27:24 <anssik> q?
15:27:28 <anssik> ack zkis
15:28:00 <anssik> zkis: just a question for the future, in order to make big changes easier to review, how to make them more digestible?
15:28:37 <anssik> ... I was wondering if we can land next big change with integration branch with multiple PRs delivered there and when we are ready to make an atomic change we merge the integration branch
15:29:45 <anssik> ... multiple smaller PRs may get faster reviews, sometimes retaining the full context is useful
15:31:56 <anssik> Chai: thank you Anssi, big PR with 200 comments takes its time to converge, I'm fine with that it takes long with many stakeholders involved
15:32:24 <anssik> ... we started working in this space 4 years and this year we've really accelerated with more and more folks and companies joining
15:32:34 <anssik> ... also want to acknowledge your leadership Anssi
15:32:59 <anssik> ... everyone on the call knows this year 2023 is the year of AI
15:33:20 <anssik> ... I see this WG getting stronger and stronger, super excited for the future
15:33:40 <anssik> ... working in this group is dream coming true, experts in this industry coming together in this group
15:34:02 <anssik> ... we have a lot of issues and proposals to go through so 2024 will be even busier than this year, thank you all!
15:34:39 <anssik> q?
15:35:02 <anssik> Subtopic: Next steps
15:35:16 <anssik> anssik: now that we've landed this major PR I would like to discuss what are our shared goals going into 2024
15:35:33 <anssik> ... I believe we want to advance the issues we spun off from the PR and I plan to bring them to our discussions in our future calls
15:35:50 <anssik> ... extending beyond this v2 ops PR, I think we want to work on the NPU support and WebGPU interop
15:36:06 <anssik> ... before getting into W3C Process-level next steps, anyone want to share areas of focus for 2024?
15:36:30 <anssik> q?
15:37:17 <anssik> anssik: I have invited Dom to share with us guidance for the expected W3C process next steps
15:37:33 <anssik> -> Initial wide review completed Mar 2023 https://github.com/webmachinelearning/webnn/issues/239
15:37:33 <gb> https://github.com/webmachinelearning/webnn/issues/239 -> CLOSED Issue 239 Wide review tracker (by anssiko) [process]
15:38:18 <anssik> Dom: we reached first CR March 2023, since we landed this major change to the spec that significantly expands the scope and incorporates rewrites I suggested to Anssi we want to target another CR snapshot
15:38:54 <anssik> ... one motivation is it gives greater IP grant for the whole scope, it is also an opportunity that all these changes are well aligned with the rest of the platform and how Web APIs are expected to behaev
15:39:14 <anssik> ... when we reached CR in March we went through the wide review process
15:39:41 <anssik> ... if we were to publishing this CR snapshot we'd be expected to do a delta wide review for the changed and new parts
15:40:08 <anssik> ... horizontal groups relevant primarily would be TAG for technical architecture
15:40:35 <RafaelCintron> +q
15:40:47 <anssik> ... any additional short term issues worth fixing prior to that would be good to know, not necessarily WebGPU interop
15:41:44 <anssik> ... any of the review group could review any part of the spec but recommendation is to highlight what is new
15:41:57 <anssik> ... we could point to the list of PRs rather than raw HTML diff
15:42:30 <anssik> anssik: maybe 2024 Q1 is a good time for snapshot?
15:42:46 <anssik> q?
15:42:47 <anssik> ack RafaelCintron
15:42:53 <anssik> Present+ Rafael_Cintron
15:43:25 <anssik> RafaelCintron: WebGPU interop portion of the spec has currently least implementation experience, welcoming any review by TAG or any group
15:43:44 <anssik> ... before we go ahead we should probably gather more implementation experience on the WebGPU bits
15:44:14 <anssik> q?
15:46:44 <anssik> Topic: Enhancements
15:46:56 <anssik> Subtopic: API lacks handling for async ML device errors on the context (revisit)
15:47:01 <anssik> anssik: issue #477
15:47:02 <gb> https://github.com/webmachinelearning/webnn/issues/477 -> Issue 477 API lacks handling for async ML device errors on the context (by bbernhar)
15:47:10 <anssik> ... we discussed this on our 2023-11-16 call: https://www.w3.org/2023/11/16-webmachinelearning-minutes.html#t09
15:47:16 <anssik> ... I wanted to revisit this now with Bryan in the group officially
15:47:36 <anssik> ... to recap Bryan is asking: "What happens if a WebNN operation dispatched through MLContext encounters some internal error which causes the GPU device to get removed?"
15:48:08 <anssik> ... and his expectation was: "I would expect WebNN to provide a spec into how fatal (device) errors are handled so the WebNN developer could respond appropriately. If we want to do more with MLContext (ex. create buffers), I believe we'll need a more robust error mechanism like WebGPU"
15:48:13 <anssik> -> WebGPU Errors & Debugging https://www.w3.org/TR/webgpu/#errors-and-debugging
15:48:20 <anssik> q?
15:48:44 <anssik> q?
15:48:46 <RafaelCintron> +1
15:49:28 <anssik> Bryan: we need to agree what the spec should be for this, there have been multiple approaches, there can be a driver error, anything that causes a device loss
15:50:10 <anssik> q?
15:51:37 <anssik> Rafael: WebGL and WebGPU have a similar way to surface these errors, WebGL used callbacks to signal these errors, WebGPU did it in a modern way with a Promise when context is lost
15:51:56 <anssik> ... WebGPU path seems worth following,
15:52:26 <anssik> ... in future we can have multiple adapters, WebGL/GPU on different adapters, you lose context but things continue work on another adapter
15:52:44 <anssik> q?
15:53:35 <anssik> q?
15:53:53 <anssik> Topic: New features
15:53:57 <anssik> Subtopic: Support for device-based tensor storage objects
15:54:02 <anssik> anssik: issue #482
15:54:03 <gb> https://github.com/webmachinelearning/webnn/issues/482 -> Issue 482 Support for device-based tensor storage objects (by bbernhar)
15:54:15 <anssik> anssik: a proposal from Bryan for device-based tensor storage objects
15:54:22 <anssik> ... problem statement:
15:54:37 <anssik> ... - WebNN and WebGPU lack a way of sharing tensor data on-device directly with each other
15:54:52 <anssik> ... - WebNN does not support chained inferences without copying everything back to the CPU
15:54:59 <anssik> ... proposed solution is MLBuffer, features:
15:55:17 <anssik> ... - Give WebNN developer control of device-storage to avoid round-trips to/from CPU
15:55:34 <anssik> ... - Could be extended to export/import to support WebNN interop with web APIs
15:55:40 <anssik> anssik: the GH issue proposes new interfaces for:
15:55:46 <anssik> ... - MLBuffer construction/destruction
15:55:52 <anssik> ... - Upload/Download tensor data to/from MLBuffer
15:55:58 <anssik> ... - Binding MLBuffer to an MLGraph
15:56:13 <anssik> q?
15:57:04 <anssik> Bryan: WebNN needs a way to share data and avoid roundtripping, affecting some models and inference performance
15:57:56 <anssik> ... MLBuffer helps WebNN provide means to source tensor data and initialize itself and have explicit resource sharing similar to WebGPU
15:58:08 <anssik> ... one stone two birds, interop and non-interop situation
15:58:08 <anssik> q?
15:58:25 <anssik> q?
15:59:00 <RafaelCintron> q+
15:59:25 <anssik> Chai: I do understand the motivation behind this proposal, I'm thinking the implications of this proposal on the API side, understand why encapsulation is beneficial for WebGPU interop and other reasons when we want the MLContext to act like a resource domain
15:59:33 <anssik> ... I understand the motivation behind this
15:59:35 <anssik> q?
15:59:40 <anssik> ack RafaelCintron
16:00:43 <anssik> RafaelCintron: near the end the proposal uses overloading  for compute method, we need to change it, a new method probably that does not return a promise and only accepts a buffer
16:00:45 <chai> q+
16:01:00 <Ningxin_Hu> q+
16:01:13 <anssik> ... the API does not know you assign a return value to something, needs some improvement in the current proposal
16:01:36 <anssik> ... that said, I see the need for this feature, so each inference does not have to round trip
16:01:49 <anssik> ack chai
16:02:26 <anssik> Chai: likely implication for adopting MLBuffer is we maybe need to scrub WebGPUBuffer
16:03:01 <anssik> ... we have to redesign how to move resource to GPU, need to think about this more, otherwise we have two ways to do the same thing
16:03:07 <anssik> Bryan: that matches my understanding
16:03:11 <anssik> q?
16:03:14 <anssik> ack Ningxin_Hu
16:05:08 <anssik> Ningxin_Hu: want to add one point, chained inferences, CPU inference does not need to do relayout, can use internal representation and pass it to the next in the chain
16:05:10 <anssik> q?
16:05:20 <anssik> q?
16:05:40 <anssik> q?
16:05:50 <anssik> Topic: Thank you for a transformative 2023!
16:06:14 <anssik> anssik: Thank You for your major contributions during 2023. Some highlights and milestoned from our journey this year:
16:06:19 <anssik> ... - WebNN API hit its Candidate Recommendation milestone in March 2023
16:06:34 <anssik> ... - the WG delivered a substantive spec refresh in Dec 2023, transformers support with v2 ops, an early seasonal gift
16:06:49 <anssik> ... - super strong progress on implementations across multiple backends and platforms, frameworks
16:07:05 <anssik> ... - the WG's participation grew by 100% YOY and we merged ~80 PRs into the WebNN API spec
16:07:22 <anssik> ... As this year draws to a close, we are accelerating into an exciting 2023 from a position of strength
16:07:34 <anssik> ... I look forward to more great things to come from this WG in 2024, the year of AI PC
16:07:39 <anssik> ... Happy Holidays and a Prosperous New Year!
16:07:57 <anssik> ... relax, recharge, and see you on our next call 11 January 2024
16:07:59 <anssik> q?
16:08:12 <anssik> RRSAgent, draft minutes
16:08:13 <RRSAgent> I have made the request to generate https://www.w3.org/2023/12/14-webmachinelearning-minutes.html anssik
16:08:34 <anssik> Present+ Dwayne_Robinson
16:08:36 <zkis_> zkis_ has joined #webmachinelearning
16:08:39 <anssik> RRSAgent, draft minutes
16:08:41 <RRSAgent> I have made the request to generate https://www.w3.org/2023/12/14-webmachinelearning-minutes.html anssik
16:21:18 <anssik> s/milestoned/milestones
16:21:25 <anssik> RRSAgent, draft minutes
16:21:26 <RRSAgent> I have made the request to generate https://www.w3.org/2023/12/14-webmachinelearning-minutes.html anssik
16:22:33 <anssik> s/exciting 2023/exciting 2024
16:22:38 <anssik> RRSAgent, draft minutes
16:22:40 <RRSAgent> I have made the request to generate https://www.w3.org/2023/12/14-webmachinelearning-minutes.html anssik
18:16:10 <Zakim> Zakim has left #webmachinelearning