<jdarpinian> me too, "The Webex Meeting is locked"
<jdarpinian> got in, thanks!
Rafael: F2F minutes were clear, discussed with Apple at WebGPU F2F
… spoke with Myles Maxfield, he told me Apple favors an API that is not WebGPU extension
… would be easy for developers to misuse the API if so
jdarpinian: also talked to Myles, and I think he did not know if their hw allows sharing buffers between GPU and ML hardware
… WebGPU extension does not mean buffers allocated on GPU necessarily
… would be good to be able to specify "I want to use this buffer for ML"
anssik: are there minutes from WebGPU?
jdarpinian: can look into the minutes
paul: Microsoft has also custom hw for ML offloading, not sharing GPU buffers, so must support scenario with non-GPU hw not sharing buffers
jdarpinian: about sharing buffers, we'll want an API that does not share buffers, not WebGPU-based, not necessarily mean we shouldn't investigate WebGPU-based APIs, GPUs are growing ML-based features
… also it still might be simpler to release a WebGPU-based API even if it'd not perform as well on every platform, e.g. on those that cannot share buffers
Ningxin_Hu: questions re WebGPU F2F, james mentioned WebGPU extension, did you discuss WebGL extension too at WebGPU F2F?
jdarpinian: WebGL extension not discussed directly
… VuklanML F2F had discussions on MLIR and TVM
… no meta command API going into Vulkan, instead prefer exposing lower-level primitives allowing shaders access tensor cores of GPUs of today and write their own kernels, do kernel fusion
… not sure if that direction makes sense for as, just a data point
anssik: anyone from VulkanML to participate this group?
jdarpinian: more hw vendors, e.g. ARM, Qualcomm would be nice to get as participants here
https://www.w3.org/2019/Talks/dhm-ml-workshop/standardization.html
https://www.w3.org/2019/Talks/dhm-ml-workshop/
https://github.com/webmachinelearning/webnn/blob/master/explainer.md
https://github.com/immersive-web/webxr/blob/master/explainer.md
<jdarpinian> webgpu face to face meeting minutes, ML mentioned briefly: https://docs.google.com/document/d/1CmKo59tjZwmePVrFpHpIG0W5shKR_GOrnNuMStPCEko/edit
<Ningxin_Hu> https://github.com/webmachinelearning/webnn/issues/6#issuecomment-536408448
Ningxin_Hu: after F2F, I provided details of the investigations to issue #6 for WebGPU buffer sharing
… we have Apple MPS POC, Metal backend, WebNN can compile subgraph for WebGPU device
[Ningxin recaps WebNN investigation from F2F]
Ningxin_Hu: need extend WebNN API to allow compute subgraphs, to avoid data moving across devices
anssik: does Ningxin's POC results agree with Apple's concerns re buffer sharing?
Rafael: interested in hearing Ningxin's view on performance delta in this scenario?
Ningxin_Hu: POC investigations on MBP, so does not have dedicated ML hardware
… tests exercise WebGPU compute shaders and Metal compute shaders
Ningxin_Hu: is this reasonable requirement: we want WebNN to compile to a dedicated ML hardware, test with WebGPU shader compute shader exchanging data with it, profile performance of buffer sharing
<PaulM_> Can we use intel ml chips as a test case ?
Paul: you're looking for hardware to prove this out?
Ningxin_Hu: re future POC requirememnts 1) choose dedicated ML hardware to test with, 2) decide a data point we want
Paul: I like data-driven design as proposed by anssi
Ningxin_Hu: we have Movidius VPU in our POC via OpenVINO on Linux
… we could probably have similar setup on Windows through DirectML
Paul: that sounds awesome, let's follow up off this call
<Ningxin_Hu> POC repo: https://github.com/otcshare/chromium-src
jdarpinian: comment on using Movidius, these are often connected over USB implies bandwidth constraints
… PCI Express would be better
Ningxin_Hu: previous setup was with USB, but current hardware on PCI Express
Explore custom op support by DSL
Kai: sort of interested, but not up to speed with it
Support compiling and executing ops for devices, CPU or GPU
<PaulM_> Need to drop off.
Succeeded: s/TVM(?)/TVM/
Maybe present: anssik, jdarpinian, Kai, paul, Rafael