Meeting minutes
anssik: Goal for the first topic is to discuss and triage new issues; identify any reusable constructs, concepts and patterns to be shared with WebNN API.
Device Preferences
anssik: three issues around a common theme of device preferences, touched on 6 Apr meeting briefly
WebML CG call 6 Apr 2022: Device Preferences: GPU, power, usage, performance hints
DevicePreference: "GPU" => "GpuRequired" and "GpuPreferred"
"default" v.s. "auto" in MLDevicePreference and MLPowerPreference
anssik: from that discussion, it seems there's interest to try to share surface between WebNN and Model Loader
… Rafael from Msft noted "auto" has precedent, helps mitigate fingerprinting concerns, the browser to pick the backend for better of worse (i.e. selection is an implementation detail)
https://
Rafael: we've been worn many times with an API that gives a bunch of devices
… better to say, "browser give me an ML device"
… for WebXR they go further, have required and optional hints
anssik: any experiences from WebXR, does their API work, interop issues?
Rafael: people tend to make assumptions sometimes, no major issues known with their approach
… earlier WebVR API was worse, that required ppl to do string comparison etc.
… getUserMedia and WebXR models are best for privacy
Honglin: maybe adapting the model to the device is not a problem in terms of privacy?
… exposing GPU specific features, would that be a problem?
… we don't want to leak device information, e.g. what type of GPU is in use
… I think it is not an issue, it is already known what the GPU is
Rafael: reasonable to expose an API that prefers GPU, WebGL supports PowerPreference, not a guarantee a wish
… in Chromium, if you has high-power, you actually never get such a battery-sucking context
… low-power, high-power, (pick nothing) are the options
… in iframe, high-power does nothing
also "fail is major perf caveats" exists, example "Google Maps"
Honglin: is this the same with WebGPU?
… does this affect WebGPU backend of WebNN?
Rafael: I don't know we ignore that for WebNN
Chai: for WebNN, you can create WebNN context either from WebGPU device or its own device
… to interop with WebGPU, you pass the context to WebNN
… standalone WebNN usage is possible, then can provide preferences
Honglin: this reminds me of an issue, the model is running in another process, more complex power management, e.g. when the renderer is throttled, we probably need to also throttle the other process
Rafael: what do you do when multiple web pages ask of different things, process per request?
… for power management, for example
<RafaelCintron> https://
Honglin: haven't gone to the details, talking with experts the plan
… TFLine has a callback, maybe we can sleep in between, this is being explored
Rafael: when ppl say throttling it doesn't mean slower always, timeouts are fired less frequently
… when you minimize a tab, Chrome will constrain the CPU usage of the tab
Rafael: TFLite in renderer process? what's limiting that?
Honglin: TGLite not designed for web usage, adding fuzzers to TFLite as a pending task, first is security, TFLite is also evolving quickly, so maybe not ready for renderer
<RafaelCintron> https://
Honglin: the renderer has its own memory management, security sandbox etc.
Rafael: evolving quickly, can you clarify?
Honglin: I don't think any ML packages are stable, they are evolving quickly
Chai: in my day job we work with ISVs and ML across hardware vendors there's significant overlap so we can build an abstract API atop
… if we look at transformers, yes, that is an area that evolves quickly
… for the most part, e.g. Computer Vision is settled, a lot of models there
… significant overlap what the web can take advantage of
… we can add stuff and break some stuff along the way
… we should look at native, web is very basic and room to catch up there
Honglin: many technologies used to implement ML capabilities are evolving, ML foundations are stable I agree
… what we choose now is the lowest effort approach, fail quickly and course correct to gather developer feedback
… Chrome OS has a lot of users so good platform to gather feedback
… I'm working on GPU support for Model Loader API currently
Chai: stable vs unstable, I meant the platform API is stable, not models
… in the beginning of browsers, they ran on platform APIs, Mac, PC, browser has been successful abstracting the capabilities of the OS platforms, e.g. 2D Canvas
… 2D graphics was developed enough when 2D Canvas was brought to the web, and it became successful
… we can draw parallels to current platform APIs for ML
… with native support in the OS, the OS can do better job in performance, hw conformance, because OS teams work with HW vendors, and web should use these HW capabilities
… for example, CoreML does conv, matmul etc. similarly to Windows, Linux, others
… conformance work done in the OS can benefit the web, portability is part of the Web API promise
… companies continue invest in platform APIs is a clear signal that the browser should take advantage of these platform APIs, regardless of what is the next shiny object
anssik: there's also precedent in HTML media, canPlayType() method that must return "probably" if the user agent is confident that the type represents a media resource that it can render if used in with this audio or video element; and it must return "maybe" otherwise.
… this may not be the greatest API, but it is out there and widely supported
videoElement.canPlayType() method from HTML spec
… we could evaluate whether a similar "canComputeModel()" method would work here
Rafael: my understanding is that after you pick your preference any model would work
… now if we'd target a large diversity of HW, and some ops wouldn't work at all, I think a better to have a set of ops for which you could say "yes I can do that"
Chai: I don't have many thought yet on this
Versioning
anssik: we merged this issue into Standard model format issue https://
… any concerns?
Honglin: Jonathan should own this issue
Standard model format
TFLite FlatBuffers format (commit history)
anssik: I have one question, what is the process by which the TFLite FlatBuffers format is evolved?
… for example, new features landing recently to add RELU_0_TO_1, support uint16
Honglin: I don't know exactly, TFLite ppl need to speak for it
anssik: I'm interested because of the provision we have in the WebML WG charter:
Note: The Model Loader API needs a standard format supported across browsers and devices for broad interoperability. The Working Group will only create a standard format of its own if no other standardized format aligns with the group's principles nor allows the group to control or provide direct input to shape it. The Working Group will only start working on this API when there is agreement on such a format.
https://
… "The Working Group will only create a standard format of its own if no other standardized format [...] allows the group to control or provide direct input to shape it"
… is the process by which TFLite FlatBuffers format is evolved community-driven and accepts direct input?
Rafael: not sure, but my bet is maybe no, it is a Google thing, the same goes with ONNX Runtime, where one format is a standard format, but it also support a proprietary format that mirrors ONNX Runtime
… we should at least have a standard format every browser can adapt