WebML CG Teleconference – 4 May 2022

Meeting minutes

anssik: Goal for the first topic is to discuss and triage new issues; identify any reusable constructs, concepts and patterns to be shared with WebNN API.

Device Preferences

anssik: three issues around a common theme of device preferences, touched on 6 Apr meeting briefly

WebML CG call 6 Apr 2022: Device Preferences: GPU, power, usage, performance hints

DevicePreference: "GPU" => "GpuRequired" and "GpuPreferred"

"default" v.s. "auto" in MLDevicePreference and MLPowerPreference

Add performance/usage Hints

anssik: from that discussion, it seems there's interest to try to share surface between WebNN and Model Loader
… Rafael from Msft noted "auto" has precedent, helps mitigate fingerprinting concerns, the browser to pick the backend for better of worse (i.e. selection is an implementation detail)

https://github.com/webmachinelearning/model-loader/issues/30

Rafael: we've been worn many times with an API that gives a bunch of devices
… better to say, "browser give me an ML device"
… for WebXR they go further, have required and optional hints

anssik: any experiences from WebXR, does their API work, interop issues?

Rafael: people tend to make assumptions sometimes, no major issues known with their approach
… earlier WebVR API was worse, that required ppl to do string comparison etc.
… getUserMedia and WebXR models are best for privacy

Honglin: maybe adapting the model to the device is not a problem in terms of privacy?
… exposing GPU specific features, would that be a problem?
… we don't want to leak device information, e.g. what type of GPU is in use
… I think it is not an issue, it is already known what the GPU is

Rafael: reasonable to expose an API that prefers GPU, WebGL supports PowerPreference, not a guarantee a wish
… in Chromium, if you has high-power, you actually never get such a battery-sucking context
… low-power, high-power, (pick nothing) are the options
… in iframe, high-power does nothing

also "fail is major perf caveats" exists, example "Google Maps"

Honglin: is this the same with WebGPU?
… does this affect WebGPU backend of WebNN?

Rafael: I don't know we ignore that for WebNN

Chai: for WebNN, you can create WebNN context either from WebGPU device or its own device
… to interop with WebGPU, you pass the context to WebNN
… standalone WebNN usage is possible, then can provide preferences

Honglin: this reminds me of an issue, the model is running in another process, more complex power management, e.g. when the renderer is throttled, we probably need to also throttle the other process

Rafael: what do you do when multiple web pages ask of different things, process per request?
… for power management, for example

<RafaelCintron> https://immersive-web.github.io/webxr/#feature-dependencies has a good explanation of required vs. optional features for WebXR.

Honglin: haven't gone to the details, talking with experts the plan
… TFLine has a callback, maybe we can sleep in between, this is being explored

Rafael: when ppl say throttling it doesn't mean slower always, timeouts are fired less frequently
… when you minimize a tab, Chrome will constrain the CPU usage of the tab

Rafael: TFLite in renderer process? what's limiting that?

Honglin: TGLite not designed for web usage, adding fuzzers to TFLite as a pending task, first is security, TFLite is also evolving quickly, so maybe not ready for renderer

<RafaelCintron> https://www.khronos.org/registry/webgl/specs/latest/1.0/#5.2.1 has a good description of context parameters you can pass for requesting power preference and "failIfMajorPerformanceCaveat".

Honglin: the renderer has its own memory management, security sandbox etc.

Rafael: evolving quickly, can you clarify?

Honglin: I don't think any ML packages are stable, they are evolving quickly

Chai: in my day job we work with ISVs and ML across hardware vendors there's significant overlap so we can build an abstract API atop
… if we look at transformers, yes, that is an area that evolves quickly
… for the most part, e.g. Computer Vision is settled, a lot of models there
… significant overlap what the web can take advantage of
… we can add stuff and break some stuff along the way
… we should look at native, web is very basic and room to catch up there

Honglin: many technologies used to implement ML capabilities are evolving, ML foundations are stable I agree
… what we choose now is the lowest effort approach, fail quickly and course correct to gather developer feedback
… Chrome OS has a lot of users so good platform to gather feedback
… I'm working on GPU support for Model Loader API currently

Chai: stable vs unstable, I meant the platform API is stable, not models
… in the beginning of browsers, they ran on platform APIs, Mac, PC, browser has been successful abstracting the capabilities of the OS platforms, e.g. 2D Canvas
… 2D graphics was developed enough when 2D Canvas was brought to the web, and it became successful
… we can draw parallels to current platform APIs for ML
… with native support in the OS, the OS can do better job in performance, hw conformance, because OS teams work with HW vendors, and web should use these HW capabilities
… for example, CoreML does conv, matmul etc. similarly to Windows, Linux, others
… conformance work done in the OS can benefit the web, portability is part of the Web API promise
… companies continue invest in platform APIs is a clear signal that the browser should take advantage of these platform APIs, regardless of what is the next shiny object

anssik: there's also precedent in HTML media, canPlayType() method that must return "probably" if the user agent is confident that the type represents a media resource that it can render if used in with this audio or video element; and it must return "maybe" otherwise.
… this may not be the greatest API, but it is out there and widely supported

videoElement.canPlayType() method from HTML spec
… we could evaluate whether a similar "canComputeModel()" method would work here

Rafael: my understanding is that after you pick your preference any model would work
… now if we'd target a large diversity of HW, and some ops wouldn't work at all, I think a better to have a set of ops for which you could say "yes I can do that"

Chai: I don't have many thought yet on this

Versioning

anssik: we merged this issue into Standard model format issue https://github.com/webmachinelearning/model-loader/issues/28 per https://www.w3.org/2022/03/09-webmachinelearning-minutes.html#t02
… any concerns?

Honglin: Jonathan should own this issue

Standard model format

TFLite FlatBuffers format

TFLite FlatBuffers format (commit history)

anssik: I have one question, what is the process by which the TFLite FlatBuffers format is evolved?
… for example, new features landing recently to add RELU_0_TO_1, support uint16

Honglin: I don't know exactly, TFLite ppl need to speak for it

anssik: I'm interested because of the provision we have in the WebML WG charter:

Note: The Model Loader API needs a standard format supported across browsers and devices for broad interoperability. The Working Group will only create a standard format of its own if no other standardized format aligns with the group's principles nor allows the group to control or provide direct input to shape it. The Working Group will only start working on this API when there is agreement on such a format.

https://www.w3.org/2021/04/web-machine-learning-charter.html#tentative
… "The Working Group will only create a standard format of its own if no other standardized format [...] allows the group to control or provide direct input to shape it"
… is the process by which TFLite FlatBuffers format is evolved community-driven and accepts direct input?

Rafael: not sure, but my bet is maybe no, it is a Google thing, the same goes with ONNX Runtime, where one format is a standard format, but it also support a proprietary format that mirrors ONNX Runtime
… we should at least have a standard format every browser can adapt

– DRAFT –
WebML CG Teleconference – 4 May 2022

04 May 2022

Attendees

Meeting minutes

Device Preferences

Versioning

Standard model format

Diagnostics