W3C

– DRAFT –
WebML CG Teleconference – 4 March 2021

04 March 2021

Attendees

Present
Anssi_Kostiainen, Chai_Chaoweeraprasit, Ganesan_Ramalingam, Ningxin_Hu, Ping_Yu, Rafael_Cintron, Zoltan_Kis
Regrets
-
Chair
Anssi
Scribe
Anssi, anssik

Meeting minutes

TAG review feedback - open PRs

anssik: couple of TAG review issues have PRs in review, let's discuss those.

[tag-tracker] NamedOutput mechanism clarification #140

NamedOutput mechanism clarification #140

PR #147

anssik: PR #147 seems to be pending review from Ping and Chai. Any initial reactions?

ningxin_hu: this PR tries to use W3C convention to describe the algorithm similar to other web specs
… for compute method, interacting with the device, I read the WebGPU spec and there are many methods that are similar and use a timeline concept, so though reuse that concept in this concept, to help spec writing and also easier to read by people familiar with WebGPU spec conventions
… no feedback yet from WebGPU participants, looking forward to that feedback.

<RafaelCintron> I do not have immediate feedback but will review this week.

<ping_yu> will do

[tag-tracker] WebGL and WebGPU interops (related issues #135, #141, #136)

[tag-tracker] Create context from external sources #135

[tag-tracker] Prefix generic interface names #141

[tag-tracker] Switch to use constructor #136

anssik: issues #135, #141, #136 are addressed by PR #149

PR #149

anssik: Chai's PR #149 comment has a summary:
… - Create context from external sources e.g. WebGLRenderingContext and WebGPU device (#135)
… - Make power preference part of context creation options.
… - Constant operands can be created from either WebGL or WebGPU buffers
… - Model inputs and outputs can be bound with WebGL or WebGPU textures
… - Prefix all types with "ML". Simplify "NeuralNetworkContext" to just "MLContext" (#141)
… - Switch to use constructor for MLModelBuilder instead of factory method (#136)

Chai: purpose as the name says, provide interop with WebGL and WebGPU, also zerocopy interp efficiently
… TAG reviewer mentioned that the way we deal or not deal with device might be a problem
… up to this point we haven't exposed the device concept
… we can interop with a device even without such a notion, the first aspect people should look at is the way how NeuralNetworkContext is created, it is called MLContext now
… we can create it with WebGLRenderingContext of WebGPUDevice, chose Device over Adaptor since WebGPU has a way to choose an adapter, and choosing an adapter is pretty important, e.g. when you have multiple adapters in your system, it is either up to the framework or the web developer to choose the right one
… we also want to create out context the way we wanted, looked at the current API with compilation power pref, we need to frontload that when the context is created
… so made it into MLContextOptions, web developers now need to deal with the device earlier in the flow, the benefit is this is consistent with the way we interop with the device
… if we create the context outselves, the model is inconsistent with the device
… that's a minor change introduced by this PR, the rest is just common sense, you can create a context from a device and bind it
… the input to the model can be WebGL or WebGPU textures in many cases due to visual input use cases
… we can use WebGPUBuffers as input, per Ningxin's suggestions, that sounds good to me
… about the model constants, limited to buffers, all the GPU vendors deal with buffers as constants, conv weights etc. no GPU vendors support textures due to its complexity
… all the constants in the model supported by buffers
… we used to use factor, not use ctor instead since it is simpler and aligns with API Design Guidelines

RafaelCintron: in general I'm supported of Chai's PR #149
… we do not need to solve this in this PR, we should look at how the promises in WebNN API interact with these other APIs we add dependency on to
… while the promise is off doing its thing, if WebGL or WebGPU buffers are changes, might introduce a race condition
… a web developer expects a more deterministic API perhaps
… we may have to modify this a bit, it may not be as seamless as we think to go back and forth with WebNN and WebGL/GPU buffers

Chai: great idea, maybe create an issue for this Rafael?

Chai: PR is pretty loaded, so can address Rafael's comments separately

ningxin_hu_: according to my understanding the MLModel is device agnostic?
… do we still need to keep the MLModel device agnostic, or leave that for compile time?
… for example, power preference was prior defined at compile time
… web developer has to build a model in a device agnostic way and compile for different devices or power preferences

Chai: whether the model should be device agnostic is an interesting one
… when you create a model builder, which is consistent with WebGPU, device is set up as the first thing
… in this approach, the model cannot be completely out of context of the device
… because device is specified in the beginning
… can think that at compile step, if already equipped with a device, it is essentially how to fuse or how to do layout assignments
… not separating out device agnostic phase from compilation that can always be device specific

<ping_yu> model topology should be device agnostic, compilation should be device specific.

Chai: the model may not be agnostic to the device, maybe the caller can use WebNN to build a graph and then serialize it out
… you can do this for MLModel, when frontloading the device it is hard to mentally think of objects coming later be device agnostic
… can spec that "MLModel is device agnostic" but in practice may not be

ningxin_hu_: do we need to put the device in the compile API instead?

Chai: the issue is you may want to construct a model by giving it device specific constants
… imagine, build a model off of WebGLContext, you'd have to do MLContext.constant, so dealing with a device resource even at the time you construct the model
… if you interop with an external device, these constants are already device resources

ningxin_hu_: it makes sense if like in your PR constants can be created from WebGL/GPU buffers

<ping_yu> where constant is should not prevent creating device agnostic model

ningxin_hu_: I know interop for execute and compute of the model, interacts with device buffer is useful
… I don't understand model creation with device resource, what's the use case for that?

<ping_yu> I have some comments

Chai: there are cases when we interop with scenario where the data is on device as device resource, in those cases WebGL uses upload to GPU and then construct the graph
… there are cases where you construct the graph when all the weights have already been uploaded to the GPU
… agree, the first think to do is to construct the model, but also cases esp for interop when weights are already in GPU buffer
… and don't want to do a round-trip

ningxin_hu_: thanks for the explation

Ping: I like this conversation and align with Ningxin's proposal, model should be device agnostic
… example from TF.js, you do not know if constant is on the GPU, from user's POV
… in our framework we do the data transfer for them, we will not prevent that
… I feel the model should be device agnostic, execution device specific

Chai: quick comment, as I mentioned, it is a design tradeoff, if we want model to be device agnostic, all device interop must be done later after the model has been constructed
… if we want to allow max interop when model weights are already uploaded
… I'll think this a bit, my hunch you either support the proposed interop proposal, or you don't, cannot revert and do it differently

[tag-tracker] Explainer feedback

<chai> <need to be away from kbd>

[tag-tracker] Explainer update per TAG review feedback #146

PR #148

anssik: in summary, PR #148 notes use cases up front, clarifies AI & ML terminology usage, adds a high-level comparison table for explaining the positioning of WebNN API vs. Model Loader API.

<chai> <back now>

anssik: any comments, OK to merge?

TAG review feedback - issues with active discussion

<chai> yes looks good. i've signed off on it

anssik: The ask was to take a look at these issues and make sure that you provide your feedback in comments. The editors will take your feedback into considerations when crafting PRs for these issues.
… let's peek into these one by one to get reactions

[tag-tracker] Isomorphic JS story, worker scope exposure?

[tag-tracker] Isomorphic JS story, worker scope exposure? #142

anssik: In response to Rafael's comment on Node and WebGL exposure, Sangwhan made a point a Foreign Function Interface (FFI) bindings to a compute backend such as CUDA would be feasible.
… Ningxin points our webnn-polyfill CPU backend already works in Node.js for testing purposes. Also, the Node.js binding to the proposed webnn-native should also work.
… the discussion seems to be pending Sangwhan's further comments?
… any thoughts from folks on the call?

[tag-tracker] Prefix generic interface names?

[tag-tracker] Prefix generic interface names?

anssik: the feedback from the TAG was: "A lot of the names are very generic (Operand, Compilation) - this feels like something we might want to prefix with something or synchronize with TC39 about"
… and we actually have a PR for this issue thanks to Chai:

PR #149

anssik: Chai want to walk us through the PR?

Chai: I simply take a cue from ML attribute and use that as a prefix for consistency, Operand -> MLOperand, NeuralNetworkContext -> MLContext
… when I look at WebGPU specs, they use GPU as a prefix
… same with WebGL, prefixed with WebGL

anssik: any comments re naming?

[tag-tracker] Ergonomics of the JS examples

Ergonomics of the JS examples #139

anssik: TAG feedback was: "While the limitations of JavaScript probably contribute a lot to this, but the ergonomics of this API based on example code might have room for improvement."
… Sangwhan adds some more details in comments: "unless the compute graph's topology is complicated (see: Inception V3...) reading the code for a sequential model should give you a rough idea of what the model is doing, but right now this is difficult."
… I think a key design choice the group took early on was to position the API with a JS framework as its key consumer

anssik: thought on ergonomics?
… to address this issue, we could note in the spec the key consumer for the API is JS frameworks

zkis: just a quick question, is it necessary to expose NeuralNetworkContext, now that we have ctor for creating the model?
… reason for multiple contexts?

Chai: quickly, the notion of a context is basically to keep the global state
… if you think about the implementation of this API in the browser, they need to keep the notion of a device in case of GPU
… otherwise lifetime is not very clear
… low-level APIs come with a notion of context

zkis: is it possible to have an internal slot that developers don't use directly? Context could be internal

Chai: for flexibility, the developer may choose that this specific model I want to use this specific device, if we don't want to give people that option, then the ML attribute could be its own context, depends how much we want people to interop with WebNN and WebGL/GPU APIs

zkis: earlier we had a similar problem, e.g. WebNFC with multiple adapters
… before context owned by model builder, now it is moved our as a ctor so we have a default context

[tag-tracker] String enum for activations

String enum for activations #138

anssik: TAG says: "If there are layers that will be taking activations as string enums, there should simply be a string enum for activations rather than have it just in RecurrentNetworkActivation. (One may argue that hyperbolic tangent is RNN specific, but..."
… perhaps we could note this detail in the spec as an informative note for readers?

anssik: comments?

[tag-tracker] Clarify which view/reshape like function are expected to copy

[tag-tracker] Clarify which view/reshape like function are expected to copy #137

anssik: TAG said: "I see quite a few view/reshape like functions, which of these are expected to copy and which are not? Probably good to note this in the spec."
… Sangwhan asks that wouldn't it make sense to define a common term for logical tensor changes (e.g. views?) somewhere early in the document so that concept can be re-used?
… in response to a review of PR #144 that proposes to address this issue #137

Chai: I did PR #144, I added some clarification on which copy

https://github.com/webmachinelearning/webnn/pull/144/files/44870dee0271e50ead69da93bf4ef0b9c206537e

Chai: will take a look at this comment

web-native update

An update on webnn-native open-source project progress and plans deferred to next call

webnn-native PR #1 initial implementation

anssik: Dawn Copyright statement comments to be addressed before landing

<ningxin_hu_> sounds good

Adjourn

Minutes manually created (not a transcript), formatted by scribe.perl version 127 (Wed Dec 30 17:39:58 2020 UTC).

Diagnostics

Succeeded: s/deferred for/deferred to

Succeeded: s/license/Copyright statement

Maybe present: anssik, Chai, ningxin_hu_, Ping, RafaelCintron, zkis