WebML CG Teleconference – 4 February 2021

Meeting minutes

Proposals for future work

Operation-specific APIs by @jonathanbingham

[ flipping agenda to accommodate Ping ]

Ping: had a discussion about this proposal, think this proposal could be complementary to what the current framework offers
… Wasm used to accelerate on kernel-level, similar to WebNN, e.g. thinking OpenVINO, could further improve SIMD capability
… the problem we see is I/O how to map Wasm global to OS level implementation, whether need to copy data in and out, critical for performance

Ping: graph API would have similar issues, how to provide I/O efficiency, whether Wasm or GPU level

https://github.com/webmachinelearning/proposals/issues/2#issuecomment-771159231

Chai: I looked at Ping's response on the issue, Jonathan gave a small code snippet how the API could look like
… still looks like we want to define convolution as a separate function, immediate function that can work on WebGL texture
… now quite clear if we want to interop with WebGL texture, why cannot we do it with the graph API itself
… why need another set of API to interact with WebGL, something we want to do with the graph API as well
… another thing, if we want to see this as a layered API cannot see how it could happen, or is this unrelated totally different API?

Ping: rationale here is, if we do op-level API we should bind it to current browser API, the reason is for many users who are not familiar with WebNN now the graph API will take some time
… to avoid I/O bottleneck, we have to tie out acceleration to a particular chip, e.g. CPU sharing memory with Wasm from WebNN
… in the GPU case, not so sure if we can share memory(?)
… the reason we say this, it won't be jumping around, data cannot jump from CPU<->GPU or the benefit is not realized
… we also had an example on the Wasm side
… if we would have an op-level API I/O bottleneck is critical to solve

ningxin_hu: efficient I/O is important for graph API as well
… we had a discussion on custom ops long time ago
… discussing memory exchange mechanism between WebGL<->WebGPU<->WebNN

Rama: seems to me efficient I/O is important and needs to be solved, not sure what other distinct motivation underlies Operation-specific API

Ping: graph API definitely requires efficient I/O, from my POV graph API needs a lot of tooling to map, manual construction or 3rd party transformation
… existing frameworks TF.js have their own intermediate interpretation, arch is backend focused, route ops for backends to acceleration
… in the future folks may agree ops change over time, major complexity and computationally heavy ops will remain
… benefit of full graph construction TBD from our POV

Chai: Ping's proposal reminds me of some work we did on the hardware level and can reflect
… I understand what you are saying, you are trying to solve a slightly different problem with a framework based on WebGL/Wasm and want to provide this small acceleration this proposal would provide
… issues here is you want to define the data type, but when you deal with construct on this level with hw and GPU whatnot, that specification of datatype is not sufficient, e.g. ML accelerators deal with issues such as data alignment, layout format, not all hw use standards layout, texture format etc.
… that is the reason why people define the API in a certain point on the sw stack, e.g. model loader, where innovation happens below
… with graph API you have innovation on up and downstream
… WebNN seems to strike the right layer, because innovation can also happen on top of this abstraction
… at the level proposed to be efficient, to reduce data movement, the currency that flows between the interfaces is normally lower abstraction than just a texture
… different GPU have differences in here, how to define requirements of this texture input
… the level of abstraction matters a lot

Ping: totally agree
… in the doc I describe some of these issue
… this proposal is not to replace the Graph API
… what people currently do with ML is they create a pipeline
… even though you abstract into a graph API conversion happens somewhere
… I don't see the difference if using graph API or op-specific API
… model execution is one step out of multiple steps
… in my opinion must realize what the user is doing
… this mapping needs to be done inside, otherwise there's too much complexity for the user

RafaelCintron: just had couple of questions
… at the Wasm level, the version of op API that works on CPU what acceleration is provided for conv2d?
… second question, TF.js is already a graph of ops, why is the opset better given that?

Jonathan: motivation for operation-specific API was that it is a smaller and simpler API that can be shipped sooner
… little concerned it takes longer
… if we think we can ship WebNN API in the same amount of time let's go for it

RafaelCintron: motivation to ship sooner not because inherently better?

Jonathan: right

Ping: re Wasm, we don't know if we can do SIMD implementation(?)
… Intel chip can have wider SIMD than what Wasm currently exposes

TAG review feedback dissemination

anssik: Discuss the first-pass TAG review feedback, formulate responses to key questions from the TAG.
… first, thanks to Sangwhan and Kenneth for your feedback! I propose we discuss the feedback one by one on this call and for follow up by creating a GH issue for each piece of feedback we cannot resolve right away.

Spec feedback

anssik: 1/13: "The fact that a GRU is in there really sticks out. I somehow found out why it is there, but it feels extremely inconsistent with the rest of the API which is fairly generic. (e.g. you should have a LSTM and a GRU, but not just a GRU - that's weird.)"

Chai: general question about the interaction?
… comment back on the issue?

Chai: this is because we're not complete yet, this is "v1" API

<ningxin_hu> +1

anssik: 2/13: "In the spec, some of the activations are out in global scope (e.g. relu), some are in unary operators (sigmoid, tanh) - this doesn't look consistent."

anssik: does this need to be addressed in the spec?

Chai: agree with the observation, can group these differently, stylistic
… doesn't matter if relu is outside unary because the API does not define the group, but in terms of the spec we could move things into unary group

ningxin_hu: if we look at the WebIDL level they are in the same interface so no difference from the API perspective, text section organization issue

anssik: let's create an issue for this to track

anssik: 3/13: "The spec mentions training in the batch normalization section - but I'm fairly convinced that there is no support for training. Is this an error?"

anssik: easy fix, right?

Chai: yes, can explain this better

anssik: create an issue to track the fix

anssik: 4/13: "getNeuralNetworkContext() and createModelBuilder() seem strange (no parameters, for one thing) - is this expected to accept parameters/configs at some point? If so, we'd like to see what is intended here."

RafaelCintron: I see 4 and 5 very similar, and agree

anssik: 5/13: "Wouldn't it make sense to have a constructor rather than a builder pattern for createModelBuilder()? (e.g. new ModelBuilder(navigator.ml.getNNContext());"

RafaelCintron: prefer constructor

Chai: SGTM

Chai: getNeuralNetworkContext() with eager mode could be a param going in

ningxin_hu: I believe that's the idea, discussion with Rama

anssik: open issues for 4 and 5

Rama: I agree, saying eager could be supported even without constructor

ningxin_hu: constructor, I agree for model builder
… getNeuralNetworkContext() is similar to WebGL, allow eager execution config

<Rama> correction "even without" => "even with" above

Zoltan: constructors not always the best, discussed Domenic Denicola
… for two phase construction, factory might be better

anssik: 6/13: "I see quite a few view/reshape like functions, which of these are expected to copy and which are not? Probably good to note this in the spec."

anssik: sounds like a simple clarification

anssik: open an issue to track

anssik: 7/13: "If there are layers that will be taking activations as string enums, there should simply be a string enum for activations rather than have it just in RecurrentNetworkActivation. (One may argue that hyperbolic tangent is RNN specific, but..."

ningxin_hu: re 6, not sure if there are implementation details?

Chai: re 6, my answer would be, "it depends"

Chai: yes, the reason for 7 for GRU since not activations can be supported
… will clarify in spec

anssik: open an issue for 7

anssik: 8/13: "While the limitations of JavaScript probably contribute a lot to this, but the ergonomics of this API based on example code might have room for improvement."

ningxin_hu: related to the existing issue in webnn repo

anssik: 9/13: "It feels like errors/exceptions should probably fleshed out. (e.g. what happens when you try to reduce on a non-existent axis?)"

Chai: yes, we already have an issue for exception handling and error handling

anssik: 10/13: "I don't quite understand the NamedOutput mechanism. What if what is output just a feature?"

<ningxin_hu> Chained API for the Operands: https://github.com/webmachinelearning/webnn/issues/106

Chai: I think we can answer this, and give the reasoning why landed on NamedOutput, not sure I understand the latter part of the question ("just a feature")?

anssik: potentially open an issue

anssik: 11/13: "A lot of the names are very generic (Operand, Compilation) - this feels like something we might want to prefix with something or synchronize with TC39 about."

anssik: naming is always interest topic to discuss

Zoltan: we have a namespace, don't feel like need TC39 coordination

ningxin_hu: the suggestion sounds like we'd prefix with ML or NN?

anssik: concerns probably due to there being exposed to the global namespace so could clash

anssik: could break some JS libs

Rama: can't have namespacing?

anssik: sadly no for interfaces defined in WebIDL

anssik: 12/13: "What's the isomorphic JS story for this? Also, given that this is attached to vanilla navigator, is this not expected to work in a worker scope?"

RafaelCintron: 12 is a good issue, should not tie to navigator, good for worker scope as well
… as for the isomorphic part, the degree to which you can use this API, depends on how much Node exposes e.g. WebGL in node

anssik: open an issue for 12

anssik: 13/13: "Given that bootstrapping a network is a lot of work, would it make sense to have some sort of serialization/caching story here?"

Chai: at least for serialization, it is out non-goal we say that in the explainer
… that can be out proposed response

anssik: sounds good

<ningxin_hu> SGTM

anssik: nit: "The one case I saw clamp() being used seemed to implement a relu?"

anssik: nit: "Search for "creatModelBuilder" in the explainer."

anssik: bonus comment: "feels like having a Sequential() would be nicer syntax wise."

<ningxin_hu> regrading to namespace: https://heycam.github.io/webidl/#idl-namespaces

<ningxin_hu> sounds good

<chai> +1

– DRAFT –
WebML CG Teleconference – 4 February 2021

04 February 2021

Attendees

Meeting minutes

Proposals for future work

TAG review feedback dissemination

Adjourn

Diagnostics