WebML CG Teleconference – 10 December 2020

Meeting minutes

WG Charter feedback

Anssi: the idea was to go through the feedback that's been provided for the charter proposal

https://github.com/w3c/machine-learning-charter/issues

Anssi: thank you all for the active engagement in the repo

https://github.com/w3c/machine-learning-charter/issues/8

Anssi: first I would like to discuss one issue that seemed to be pretty core to the most of the discussions, #8, opened by Jonathan
… Jonathan wanted to make sure the charter gives flexibility to accommodate potential future work

https://github.com/w3c/machine-learning-charter/pull/9

Anssi: I submitted pull request #9 to address this desire with a list of 2 tentative specification
… the first one is the model loader API for which we have an explainer
… the second one is a proposal "WebNN API Lower Level Instruction Set"

https://www.irccloud.com/pastebin/6T8DY8hm/

""A lower-level instruction set for the Web Neural Network API defines a complementary set of primitives out of which higher-level operations defined by the Web Neural Network API can be constructed. This instruction set is vendor-neutral and is expected to draw inspiration from efforts such as XLA HLO, Tensor Compute Primitives, TensorFlow RISC, and Tensor Operator Set Architecture.""

Anssi: the discussion in the repo shows that Google has been working on these 4 ideas, but not shipping yet
… they're basically sitting at a lower level than current WebNN
… Based on the discussion on the repo, the pull request would address this issue, and the other issues that seem to mostly derive from this core issue
… This seems acceptable to Google, and I would like to get confirmation this is acceptable to others
… For clarity sake, these would be tenative specs, which the WG could but does't need to work on, and would be based on prior incubation in the CG
… I'm particularly interested if anyone has concerns about this approach

Rafael: rechartering requires going through legal review in MS, so I would request we keep the old charter

dom: there are two charters, CG Charter and WG Charter proposal
… did you already get feedback on the WG Charter review?

RafaelCintron: I haven't reviewed the WG Charter yet
… are the CG and WG similar apart from OR #9 changes?

dom: IANAL, they should be similar, the WG is expected to be a strict subset of CG

<dom> Chai: I have questions

<dom> ... reading this issue, it seems like the request would be extended to support low level instructions

<dom> ... there has been a lot of discussions on low vs high level already

<dom> ... and it has been pointed out that the "low level" set is not really lower level than what WebNN providers

<dom> ... I also don't understand the relation to subsetting WebNN and this low-vs-high level

<dom> ... I understand there was discussion on whether WebNN would be low level enough and whether a graph API is the right approach

<dom> ... I'm not sure how this addresses that question (which I think has been otherwise addressed in the discussion)

<dom> Anssi: Jonathan you had suggested shipping a subset of WebNN that would perform better - can you speak to that?

<dom> Jonathan: 3 topics: whether the operations are low-level enough - I agree with Chair that low level operations can be addressed in a graph API, so that issue can be closed

https://github.com/w3c/machine-learning-charter/issues

<dom> Anssi: so there is an issue that we can close?

<dom> Jonathan: issue #2 can be closed

<dom> ... my question about the operation subset was in the context of discussing an alternative to a graph API: we could have a convolution 2D API on its own, independent of any graph or model - because it is a well-known expensive operation

<dom> ... The excitement has been mostly about model loader and graph API, but I was asking if we could do that?

<anssik> dom: the WG could decide to work on a subset of the WebNN API and work on it as a separate work item

<anssik> ... given consensu

anssi: indeed, a spec can be split into as many pieces as we feel is appropriate

chai: I want to point out that the way the WebNN is being specified right now is specified as a graph builder API
… an immediately-executable stand-alone eager-mode operation would not be a strict subset of WebNN from that perspective

Jonathan: +1 on that description

Anssi: so how strongly do you feel we need to put that in scope of the WG (assuming if it isn't)

Jonathan: some of the feedback I've gotten is that 3 individual operations give you 90% of the perf, then it's worth considering such an approach
… which we had discussed in the early days of the CG
… this made me wonder how much room the charter leaves to us for this

dom: from formal perspective IANAL we could work on both the approaches in the WG, the scope is broad enough to accommodate that, different way to organize the work
… alternatively, if this is still a significant open question, maybe we need to be explore this in CG before starting the WG charter review?

<jonathan> i'd rather not delay the working group for this :)

<dom> anssi: practically, exploring this further would delay the start of the WG

<jonathan> how about i file an issue in the CG and we can discuss in parallel?

<dom> Anssi: the WG will be chartered for 2 years - so no matter what, we will recalibrate our scope in 2 years

<dom> ... that's another way to pace evolutions in our understanding of the space

https://github.com/w3c/machine-learning-charter/pull/9

<dom> Anssi: would like to see if we can close issues and get consensus on what the PR should contain

<dom> ... Chai made the point that "the lower level" proposal isn't so clearly lower level

https://github.com/w3c/machine-learning-charter/issues/7

TOSA comparison https://github.com/w3c/machine-learning-charter/issues/7#issuecomment-729487005

<dom> chai: we haven't found that operations in TOSA were particularly lower level than WebNN

<dom> ... we use XLA / HLO quite a bit to ensure mappability with WebNN

<dom> ... some operators are graph/networks on their own, while others are really instruction sets

<dom> ... we cover both in WebNN

<dom> ... the process we ensure this is by checking each time we specific a network-like operator, we show how it would decompose in lower-level operators that we define too

<dom> ... as a result, this gives us confidence that we consciously design an API that includes low level operations

<dom> Jonathan: I agree with you that the approach seems to address the problem; the Android NN API in launching their 120 operations felt and feels that what they came up with is not low-level enough all what ML practionners want to do on Android

<dom> ... they're exploring a major re-architecuring rather than decomposing lower-level instructions, although I'm not sure why, but they seem convinced

<dom> ... for WebNN, since the group has been looking at all these alternative instructions set, this seems a pretty robust

<dom> ... #7 can be closed as well

<Chai> afk (need to step out for a few mins)

https://github.com/w3c/machine-learning-charter/issues/6

<Chai> (im back)

https://github.com/w3c/machine-learning-charter/issues/6

https://github.com/w3c/machine-learning-charter/issues/6#issuecomment-736004531

jonathan: the question in issue #6, are we doing graph API because we think it's the best abstraction, or because it takes too long to do a model loader with an agreed upon model format

RafaelCintron: graph API is format agnostic, there are many such formats
… maybe graph is better for future scenarios such as training
… also OK to do load model first if inferencing is our primary target

Chai: I wrote the explainer, so should explain :)
… re issue #6, do we believe the graph API is the right abstraction
… the motivation is to reflect what is already out there, we are not reinventing graph API for WebNN, we acknowledge everywhere you look is a graph API
… TensorFlow, ONNX, XLA, TOSA all are reflections of what's out there
… we're just saying WebNN is a reflection of reality

<jonathan> lol

Chai: model loader is the most convenient, can be used if you know the format you'd use in your application, the issue is we're not trying to address the questions of model format because there already exists many popular model formats

<jonathan> yes, google keeps defining too many formats

Chai: the WebNN provider the foundation on which all these formats can sit on
… in the explainer we try to identify this topic, WebNN provides an abstraction for browser to offer native performance provided by OS or platform underneath

jonathan: that makes a lot of sense, a lot of platform and OS use graph as an abstraction
… let me ask this differently: there has been some changes to the TF leadership since 2-3 years ago
… if I could persuade TF with its current leadership ONNX makes sense for interop across, would we then prefer to have a model loader with ONNX as the format, or prefer to have a graph API
… if Google embraces ONNX, do we still launch the graph API?

RafaelCintron: my answer, graph API is useful for training and when we need to change the graph
… not a blocker to do model loader first, but if we do model loader, it must accept one model loader across browsers
… customers do not want to ship multiple model formats, they want just one

anssik: can we close issue #6?

jonathan: from charter pov, we can close #6

jonathan: thanks RafaelCintron for your perspective, let me go to TF folks to introduce the idea
… the argument I'd make is it's the same work both ways, format or graph, you have to parse and convert either way

jonathan: all the work we're doing here continues either way, it is just whether it becomes an implementation details for the model loader or exposed directly

RafaelCintron: in the beginning of the CG, the reason why graph is preferable was due to Google person telling TF folks prefer a graph API, whereas with load model need to serialize etc.

<jonathan> fair point, for the use case of training in the browser

https://github.com/w3c/machine-learning-charter/issues/5

ningxin_hu: I recall feedback from the workshop that people are interested in accessing purpose-built hardware, for perf and power

RafaelCintron: we have interest from internal people on the API
… if the API can be better than what I use now I'll use it

jonathan: like everyone here, I believe if we ship this API developers will use it
… before Blink project accepts code into the project they want to know whether there are customer request

<jonathan> yes, exactly

<jonathan> also probably not blocking for the charter

anssik: can we close issue #5 hearing the customer req is a Blink process consideration?

jonathan: yes, we can close issue #5 since we don't need this information now, but in the future when doing shipping decision

https://github.com/w3c/machine-learning-charter/pulls

https://github.com/w3c/machine-learning-charter/pull/9

anssik: are we in agreement we merge PR #9?

RafaelCintron: yes, assuming this gets reviewed before the WG Charter is operational

anssik: yes, W3C AC will have 6-week review period

– DRAFT –
WebML CG Teleconference – 10 December 2020

10 December 2020

Attendees

Meeting minutes

WG Charter feedback

Happy Holidays

Adjourn

Diagnostics