WebNN API TAG spec review submission
[DRAFT] TAG Specification Review: Web Neural Network API
anssik: Final review of the TAG spec review request before submission:
… Please provide your feedback in the issue ahead the meeting.
anssik: Plan to submit this request tomorrow, any concerns?
Self-Review Questionnaire: Security and Privacy
anssik: Contribute your suggested responses to the questionnaire:
Self-Review Questionnaire: Security and Privacy
anssik: Any questions re the questionnaire?
Support the execution of the sub-graph scenario
anssik: Discuss and provide feedback on the preferred builder pattern:
ningxin_hu: I our last call we discussed this issue and follow up action was to understand the use case better
… Ping mentioned transfer learning as the key use case
… so some layers can be trained with personal data
[Ping noted "But this API does not seem to follow the conventional builder pattern"]
Sandeep: transfer learning makes sense, building into a single model, not familiar with the use case Ping had in mind
Ping: I have discussed with Ningxin about his subgraph issue, also Chai has chimed in
… there are use cases for transfer learning, the current API only considers the output part of it, input part is not as clear as output
… manual training is not typical(?)
… what the API now does can address majority of use cases, I can follow up on the issue
… happy with the solution from Ningxin
Proposed new ops Mirrorpad, SquaredDifference, Pow, TransposeConv
anssik: Review low-level ops decomposition and gaps:
Chai: I looked at the original and ONNX part, they are essentially the same models
… I think this is one of the models WebNN should support, it is one of the samples for many frameworks, also ONNX
… the ops sound reasonable, Ningxin may have some questions?
… I can take an action to tap out all the ops needed, there are not that many
ningxin_hu: question to Chai, there are two models using different ops to decode, TF Lite and ONNX
… do you want to support convTranspose?
Chai: it is supported in many models, should add it
ningxin_hu: we had an early discussion on conv op support, will revisit
Specify the ModelBuilder.createModel
anssik: Discuss naming and how to spec the ModelBuilder.createModel:
Ping: comment about the naming convention, duplication in the namenot preferred
Chai: suggestion for naming, createModel -> build
… that would be reasonable
<ningxin_hu> +1 rename to build
Chained API for the Operands
anssik: Follow-up on the mixin interface proposal:
Rama: my questions was whether in the chaining style you omit the first operand
… was not clear how the suggested proposal would work, and Ningxin seemed to agree
Ping: TF.js chaining API is not for model building
anssik: is this a nice to have or must have feature from TF.js perspective?
Ping: nice to have feature
… that said, should think how the API should look like, we want to chain the ops here, we have to think about it
anssik: welcome discussion in the GH issues #106
WG Charter feedback
Status check and discussion on WG Charter issues:
anssik: "Is this API likely to be a long-term solution?"
[ Jonathan talking through the points in the issue ]
Jonathan: asked the internal Google team why couldn't the existing NN API be augmented with the low level instructions being explored at Google
… haven't gotten an answer to that question internally yet
… would like to get web standards and e.g. TAG perspective on the situation where an API might get a new API in the future that might or might not replace the old API
Chai: my response is on the issue, don't want to read it
… to me the question is what would be the right abstraction for ops to ensure interop across platforms?
… the topic is about that abstraction
… the web stack has a browser and underneath an OS
… to be really cross-platform we need to pick an abstraction the underlying platforms can support
… ping audio, media, AI it is the same, how to layer those capabilities across multiple platforms
… I believe this abstraction WebNN API chose can stay relevant for 10 years or more
… it has always been the case that abstraction established as maintained, and the Web is never the first for a good reason, it is a followed building on the foundation of the underlying platforms
… we have to be careful to look across ecosystems and platforms and define the abstraction we believe is good for all platforms
… this process takes time
… WebNN API is around that corner, AI/ML has been in the past 5-10 years, enough development, working in the Msft OS group, dealing with GPU vendors, seeing the directions where the hardware is going and SW frameworks, TF, ONNX, CoreML evolving, but when you do cross section you find there's a handful of common currency flowing through this system
… eventually you need support from the underlying platform
… the folks working on the lowest stack that people do not often see, what they do is they look up to understand the use case
… this has been happening in the past years in desktop and phone, the questions is, how do you define the web stack that can take benefit of that overlap
… you can always wait, but the space moves so fast no one actually waits for you
… enough parties recognizing simple things like need to support conv, gemm etc.
… for example, matrix multiplication is so fundamental, you'll need this abstraction anywhere
… are we ready to say this is the current we'll start with for the Web?
… there could be a new thing coming that will validate everything that came before it
… the web had a very basic image formats 20 years ago, much improved today step by step over the years
… there may be obsolete ops in the future at some point, that's fine
<Jonathan> There are a couple different ways that it could work. If Web NN 2.0 supported a lower level instruction set, either...
<Jonathan> 1) Web NN could support both the currently proposed ~100 operations + the lower level instructions in a single graph, or
<Jonathan> 2) Web NN could support a choice of two mutually exclusive op sets: either developers would use the 1.0 operation set, or they would use the 2.0 operation set
<sangwhan> The proposed ~100 operations would be nice, but getting implementor traction might be a hard sell, given the complexity of the feature - I think that's my main concern. If there is a high level API that implements less but has very wide developer adoption it feels like there would be stronger motivation to push forward a more complex API.
anssik: are these 4 issues all the issues Google would like us to address?
Jonathan: there may be couple of more, but no surprises
<Jonathan> @anssik has requested, can Google write something about how the future of the NN API might work for the web platform?