Meeting minutes
Workshop presentations and GH discussions
anssik: Workshop presentations are actively being discussed on GitHub. CG participants feedback has been appreciated, further engagement welcome.
WebNN polyfill and samples
anssik: Discuss review feedback and suggestions for the foundational implementation
Add the foundation implementation (PR #1)
ningxin_hu: thanks for the review comments
… during the last two week we've been addressing the feedback and have done improvements per Ping's suggestions from TF.js API usage, for example
… WebNN is a graph builder so need some clarifications how to map WebNN graph to TF operators
… some bug fixing to be done in that area, another improvement is documentation, added are JS docs for the public API of the polyfill
… it is lightweight since we have spec in place, rather than duplicate spec text
ningxin_hu: polyfill tracks the spec closely, so JS docs complement the spec, not duplicate
… also unit tests based on Mocha were added
… it would be good if Chai could review the unit tests, they are not perfect, but directional guidance welcome
… Anssi suggested hosting JS docs and unit tests via GH so those are added
ningxin_hu: Ping had a question re API design itself, so we agreed to open a design issue for the spec instead of polyfill issue, any API design issues should be discussed in the spec repo
… I will keep the first polyfill PR aligned with the current spec, so the proposal is to discuss any changes in WebNN spec repo first and then follow up with polyfill change that keeps up with the spec
anssik: are the mocha tests automatically or manually generated?
ningxin_hu: manually created
anssik: interested in figuring out if we can reuse mocha tests for w-p-t tests
ningxin_hu: worth opening issue about w-p-t testing plan
<Ping_Yu> It would be good to unified the op correctness test at the platform level.
Ping_Yu: so my comment is for the op correctness, would be good to unify, polyfill is just one implementation, going forward there will be multiple native browser implementations that need to be tested
https://web-platform-tests.org/
anssik: Discuss and review LeNet sample for handwritten digit recognition using WebNN
ningxin_hu: I acknowledge this example had a UI issue on narrow screens, will fix that
… in this sample, I display the compilation time for the graph, also execution time (inference time actually)
… this example can show the difference in performance if we run it on a native implementation
<Ping_Yu> sorry, I have not had a chance to check it out
<paul_mcdaniel_msft> sorry, have also not had a chance to review LeNet on my end either.
<Ping_Yu> one related question, what is the plan for benchmark?
Ping_Yu: in the long run it'd be good to have a benchmark for all the implementations
ningxin_hu: I'm only aware of framework-specific benchmarks
Chai: I think the benchmark is a good idea when the API becomes more stable
… normally conformance and benchmark are done when you have a release, so you can test against a stable base
<Ping_Yu> we shall look into mlperf standard for mobile or NNAPI benchmark
Chai: most trustworthy benchmarks are done by 3rd parties
<paul_mcdaniel_msft> https://mlperf.org/ is a common one
<Ping_Yu> There is a team in google working on mlperf, I can chat with them on possible web mlperf
<Chai> i'll take a look at lenet sample. took some time off last week. still catching up.
GRU and corresponding ops
anssik: Fill the operator gaps to support noise suppression first-wave models
… Adding GRU and GRUcell operators to support GRU recurrent network. Defining a cell operator in addition to the network operator for added customization flexibility e.g. to support stacked cell recurrent network, etc.
Review PR: https://github.com/webmachinelearning/webnn/pull/83
Proposal to close noise suppression issue with PR #83:
https://github.com/webmachinelearning/webnn/issues/66
anssik: are we fine to close issue #66 now that PR #83?
Chai: will take a look
<paul_mcdaniel_msft> looks great !
First-wave models and ops delta with WebNN API definition
anssik: Review the delta between the first-wave models ops and WebNN API surface:
[ ] clamp
[ ] globalAveragePool - Lowering to reducewindow, add and div?
[x] gru
[x] sigmoid
[ ] split - Lowering to slice?
[x] squeeze
anssik: Discuss which ops to add to the spec definition considering e.g. major platform support and performance implications.
Chai: globalAveragePool already supported
… split is new
… clamp need to be defined
… with this delta, I'd like to ask again, how do we think about adding new ops before closing on 1st wave models and what's the criteria to add new ops?
anssik: my though is to work from use cases, to models, toward ops
<paul_mcdaniel_msft> https://github.com/webmachinelearning/webnn/blob/master/op_compatibility/first_wave_models.md
anssik: should review our use cases that they're still good
Chai: frameworks have model zoos that could be used as inspiration
… 1st wave ops set is around 50, which is about half way, the remaining ops being fillers
… so the questions is how to cut the first release
<paul_mcdaniel_msft> just FYI .. ONNX model zoo is here: https://github.com/onnx/models
<Ping_Yu> Is there a guideline on what criteria the committee follows to determine what ops get to be added to the spec
RafaelCintron: no strong opinion regarding release scoping
<paul_mcdaniel_msft> again just FYI . tf.js model zoo is here (i think): https://www.tensorflow.org/js/models
paul_mcdaniel_msft: similarly, no strong opinion on release scoping
<Ping_Yu> I have some comments
Ping_Yu: release without proper tests, people cannot be sure of conformance
<RafaelCintron> +1 to having test suites and conformance
<Chai> +1
<ningxin_hu> +1
<paul_mcdaniel_msft> agreed. i would really like to have a test suite AND some reference implementations (backends) so that we know this really is viable before we publish our v1 api draft
anssik: we should aim to have a test suite for the opset we are about to release
<Chai> this is why we also starting polyfill, samples and unit tests
Chai: from execution point of view, maybe good to say we're opening a proposal for adding new ops for 1st wave, but define a timebox and process for adding new ops
PROPOSED RESOLUTION: Define criteria for adding new ops to the spec
PROPOSED RESOLUTION: Define criteria for adding support for new models
Resolution: Define criteria for adding support for new models
<Ping_Yu> I would suggest define criteria for use case first instead of model
RafaelCintron: want to say, maybe soonish we should request TAG review
PROPOSED RESOLUTION: Request TAG feedback for WebNN API
<paul_mcdaniel_msft> (thinking out loud) it's almost like: step1 - the first wave of models identified. step2: tests and samples and polyfils for wave1. step3: Proof-of-concept implementations for wave1; backend (DirectML) and frontends (onnx.js / tf.js) to make sure our api holds for implementation. step4: publish our findings .
Resolution: Request TAG review for WebNN API
ack?
Ping_Yu: if our releases are MVP style, then people can start implementation, propose we minimize the models we include
… also minimize the ops involved, so people can try out sooner, rather than expand the scope
… another comment is, instead of focusing on models, focus on use cases
… models come and go, there may be new architectures in the future
Chai: aiming for MVP is a good idea, but we need to also balance the appearance of the first draft of the API, since if the functionality is too minimal impact is not there, this is delicate balance, how to define a reasonable set of models, ops, so that it covers sufficient set of use cases
… while allow implementers to start implementation early
… I assume MVP here means the coverage we can successfully implement
… the first milestone for WebNN, we're already past that, since we've worked on the API for over a year now, we should set the bar higher
<Ping_Yu> My comment is on the first wave of models level not API level
<Chai> we derive op set from the models
<paul_mcdaniel_msft> i like use cases -> model architectures -> ops.