WebML WG Teleconference – 19 Aug 2021

Meeting minutes

anssik: Welcome back after the break! I hope those of you who had time off had a great time. I'm aware some of you are still on vacation.
… I observed some great progress during the break and also substantial contributions from new contributors. Thank you!

Upcoming meetings

WG bi-weekly teleconference scheduling

anssik: We ran a poll to understand participants' preferences for our bi-weekly meeting
… given our global participation, I appreciate folks flexibility as we obviously cannot find a time that is perfect for everyone, given the Earth is not flat

WebML WG teleconference scheduling poll

anssik: looking at the results of the second poll we ran in August, the current Thu 14:00-15:00 UTC+0 slot received the most votes in support
… in inclined to continue with the scheduling we have had in place and re-run the poll from time to time to understand if the situation has changed e.g. with new participants joining
… I should note UTC time does not change with a change of seasons, so we have previously shifted to 15:00-16:00 UTC+0 when DST ends (~from end of Oct till end of Mar) which means 7am Pacific and 11pm Shanghai, given China Standard Time is observed all year
… any concerns with that?

<ningxin_hu> no problem to me

proposed RESOLUTION: Per scheduling poll results, stick with Thu 14:00-15:00 UTC+0 bi-weekly schedule for WebML WG teleconference during Daylight Saving Time, 15:00 UTC+0 when DST not observed

Resolution: Per scheduling poll results, stick with Thu 14:00-15:00 UTC+0 bi-weekly schedule for WebML WG teleconference during Daylight Saving Time, 15:00 UTC+0 when DST not observed

WG meetings during W3C TPAC 2021

<Chai> +1

W3C TPAC 2021

Dom's mail

anssik: W3C annual conference, TPAC, will happen in a virtual format, over the course of two weeks: October 18 to 29. Dates are split as:
… 18 - 22 October unconference breakout sessions
… 25 - 29 October WG/IG/BG/CG meetings and Joint Group Meetings
… in our case, we'd meet over 1-2 days during 25-29 Oct.
… quoting what Dom said in his email:

As a new Working Group, I believe it would be good for the Web Machine

Learning Working Group to organize such a meeting during the 2nd week.

In other words, such a meeting would be more an opportunity to reflect

on where we are, our next steps and explain the context of the group to

the broader community than it would be about solving specific technical

issues.
… so the TPAC Group meetings are differently scoped from our usual bi-weekly meetings:
… - look on the progress and goals of the Group as well as the deliverables
… - look at related work (e.g in Community Groups) and what's new out there within the scope or related to the Group's mission;
… - welcome new participants, understand their interests, get their questions/feedback on the Group, and potentially mentor them on how to contribute;
… - welcome observers, understand their interests in the Group, and get them interested in joining the Group and helping;

anssik: at TPAC we discuss a more high-level view of the WG, its successes and challenges, goals, look at where we are to meet the charter goals
… other topics:
… - state of test suite and implementations
… - past progress
… - next 12 months expectations/hopes
… - what contributions would be welcome from other/new participants
… all this in addition to out usual business that is driven by issues/PRs

anssik: I'll check back with Dom on TPAC scheduling options and cross-group coordinations when he's back and we'll also ask your feedback.
… any questions?

WebNN API new features landed

anssik: I'd like us to discuss recently landed new features and any related design considerations behind them.

Fused activation

anssik: Fused action is related to two issues, #138 and #185

Issue #138

Issue #185

PR #188

anssik: PR #188 by Chai addresses both these issues was reviewed and merged.
… thanks to Chai, Ningxin, Rama, others who contributed.
… the gist of these changes that generalize a way to allow an operator to be created before it is connected to the graph.

Chai: the only operators like this are activation function, you don't want to make this broadly for all ops, with this change we can pass activation function into things like convolution and BatchNorm
… also addresses one TAG issue that flagged enum for a few activation functions
… we solved this, but not in a fragile way that would have been to update the activation enum
… also in some ops like GRU this would not make sense, and would make checking for errors hard

HardSwish op

Issue #181

PR #195

anssik: HardSwish is an important activation function in MobileNetV3. Both TensorFlow and ONNX have supported it.
… In the issue we had discussion on whether this op should be a composite of existing ops. A few alternatives for compositing discussed in this issue.
… the research done in the issue suggests a number of native ML APIs support hard-swish, and the op is used by first-wave models.
… We've usually used these criteria in evaluating whether to include a new op into the spec
… Given this op passes this test, a PR was submitted, reviewed and merged to add HardSwish op to the API spec in PR #195

anssik: Ningxin, you want to talk about this op?

ningxin_hu: thanks Anssi, good summary
… thanks Chai for review and suggestion to be careful about the term and use a more generic terminology in this op speccing
… we have WebNN polyfill with tests, I'd like to say Chai's PR for fused activations is already landed in the WebNN polyfill
… this polyfill is based on TF.js and we utilize TF fused ops within the polyfill and get good speedup 5-7x in MobileNetV2 with these fused ops and Wasm backend
… a good example of the performance benefit of adding this type of op

Chai: it is useful to state the criteria of defining new ops, if you look at the discussion between myself and Ningxin in the issue, we are saying if there's a new proposal for adding a new op, we want to look if the native platforms support it, then we should define the op, because the goal is to accelerate using the native ops
… on the flipside of the argument is, what if a certain platform cannot accelerate the op?
… the general answer is we emulate
… in those cases for implementation that don't do acceleration, the implementation should then decompose the ops into subgraphs, that is what we were discussing in the issue

anssik: TPAC topic proposal: Discuss/document the rationale/criteria for adding new ops to the spec makes for a great TPAC WG meeting discussion topic

Chai: e.g. GRU really is a graph that can always emulated, the questions is whether it can be accelerated, we need to look for evidence an op can be accelerated, if none found we should not define it in the spec

<ningxin_hu> +1 to document the criteria for new ops

<Chai> New op: 1. can it be emulated?, 2) can it be accelerated?

Chai: Those two PRs were the new features that landed since our last meeting. We'll discuss WebNN API recent new feature requests next.

WebNN API recent new feature requests

BatchNormalization should be an optional operation

Issue #187

anssik: Ping opened an issue, quoting:
… "BatchNormalization usually are optimized away for inference (fused into conv2d op filter and bias), for example TFLite does not have this op, for compatibility purpose we shall move this op to optional section of the spec."

Ping: for PatchNorm, many of our models are optimized and fused, especially for inference purposes, e.g. TF.js does not have this supported
… question is whether this op should be optional
… there's also some discussion, Jonathan is not here today, here's on vacation, so maybe we want to postpone this for next week

anssik: Chai suggests BatchNorm is supported as a native operation in DirectML
… noting a backend implementation should be allowed to treat BatchNorm as a standalone operation, or fuse it with just the following relu, etc.

Chai: my feedback is in the issue, related to the logic discussed in previous topic, it can be emulated and accelerated, so think we should define it
… if a platform cannot support it, it can emulate it
… technically speaking WebNN can be defined without any big ops, but we want to define them due to acceleration, the point we want WebNN to be fast
… it is supported in all recent GPU drivers
… what Ping brought up, optionality is related but different, discussion
… if someone is to implement WebNN, the users don't know ahead of time what ops are optional and what required, so hard to rely on these ops, from user's point of view optionality makes it hard to code against the API

Ping: thanks Chai, thinking further, optionality is bad for users, what is the standard best practice for future additions? versioning?

Chai: thanks Ping, fusion is probably one of the big topics in MLGraph
… it is complicated and open-ended, your question is what is the philosophy around the API, and versioning is a great topic for discussion
… on the topic of fusion, it is open-ended, anyone can argue that some implementation can fuse the entire graph into one thing if they control the whole SW/HW stack
… fuse op in activation, I think we want to make it scoped, every implementation already takes care when you can conv followed by a relu
… so in a way it is already a special case, similarly batchnorm followed by relu, the platform will in reality fuse them
… the change to support fused ops is a reaction to that trend, but a very scoped fusion
… a more generic fusion, graph with conv, batchnorm and relu, you want these fused
… do we want to go so far we want to make batchnorm a fused op to be passed to conv2d
… it may get very complicated, with circular deps
… how to deal with this, we already have a provision for complete step, for open-ended hw specific ones compile step takes care of this
… the fused activation should be used in very limited way when we know if can be supported in modern platforms, otherwise use complication step

anssik: TPAC topic proposal: "How should WebNN deal with fusion"

BatchToSpaceND and SpaceToBatchND ops

Issue #189

anssik: this request is motivated on: "SRGAN model contains BatchToSpaceNd op, can we support it in WebNN? And SpaceToBatchND as well. In ONNX, similar ops are DepthToSpace and SpaceToDepth."
… No reactions yet on GH. Please provide your feedback on the issue and we'll bring this to our upcoming calls. Any quick comments?

The scales and sizes of MLResampleOptions can't be both empty

Issue #192

anssik: quoting the issue author: "scales and sizes are hosted in the MLResampleOptions, both of them could be empty. I think when neither scales nor sizes is defined, the resample operation shouldn't be created."

<ningxin_hu> it looks like a spec bug

anssik: I believe we need to spec what is returned by resample() under such conditions.

Support for configuring rounding type in pooling operations

Issue #198

anssik: While implementing OpenCV pooling layer with WebNN backend, it was observed OpenCV's pooling layer supports both ceil (with ceilMode flag) and floor rounding type
… the issue is, WebNN API spec does not support configuring the rounding type in pooling operations, so the author suggests considering adding support for configuring rounding type
… also preliminary investigation suggests ONNX pooling also supports both ceilMode and floorMode
… thoughts?

ningxin_hu: I'm the mentor of the GSoC project with the student, working on the OpenCV DL module
… proposed next step to investigate this request based on criteria we've developed to see if it can be emulated, and whether native APIs can accelerate
… I'm committed to work with the student on this issue

anssik: Thank you all for joining!
… we'll defer WebNN API TAG review issue discussion to our next call

– DRAFT –
WebML WG Teleconference – 19 Aug 2021

19 August 2021

Attendees