anssik: Nikhil filed issues for tracking op compatibility resolution for matMul and conv2d. This is a start, more to come, I guess contributions are welcome!
Nikhil: compat study is to understand that the API supports all of the JS libraries we care about
… the API should be compatible with the platform-provided underlying APIs
… filed two issues, bread and butter of ops, we want to start small and slowly grow
dom: TF.js is an open source project that can evolve, can it evolve to match what happens here, or the other way
nikhil: matMul will not change, hopefully conv2d also does not change, there are such basic primitives
… TF.js strict requirement is we have the same API with TF to allow model sharing
… breaking change for signature is hard, we can superset TF that is a possibility
dom: thanks, that help, do you know if ONNX has similar constraints?
nikhil: I assume they are more flexible, but cannot speak on behalf of them
… worked with TF and other Google people to figure out how to find an abstraction that will be good for the next 10 years
… not sure ops is the right abstraction, 25% growth in ops YoY in TF alone
… maybe there's an abstraction below ops that could be the standardized, and layer ops on top of that layer
… next, I'll introduce matMul op compat study issues findings
… matMul is easier, this signature is from numpy directly
… and much of this is from numpy docs
… two important notes for this op, there are other arguments for this op e.g. transpose
… maybe transposing should be a graph optimizer's task
… that would allow us to make the API surface smaller
dom: how confident we are that could be done?
… how's the implementation complexity from the implementers (web engine) perspective?
… the way that the graph APIs work in general, stick together a graph, computation happens only when you feed the values into it, that's TF 1.0 style actually
… doing that from the user's perspective is complicated, TF 2.0 does eager mode instead, i.e.when you call it run, losing graph optimizations
… the hybrid approach is better for users, get graph optimizations also in this case
… discussion ongoing how to expose these to the browser, underlying there's always a graph
<sushrajaMSFT> sushraja_MSFT: does graph rewriting result in loosing ability to use the API for training
nikhil: good feedback from Benjamin/WebKit on this issue re CoreML & MPS
… need to understand how other accelerators deal with the concept of accumulator precision
ningxin_hu: it relates to our experiment on Android NN API
dom: should this be a capability that is exposed?
nikhil: question becomes, what accelerometers we want to support?
… conv2d precision could be different between devices, e.g. mobile vs. laptop and could lead to severely different results, this is not theory, we see this happen in TF
Gabe: when is the operator going to be its own variant?
ningxin_hu: question is also, do you want to open quantization issue or is precision issue enough?
dom: broadly, how do you handle capabilities, do you want to allow different code path based on underlying capabilities
nikhil: in TF we let this to happen, we don't throw at you, thinks just work but I expect the model to work the same on phone and desktop/laptop
ningxin_hu: decision should be done by frameworks, API should expose the capabilities
<sushrajaMSFT> sushraja_MSFT: something to think about is exposing hardware capability or for the UA to automatically fill in with a slower code path
ningxin_hu: questions, there's a todo, want to know why you choose matMul over fully connected
nikhil: to get the discussion started :-)
… matMul is the simplest thing
ningxin_hu: we can contribute our input from POC to the compat study
nikhil: conv2d(x, filter, padding, strides, dilations)
… padding, everyone does this differently
… it get fun with tensor layout, shape of x, many ways to represent, channels can be transposed, different hardware support different ways, e.g. CUDA has a different way to transpose
… with Daniel thought this, from user's POV, good if web developer does not need to think about underlying platform we're in a better place, so proposal to choose just one format, even if internal representation is different
… browsers have already chosen a particular format, channels not transposed outside
ningxin_hu: two re-layouts can happen, with constants and feed images into your network, done for every frame
dom: there seem not be benefit in picking one over another, matter of optimization, exposing capability not useful here
ningxin_hu: align with media element layout by the underlying implementation
dom: one mental model is easier to map
ningxin_hu: earlier discussion, we accept ArrayBuffer, investigate WebGL buffer, TextImage2D, video or canvas to be fed as input to this API
ningxin_hu: activation, per our native API experience, fused activation etc. can be done by underlying graph optimizer, current direction is a group with small number of ops and leave other for custom ops, need to investigate if we can optimize more, since optimizers do not work with custom ops(?)
nikhil: we need discussion on the underlying APIs
anssik: currently our charter says: "The APIs in scope of this group will not be tied to any particular platform and will be implementable on top of existing major platform APIs, such as Android Neural Networks API, Windows DirectML, and macOS/iOS Metal Performance Shaders and Basic Neural Network Subroutines."
nikhil: we should look at each of those
anssik: ningxin_hu do you think you can help with this part?
ningxin_hu: yes, we've already done this work in a separate spreadsheet
ningxin_hu: let's look at the supported ops table we've collected
ningxin_hu: listing different op types and their compatibility across Wasm, WebGL, NNAPI, MPS, BNNS, clDNN, MKLDNN, DirectML
… NN API and MPS have good coverage, DirectML with some compat issues documented in this table
ningxin_hu: this is a little bit complex, this table tries to map the native capability, API and parameters, compat issue marked with notes
… e.g. for MPS padding, we have 4 places that need to be padded, open question how to do the right padding
… for DirectML, we can extract static conv2d information into this table and provide it as input to the compat study under progress
… we want to get the data how the definition and what op is supported by native platforms
… also uplevel compat study, looking at frameworks
ningxin: WebNN POC perf data for DirectML and WebGL backends
… our POC is open source, code available so you can run these benchmarks yourself
… models used are official TFLite and ONNX models
ningxin: summary, very promising performance speedup, opportunity for better perf with further optimization
ningxin: across platforms we see similarly good speedups, not just Windows
anssik: how much work was it to produce these op compat study items?
nikhil: it was some work, not trivial
anssik: Wanted to discuss two things: 1) near-term goal to produce an Explainer document that complements the spec that helps solicit early review from W3C TAG; 2) incubation to standards track transition, invited Dom of W3C Staff to talk about this.
anssik: Web specs are expected to be reviewed by W3C's Technical Architecture Group (TAG), and the best practice is to seek such TAG review earlier rather than later in the spec design process.
anssik: This is a collective group action.
anssik: we could copy with pride WebXR explainer's approach, it includes e.g. target hardware section
Alex: supporting explainer-driven spec design
Sushanth_Rajasankar: also splitting a spec into modules is one possible design approach
ningxin: what if we have alternative design we don't yet know which to pick?
alex: explainer is the right place for those, put your alternative designs in the explainer
dom: any discussion in the TAG on the architectural decision record?
alex: 8 month dated understanding, not sure at this point
ningxin: what is the process to update explainer?
anssik: PR with review
anssik: Hand over to Dom
dom: W3C Standardization aka Recommendation Track
… build shared understanding when to advance
… Happens in Working Group
… Under strong Royalty-Free policy
… Following a well-defined process to enable:
… - Fairness and consensus
… - Architectural consistency with the platform
… - Proper review from a security, privacy, internationalization, accessibility perspectives (as it applies)
… Incubation in Community Group
… transition to WG when
dom: - Rough agreement on shape & scope of API
… - Some early implementation experience
… - Before it is too late to evolve based on broader input
… Find a W3C Staff to help, e.g. Dom :-)
… Draft a charter reflecting Community Group's view
… Build momentum in W3C broader community (cf workshop)
… Iterate based on reviews (from W3C and others)
… Get formal approval
dom: What about the CG then?
… Various possible approaches:
… - Keep CG to incubate new proposals (e.g. Immersive Web, Web Assembly, Web Audio)
… - Pause the CG while the standardization work happens (may relaunch afterwards)
… - DYI
anssik: thanks Dom, any questions?
nikhil: what is a typical timing?
dom: Immersive Web was 4-5 years in incubation
… Wasm incubated for 2 years
… there's no rules really, depends on where you are in your design process
nikhil: how to evaluate maturity?
dom: interest from target community, key thing is making sure whatever you produce gets adoption
… when you see that implementers are behind the rough shape of the API, it is good time to graduate
anssik: asked Dom to talk to us about Workshops and how they help in getting wider community engaged around a web spec proposal
dom: What is a W3C workshop?
… Open to anyone with relevant expertise
… Broader perspective than specific CG/WG
… Opportunity to hear from more relevant communities
… Typically, 2-days event
dom: W3C Workshop examples
… Web & Virtual Reality Workshop in 2016
… Web & Games Workshop in June 2019
dom: Why a W3C Workshop on Machine Learning?
… Lots of energy in WebML CG
… Lots of interest from many connected but not yet involved communities
… Opportunity to put WebNN in broader context of ML in browser
dom: Possible topics
… WebNN in context
… Integrate ML with browser data sources (e.g. sensors)
… Integration with WebGPU, WASM
… Relation to Speech Recognition, Shape detection
… Relation to /integration with cloud-based inference
nikhil: anyone who implements the underlying APIs should be great participants in such a workshop
… also MLIR is an important group of people
… anyone from hardware side of people would be also very welcome participants
… what is the format of the workshop?
dom: program committee to decide, can be short talks, discussions, lightning talks
nikhil: lightning talks would work in this context, since this is so much cross-team effort
Dave_Singer: you want to look at opportunities where standards would already help build a market
… Apple would probably be interested in participating
dom: How, where and when
… Define a call for participation
… Establish a small team to do outreach, research and set agenda
… Where? Offer in Berlin - others?
… When? Q1 2020: last 2 weeks of March?
nikhil: TensorFlow Dev Summit is probably March 2020 want to avoid overlap
anssik: Thank you all for participating, see you at the W3C Workshop on Web & Machine Learning in Q1 2020 maybe? :-)
Succeeded: s/que H/que_H/
Maybe present: Alex, anssik, dom, Gabe, Nikhil, ningxin