WebML WG Teleconference – 6 October 2022

Meeting minutes

ghurlbot, this is webmachinelearning/webnn

<ghurlbot> anssik, OK

WebML WG Charter 2023-2025 early heads-up

anssik: This is a kick off for the WebML WG Charter 2023-2025 brainstorming discussion. Our current charter ends 2023-04-30 and we are expected to define our new charter for the next 2-year period 2023-05-01 - 2025-04-30.
… And a call to start solicit use cases and model requirements for WebNN "v2".
… WG rechartering process to formally kick off early 2023, but we want to start now to have adequate time to converge on consensus
… my expectation is we will develop a good draft WG charter during Q4.
… as a reminder, W3C expects new technical proposals to be incubated in the CG prior to the WG adoption.
… the charter development happens in a separate GH repo and is open to proposals from WG participants and also public:

WG Charter

WG Charter (GH repo)
… I'll continue request feedback from the WG on our calls from time to time and will reflect that into the WIP charter in the GH repo, feel free to chime in there or provide your feedback on these calls.
… please talk to your colleagues internally to ensure you can convey the whole company's perspective -- there's a formal checkpoint for that when we enter so-called AC review Q1 '23 but it is better to align already in this draft state!

Current scope & deliverables

anssik: the W3C's WebML WG high-level scope your participating is:
… Deliverables:
… - Web Neural Network API
… - Ethical Principles for Web Machine Learning (non-normative)
… Out of scope:
… - "Training capabilities are out of scope due to limited availability of respective platform APIs."
… - "To avoid overlap with existing work, generic primitives used by traditional machine learning algorithms such as base linear algebra operations are out of scope."

anssik: Tentative Deliverables:
… Model Loader API is tentative until "agreement on a model format"

Proposed removals

anssik: WG's decided to drop WebGL interop, we should remove coordination unless there are reasons to keep it?

[agreement]

anssik: any objections to keep Model Loader API as a tentative deliverable?

[agreement]

Proposed new work

anssik: Let me introduce some topics we may want to discuss to understand what new work to add in scope (or not):
… - v2 features. Which features are substantial enough to warrant explicit mention in the charter? Examples: VPU, int8 ...
… I propose we label such proposals with "v2". If no issue exists, you can file one and note it should be labeled "v2"

Proposed v2 features
… - WebGPU interoperability, do we want to be more explicit than in the current Coordination: "[WebGPU API] ... may be used to implement traditional machine learning algorithms efficiently" -- needs revision I think?
… - New model and their requirements. Do we want to take another look of ML models we want to support in v2? We had collected the first-wave models to inform our v1 API, we might want to pull of a similar exercise for v2.

Chai: thanks Anssi!
… for v2, one of the constant feedback from our external partners when discussing WebNN for their use case has been the ops
… the set of ops supported must be more comprehensive
… this needs to be more explicit goal, this is important
… related to that, use cases around transformers
… getting big, an area we should probably spend more time on
… also need to support NPU or VPU or xPU

anssik: WebGPU interop thoughts?

RafaelCintron: working with WebGPU folks is important for the success of the spec for that part

The first-wave models

anssik: - Level of abstraction for neural net operations? The explainer has the rationale for the current abstraction in the explainer

[Explainer] What is the right level of abstraction for the neural network operations?

Chai: on this topic, I don't remember we disagree with Google's position
… we need lower level primitives they say, but it is important to also have ops that are generally implemented as a single unit, we want to do both low-level and high-level ops
… in the spec we wrote informative sections on those ops that are high-level so that they are translatable to low-level ops
… we made a point of doing more clarification in the spec to address that concern

Chai: we can summarize that explainer text and embed it in the charter, re abstraction chosen

<ningxin_hu> https://www.w3.org/TR/webnn/#security-new-ops

ningxin_hu: agree with Chai, I pasted a link to the spec that discusses on how to add new ops to the spec
… we say in the spec "if an operation can be decomposed to low level primitives:

Add an informative emulation path

Prefer primitives over new high level operations but consider performance consequences"

ningxin_hu: we define both and ensure high-level ops can be HW optimized

WebNN API Candidate Recommendation open issues

Current CR issues

Web platform tests

anssik: to satisfy CR requirements, we must document how adequate implementation experience will be demonstrated, and the right way to do that is to produce a cross-browser test suite to validate implementation correctness.
… of course the results are only as good as the tests area and to that end
… Bruce working on web-platform-tests for WebNN indicated the only blocker was Unit of Least Precision (ULP) tolerances
… to that end I'm pleased to welcome Dwayne Robinson from Msft to this meeting
… Dwayne is a senior developer at Microsoft working on ML platform engineering
… Dwayne posted his initial list of recommended ULP tolerances to the GH issue

Recommended tolerances (Dwayne)

Dwayne: [presents slides "Operator Tolerance Conformance Considerations"]
… Operator Categories, 8 grouped in complexity roughly
… Data movement
… Data generation
… Exact math
… Simple math
… Complex math
… Trigonometric functions
… Lossy accumulation
… Very complex iterative
… each of these categories have similar precision numbers or tolerance values
… one question to answer, are you interested in op conformance or precision?

Dwayne: you cannot test all values, asymptotes etc. dubious areas, 0/0 issues

catastrophic cancellation
… Note there are numerical gotchas to beware of, including subtraction of nearly equal numbers (see catastrophic cancellation)
… Precious issues and gotchas
… also subtraction of nearly equal numbers
… division by very small numbers
… Ideal (Expected) vs. Actual Signal Behaviour
… left hand side error that is roughly a constant distance from the value
… center, error proportional to the value
… on the right, floating point error with jagged red line due to IEEE float
… Measurement methods
… 3 methods
… - absolute tolerance
… - relative tolerance
… - unit last place

Dwayne: it is sufficient in SW comparison to use ULP instead of relative error

Dwayne: Contributing error factors
… Compute precision and tensor data type: float16 vs float32 also non-standard types (e.g. bloat16)
… rounding modes
… subnormal flushing
… different NaN bit patterns
… Number of calculations:
… the more input elements, the greater potential for error

Dwayne: Algorithm used e.g. summation order
… Fusion magnifies earlier errors
… Contributing error factors - IEPOE (Input Element Per Output Element)
… not used directly, but conceptually gives a degree of complexity
… inputs per elements concept not used directly, but useful for considering the lossy math ops
… greater the number of lossy math ops, the greater the potential for error
… adding ULP tolerances help establish sensible upper bound
… beats pulling numbers from thin air
… any questions?

ningxin_hu: thanks Dwayne!
… question about sanity checks, you mention "n" can you clarify

Dwayne: e.g. adding 100 numbers together we have n = 100

Dwayne: applies to mul() add(), does not add to all ops e.g. not to exp()

Chai: proposed tolerances in GH issue are not what we use internally in our products, it is based on expertise but not exactly what we use

<Zakim> zkis, you wanted to ask whether the Web ML group should work on testing/attesting implementations, or share best practices on testing in the spec or separate document?

bruce_dai: I have a question test input data and output data
… we use number type as input, double precision

Dwayne: cast to what the machine will see, if input is double and you pass as float32 into GPU you lost precision in translation

Add method steps and normative algorithms to operations

anssik: Zoltan is with us today and starting now has time to focus on this task

22 Sep 2022 discussion
… last time we collected feedback on work items for this task:

- internal slot definitions

- clarify graph building with algorithmic steps and internal slots

- (Chromium impl where internal slots are private members will help)

- MLOperand interface

- MLOperator interface

- keep the current declarative op definitions side by side with the algorithmic steps

anssik: Zoltan you have comments or questions?
… thanks for your help addressing this CR blocker issue.

zkis: I'm relative new to this spec so will be asking good questions from you and fix unclear parts as I go

<ningxin_hu> sgtm

anssik: thank you Zoltan!

– DRAFT –
WebML WG Teleconference – 6 October 2022

06 October 2022

Attendees