WebML CG Teleconference – 19 March 2020

Meeting minutes

<walrusmcd> anssi - you might be dropping out audio

matMul op definition

anssik: ask for the group was to review PR:

Add matmul PR #49

Ningxin_Hu: PR is based on the matMul signature proposal from Nikhil
… the proposal is strongly numpy inspired

[Ningxin_Hu walking thought the details of the matMul op definition]

Chai: might be a good idea to explain the output

Ningxin_Hu: returns section is a brief description of the output, need more details?

Chai: right, input in this case can be inclusive of all the cases enumerated, impacting the output, e.g. if dimension greater than 2 the output should match accordingly
… happy to help clarify that

anssik: any other feedback on the matMul op definition?

Handling unsupported OperandType

Handling unsupported OperandType issue #36

How to handle a failing compilation PR #50

anssik: Chai submitted a PR #50 to fix issue #36 (thanks!). This PR proposes to update CompilationPreference to follow WebGL convention for context creation. Update the code sample to show how a certain compilation could fail such as when the model uses data type not supported by the hardware, and how to recover from the failure.

Chai: wanted to clarify, I looked at WebGL, there's also a concept of context attribute, WebGLPowerPreference: :high perf, low power, 3rd_one_script_did_not_catch
… matches what we'd probably want to have for this API as well e.g. in cases when we fallback to CPU
… if float16 not possible on GPU, then can fallback to CPU, but if low power or high perf preference, we'd try that
… another preference in WebGL allows to create a high performance context, but that could fail(?) if not supported
… power preference of WebGL is something we should try to align with

RafaelCintron: agree we need a default
… in WebGL if you ask high perf, you always get something
… never fails
… you always get something back regardless of the preference, it just sorts the list of GPUs

Chai: in this case we have a model, that might not be supported by the GPU, so how we interpret high perf is more absolute in the case of WebNN API, similar to WebGL preference + flag

RafaelCintron: in this case high perf preference is not power setting, just whether it is supported?

Chai: if we try to compile a model, and GPU does not support it, we want the preference to say "we cannot create a high perf compilation of this model"
… semantics different from those of the WebGL behavior

RafaelCintron: so you're saying, if the model requires a lot of copies then high perf pref would fail?

Chai: right

Chai: if developers do not care, they could retry compilation with CPU

RafaelCintron: is there a threshold?

Chai: you don't want to mix two devices midstream

RafaelCintron: if 6 out of 10 ops can be done, does that qualify as a high perf GPU?

Chai: hard to say how we define runtime performance
… given what we know w/o executing, we want to compile with high perf preferences, there could be some policy as long as it is not too slow, if the browser needs to walk the graph to determine which is fast and which is slow then that's too much
… the idea of high perf when user is certain the scenario/use case needs high fps, e.g. computer vision usages need high perf for good UX

RafaelCintron: I see your point, if device-specific there's a fingerprinting concern, maybe there's no other way to specify this than how it's now spec'd

Chai: important topic, should offline

anssik: WebGL API fingerprintable surface is way larger

Ningxin_Hu: thanks Chai for the PR
… ask for clarification, compilation preference: low power, high perf, fast answer
… in current spec not clear fast answer is the default
… fast answer is about responsiveness
… high perf is for use cases that need successive frames such as input from camera frames
… Android NN API has similar preferences
… on previous meeting, we wanted to handle the failure similar to WebGL, and define error codes to indicate the reason of failure

Chai: general observation, if we know what to recommend to the caller given a specific error code, then we should define a error code(s) for such a failure path

Chai: any cases when we do not want to caller to fall back to another preference, not just run the model?

RafaelCintron: in WebGL you can ask preference, high power, low power before context creation, after the context creation can ask if can if e.g. can do 16-bit float textures or not
… similarly for WebGPU

anssik: do we have extension points in the API to add error codes later on? maybe have a separate issue for error codes?

Chai: right

anssack Ningxin_Hu

Ningxin_Hu: agree we want to separate error code
… can we create a separate issue for compilation preference change?

anssik: do you want to offline this topic?

Chai: sounds good

Ningxin_Hu: similarly, ok with that

Inference API to load and run a model

anssik: I review WebML CG charter to figure out what changes if any are needed to allow us to incubate the load and run a model API proposal alongside the current graph builder API as a result of that review
… proposed resolution: add the load and run a model API to the group's scope, the group does not attempt to mandate a specific ML schema or format. No changes to the charter with this clarification.

anssik: comments? concerns?

Greg: when does the charter expire?

anssik: does not expire, since this is a CG

Greg: I think we should go ahead and circle back to charter if we hit a roadblock with charter definition

<gregwhitworth> >thanks

Jonathan: would like to record a decision in the respective GH issue
… maybe someone could pull out the model loader part of WebNN into a separate spec?

anssik: how about starting with all in one spec and split later?

Jonathan: sounds okay too

anssik: proposed resolution: add the load and run a model API to the group's scope, the group does not attempt to mandate a specific ML schema or format. No changes to the charter with this clarification.

anssik: any concerns?

RafaelCintron: LGTM

Jonathan: LGTM

Ningxin_Hu: LGTM

Resolution: Add the load and run a model API to the group's scope, the group does not attempt to mandate a specific ML schema or format. No changes to the charter with this clarification.

Adjourn

anssik: Take care y'all

– DRAFT –
WebML CG Teleconference – 19 March 2020

19 March 2020

Attendees

Meeting minutes

matMul op definition

Handling unsupported OperandType

Inference API to load and run a model

Adjourn

Summary of resolutions

Diagnostics