SV_MEETING_TITLE -- 08 Dec 2020

<cabanier> I can't log in to get the instructions for cisco

<Dylan_Fox> Hello! First time joining this meeting, also first time using IRC. Nice to meet y'all

<Dylan_Fox> Is there a phone call/video chat to go with this or are we pure text?

<cabanier> there are instructions mailed out on how to connect but I can't get to those since the w3c website is broken

<cabanier> @ada can you send me the connection details over email?

<Dylan_Fox> Same here please - dylan@drfoxdesign.com

<cabanier> or cwilso ?

<Dylan_Fox> yeah I found the link to the "internal mailing list" but it's giving me a redirect error

<yonet> I am having issues to login as well.

<cabanier> scribenick: cabanier

<cwilso> https://github.com/immersive-web/depth-sensing/issues/2

cwilso: this was requested by piotr

bialpio: I want to make sure that more eyes are on this API and if it's implementable on other platforms than chrome android
... I would like other implementors to take a look
... the doc should have all the details
... one of the issues I'd like to discuss in a bit
... we're origin trialing it in chrome. No at a lot of feedback yet but I would like to hear from other engines if this API makes sense for then

cwilso: does anyone have any comment

jgilbert: the feedback is that we should avoid uploading from the cpu to the gpu

bialpio: yes.
... I've had this feedback before
... we can try both CPU and GPU access because I don't want to offer GPU only access
... because the data is not that bug
... there are other efforts. for instance occlusion
... how can we use depth data without exposing it. So maybe GPU is more privacy preserving
... some experiences might want to use it for physics so that would need CPU
... this is the type of feedback I'd like to hear
... so we can work out how the data is returned to the user. I want to know what is achievable

jgilbert: if you can sample it, you can read it back

bialpio: some experience can provide this data without the CPU
... which we can't do on android
... we can expose it on a texture
... right now we can't avoid a copy but if we provide the API, it might be possible to avoid
... we should be safe as long as there are ways to surface the data
... should it be only CPU or GPU or provide a mixed mode
... this roundtrip is not my big concern

bajones: for the lighting API we came to a different approach
... because the underlying API returns a texture natively
... but it had the same concerns
... if I get it as a texture, would it be a depth texture, or do I want RGB for fancier effects
... there are a lot of variables to think of
... the values that are supplied are 16 bit which is not a texture type for non-depth texture
... you'd have to limit yourself to webgl2
... and given all this, it feels that just giving you the raw data, would be a minimal viable API surface
... there are concerns on how efficient it is, but these are the considerations we were thinking of

Nick-8thWall: from our experience, having image like data provided by the CPU is almost always the wrong thing
... javascript is not the most efficient way
... you want to control the size of the buffer
... starting on the GPU is almost always the right place
... there were questions on the lifecycle of the texture
... but you could make the assertion that it's only valid for the duration of the frame
... the format concerns are addressable by teximage2d from videoelement
... and so if it there was a new teximage2d provider, it would allow you to specify the format you want to receive
... there would still be a copy but that would be a GPU copy which is less expensive
... teximage2d ergonomics are a good end user experience

bajones: I wonder if imageBitmap is not a appropiate use
... but when you call about this, imagebitmap is the one one should be using

Nick-8thWall: if someone gives me a bitmap, I'd prefer that the GPU does the downsizing and not the CPU

alexturn: some of the stuff we've done with the web camera on the device
... we did the texture directly to the GPU
... It did come from a sensor, where you'd render the world mesh
... so for hololens the mesh would be rendered to a texture
... some user agents will start on the cpu while others it would start on the gpu
... so it would depend if we need 1 or 0 copies

bialpio: I would go back to the initial feedback to see if we can provide a hint where the data is going to be needed
... so we can avoid copies
... from what I've heard, the implementation can still happening

depth sensing mm vs m

bialpio: the different platforms provide different lengths
... how can we communicate it best?
... if we want to surface the data, how can we communicate what needs to happen with the data
... I don't know the best practices

bajones: we dealt with this in the lighting estimation
... but I'm unsure how it will pan out
... through various discussions with kip and having the platforms being different, there was some motivation to go the srgb8 route because it's available everywhere
... so we decided that at the time you create the lighting probe, you state what format you want it in
... and there's an other parameter that gives the optimal format
... so iOS can report only rgb8 but android can give you the more optimal case
... I'm not sure if it's the right answer
... if there is a format that we can use without conversions, etc so maybe we should use that as a default
... I'm not super familiar with the case so not sure what would be optimal

cabanier: is the question mm vs m

bialpio: that is less of a concern
... depending on the exact format a different shader would have to be used
... it would be great if someone can tell me how it's supposed to be done
... if we can communicate how to interpret the data, that would be sufficient
... and I can make progress on the API

jgilbert: bajones mentioned that it would work for the depth format texture that would be great
... because some operations can only be done on depth textures and there is universal support for 16bit texture
... how to present the data is a different question
... this does mean that we expect application to interpret the data consistently
... also if we have to do a copy, we can do that basically for free
... my instinct be to just be to convert it to a single system on all platforms

bialpio: my worry is that if performance suffers, will we have to make breaking changes?

jgilbert: that would not be that big of a concern

<jgilbert> scribe: jgilbert

<cwilso> scribenick: jgilbert

ada: next issue

<cabanier> PR: https://github.com/immersive-web/webxr-hand-input/pull/71

manish: Context: TAG has feedback, in hand input API, we have constants, like Node context. TAG found current design unidiomatic, so we're changing it to enums. Definining joints as a certain order. They also prefer human readable rather than medical names, so PR replaces the names.
... TAG also proposed numbering them, but we pushed back.
... Moving constants to enums is probably non-controversial, so let's see
... Any objections?
... Another minor point: Can use numbers to iterate over hand, but can't do that with enums. But perhaps we can iterate over the dictionary, via for...of etc. Should we use Iterable or Maplike? bajones had concerns maybe.

cabanier: Decided to keep XRHand as array-like not map-like, because the high-perf APIs have it that way. Do we really want to change that?
... People will expect the order of joints will be deterministic, so iteration order might be a concern
... Also provided method to query joint by joint-name, but this might be unnecessary

manish: If just doing gesture recognition, being able to just query what you want is better.
... Dominic says we can always define the iteration order, so we can define that. For...in and For...of gives you keys and values like you want. You could pass in XRHand.values if needed. You'd have the names if you need the names, because they're fixed

cabanier: Dominic I think said array order is fixed, but not map-like order

manish: Second part: The actual names that have been picked

cabanier: Picked the names that are correct more than friendly, but the TAG is probably right about this being harder to understand, especially for non-EN users

alex: We went through this in native API, and had the same concerns about requiring people to learn all these new terms. In the issue, I raised some of the ambiguities, for example "base"
... Best we could come up with was "base" "knuckle1" "knuckle2" "phalanx" but that's not great
... It was one thing when just we were implementing it, because we can learn it for implementation, but among multiple vendors, there starts to become more ambiguity e.g. "maybe the base is part way up the hand" instead
... Apps will start to make assumptions, so we erred more on the side of clarity/specificity than lay-understandability
... E.g. "is knuckle1 closer to tip or base"
... Maybe we just pick one and commit to a convention. Devs will often just use "tip" which is easy and clear, but further than that, they're probably rigging the whole hand, so getting knuckles/metacarpal right gets real confusing, maybe they can be different bases across vendors even
... If you need all those joints, you really need to get them all right, but if you're just doing "tip", e.g. for hit-testing, it's much simpler.
... lower vs upper meaning is also ambiguous, e.g. are your hands at your sides, or upright before you

manish: Wrote a comment: One real difference is that people have different perceptions of up vs down for their hands/fingers. Because we can rotate hands all over, precision is really important. An advantage of medical terms is they're well-established, even if not lay-known. "Base" even tripped me up, as to what it specifically means. Ultimately, like with pure numbering, it's ambiguous. With lay-names, they have some intuition, but y[CUT]
... ... match what the spec means

cabanier: It does suck to have to look up these unfamiliar medical names.

manish: We do still say "tip" at least, which keeps that simple
... I remember trying to figure out which joints oculus had, and had confusion between oculus and webxr. E.g. thumbs have different/confusing terminology. That said my perspective is more experienced so I'm not as confused by medical names as much as a more lay user

ada: Let's do one more issue

cabanier: Issue was already resolve, and revised some wording for the explainer, but people should re-read the explainer and check that it matches their explainations

ada: Unless there's more to say, I think that's a wrap. I think we should have a meeting next week, so see you then!

- DRAFT -

SV_MEETING_TITLE

08 Dec 2020

Attendees

Contents

depth sensing mm vs m

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output