Immersive-Web WG/CG (extended) group call

Meeting minutes

<cwilso> scribe?

<ada> https://www.w3.org/2020/05/immersive-Web-wg-charter.html

Charter progress update and Charter 3:The Chartering

We forgot to set up a scribe for a bit, been talking about re-chartering

ada: We want to carry forward all the previous specs to the new charter, right?

ada: Gaze tracking? Y/N?

klaus: There was quite a bit of concern about it, and there's higher level alternatives such as focus events.

ada: Okay, let's remove it.
… image tracking?

klausw: What's the scope? Raw camera access may be sufficient?

ada: If we drop it we can't ship an image detection spec without rechartering
… face detection?

klausw: I don't know that anyone is working on that.

ada: I think we should remove it.
… so in summary, drop gaze tracking and face detection.

<Jared> 3d favicons is a great example

ada: What topics should we look at adding?

bajones: We should consider features that aren't explicitly hardware-related, like 3d favicons.

ada: Yes, we should look at volumetric CSS, 3d favicons and <dramatic pause> the model tag.

yonet: Face detection has been used for a while for security, and gaze tracking as well. So there are security concerns.

klausw: Gaze could be proivded as an XRInput device.

ada: Let's swing back around to this after we talk about new proposals/blue sky.

Depth testing across layers

<ada> https://github.com/immersive-web/layers/issues/135

Depth testing accross layers

The problem with layers is that they don't intersect they just sit on top

e.g. A projection layer on top of a cylinder layer needs to have a hole punched ut

and if you put the cylinder layer in front you cannot see the controllers

This is a dirty hack which breaks if you move fast

We've been looking at having depth sorting between layers to do actual sorting

This would be great for having multiple projection layers which can be combined

The issues comes with opacity i.e. a piece of glass in front of a cylinder layer

<Jared> Yes

<Jared> Super interested in that to experiment.

<Zakim> ada, you wanted to shout yes

ada: yes I think it is fine if that is a known edge case

bajones: there are two types of depth sorting. 1. the compositor tests against my depth buffer when rendering the depth buffers

the second type is when we have multiple projection layers which will need a per pixel test to test if each pixel is closer to the camera

cabanier: we are thinking of doing the second case as it is what has been requested of us

currently you couldn't have any intersecting layers in an X shape without the second style of compositing

currently they would obscure eachother in a punch through kind of way

bajones: how do you imagine this to be enabled?

bajones: it seems you want a per-layer toggle for this

cabanier: it seems for that situation you could work around it with WebGL

Michael_Hazani: I just wanted to express our enthusiasm for this one of our products really relies on it

one of the things we would like to do is multiple XR sessions on top of eachother

<yonet> ada: when Rick described the multi session use case

<yonet> ada: it is very powerful, combining immersive AR and VR session, without the AR layer

ada: like iframes in HTML but for WebxR

Nick-8thWall: we want to use DOMLayers to attatch interfaces to users, having it semi transparent would be really valuable

ada: I think in that situation you could work around it via having the semi transparent layers be the frontmost layers

Nick-8thWall: just wanted to ensure that the wrist mounted or HUD use case was covered

<Zakim> bajones, you wanted to say that depth testing across sessions may require a shared depth range

Jared: even an implementation with the semi-transparent limitations will be useful for us

bajones: right now with webxr you set a depth range for the whole session, that might not be the case if it gets to the point where we are extending it. I think there are more issues but this is just one which comes to mind. At least within the same XR session this should work itself out fairly naturally.

cabanier: @Nick-8thWall DOM Layers will always be opaque at least how they are defined right now

but cylinder or quad layers can have transparency but should blend correctly

For the record: A possible issue with transparancy on geometric layers is intersecting layers with transparency on both sides of the intersection. If two quad layers are arranged in an X and both have transparency there's not a natural order to render them in.

Focus control for handheld AR

klausw: for phone ar, autofocus is off
… because it can cause issues

<atsushi> issue link

klausw: you can choose fixed focus at a far distance or a close distance
… if you want to do marker tracking, you can't detect things that are close
… should apps have a way to express this?

bialpio: what do people think about this feature?
… there hasn't been much activity on this?
… or are we ok with the way it is right now?

klausw: one problem is that marker tracking that needs it, can't use it

bialpio: we might be more forward thinking
… the tag wants to have focus control for raw camera access
… so maybe we should think about the focus control now

<Zakim> ada, you wanted to ask whether it would make sense to tie to hit-test

ada: I've definitely come across this
… can the user agent can take care of this?
… I guess it can be hard for the UA to do so

Nick-8thWall: this is a major issue that we run into when scanning qr code
… I had to print qr codes out on large pieces on paper
… we prefer that autofocus is always on
… on other platforms, we always set things to autofocus because it's clearly the best experience
… my preference would be to make autofocus always the default and maybe have an option to turn it off
… autofocus is clearly the better solution

klausw: I'm unsure when autofocus would make things worse
… people don't seem excited to have an API
… the app wouldn't know if a device wouldn't work well with autofocus

Nick-8thWall: on 8th wall, it's very hard to determine what the best experience is on each device
… on most devices autofocus is best but on some it doesn't work as well
… it's unworkable for us to have a per device decision

klausw: so nobody is interested in making this an API?
… so user agents are free to choose autofocus?
… or maybe it can be triggered by other signals.

<Nick-8thWall> 0 or +1

<idris> 0

<Jared> 0

<klausw> 0 or +1

<ada> 0.5

<atsushi> 0.25

<yonet> .5

<cwilso> i

<bialpio> 0 or +1

ada: it seems people want the user agent to make a decision based on heuristics

klausw: ok, we'll make a decision. We might turn on autofocus but then turn it off for certain devices

TPAC Discussion: Getting Hand Input to CR

<yonet> https://github.com/immersive-web/webxr-hand-input/issues/107

LachlanFord: started working on web platform testing (WPT)

<ada> ada to moan about W3C sticklers regarding implementations

Rik: Although both Chromium, we're do not share code.

Rik: WPT should run on Android, as soon as it's up we can write the test

blialpio: Not sure what requirements are, launch process before launching a feature should work. It seems like it is up to us to advance. Not sure if WPTs are blocking but they are good to have. Not sure how Oculus Browser is setup. We use a fake device implementation. We are mocking device to only test 'blink code' and can chat about it later.

<Zakim> atsushi, you wanted to discuss just comment, please complete self review checklist before requesting HRs

atsushi: For horizontal review, please do self-check list. I can do international, there is a self-check list for all horizontal review areas. Please complete first. I will post link to the procedure.

<Zakim> ada, you wanted to moan about W3C sticklers regarding implementations

<atsushi> HR procedure

ada: We might in the long term get pushback from some folks in W3C. Two independent implementations may be met with skepticism. Is this something that Apple may able to show?

dino: We don't have devices, but we have the barebones implemented in Webkit.

ada: It is working in Linux in Egalia ?

dino: Igalia has it working. Mac had it working, on Valve headsets. Currently code isn't open source, or in a product so it wouldn't work. Anything we do for our implemenation will work for Igalia as well. The question is if they support a device with native hand-tracking hand-tracking.

ada: Adding it to the horizontal review checklist. The TAG may have added things to the checklist. Once everyone who is satisfied with this, ping Chris or Ada, to get it done. It is on our homepage. There is an editor's draft. Did we do a CFP for that?

dino: Since it shipping in two browsers we should move it to a public working draft.

Ada: It says WebXR hand tracking already has a working draft. The next step is putting it forward to CR. We could wait to put it forward for CR, or to it now.

Dino: I don't think we can go to CR until we can even test that the two implementations are correct. I am happy to help as much as I can like writing tests:

Ada: Lachlan, you're working on tests?

Lachlan: Yes, Dino and I are working in tests.

<ada> Once WPT is done ping immersive-web-chairs@w3.org and we will put it forward for a CFP for CR

LachlanFord: I think I'm on the queue, it was about implementations. I think Mozilla is there.

Ada: So we do have two independent implementations?

LachlanFord: Yes, I believe so

https://github.com/immersive-web/webxr/issues/1228

Rick: Mozilla had an implementation on the prior version on the API and it wasn't public.

Communicate earlier that the UA doesn't need a depth texture

<ada> https://github.com/immersive-web/webxr/issues/1228

Rick: This should be a short topic. After you create your WebGL, projection layer. You are told you don't need a depth texture, even though you just made one. It would be nice to know before you create. You create color and depth. If you knew ahead of time you would not need to create the extra texture. At the moment if you want to do multi-sampled you lose the memory for no reason. If there could be an attribute, the same attributes that sa[CUT]

Rick: yes, no, it could be avoided. Shouldn't be controversial. Any objections to propose if you need it or not?

<ada> ac kada

bajones: I don't find anything objectionable. There may be many cases where people ignore it. If you are going to put it somewhere, it feels natural to put on the WebGL binding. The binding seems like the place where you differentiate and you need to create the binding anyone. Seems like a great to put in that interface. Seems like a great way to allow for better memory use.

<Zakim> ada, you wanted to ask about fingerprinting

ada: Is this a property for before a session is granted? This is one more bit of fingerprinting. Could be worth considering.

bajones: Assuming it is on the binding.. It could be on the session itself but I don't see a reason to have it there. There is always a possibility that you could have different pieces of hardware depending on your systems. Depending on if you have multiple devices on a system.

<yonet> Sure

RafaelCintron: I don't object. This is helpful for reprojection. Knowing this allows them to know what they are opting into.

bajones: good point, we should show that depth buffers are preferred. If this is there we should have them consider that they should provide it.

rick: In the case of the quest this would be false. The three.js attribute we will not populate the depth texture. This is why we proposed it, it would be nice if we don't request it in the first place.

Extending WebExtensions for XR https://github.com/immersive-web/proposals/issues/43

<yonet> Zakim/choose a victim

ada: older issue, the idea has been brought up a couple times, in general the idea of combining 2 immersive sessions where neither session needs to be aware that it's embedded in another

ada: important idea is iframes, so you could drop one page inside of another - it'd be powerful to do this for webxr

ada: you could have an avatar system running on a separate domain, you could feed it locations of people and it'd populate it w/ avatars

ada: is this still something that people would like? if so, what is missing to enable it?

LachlanFord: yes, people want to explore it

Nick-8thWall: random first impressions - it's important to have for example payment integrations
… it'd be very useful

<Jared> Agreed! There is a big list of those types of apps in the Aardvark project

ada: other use cases off the top of people's heads?

LachlanFord: you get more utility if you have something that's composable

ada: earlier topic of combining layers touched upon this as it'd be needed to solve this

ada: how about the computing power required? would this be a blocker?

LachlanFord: input routing and security are major concerns
… composition of multiple full screen images is expensive e.g. on hololens
… maybe pixels arent the thing to work with

ada: what if we had small widgets that'd be visible and bounding-box them - would that help?

ada: if you wanted to compose multiple sessions - there may be issues around z-fighting (floor plane) and skyboxes

cabanier: is the proposal that the main page enters immersive session, all the other iframes would as well?

ada: unsure, maybe a second level of layers that pull content from a 3rd party
… or maybe you listen in to some event that fires when the main page enters immersive session
… would like to brainstorm this

cabanier: may require a lot of changes, maybe not in the API but in the implementation in the UAs

Nick-8thWall: similar to iframes where you pick where they are on the page, it will be important to do something like it for where they are in the session
… potentially taking 2 clip volumes and merging them together would not make sense - how do you define a logical volume that an inner experience is allowed to fill is important to solve here

ada: good point, we can force the inner experiences to have different scales / do sth with them
… taking a diorama of the inner experience can be handy

Jared: great to have discussions around it, it's very powerful to have multiple sessions running alongside each other
… we're experimenting w/ OpenXR and making some progress
… it's possible to experiment with chromium and openxr
… there's still some issues but it's still possible to get useful things out of it

yonet: link to a gif, something like changing the scale of the inner experience
… https://github.com/Yonet/Yonet/blob/main/images/headTracking.gif

ada: probably not adding to the charter now, but maybe TPAC+1 ?

XRCapture Module https://zspace.com/

<yonet> https://github.com/immersive-web/proposals/issues/68

alcooper: asks to enable capture of AR / VR experiences
… gap with native approaches (SceneViewer on Android, Hololens also has a way)
… no good solution in WebXR so far
… approach with secondary view but would miss out on DOM overlay
… the proposal is a new API to start recording a session and dump it on the disk
… privacy issues mentioned, the mitigation is to dump it on disk instead of sharing the camera feed with the page (if it doesn't ask for raw camera access explicitly)

bajones: there are 2 different outputs of the API - one is that when you use the API to record, the capture will end up on the device's storage (triggers native capture of the device), but then you get a share handle out of it which means the file that was captured can be shared with the device
… initially I thought that if a recording is not immediately shared, it'll be lost, but that is not the case

cabanier: initial proposal to use views - misunderstood the proposal
… in oculus, this is possible via an OS menu
… can this be achieved similarly?

alcooper: the way the API is proposed is to have a function that takes an enum value (screenshot vs video vs 360video vs ...)

bajones: I don't recall how it works on Oculus, I think it's appropriate to have a way for an API to display a system dialog
… it's not appropriate to just do things w/o confirmation
… it feels that the API should hint to the system dialog about what the page requested

alcooper: the only issue w/ hinting is that the page may not expect that it needs to stop recording

<Zakim> ada, you wanted to mention the web share api

alcooper: one of the items is it's built in into the native solutions that don't show prompts but this being the web we need a confirmation

ada: similar to Web Share, where instead of giving an URL and text to share, it's a special thing that pops up a dialog and the promise gets resolved after the dialog is dismissed

<ada> oh you need the 'to' to add the note

<ada> qi about

<ada> a- webshare

Nick-8thWall: things that can be tricky is providing a custom audio when capturing for example
… so getting some access to the media streams would be preferred route

<Zakim> alcooper, you wanted to talk about webshare

alcooper: why not getUserMedia() - discussed in the explainer
… this piles on the privacy & security aspect
… that's why the API is proposed as a new one
… on Android there's no implementation for some of the things
… discussed with web share folks, the .share() function is a shim to web share API that adds the captured file to their list

yonet: wanted to also talk about privacy
… useful to just capture the 3d content w/o the background

bialpio: getUserMedia() drawbacks - the immersive sessions may not show up in the pickers etc

bajones: we maybe could funnel things through getUserMedia() but it does seem like relying on an implementation detail
… re: "useful to capture content w/o 3d backgroud" - is this an argument for or against?

yonet: it's an argument for because it gives us more control over what's captured

Nick-8thWall: to clarify - I did not recommend getUserMedia() as a mechanism, I wanted to get a handle to a media stream / track
… you don't decide on the device, but you get media tracks out of a session
… permission to record a session can be used as a permission to share things with the site

<Zakim> alcooper, you wanted to talk about tracks

alcooper: understood about tracks, there is still a level of distinction between transiently getting access to camera feed
… vs the API to trigger the system-level functions
… triggering the recording does not mean the page gets the file

<Zakim> ada, you wanted to ask about returning an opaque buffer

ada: not a fan of having another API + a little bit
… we already have a way of handing out opaque buffers (images w/o CORS)

<yonet> We are over 10 minutes for this topic

ada: it'd be nice to have something like this where you ask for a recording and get a URL (UUID) pointing to the resource that can be used with the share API
… it separates creation of the file from exposing access to it

Nick-8thWall: e.g. when we record 8thWall sessions, we will on the device transcode them to mp4 so it's important to be able to access the bytes of the media stream
… on headsets with recording there is no expectation that there is a camera feed at all so no privacy impact
… so we should not hamstring the API with a concern that is not always applicable

alcopper: encode the recording as something that would show up as a video file in the user's media library (so .mp4)

Nick-8thWall: other use cases like branding / watermarks
… so there are cases where access to the recording is needed

<bajones> +1

<yonet> +1

<cabanier> -1

<Nick-8thWall> +1, preference for media track implementation

ada: should we work on this? +1 or -1

side note: can watermarking / etc happen post-recording? i.e. when the file is shared with the site

Expose a way to query a session about the supported features - need to reconsider? https://github.com/immersive-web/webxr/issues/1205

<bajones> https://docs.google.com/presentation/d/1tMTwkza_WDu5DNknrjwshECQ_OfzNToAXVpt7WXXi-0/edit?usp=sharing

bajones: we need to know what optional features were granted
… the session might discard optional ones and you have no standard way of figuring out
… one of the scenarios had that the anchors api couldn't convey that there are no tracked anchors
… this problem has come up often
… the proposed solution is an array of granted features
… (talks about the slide)
… the only hitch is that the spec specifies them as "any", but they're always dom strings
… for instance, dom overlay has an elegant way to still pass a string by adding an optional dictionary in sessioninit
… so features should be strings going forward

bialpio: we already say that the features should have a toString method
… we might as well make them into string
… so we don't lose anything
… for anchors, depth sensing might be transitory
… because it might come and go

alcooper: bajones and I talked about this for the XRCapture module
… you don't want to block a session on it
… XRCapture would be impossible to know if the feature was granted

<bajones> Slides for the next section: https://docs.google.com/presentation/d/1wXzcwB-q9y4T5VL9sKRhAWz_2nvwAWdFtTvVgk43rlk/edit?usp=sharing

Projection matrices differ between WebGL and WebGPU https://github.com/immersive-web/webxr/issues/894

bajones: the projection matrices are supposed to change to normalized device coordinates
… webgl and webgpu have different conventions
… so x,y are in -1,1 range, but depth range is 0,1 on webgpu
… so if you take a webgl proj matrices and feed them to webgpu, you'll get a kind-of-ok results, but those won't be correct
… for viewports, the problem is that webgl has an origin in lower-left corner, +y going up
… webgpu is like other APIs, w/ origin in top-left corner, +y going down
… it's a problem that you'd like to address before the content is being built
… the idea is that things that need to change between the APIs will all go on an XRView
… have XRFrame.getViewerPose() accept a parameter that'll accept the API name (as enum)
… to specify which convention to use
… other approach is to leave it to devs to do the math
… for proj matrices, there is a matrix to multiply by, similarly for viewports (not-too-hard adjustment is needed)
… but I'd prefer to hand out the data the right way

Nick-8thWall: not too familiar w/ WebGPU & how it binds to WebXR session - right now you set up a web gl layer and that assumes WebGL
… can't we just get the data in a correct context?

bajones: viewport comes from a specific layer so it knows if it's backed by WebGL vs WebGPU so no extra flag is needed
… but that's not the case for projection matrix, it comes from XRView

Nick-8thWall: can you have gl & gpu layers running side by side?

bajones: yes, theoretically
… there should be no blockers to mix
… we just need to make sure we give the right data

RafaelCintron_: WebGPU is only accessible by the new layers spec so can we put the matrices on the binding (same for WebGL) - this way the apps could mix & match and will have access to the right data
… it'd mean we're deprecating the existing fields

bajones: correct about deprecation

Nick-8thWall: whatever is the base layer is the default on the views, and we can have per-layer projection matrices?

bajones: having the projection matrix associated with the specific layer doesn't sound bad

<cabanier> +1 to what RafaelCintron_ said

cabanier: agreed w/ Rafael to put this info on a binding

bajones: do we want to port this to WebGL as well or leave it as is?

cabanier: yes, it should be added there as well

bajones: more WebGPU & WebXR conversations coming tomorrow, we can go back to it then

– DRAFT –
Immersive-Web WG/CG (extended) group call

14 October 2021

Attendees