Immersive Web F2F - Day 2

Meeting minutes

Inline AR

<dom> Handheld AR use cases need more than immersive-ar #77

tangobravo: This is the web build of our AR platform

tangobravo: We overlay menus on top

tangobravo: Using WebAssembly to detect various things

tangobravo: Can capture GIFs, do marker-based image tracking

tangobravo: Also works with ARKit

tangobravo: Can pause camera tracking and unpause it

tangobravo: Valuable to support a mode of accessing WebXR tracking data without losing control of the DOM

tangobravo: With an inline AR option, experience that don't need to do screen capture or their own tracking could do world tracking

tangobravo: Should be some way to get a bit more data into the DOM

tangobravo: Tied for us with raw camera access, since having to start a full session is a blocker

ada: Could you use DOM overlay?

tangobravo: One of the challenges is to get the right aspect ratio. For example, to avoid the notch on the phone.

ada: Can you switch to front camera?

Nick-8thWall-Niantic: Front facing camera doesn't do SLAM - not an option

Nick-8thWall-Niantic: If you swap to front-facing camera and had to leave WebXR session, would be jarring

Nick-8thWall-Niantic: Requiring exclusive session means experience is very different across ARCore and non-ARCore phones

alcooper: Shouldn't have to change the aspect ratio just because it's an inline session - we should be able to just keep it the same even if there is a mode switch

Nick-8thWall-Niantic: I strongly agree with Simon here

Nick-8thWall-Niantic: Here's a prototype of how a shopping experience can work within a web page

Nick-8thWall-Niantic: Not at all confusing to users

Nick-8thWall-Niantic: Can go full screen if desired, but it's optional

Nick-8thWall-Niantic: As I said before, it's a real challenge to build one experience across multiple backends

Nick-8thWall-Niantic: At the same time, this experience doesn't make sense on a HoloLens or a Meta Quest

Nick-8thWall-Niantic: What does it mean to have the view punched out of your browser - would just be a tiny view and not makes sense

Nick-8thWall-Niantic: If I could wave a wand, getUserMedia would provide intrinsics and extrinsics

Nick-8thWall-Niantic: One argument before was that we are leaking data about the device like extrinsics. However, we already can build lookup tables and so there's not really extra data leaked there.

Nick-8thWall-Niantic: Is that getUserMedia annotation perhaps the right path forward here?

bajones: I do also wonder whether these punchouts are what you'd want on headsets

bajones: I also am not sure about multiple AR sessions?

bajones: The fact that you are showing the camera feed on the page without a mode switch could freak people out

bajones: Does showing the camera by default make pages believe that the app can capture the camera?

klausw: About using DOM overlay, the camera field of view would cover the whole phone screen

klausw: You could fix this with raw camera access, but then you are basically back to raw camera permissions

<Zakim> alexturn, you wanted to ask about mode switch but stay in inline, like camera permissions

alexturn: are we conflating a few things? we're talking about camera permissions, mode switch
… inline today would go through camera permissions
… we don't need to remove permissions to enable inline from the start
… upgrading the camera feed with AR tracking while keeping permissions would be good first step rather than stalling

cabanier: I am a bit worried that this is a mobile only feature

cabanier: If we want to do inline AR, need a path for headsets too

tangobravo: My view is that inline AR is specific to camera-based platforms like mobile

tangobravo: For me, layout is the big problem on mobile, let's keep the browser in charge of compositing and not ask for camera permissions

tangobravo: Because that is not device agnostic, perhaps that doesn't sit inside WebXR?

tangobravo: The camera approach ticks all the boxes, but perhaps we'd want a more privacy-sensitive approach

Nick-8thWall-Niantic: There are other interesting challenges that arise if you were to try to take a WebXR session without substantial changes and put it into a frame on a page

Nick-8thWall-Niantic: When you start an XR session today, it kills the page's animation frame loop and starts the XR loop

Nick-8thWall-Niantic: Designed as a full takeover

Nick-8thWall-Niantic: Things might expect both loops to run

klausw: Regarding mobile-only vs. headsets, yesterday we talked about an async camera feed - sounds like it could also solve this use case

klausw: This could lead to solutions that work on headsets too

klausw: Question is how we can do this to balance privacy and power

klausw: TAG may not be OK with even more power in getUserMedia

klausw: Would we want to do this basically as a wrapper around ARCore/ARKit APIs? Is that too hardware-specific?

<Zakim> klausw, you wanted to say pose-annotated camera feeds would also be useful for headsets

<ada> CLosing the queue since this is getting long

alcooper: Regarding killing the page frame loop, nothing in the spec requires that

<dom> [I don't know if people here know about / have tried https://w3c.github.io/mediacapture-transform/ which provides a framework for real time video frame processing for mediastreams]

alcooper: May just be a bug

klausw: Pretty sure that window rAF keeps working

alcooper: Key privacy concern is how users know the camera isn't being captured

alcooper: Still not seeing how this is a strict blocker

<ada> Nick-8thWall-Niantic:

<klausw> window.rAF test: https://ardom1.glitch.me/

Nick-8thWall-Niantic: From a developer point of view, it's really clear that the best path forward is from a getUserMedia style approach

Nick-8thWall-Niantic: Putting aside potential TAG concerns - we decided TAG is there to provide feedback and our job is to provide feedback from users

Nick-8thWall-Niantic: Is there an appetite to expand reach of ARCore substantially by exposing it through getUserMedia

ada: If getUserMedia is "best", should we just approach them?

Nick-8thWall-Niantic: I don't know the right people there

ada: Dom is a good way to find the right people

Nick-8thWall-Niantic: Us bringing it to them is not a great way to get buy-in

alexturn: maybe there is something for us to write in a table with the various privacy trade-offs
… to help build a shared understanding
… for hololens, we have people using the native equivalent of getUserMedia to do this
… we should look at what it would take to bring this to getUserMedia

Ada: TPAC might be a good time for this kind of coordination

<tangobravo> https://github.com/immersive-web/webxr-ar-module/issues/78 - was my thoughts on gUM vs a new session type. For me, new session type allows leveraging all the WebXR stuff around spaces and poses more directly

dom: I'm also the staff contact for WebRTC group where getUserMedia is discussed

dom: I doubt the WebRTC group would want to take up addition of AR things themselves, but they would care how we do it

dom: Getting agreed ourselves on how we would do this is important ourselves - once we do that, let's just have a joint call, before TPAC

dom: I don't expect WebRTC working group would want to own this, but they'd offer guidance

dom: There are more modern approaches like MediaCaptureTransform

<alcooper> When I looked into FaceDetection, my proposal (because it leveraged the front facing camera) was to leverage gUM, and I did talk with some people internally and was pointed to the InsertableStreamsAPI: https://github.com/w3c/webrtc-encoded-transform/blob/main/explainer.md

dom: Some TAG individuals expressed thoughts, but no formal pushback

alcooper, insertable streams has evolved into 2 paths: webrtc-encoded-transform probably not as relevant as https://w3c.github.io/mediacapture-transform/

also, re face detection, there has been discussions in how to integrate camera-driver face detection which has a lot of similarities with what we're talking about here (I think)

see https://github.com/riju/faceDetection/blob/main/explainer.md (although that particular proposal hasn't been adopted by the WebRTC WG)

<alcooper> Sorry, I just grabbed the first link from my proposal; I think I was proposing something slightly different; I'll note that my proposal was also ~a year and a half ago and we never pushed it further forward either: https://github.com/alcooper91/face-mesh

<Josh_Inch> scribenick josh_inch

Are 8-bit outputs sRGB encoded?

Rik: on the web colors are rgb coded, we write colors in rgb buffer

<dom> Are 8-bit outputs sRGB encoded? #988

Rik: you get a double conversion which makes everything too light. Used to be done with a hack

Rik: Not optimal, you have to a lot of things that pretend to be RGB

Rik: Anybody from msoft or google know about this? seem to have same issues

bajones: so Im not an expert but this is an area that is a constant issue for web as a platform. Am curious in openxr is it required or just pref to use open srgb

rik it is the preference

bajones: might be an issue but no question we should do something better. When we get to webgpu we could do the right thing. Idk what it is though

bajones: in this case it needs to be an explicit signal, something about how to create a layer. That way we are not making differences between platform or surprising anyone. Imagine we could do the same thing for the webxr layers API. It takes an explicit format. rik, yes it does

Bajones: I think the big thing is that out of necessity we will have to convert out of necessity and use the pathway to convert the RGB texture. Maybe we can define it as part of the layers api and back port from that interface from there. Dont know what the right shaders are for that

Bajones: No matter what we do we are going to have to deal with a lot of content out there that is not designed for this

alexturn: this srgb stuff is one of those topics with so many moving parts, hard to keep top of mind. I would love to have this conversation with the folks who wrote that code. I know there are ways you can signal openxr. This is painful in the openxr layer.

alexturn: I remember you had discussions in how to make this work directly, people found a path that it worked...then stopped. there is a likely a better way to do it. We should find the people who wrote the code and discuss with them

alcooper: its almost certain that the actual implementation came from rafeal or patrick at msoft

<klausw> https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/modules/canvas/htmlcanvas/canvas_context_creation_attributes_module.idl;l=41?q=colorSpace

rik: just wanted to raise and when we do webgpu we can keep in mind and fix

<klausw> looks like there's a colorSpace attribute for 2D canvas context creation

klausw: looks like its a colorspace for 2D

<klausw> anyway, don't have much context on this (no pun intended), but looks like work is being done

while klaus is figuring that out. Im not sure how widespread the implementation is, I think it chrome it is respected. Know there is a lot of history. May not be reliable.

rik: that propert is actually the colorspace. When they say srgb... they are talking about gamma corrected. I am specfically talking about linear and camera corrected

<alcooper> git blame looks like Patrick did the implementation

<yonet_> agendum: https://github.com/immersive-web/webxr/issues/1275

ada: im not sure if this is an issue for discussion

Prepare for implementation report

ada: correct me if I am wrong but I think this is that we need to politely ask the working group to prepare an implementation report to move the spec further along.

ada: we can either do it now or we can see if the independant implementation of chrome is enough or we can wait for apple to as well

ada: anyone have strong opinion. personally not pressed to do so

dom: its down to us to determine whats needed for independant implentation, in our case we are thinking of very different underlying platforms. I dont thin kwe should shy away from taking that approach. I dont think I would make it a key requirement for us

dom: clearly there have been many independant implementations. There is a question about how we measure and document the implementations.

brandel: How soon do other specs move from CR to the next stage where they are in independant implentations?

s/bjones

bjones: just wondering what traditional timeline is?

dom: 6 months

dom: the only question of timeline is the question of how much of this group is prudent to write the test cases, run them, and review the results

<dom> https://wpt.fyi/results/webxr?label=master&label=experimental&aligned

jrossi: two questions on this

jrossi: is expectation that we are comfortable with manual test or do we need automated?

jrossi: I agree with thesis in other specs about platform dif. One thing this group should think about intetionally is that we made the journey from webxr, we should stress test across mobile implementations and ensure it feels interoperable for other devs

dom: if we are ok to run manually there is always value in having automated tests if only for regression testing

dom: I agree with assessment. Maybe not just test cases but also a small prototype

ada: automatically verifying that an immersive experience feels "good" is difficult

<yonet_> agendum: https://github.com/immersive-web/webxr/issues/1272

XRWebGLLayer: Opaque framebuffer + antialias + blitFramebuffer conflict #1272

bajones: this is just a request from a developer from a page who does a lot of java renderings of old game models. Hes good about digging into minutia. One of the things that he called out is that because of the fuzzy language around anti aliasing in teh webxr webgl layer, it doesnt seem like devs can depend on a certain type of anti aliasing or web technique that we use. generally this would be considered invisable but there are som[CUT]

bajones: request to get some clarity around what kind of antialiasing we can get here

bajones: is anyone aware of an implentation that would cause problems here? Suggest that antialising should pass a frame buffer.

bajones: would be shocking if they requested frame buffer false and they got one anyways

bajones: this issue is saying that need to tighten up language around antialiasing. Do you know if there is anything that would prevent msoft reality holense or hardware from doing that

one thing with hololense and holoense 2 is there is hardware reprojection involved. It does what it does. If someone was insisting anti alias off, what are they looking to ensure by turning it off?

bajones: In this case he wants it off so that the functionality like things such as frame buffer are consistent and known.

ada: is it even possible to copy out of a webxr layer frame buffer

bajones: its not possible to copy out of it, you should be able to into it though

bajones: depends on whether the target is multi sample or not

bajones: I dont see much of a down side

alexturn: if somebody wants to turn it off, we are in business to that

bajones: dev has said: we have said that these buffers are opaque

bajones: all of this to say that we have created an unintentional blind spot. following the suggestions from this dev would allow developers to consistently and confidently interact with buffers. I think we should put this in spec

cabanier: so in the case multisamples, we use multisample render to texture extension, special behavior if you render to multi sample you dont actually render to buffer. If you try to copy out or in there might be issues because I think one of the things was if there is sample buffers

bajones: yeah so my understanding of that situation is its special in sample buffers return1 because you are dealing with not a multisample texture.

bajones: the output is multisample

bajones: the actual buffer is single sample, you could ask for anti alias.. turn it on, ask for sample buffer, get back 1, means you could do single sample buffer and it wouldnt act exactly like one..sort of confusing but is the reason why we should expose it so that devs can do that and know what behavior to expect

ada: thank you brandon is that enough?

bajones: yes

<bajones> Navigation slides: https://docs.google.com/presentation/d/1kjAsL9NebaroqQL7thRH6DjRzeIrrTkTBccynP94OKY/edit?usp=sharing&resourcekey=0-nUlPh2G4vHRRjlNcpgkGlw

<yonet_> https://docs.google.com/presentation/d/1ewsefsmLFKIv0fRExCf1VzgvkepSJnrxn76_c8LmWRk/edit?usp=sharing

<yonet_> agendum: https://github.com/immersive-web/navigation/issues/13

Navigation update

<dom> present?

Slideset: https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0003/XR_Navigation_thoughts.pdf

Brandon: There is a difference in the context for links in VR. I'm introduces navigation contexts

[ Slide 6 ]

bajones_: On a page if you are hovering a linked element if the user clicks they navigate. The same metaphor can apply to XR in certain situations the context can imply that if you take a navigation action then you will navigate. Like standing in a door way or holding an item

navigation destination needs to be shown to the user as some form of trusted UI

no navigation happens until the user takes a trusted action i.e. device button

[ Slide 7 ]

bajones_: navigation can never be triggered by the page i.e. location.href kicks you out of VR, and needs to happen with a trusted non spoofable gesture.

[ Slide 8 ]

bajones_: in addition the page cannot observe that you are about to navigate to stop them from swapping the navigation context

[ Slide 9 ]

[ Slide 10 ]

bajones_: the transition to the new page is fade to black -> interstitial environment -> new site

[ Slide 11 ]

[ Slide 12 ]

bajones_: this could be used to show navigation related information during the interstitial such as favicon, a equirect map or a simple model

bajones_: having a pose for letting people select a link is a vector for abuse,

bajones_: addint contextual information can be really helpful

ada: mitigations for rapid switching, to avoid spam when they developer is constantly changing rapidly

either the position of the link

there probably needs to be some timeout

this could be used to trap users by making the whole page a navigation context so that pressing the navigate button to leave the page instead takes you back to your current location

[ Slide 13 ]

back links are tricky, but i think the mechanisms for it are already present and could be a community-standard

[ Slide 14 ]

bajones_: accepting navigation requests. Sites which are able to be XR need to signal that we talked a little about this yesterday during the easy enter XR

since it needs to be dynamic a declarative tag is properly not the appropriate solution

bajones: should be before the window's DOMContentLoaded Event

[ Slide 16 ]

[ Slide 17 ]

there was some discussion about using isSessionSupported as a Signal it does have an unexpected side effect that it may train users to not trust the button if it's frequently used for fingerprinting

should it be used to allow sitees to swap mode for hardware and sites which support both

bajones_: should site be able to signal how far the loadings is done

cabanier: thanks for the slides it's a big topic

<Zakim> ada, you wanted to ask about xr pages using things like a model element

ada: what about sites that would be using e.g. a model tag - in general, with XR content not using WebXR
… would this be appropriate fro this?

bajones: interop between declarative and imperative xr would be good
… the model content may not necessarily be a good fit for navigation, but it could lead to a gallery of sort

jrossi2: i think needing a dunamic hook it should probably be another specific event rather than DOMContentLoaded, I think we would still like a declarative signal for XR sicne it would really help the UA know that the target site intends to be XR

jrossi2: we also want to think about people travelling together, bookmarks could be used to expose a target destination for each person

jrossi2: also declarative signal allows search engine indexing

bajones_: my initial thoughts were if some of it has to be script driven it should all be rather than a two parter. But I can see where you are coming from.

jrossi2: regarding the two tier it doesn't need to be HTML just any early reliable hint, such as headers

ada: sorry to jump, but a <link> tag can be a header or are a tag

laford: i'm not entirely convinced it needs to be dynamic but that is a longer discussion

laford: having to redifine whata link it feels like a smell because it's so fundamental to the web concepts

jrossi2: i agree that navigation without a button is a bad idea. As a hot take I am wondering if this is a way we could get it script initiated as long as there is a reserved insititial the browser gives such as a thing that pops up that requests the user pushes the button.

bajones_: there maybe a path using location.href, if we are in the psositon where we can trigger the reserve gesture but can cause the 2nd activation links become two clicks.

tangobravo: in our product when you want to open a tab we make it a two click process

<jrossi2> Whynotboth.gif? It might be valuable to have a way to catch "legacy navigation initiations" and that has the two click experience. But then provide the nav context for devs that want to lower the friction

ada: if we start with a 2 clicks approach, this doesn't rule out doing one click as an evolution
… whereas it would be harder to walk back from one click

<jrossi2> Another scenario that we might be able to do one click: same origin nav

bajones_: one click would be updating window.href and then the user validates the navigation?

cabanier: do you think we could treat it like a permission

where once you haave gone to a site that once we have gone to a site we give it permission to go back

<yonet_> +1 not to have one time approval

bajones_: ubersites make this awkward because content agrigators and social media and blogging platforms would then be risky unless we were super granular

ada: going back to the previous page should probably be immediate to let users jump out of a place they do not want to be

bajones_: as long as it is a reliable signal it seems like a good signal

tangobravo: on a point to the UX as you are travelling through the void,
… a second click doesn't seem like a terrible UX

alexturn: the biggest risk between seeing undesirable sitse vs getting phished the phishing risk is bigger because the current site can directly target

ada: before long, we'll end up with XR social media with links, which may lead to the kind of undesirable link destination I was talking about

bajones_: next steps are to gather feedback and start iterating on solutions in the navigation repo in the Immeresive Web GitHub

bajones_: this will probably replace the proposal written by Diego from AFrame
… it probably won't be a total replacement bits will be used

yonet_: and information about when implementaitons might happen

bajones_: Meta has been experimenting with it but a user library built on top of it could be a good way to start experimenting

cabanier: it's a good idea, of course they can't press the real system button but we can see if it works

jrossi2: a little bit disconnected from navigation, if we could get the declarative solution sooner rather than later, it would enable sites to start implementing the landing features

bajones_: it seems like a reasoably sperate piece

bajones_: that it can be implemented sepereately

dom: I was going to suggest that it might be worth speccing seperately in paralell

dom: there are few things that seem to want to tie into it so worth doing earier

yonet_: lunch

<klausw> ... lunch for 1h, until 2pm (for anyone who missed it)

<dino> i'm talking

<ada> dino:Explainer for model tag is the only docs so far, but working on implementation and have some volunteers for speccing

<Leonard> Apple explainer frked to Immersive Web at https://github.com/immersive-web/model-element

model tag

dino: not designed to replace webxr for fully immersive content, for integrating some xr components into html

dino: why not modelviewer? situations where the page script cannot always do the rendering for security/UX reasons

dino: eg all the information you have to share about head pose etc that is not necessary to be shared

dino: another concern heard: can't align on an interoperable way to render

dino: last 2-3yrs lots of advancement on defining declaratively how rendering should work - won't be pixel perfect interop but close enough we can be happy

dino: why not discuss in whatwg: immersive web are the experts. should define as much as possible here before bringin it to whatwg more baked

dino: always best to do this incrementally. proposing a staged approach: 1- an elelement that points to a model, api to control camera and maybe play/pause animation built into the model

dino: 2- scripting contents of the geometry

dino: 3- joining the 3d space with the rest of the web page and give full access to the scene graph (but defining interop scene graph will be hard)

dino: ex: apple.com has watch configurator, we dont want to make a model for all permutations of the watch. just want to programmatically change the material

dino: [showing live demo]

dino: also want to discuss 'real css 3d transforms' and think these two concepts will fit in quite nicely together

tangobravo: if we had inline ar through a webxr session without exposing camera, would that replace all use cases of model element and just allow to be moved into modelviewer?

tangobravo: does it allow you to keep the 2d page going and pick models out or something?

dino: what were going to see in mr environments is some depth to the canvas but on a 2d background, so kind of 2.5D

dino: parts could be pertruding out from the page. technically has to be a point where that object extends outside the window rect. very difficult to facilitate that

dino: and then ux like plucking that element out of the page into the 3d environment. rendering environment doesnt allow for that

<dom> [using the model element to have the browser be responsible for "show me this model in my room" is an interesting idea]

tangobravo: makes sense. though imagine having pertruding objects is hard to solve with scrolling etc

dino: its actually pretty cool and creates some neat parallaxing effects

<emmett> Can I raise my hand here?

ada: want ability to control subparts of model. the way ive been doing that in aframe with gltf, i give it a child element to control a particular part of the model and then the properties of that element control the part of the model with a transform

cabanier: what happens if you apply a css to it, like skew/float/etc. if its an iframe are there limitations?

dino: I think transformation is interesting. At the moment, weve made it such that 3d transforms propogate to the object (though havent thought of skew and wish we hadnt added it)

dino: iframes is a good example of why you want this as an element the page isnt controlling because you dont want the user to have to deal with permissions for the information needed for that thing

bajones_: are the ways to control the model exposed as direct attributes of the model element?

dino: currently yes

bajones: i think youll want both options. some way to set the transform of the object but also the camera. having an override would be nice

bajones_: know were not talking model formats just yet, but worried about extensibility into areas like this without being backed by a very consistent format

bajones_: current proposal gives ua ability to support multiple formats

dino: agree. ive done a non comprehensive research into formats, i think if were careful we can propose doms that are similar enough between formats at a high level

dino: we could translate gltf scene graph into js api and thatd be a reasonable place to start

emmett: primary thing model is solving is dealing with camera permissions in AR

emmett: im surprised that the result of wanting to solve that is to create a dom node for all of 2d non immersive, non-ar sites

emmett: the model element is to solve the ar case, but it also does everything required of 3d on the web independently of ar. that seems like a difficult approach.

emmett: biting off an enormous amount of work to standardize what leads to effectively a game engine api when you really just need to solve this ar case

dino: would love to show demos but didnt get permissions to share before this, hopefully over next weeks or months

emmett: in my experience, the ux of headset vs phone experience of this stuff has almost nothing in common

emmett: dont see content that makes sense for an ar headset and also a laptop or phone browser

dino: the examples are not adding 3d transforms to elements in an ar environment. theyre interesting/exciting demos, but not groundbreaking 3d design for web sites. its adding subtle depth to your existing page and it looks really cool in a headset

dino: think like parallax but _real_ parallax

emmett: vr experience of a panel browser is not great. still a floating window whos ux is fundamentally about text and hard to read. in ar, fov makes it hard to imagine existing content but instead floating snippets of geoloc specific content etc

emmett: havent seen a vision doc that really sells me on how this will play out, hard to build a good web apit without that

dino: also partially agree that what you want of an ar headset is these floating snippets of information

dino: also not aware of vision docs that describe what could be done. but thinking now about what things we have to start on to get to something later on

yonet: maybe we can schedule something when demos are avail

Leonard: how are you going to support all the various features of the file format in various browsers, as things are added etc

dino: we have this problem with images, eg new color spaces tagged differently

dino: often doesnt have great fallback, dont have great answers for 3d either

Leonard: not just animation, but different types of rendering like lighting

dino: already have this situation with webgl and gltf. if you didnt update frameworks, it didnt work.

dino: usd is a great example of what could happen. usd is very extensible in a manner that you can have the same file open in multiple dcc's and each one can add its own metadata to that. you could effectively have a usd viewer, could be the browser, that understands more about the usd than another one. agree huge problem. would be great if we could have a baseline set of features

dino: browsers have come to be good at moving together, sometimes naturally or coordinated

emmett: if you use threejs or modelviewer, you can choose when to update and have universal support across browser. but then some browser update slowly

dino: on the flip side of that, hw gets better and you might get an upgrade without the page changing

Leonard: another concern is lighting. very important to commercial retailers who care what their product looks like

Leonard: I didn't see controls for lighting in proposal

dino: great point and another reason why the browser rendering is a good thing, it may have great signals on this that the page does not

dino: one thing not in the explainer, the idea of adding your own IBL with some way to specify what scene youre going to use a lighting model so you can experience what it might look like in different scenes

yonet: please add questions to the doc (to be shared)

<cabanier> https://docs.google.com/document/d/1u_UwbTcK8wDVLIWeRWf0Yb_n-WABKXu3G1fLKzK9aKI/edit?usp=sharing

<yonet> https://docs.google.com/document/d/1u_UwbTcK8wDVLIWeRWf0Yb_n-WABKXu3G1fLKzK9aKI/edit?usp=sharing

emmett: its great that you get free updates from a browser update, but also a pain because of unexpected issues from breaking changes

emmett: frameworks give power to control your own versioning system

<dom> [I think this points to the articulation between model formats and browser integration, which I think is closer to the way SVG is managed than PNG or JPEG]

Ashwin: can approach problem as add web content to webxr content or add 3d content to web content

Ashwin: we did prismatic library, proprietary way to position 3d models inline in a web page and pop out divs from the page all without entering xr session

we==MagicLeap

Ashwin: had demos from real content like NY Times

Ashwin: emmett asked for demos, would encourage looking at prismatic

emmett: yes theyre cool

Ashwin: is there some kind of accounting for models that want to go out of bounds... masive model that extends crazy amounts in z axis

dino: something spec has to define

dino: example as a bow and arrow. arrow extremely long in one dim and narrow in the other. how do you best show that if it was going to go through your eye. need to specify constraints

dino: use cases where you want real world size. what if you drag an elephant into a small room. maybe not something the spec should define but definitely a question of how 3d browsing experiences will work

yonet: please post in model elements on github

tangobravo: i cant physically see how some of this works [scribe missed some context], if apple wants to innovate on how to display this. seems like some sort of slice of quick look that is easier to solve than the full 3d deal

alexturn: similar to how this topic has gone before. people frustrated by gaps of mobile ar today, closing those is a priority. vendors with headsets end up seeing a different part of the problem space than others

alexturn: reality 2d web isnt going anywhere. 3d will be layered in and need to figure out the transition

alexturn: [live demo]

alexturn: [dynamics 365 demo doing 2.5d, 2d panel with meaningful tie ins to the 3d]

alexturn: today built by adding 2d into unity. but requests to start from todays 2d thing and layering in the 3d

<jrossi2> We see similar requests of people wanting to start from their 2D and layer in 3D than the reverse

alexturn: similar example with hololens remote assist app

alexturn: also dont want to take over full display

emmett: i really like this. something really interesting about this is that the 3d model isnt laid out relative to the page but the window

emmett: would like to see something that allows this and not be attached to the 2d dom

alexturn: risk or opportunity depending on viewpoint on how this evolves

alexturn: remote assist version has examples of both, like a button extending out of page

alexturn: agree with dino that this should get staged out incrementally. should align on that

emmett: reminds me of alt proposal, existing standard 3d model schema

emmett: this to me looks more like that. 3d content on this page but its not part of the dom

<dom> 3DModel - A Schema.org Type

emmett: can put information like where it should be placed

emmett: wonder if thats a better, simpler place to start than wedging 3d into 2d dom

alexturn: today doing this in unity ends up reinventing 2d from scratch. so they have similar complexity that we face in enabling all the 3d in 2d scenarios.

yonet: can we do demos in future calls and invite emmett?

dino: yes

DOM layers

<dom> WebXR Layers

cabanier: Not much has happened on DOM Layers recently

cabanier: Still some issues on how you do hit-testing

<yonet> agendum: https://github.com/immersive-web/layers/issues/280

<dom> i/<dino> i'm talking/scribenick: ada

cabanier: Every layer would be same origin

cabanier: The way you would communicate with DOM layers would be like with popup windows

cabanier: So far, feels awkward

<dom> i/Topic: model tag/scribenick: ada

Nick-8thWall-Niantic: On my screen is how we see this working

Nick-8thWall-Niantic: Having the DOM elements on the left just show up on the quad to the right

Nick-8thWall-Niantic: All this is doing is excluding the canvas, but it shows the rest of the page

Nick-8thWall-Niantic: I know you expressed some concerns about things like transparency - we could handle this being opaque

cabanier: Some of the limitations we talked about before have gone away - now you can mix quad layers with content

cabanier: That would let you do exactly what you see here

cabanier: Couldn't do super fancy effects, but opacity is OK

cabanier: That is something we could do with DOM Overlay

cabanier: Would need to be same origin

alexturn: being able to show 2D slates in an immersive experience is important
… in the context of Mesh / metaverse
… there are tricks to position Web content - except when using in WebXR
… there could be security restrictions that apply
… we're very interested in seeing the DOM Layers happen
… at least with CORS/same-origin
… are there security concerns with that kind of restrictions?
… DOM layers would provide more control than DOM Layers
… we've been using iframe - this seems to be a good model for many of the use cases we're seeing in MESH
… this would be a good place to start experimenting

bajones_: my #1 concern: how do we handle clicking interaction on pages in a secure manner?
… we absolutely cannot allow the JS to drive where the users is clicking on the page
… this could use e.G. false-ad engagement, drive the user to click on links they don't intend
… you have to have some way where the actual primary interaction with the surface (if it's clickable) is driven by the something that is more UA centric
… the only way to do this is to have some soert of js driven mode switch where you say: I'm interacting with page now, to give the browser control of the controllers
… while keeping e.g. a consistent rendering of the controllers
… difficult to figure out how to deal the hand-off
… awkward but probably unavoidable for interactive Web content

scribenick alexturn

cabanier: Yea, that was one of the concerns that caused us to focus initially on same-origin

cabanier: Could be awkward with things in front of the quad

cabanier: Is there any extra stuff that could happen here in terms of stealing info vs. what the page could already do?

bajones_: Want to talk to security folks here!

Nick-8thWall-Niantic: Some of this conversation is a little bit weird to me because I'm coming in with a different mental model

Nick-8thWall-Niantic: My expectation of DOM overlay is not that the DOM comes from another web page, but that it comes from the current page

Nick-8thWall-Niantic: Ideally, the DOM overlay would actually be not seeing controllers at all, but mouse pointers, clicks, etc.

Nick-8thWall-Niantic: I don't see why having an <iframe> to a different is problematic

Nick-8thWall-Niantic: Seems like being on a page and clicking on another page

bajones_: First off, absolutely yes - the page should not have to know it's being interacted with from an XR source

bajones_: Question is how do you reliably map that from the XR space that you're in to the DOM element without taking control away from the page

bajones_: If I'm using a painting application and I replace the controller with a paint brush, the most natural thing to do would be to point the brush at the page and interact with it

bajones_: However, the system's natural pointer direction points differently than the brush

bajones_: If the scene can override your pointer direction, could force you to click on ads and such

bajones_: Need some way to point at car and point at page and back and have it feel seamless

bajones_: I have some ideas but I'll pause for now

Nick-8thWall-Niantic: Only questions I have there are that you can emit mouseclicks on a page, but can you emit them on an <iframe>?

bajones_: There are multiple differences in telling the page to do a click, it carries different weights and properties than regular clicks - matters for ad clicks, triggering audio, etc.

bajones_: While we could let the scene generate synthetic clicks, that would limit some things you could do

bajones_: Could reason through tradeoff of synthetic clicks vs. system clicks

<dom> DOM isTrusted event

alexturn: there may be different levels of security - own DOM node vs iframe vs x-origin iframe

[demoing http://dk-css3d.glitch.me/ ]

alexturn: we're trying to work through how to wring this kind of approach in WebXR
… it's meaningful to allow same origin iframe content in that context

<Zakim> bajones_, you wanted to mention edge cases in usage, lots of small elements vs. one or two big elements.

https://dk-css3d.glitch.me/

bajones_: if this was used to render lots of HTML elements in plenty of places (e.g. for sign posts), I'm not sure this would be a good usage
… probably better suited for one or 2 interactive panes

cabanier: Folks can already abuse quad layers

cabanier: As for the <iframe> Alex was showing, the browser is still protecting you

cabanier: One thing we talked about is what happens if the <iframe> is not evil but the outer page is

cabanier: Here, you could know

cabanier: Browsers can render one thing at a time today - hard to ask browser to render two

Nick-8thWall-Niantic: If you only have one it should be fine

cabanier: Yea, having two doesn't work

cabanier: Browser layout engine can only do one thing

alexturn: let's not the perfect be the enemy of the good
… there may be aspects that aren't as seamless - e.g. the experience may choose to use a less exotic pointer / rendered controller
… good enough is good enough
… experiences would be ready to accept these tradeoffs to benefit from using interactive web content

ada: No more questions - just want to add: not sure if it's predictable enough and we haven't seen as many devs with different use cases and have concerns with transparency

<yonet> https://github.com/immersive-web/administrivia/blob/main/F2F-April-2022/schedule.md

<klausw> FYI: 15 minutes break, continuing at 4:07pm

UA projection layers

<yonet> agendum: https://github.com/immersive-web/proposals/issues/75

ada: really important to put content without bluring
… : Issues with fingerprinting, privacy

really want people to have persistant avatars without giving up on privacy
… just finishing avatar in webxr

Alex: UA can show their own dialogs and some information has been shared for avatars
… interesting negotiations already happening in sharing this info

Ada: communication between different platform

Alex: who makes the avatar system same like who makes the system to share items like browsers
… scope of what should be shared and how

Ada: user can upload the anchors to cloud for sharing
… there are side effect to it just as sharing the avatars

Alex: limits to trust are blurred and not clear

Rik: still some time before it can be implemented

Alex: interesting if avatar could not know about the location but still could be rendered corretly

Add support for space warp

<yonet> agendum: https://github.com/immersive-web/layers/issues/272

<cabanier> https://uploadvr.com/bonelab-120-hz-application-spacewarp/

Rik: meta cames with new feature for khronos to imporve the performance
… you have render at half framerate but still get the full framerate
… there can be issues with transpency but in general it works fine
… reduce resolution for depth buffer and velocity buffers

<cabanier> https://github.com/immersive-web/layers/pull/273

Rik: option to opt into space wrap feature
… previous texture was having same width and height and depth but now they can use different values for depth and velocity buffers

Ada: Will it struggle with user motion

Rik: No

bajones: its an interesting features and potential for performance improvements are great
… different values for width and height and depth buffers is good
… depth texture is normally tied to color texture
… concerns about different depth textures in the same structure
… openxr does it per layers
… you can do it for one layer but not other
… you need some external tools for that
… concerns about texture not populating in some cases

Rik: Projection layers does not use space wrap for now
… depth and motion does not have to be same size

depth texture has to be same size as color texture

Ada: Can you tell the difference

Rik & Josh: Developer is super excited about it and its not apparent

Bajones: Some example would be good to experience it

Josh: There are some samples we can share

– DRAFT –
Immersive Web F2F - Day 2

22 April 2022

Attendees