Immersive Web face-to-face May 2026 day 1

Meeting minutes

Open Metaverse Browser Initiation

<NeilT> Link to the Open Metaverse Browser Initiative Slide Deck: https://portal.metaverse-standards.org/document/dl/8175

Slideset: https://docs.google.com/presentation/d/172-ptw-kR3yybzZcOdbWwyRB34i63Pd4Iy9MfwlxkZ4/edit?slide=id.g2cc65423add_1_0#slide=id.g2cc65423add_1_0 and archived PDF copy

<yonet> https://webofworlds.github.io/

<yonet> item: immersive-web/marker-tracking#2

Marker-tracking repo owners

[shows screen in tabular view] - I wanted to characterize the difference between some of these things - some things _look_ like QR codes and are not, there are 1D codes
… Some that require pre-training etc. We have agreed in the past to support or start with QR, but I am curious about what the current view is about what's in scope

alcooper: looking to the future and across the openXR spec, it does look like it supports Aruco, QR and other tags. We should probably aim to make Web-facing stuff to be similarly flexible

bajones: seconding alcooper's comments, and to thank m-alkalbani for taking the time to look through the issues and the state of the art properly. General thumbs-up-based consensus in the room
… We do probably need to start with an easy consensus like QR, and make sure that it works everywhere - BUT to make the spec flexible enough to extend to that

cabanier: Quest now supports _native_ QR code tracking, but we have found in openXR testing that there are a number of non-standard trackers with more bits - and that the browser is somehow more performant than the OpenXR tracking

<yonet> ada q+

cabanier: I'm not sure that we _need_ this feature anymore, and have found that a lot of people have been kind of doing this on their own already. It seems like we will need to continue expanding the spec

bajones: I think the dino QR code _is_ a standard QR code with some of the redundant bits removed

ada: They are standard, but typically track less well because they have dropped the redundancy

rik: The QR tracking is for 'regular camera', rather than 'raw camera access'

ada: I think there are some advantages to using this as an extension

cabanier: We would suffer limitations if we were to expose this as a method rather than letting users use camera for it

alcooper: I'd support the ability to leverage this as a capability, and that there is a privacy-facing difference between camera access and QR tracking

bajones: So you get a snapshot of the position and orientation over time, but it's not a persistent thing that appears over time
… There is a privacy-preserving aspect to doing this as an API _even if_ the browser did this as an openXR extension. It could provide it in JS even
… ideally the openXR implementation could be at least as good as a JS one, but that's another story
… arguably if it's provided as a browser-level thing then we can probably do away with some aspects of permission gating
… Even if this means it doesn't turn up as a continuously-tracked element, and that requires a separate feature

API Shape Considerations and Research #3

<scribe> github issue marker-tracking #3

cabanier: There are perf overheads for approaching it in this way

alcooper: Maybe this could be sliced or gated based on HW perf categories, even if it can't be turned on everywhere

bajones: conscious of the concern that all the things shouldn't be turned on, and that it would be good to make engines apply reasonable defaults

ada: re: tracking a single point in time, because the API surface is still in flux - would you like the API to specify if/when it is single point vs. continuous?

cabanier: At this stage, I don't think we would implement this API at all. Developers currently have the ability to do this

ada: Presumably firing up the camera has an impact too. Do developers typically keep the camera up for the duration?

cabanier: They'd have to if they were doing continuous

ada: If this is done through the API, then it gives more choice

bajones: My main concern about "Go build it yourself" is, as you mentioned, it requires some device-specific metrics. That means that it'll be likely to work really well on the most popular devices and be much less well-tested on other ones.
… If we leave it to developers, they'll likely just test on the popular ones

ada: We don't support Immersive AR sessions, so we'd need to put the QR code on your cheek because the "camera" is just the "user" camera access

alcooper: Android allows the front-facing camera

ada: This is a great topic m-alkalbani! And also the lack of editorship. alcooper has suggested that he might be willing to be an editor?

alcooper: Yes - I don't know when or how I will do the work, based on my availability through the end of the year, but I am interested

ada: Great! We are all stretched very thin. and m-alkalbani, are you also willing to take this up?

m-alkalbani: Yes, I would be happy to

<m-alkalbani> Different markers (slide): https://docs.google.com/presentation/d/1CaDrz0SJrdU7N9Fi6W0txvBo4fblLP89DFSLI4-_RGI/edit?usp=sharing

<yonet> Thanks m-alkalbani

lunch

WebXR integration with HTML-in-canvas #1414

<yonet> immersive-web/webxr#1414

bajones: worth bringing up because progress is being made on the HTML-in-canvas, catching up everyone and looking at WebXR specific bits
… let's look at some demos
… HTML-in-Canvas is working pretty well, pushing to standardize it soon-ish
… (looking a WebGL demo with selectable HTML text)
… (looking at a glass effect demo with interactive HTML in WebGL)
… check out html-in-canvas.dev
… you can use any html elements (buttons...). Elements need to be a child of the canvas element, they can get 3D transforms, events "fall through" the canvas
… so events work but you're limited to what 3D CSS transforms support
… this falls apart for WebXR, the "html on a ball" use-case
… so we want to discuss the input story for WebXR, but we should be able to adopt all the rest
… if we come up with a solution applicable outside of XR we should definitely make everything better, but we may end up with divergent solutions

ada: we have an analog with 2D quad layers
… we can raycast to layers and get 2D coordinates. it doesn't solve "html on a ball" but it could give use interactive html
… and let canvas people solve "html on a ball" for generic shapes
… but there's the question of how much to we trust these events (as opposed to layers where we own the geometry / events)

bajones: if the developer could map to x/y it solves the "ball problem" (but ada was mentioning browser-resolved x/y)
… maybe we don't need to solve the "ball problem" right away, maybe layers work

ada: the developer is responsible to use the target ray space

bajones: still need to handle teleportation, and we might want some sort of cursor analog so that the user can see where you're pointing

ada: we'd probably want this cursor to be a user-agent concern, needed for visionOS

bajones: for vision, the captured HTML would not include any of the highlights

ada: yes

ada: but we could map highlight regions, as long as it's a quad

bajones: might need to know about other environment factors (like fog between the user and the layer)

ada: another tricky thing about layers is that right now they don't occlude each-others, you can't give a depth-map

cabanier: you can mimic it

bajones: we also need a way to toggle input on/off

ada: one analog: the API for dom-above-AR where you can cancel rays

bajones: that might just work with the current HTML-in-canvas approach
… just need an opportunity to preventDefault

cabanier: is your proposal for this to only work for layers?

ada: just the input part

bajones: maybe layers are not necessary, just a convenient way to constrain it to "quads in space"

cabanier: everything has to be same-origin

bajones: and everything needs to be a child of the canvas element, which can't include iframes

bajones: layers might still simplify things (already has transforms), but we could plumb that information to a new API

ada: you could even do arbitrary shapes with the UV-map
… feels hacky

bajones: pointer do some extra stuff

ada: yes the three.js implementation is full of hacks

cabanier: we want to OS to do most of it

bajones: do we have scenarios that can't be fully captured with an x/y mapping (advances gestures)?
… might be safer to restrict to a quad
… if we do this kind of interactions we need the browser to synthesize events, what does the browser need? needs more research

ada: might setup a joint meeting

cabanier: could we just adopt their proposal? have the CSS transforms represent the quads in space
… without layers
… layers could give you more information about what needs updating

bajones: with or without layers, informing the browser of the positioning let's us tightly scope the WebXR specific bits

Mike_Wyrzykowski: for translucent layers, do we have a way to indicate that ray casting can poke through the translucency?

bajones: that's a good question, we would probably inherit a lot of the basic DOM behavior
… a reliable way to do that might be to have a transparent div at the back, and let events hitting it pass through
… it could get messy, but I think we can do it without additional mechanism

yonet: action item is to arrange a meeting to learn more about the html-in-canvas rationale

bajones: do we agree that starting with "quad in space" makes sense?

(no complaints)
… especially for interactive content, you don't want to get too crazy with the shapes

<yonet> Zakim choose a victim

How can we expose foveation? #32

<scribe> github issue WebXR-WebGPU-Binding #32

cabanier: How can we introduce foveating rendering in WebXR

cabanier: Under the hood we expose foveation to texture under the hood. Hopefully for WebGPU we can make it more explicit. The thing in Vulkan at least, OpenXR exposes a swap chain. The lower the value the rougher it renders on the GPU

cabanier: Some implementations may want to vary the foveated rendering level

<bajones> WebGPU discussion of Variable Rate Shading: gpuweb/gpuweb#450

cabanier: The way we can expose is it when we get the texture for color and binding we can also expose it to the WebGPU extension

bajones: Variable ray shading was discussed with WebGPU group. There is some discussion on their end, Metal, Vulkan and D3D expose something different.

bajones: Like I said the WebGPU group is enthusiastic about and want to see happen. This is definitely the main reason to implement. Question to you: Would it be better to expose the foveated rendering value

cabanier: I'm not sure how it works with WebGPU and maybe need to be attached to render pass

bajones: It will be difficult for the browser WebGPU itself needs to patch it into the render pass

cabanier: Developer may render to the swap chain

cabanier: We need something in WebGPU

cabanier: My main ask is to expose the texture and not just the level. The foveation level updates the texture

cabanier: How long do you think it will take

bajones: Hard to say since it has a dependency on WebGPU WG

bajones: Without this we're at a disadvantage

bajones: On the WebXR side we can expose the texture without knowing the exact details. Going through the design of swap chain process will help us take it to the WebGPU folks. When talking about swap chain we have a couple of pseudo swap chains.

cabanier: When we call view.texture we get the sub image

cabanier: I guess it won't work for foveated layer, but I'm not sure if anyone is doing that

bajones: Everytime we talk about this I bring up eye tracking. If we return a detailed foveated rendering texture for highligting content it will leak the eye tracking data. To do high quality foveated rendering would leak privacy data. Does Apple have multiple static foveated texture maps that would help with this

Mike_Wyrzykowski: how would you do this with linear interpolation systems

bajones: You can track in big blocks that the user was looking here and then there. Then spend the foveation effort in that area

bajones: You've to overlap the foveated rendering maps so that there's enough detail for high resolution

ada: I need to talk about it internally. I don't think we would want to do this the same way where there's no read back from the developer.

Mike_Wyrzykowski: It's possible but I'm not sure about the linear eye tracking vector

ada: It's making me ick, the performance benefits are great and maybe we're willing to make the tradeoffs. Only willing to do this route if the WebGPU can gurantee that the textures cannot be read back

bajones: In WebGPU you can specify the texture type. We can invent that this is a variable render texture to say that it's being used for foveated map texture

bajones: The foveation map that comes out this API is only used for foveation so that the API limits read back. This is not completely perfect because there are possible timing attacks on a sidechannel. However in this specific use case I don't think that would be problem because it requires the user to stare at a single location for a while

bajones: In general there would be a fuzzy gaze map and the device needs to expend lot of resources to do the foveated texture hack.

ada: Can you prevent the blue sky texture from being loaded in texture map and mark as not read back possible

bajones: I need to do more investigation on it

Mike_Wyrzykowski I like that idea and would allow us to do foveated rendering texture. How would the 0-1 foveated rendering value be affected. Is there a performance benefit of doing it via texture.

cabanier: Yes there will be performance benefit

bajones: We could add something to WebGPU to do 0-1 foveated to render path

Gareth1: We definitely want the WebGPU/WebXR feature integrated. Would it be exposed to the application foveated rendering. The current WebGL implementation is totall opaque. But in the WebGPU version we can use it in our rendering in a restricted way

cabanier: Yes

Gareth1: Is it based on a OpenXR extension

cabanier: I can look it. You definitely have to enable Vulkan

cabanier: It's experimental in Quest and Android XR. You can use it Vision Pro today

ada: Can the new usage limit be used

bajones: You can change the output of your rendering be affected by the foveated level. You can output a heat map. You end up creating a chain of anything that uses this has a restricted set of usages. You remove the clean restriction that this API only these type of things to anything that uses the API have to follow the restrictions.

ada: Can you update the textures so that when you use it on VR you get a texture or otherwise it's blank

ada: If this is anything like Canvas it would tank it

cabanier: The other thing that was proposed is that you've 9 different foveated textures and the system will switch between the maps. But users will complain than have full fuoveated rendering

<Gareth1> We'd be happy to see WebGPU-WebXR released without foveation (or just the existing opaque 0-1 toggle we have at the moment) , and then see eye-tracked foveation as a seperate extension

bajones: Having fixed foveation is less effective than eye tracking foveation. Having eye track is cherry on top, we should focus on fixed foveated rendering

Brandel: all foveation has fixed temporal vision, on Vision Pro it's much more continuous than on Quest. It has both temporal and spatial foveated rendering. The dynamic and fixed have enough in common it's worth to reconcile those two

cabanier: By using a texture will future proofing the feature

Brandel: We may add onFoveationChange so that we can report foveation events. In case we did change those things

bajones: I guess what you're getting at is that the foveation rendering could change. I think it's good to have that setup in the future even though we're now starting with fixed foveation. For now it could return the same handle for fixed foveaation.

ada: If you do derivatives that late in the game, you have to move foveation later. You may not want to do SSAO on the foveated content. I don't it would be too much of a problem

Discussion topic, Untracked stereoscopic inline sessions #1348

<yonet> immersive-web/webxr#1348

cabanier: discussion regarding stereoscopic content in inline sessions

cabanier: requires user permission prompt for tracking purposes

cabanier: (demo of inline stereoscopic content to the room / call)

Brandel: is the head-tracking pose coming out of WebXR?

cabanier: no, extension on WebGL for view projection matrices

Brandel: how valuable would it be just to have the ability to do stereo canvas without head-tracking?

cabanier: agrees with bajones that it would be valuable, e.g., for stereoscopic video

ada: wouldn't require WebXR spec changes, rather request inline session with head tracking

cabanier: seems overly complicated to couple to WebXR

bajones: may require pulling in HTML canvas spec

Siyaman: can't look around, but can see the sides

bajones: just like a portal / window

ada: almost all the way to <model> on visionPro

cabanier: could be used to emulate <model>

ada: looking for interest in <model>

<yonet> ack Brandel

Brandel: directly analogous to <model> with a lot of solution

Brandel: challenge from untracked context: may look more wrong than correct if not staring in the correct location

ada: one could perform monoscopic adjustments based on where the inline session is located

cabanier: agrees and comment that tracked would be better

Brandel: might be better to be based on WebXR

cabanier: WebXR would pull in a lot more than what is needed

bajones: stereoscopic without head tracking -> people will try to do it anyways

bajones: example, movie theaters

bajones: head tracking should be an option, avoid case where site requests head tracking to show any stereo content

bajones: WebXR -> vetted and tested, preferable over having yet another way to expose the data, recommends something similar to inline session in WebXR

bajones: personally prefer not to re-invent the wheel even though WebXR is over-built for this use case

cabanier: don't need anything new, three.js works with it

Brandel: XR controllers could be useful

cabanier: no 6dof position from the controllers currently

Brandel: pointer events could work, they have rotation and width

bajones: separate session per canvas could provide the matrices

Brandel: could adjust spec language for more nullability

bajones: would need some changes from how canvas works right now

ada: what happens if you write too big of a buffer?

bajones: the spec says to resize the buffer to the canvas

cabanier: could bypass standards via WebXR

ada: WebXR is a standard :)

Mike_Wyrzykowski: could use layers to avoid canvas size restrictions

bajones: makes most sense to produce content with WebGL / WebGPU, feed to compositor to try out concept

alcooper: would need to understand limitations, offline

<atsushi> sorry, my zoom client application on Win crashed...

Expose where buttons and other components are located on controllers #1428

<scribe> github issue webxr #1428

ada: made a tool to help annotate the buttons and such. Put info in both a json file and a gltf

ada: not showing a demo at the meeting but link is available from the GitHub issue

ada: if there are new hardware controllers that people want to add support for, they can do that/talk to ada

Brandel: are there file formats for semantically referencing transforms? Does GLTF?

bajones: on gltf, you can just use nodes. You can give different nodes names but not much in the way of accessibility

bajones: there are efforts in gltf for composing geometry

Leonard: gltf does support metadata at node level, which can be useful for accessibility

[W3C-wide survey] Impact of AI technologies on Immersive Web Working Group's mission #241

<yonet> immersive-web/administrivia#241

atsushi: w3c looking for impact of AI on our group/industry. If anyone has any thoughts/insights to weigh in on they should add it to the issue

<yonet> We'll be back in 7 minutes taking open mateverse browser questions

<Leonard> It appears that the room Zoom connection was closed.

<yonet> I see you on zoom Leonard

<yonet> You don't see the room?

<yonet> Ada is getting ready to screen share, we'll be back.

<Leonard> I see the individuals signed into Zoom. The screen share says Safari can't connect to the server

Site Rebuild #108

<Leonard> ... It is ada's screen. There is no audio. Everyone is muted.

ada: it's well past time to update the immersiveweb.dev website!
… instead of a table of supported devices that's impossible to support, it now shows what the user's browser supports
… and it has sections with a live webxr demo and info about the <model> element

Leonard: please include credits/license information

ada: already included!

Leonard: could be a good idea to run it through the W3C accessibility checker

Brandel: could be useful to have a deep link to the "does your device support WebXR" section

ada: looking for input on a demo to be the WebXR example on the landing page

<yonet> https://cdn.rp1.com/decks/MSF-Open-Metaverse-Browser-Initiative-W3C.pdf

<yonet> https://share.google/Fblhot1zuHMMNJzPm