Immersive Web CG/WG F2F - Day 1

Meeting minutes

<cabanier> Can we add https://github.com/immersive-web/layers/pull/273 to the agenda for today or tomorrow. 15-20 min max

<ada> https://docs.google.com/presentation/d/1iIWMt-jM1UToQ9Fo4KQz0g5JuqYbGdLvvENL4YBnCG0/edit?usp=sharing

<ada> winstonc:

Slideset: https://www.w3.org/2022/Talks/dhm-metaverse-workshop/

Metaverse Workshop - Dom

dom: there's been a lot of discussion about the metaverse

dom: folks have shown interest in bringing it forward. it's an online, immersive, interconnected space

dom: until a couple years ago you couldn't say the web was immersive, but that's changing now

dom: in terms of interconnection, navigation, social aspects, not as much there yet

dom: stil lwork to be done

dom: been talking t ofolks in the industry to understand what's missing in this picture

dom: the notion that webxr is a v programmatic approach that gives you an entire environment, creates challenges in the security model of the web

dom: this notion of providing a safe, immersive sandbox for the web has come up again and again

dom: we need to come up with good immersive navigation

dom: also interesting discussions around a11y

dom: if we really want the metaverse to be a critical alternative, all of this is important

dom: also we need strong interop

dom: web content itself also could be made more 3d without a full immersiveness

dom: repurposing 2d content in a 3d world with existing css props has also emerged

dom: likewise for scene formats, a more declarative approach

dom: if we're talking about collaborating in 3d we need 3d capture for videos etc. we have webrtc but 3d would need to be made to work in it

dom: if you're moving from one space to another you'll get a strange transition, so there may be a need to harmonize ux patterns, locomotion patterns, etc

dom: questions around identity and avatar mgmt

dom: and being able to transport assets

dom: here to suggest we run a w3c workshop to help bring many people to the table to share perspective, priorities, directions

dom: at least get a shared understanding of the direction

dom: we've done this a bunch of times before (20216, 2017, 2019)

dom: have a draft CfP

dom: contact me if you want to help

dom: oct/nov 2022 probably?

<dylan-fox> I'm assuming there's overlap in members but we should definitely include the Open Metaverse Interoperability Group https://omigroup.org/

<Zakim> ada, you wanted to ask about whether we need to change the charter

ada: we're just about to renew our charter

ada: do you think there's stuff we should add to our charter?

dom: we've got an open scope charter so we can do stuff like this already

dom: my expectation is that there will be a high chance most of the stuff will not be for the wg, more for the community group etc

unknown: i know there's a bunch of ?? working on similar things, leveraging web ?? etc
… i guess we should have some process ot make sure everyone is aware
… something to echo back is
… i've heard a lot of orgs around these metaverse concepts
… very unclear what they're actually talking about
… and when they say metaverse they actually mean nfts/etc, still not sure what we're talking about

er, wrong attr

toji: something to echo back is i've heard a lot of orgs around these metaverse concepts very unclear what they're actually talking about and when they say metaverse they actually mean nfts/etc, still not sure what we're talking about, we should alt least be clear on context and make sure we're not retreading

dom: yeah, recently we had a proposal to join the metaverse ?? group
… been a bit challenging, finding the right way to communicate things

toji: to be clear, not trying to point fingers, just that there's a lot of buzzwordy landrushy interests,very easy to overlook prior art

unknown: very good point actually
… part of that is not wanting to be slowed down

<dylan-fox> While we're waiting, just want to echo Brandon's point earlier about visibility & the difficulty of finding past W3C work. Been trying to harness the XR Semantic Web work for the ongoing Accessibility Object Model project and it's been very difficult. Feels like being an archaeologist somewhat

Successes

<ada> Here are the slides: https://docs.google.com/presentation/d/1ewsefsmLFKIv0fRExCf1VzgvkepSJnrxn76_c8LmWRk/edit#slide=id.g13c95719e2_0_0

<Zakim> atsushi, you wanted to discuss just for comment but AR and gamepads modules are pre-CR, waiting WG tasks performed...

<dylan-fox> Just requested access to slides

<cabanier> There were 2 more issues that were marked as f2f that are not on the agenda: https://github.com/immersive-web/webxr/issues/1276 and

<cabanier> https://github.com/immersive-web/anchors/issues/71

<Phu> lol, sorry for poking around the menu, @ada

<Phu> @ada, can you send out the link to your slides today?

Raw camera access API

nick: 8th wall trying to push capabilities forward. Lack of camera access has been problematic.
… for ar headsets would like to experiment with different effects that may not be provided out of the box
… things we see as important but not currently provided:
… custom visual effects. Lots of interest today. Example of using the camera feed as a kalidescope.
… also high quality visual reflections
… image targets: They offer things like curved image targets (for bottles, etc) that no vendor that they no of is providing today.
… responsive scale: Allows you to place content instantly without having to scan the scene for planes, etc.
… responsive scale covers 90% of use cases, absolute scale (where scale is always 1:1) covers last 10%
… When code doesn't have access to camera pixels these are all more difficult.
… face effects is another area of interest.
… Niantic, who just acquired 8th wall, has it's own suite of capabilities. Things like understanding difference between sky/ground/water
… for Pokemon, naturally. :)
… Also looking into hand/foot/body tracking
… Not the only team feeling this need
… Another company (missed the name) needed it for multiplayer syncing, asked Nick to advocate for him today.
… Proposal from Chrome team is a proof of concept. 8th wall was able to use successfully.
… Question today is that given the wide range of use cases and where we see future hardware going, what are the next steps for moving this forward.
… Have it on the roadmap, but would like to make it more concrete.

(Audio issues)

Simon Taylor: Creating multiple WebAR projects, would like to use WebXR/ARCore. Lack of camera access has prevented them from making the move.
… Would like to have more control of presentation (ed: Not a session mode switch?)
… Current API is really mobile focused.
… camera is aligned to the frame. On headset it needs to be predictive and camera is not perfectly aligned.

<alexturn> Previous discussion about handheld camera API vs. HMD camera API: https://github.com/bialpio/webxr-raw-camera-access/issues/1#issuecomment-816395808

(bajones: Sorry, missed some of what was being said next.)

<Zakim> ada, you wanted to clarify the blockers

(bajones: Wondering if not having a separate immersive mode would help Safari implement the API?

ada: Wants to give background on TAG feedback.
… it's a "giving away the farm" type API.
… Finding ways to inform users of the privacy concerns of camera access can be overwhleming.
… don't want every experience immediately jumping to camera access-based solutions.
… Users don't read dialogs
… Relates to Rik's suggestions for simplifying entry to XR sessions.
… Removing some normalization of the WebXR permissions requirements lets UAs put more emphasis on the most "scary" scenarios.
… Suggestions of fuzzing camera access to make less easy to abuse, but probably also affects usefulness too much.

piotr: Talking about TAG feedback and how to approach
… important to involve the right people in the discussions
… We can try to figure it out, but UX/privacy people aren't on those calls
… Shouldn't add normative text around permissions flow for that reasons
… Need to make sure that its something browsers can experiment with
… Maybe one browser sets a defacto-standard that the others adopt as well
… Need to make sure user is informed of camera access in the same way as get user media. (Icon showing access, for example)
… Wanted to ask Nik about what aspect of Chrome's implementation doesn't live up to an MVP state?
… Good to hear that it works for partial use cases, want to know what the misses are.
… How do headset-based implementations factor in? API currently says it's mobile-targeted, but don't want to close the doors on headsets

<alexturn> Previous discussion about handheld camera API vs. HMD camera API: https://github.com/bialpio/webxr-raw-camera-access/issues/1#issuecomment-816395808

alex: From pasted link, there was a comment about gaps from current state to MVP
… this is one of the places where the needs for mobile/headset are hard to normalize
… headsets have different offsets, latency, exposure, etc.
… At this "raw" layer of the API some of the backend details start to show, and maybe that's OK?
… Comment has a proposal for one way to get this information into the WebAPI
… Curious what other people think. Conclusion from last year was maybe two API shapes are needed?

nick: Responding to a couple of things. For the current API, sub-MVP claims were specifically aimed at headsets. Works well for mobile.
… think it would be a mistake to separate API. Think there's a way to modify the flow fairly simply to work well for both environment.
… Different clock that frames can be on. Texture currently associated with XRFrame.
… if we could decouple that would go a long way towards addressing issue. ARCore can still get one callback per frame.
… Need a timestamp to build history of extrinsics to project into the world
… Also need camera field of view matrix. But decoupling frame timing would be most important, metadata would get us the rest of the way.
… Re feedback of giving up the camera feed, it's a problematic point of view for 8th wall.
… If they had to wait for browser implementation for everything wouldn't be able to innovate for their customers.
… Get user media provides a good existing proof where there's a lot of useful things happening with it and users are appropriately informed.
… No issues with current user flow for getUserMedia
… lots of real end0user value here that's not met by one-off approaches aimed at specific hardware.

klaus: TAG/privacy reviewers are not convinced that users are making informed choices with existing APIs. Feel that getUserMedia is too powerful.

<bialpio> (FYI: I'll need to switch over to a different room in ~1 minute as I think I'll lose this one)

klaus: For marker tracking, concern is that exposing platform capabilities yields unpredictable results around what is tracked.
… One issue with the current API shape is that extending to meet other use cases is a "slippery slope".
… lots of nice properties about how things work on mobile (implicit alignment, etc)
… If the camera feed has a different crop from what's on screen that may have privacy concerns.
… If you want a more generic API that's more powerful, it could slow down the delivery of any API at all due to privacy concerns.
… Maybe don't see it as two separate APIs but the more tractable first step
… Also, we used to have an inline-ar, which was not a good API but allowed for not going fullscreen.
… don't have background on why we removed inline-ar.

ada: TAG really does view WebRTC's current capabilities as overstepping.
… They may be looking at those APIs again
… Back when the API was developed was just for video calls. The idea of AR on top of it wasn't considred.
… I think raw camera access is essential. Extensible APIs are good!
… Having a higher-level API does help our messaging, though.
… sites should only need the "scary" API to do advanced things.

<Zakim> ada, you wanted to talk about WebRTC

simon: The problem with the privacy thing is that not having it on top of WebXR will move people to other implementations like 8th wall that have (hypothetically) larger potential privacy concerns
… Alex's proposal for decoupling frames sounds good.
… As a company we're not interested in privacy invasive use cases, obviously.

alex: Talking about knowing what people want. Seems like we know what the industry want. Question is are we asking them to wait for something in the platform that they're not going to be able to use.
… Can't have it both ways. "Need to use this thing first, but then we won't have the funding for the next step."
… Maybe we can limit things like camera crops on mobile so what you see is what you get.
… Wondering if what makes sense is to agree on general shape and then figure out how to enhance privacy rather than weaken the power of the API.
… Defend the power that we're giving by explaining that it's not for a one-off use. Not seeing the path for how to get TAG excited.

nick: Echoing what Alex said. Heard a lot about "TAG has concerns." That's fine, it's part of their job, but our job is to push back on the push back.
… their concerns are legitimate, we need to work through them to deliver something that improves on user consent AND meets developers needs.
… if some of that compromise is having both a low-level and high-level API then we could try that.
… wouldn't want to land on a situation where we make that compromise but then only get one half of it.

<dom> [I think prototyping the higher-level API on the raw access one might be a good way to explore whether there is a value in that mid-level idea]

<ada> bajones: I think that klausw who brought up we used to have inline-ar but didn't know why it was removed

<ada> i want to shed light on it

<ada> we back away from it because the privacy groups we worked with were concenred that users would be concerned that there is a camera feed on the page which they didn't opt into

<ada> the immersive-ar was designed so that, it would make it clear that the camera was beign used in a particular context as well as the additional tracking data

<ada> there is also the issue how much money and effort we can put into implementing multiple things since Google has scaled back their implementations and we may end up with only one of two implementations

bajones: I think it's useful to listen to the TAG but I am not ure it's fair that the WebRTC mistakes should reflect negatively on what we are trying to do. The onus is on us to show that we are not making the privacy situation worse but we are making it better

dom: Pushing back on the push back is perfectly OK.
… Maybe a useful olive branch is to use raw camera access to demonstrated the desired use cases for the TAG.
… If doing a high-level/low-level approach is not feasible using the existing capabilities is good for demonstrating need.

ada: It's lunch time!

<tangobravo> See you all tomorrow... more inline-ar chat at 10am :)

<yonet> Hi everyone, we are running 10 minutes late for Depth sensing. Sorry about the technical issues and losing time.

<yonet> agendum: https://github.com/immersive-web/administrivia/blob/main/F2F-April-2022/schedule.md

Depth Sensing & Occlusion

ada: wanted to discuss merging depth sensing & occlusion
… I've only seen used depth sensing used for occlusion
… anything other usage anyone wants to report?
… any concern / support with merging them?

nick: in terms of depth sensing, I've seen a lot of good applications in native implementations
… which would be great to bring to the Web
… #1, physics: if you know where the surfaces are and their direction, you can have things bouncing from them
… constract a mesh from depth map allow for better interactivity
… another use case is for example scanning apps to make 3D models of your environment
… they tend to require a combination of depth & image API
… it provides a low cost way to generate 3D models
… so more than just occlusion

Ada: if real world geometry as complete as depth, would it be a better fit for these use cases?

nick: would depend on the shape of real world geometry & the details
… if it comes with a detailed enough mesh, it would probably be OK for interactions / physics
… not for scanning

ada: 3 options: separate depth & occlusion; merging depth into occlusion; vice versa

nick: if you solve occlusion with depth, that would be sufficient

ada: it's a pain to use depth for occlusion

Piotr: ArKit implements a mesh api powered by their depth API
… ARCore has a special variant of hit testing powered by depth
… re occlusion vs depth, there are privacy aspects to this
… in chrome, the depth buffer has limited resolution

<yonet> Zakim track queue

Piotr: if we had an API for occlusion that the site cannot access, we can probably provide a higher resolution API
… that may be an advantage of having both APIs

cabanier: the quest has very limited depth sensing primitives
… the planes API could be used to sense the walls, ceilings, a desk or a couch
… we could introduce it to e.G. help put content in a room

Nick: for the quest, to what extent passthrough contemplated for WebXR
… you could image an experience which as you walk through the room it renders your couch in the virtual space

ada: the planes API is part of real-world geometry?

Rick: no, it's its own specification

Ada: do we need to add it to the charter?

rick: if it's not, it should be added

piotr: it's available in Chromium behind a flag; but depth would give more detailed information than the Planes API
… we leverage ARCore that is limited to horizontal & vertical planes

Alex: we're interested to help with occlusion
… depth is challenging to implement in hololens

cabanier: do you think the mesh is high quality enough for occlusion?

alex: we use it in native apps
… it's more about expressing it as a depth frame vs a mesh

piotr: that reinforces my sense of keeping occlusion
… otherwise this would require double code paths for managing occlusion

Lower Friction to Enter

Josh: we see lots of different entry points used by developers, e.g. "enter VR" buttons
… we've been looking at ways of exposing this in the browser chrome to make it easier for users to identify & recognize
… once the user clicks it, they enter the WebXR screen

cabanier: a possible implementation would replace requestSession, possibly with an implicit associated consent
… this separate also helps with providing a consistent approach, that doesn't depend e.G. on the viewport size

@@@: what signal is used to have the button in the chrome?

<alexturn> offerSession?

cabanier: this would be via a new API that replaces requestSession

<manishearth_> +1 offerSession

@@@: having it declared as early as possible would be useful

cabanier: one of the challenge is to tie it with assets being loaded
… this is not just a signal that the site is VR-compatible

bajones: this has overlap with navigating into a Web page
… ideally this would be the same mechanism

piotr: how you handle rejecting the permission prompt, in case it can be rejected? will that be exposed to the web site?

<alexturn> offerSession returns a promise that completes or errors when the offer completes?

piotr: is a new API really needed? can we piggy back that requestSession is promise-based as trigger to expose the button

cabanier: there would no way to reject the permission prompt if we count clicking the button as accepting permission - you could always navigate away

manishearth: what the situation will be for pages with multiple potential VR sessions? I guess they would have to use requestSession
… but this would create fragmented approaches to entering a session
… that may be fine, but it's worth thinking about it

bajones: if people start relying on it as the primary way, this may break interop if a browser doesn't implement the chrome-based ux

manishearth: this could be polyfilled though

<yonet> ack Nick-8thWall-Niantic

nick: people being confused by the different idioms of entering VR
… user education is always a problem
… having it in the chrome creates different design challenges where you sometimes have to point users to parts of the chrome UI, but that isn't stable over time
… for multiple sessions, another way would be to use a VR navigation as a way to route the user on the different experiences

dylan-fox: thinking about screen reader users - you accidently click on a button that launches a VR thing that whisk you away from where you were on
… low vision people have complained about e.g. the pin used in Mozilla hubs that they can't use
… having it multimodal and undoable is important for accessibility

<Zakim> ada, you wanted to ask about if there is no browser chrome

Ada: developers using supportSession for fingerprinting would be exposed which may be interesting

<ada> dom: I like the proposal, having a consistent way to enter XR would be usful. I think it would be good have a declarative approach, I understand it's neccasary to have the Assets loaded but the page could have a declarative appraoch that "I support VR/XR/AR" it also opens up additional points regarding search and other meta data use cases. I also think it ties in nicely to Brandon's

<ada> Navigation

cabanier: very few web sites have more than one VR web site - can only think of a demo site
… in terms of accessibility, I think this would be an improvement - users wouldn't have to hunt for a button, the browser provides a direct access to the VR experience
… the declarative markup approach with a dimmed button until assets are loaded is worth investigating
… and +1 on exploring the questions of overlap with navigation
… overall, hearing support with exploring this

AlexC: interesting & potentially useful proposal; some concerns about removing a permission prompt - I think that should be left to the user agent
… we should also expect sessions can fail

<Zakim> bajones, you wanted to mention that accepting the session could be "higher cost" than similar actions in the same space, like setting bookmarks.

bajones: looking at some of the icons that sit along side this in the example demo
… many of those are mostly low cost - e.g. creating a bookmark, or opening a menu
… entering VR is more disruptive
… if you do it accidentally
… This may argue for leaving some friction here
… re declarative approach - separating the question of being VR-capable and being VR-ready would also be useful in the context of navigation

<dylan-fox> Re: Icon - Interesting to reference experiences like Sketchfab https://sketchfab.com/3d-models/tilly-0e54f44e56014e079572207a29788335

cabanier: +1 on this signal being useful for navigation
… +1 on not being prescriptive on permission prompt
… we can run user studies to find the right approach

ayseful: would be really useful to share the results of these studies

<Zakim> ada, you wanted to ask about the loaded state

ada: having a "ready to show content" signal useful for this context; that signal would also be useful to launch in a PWA context

<ada> unconference page: https://docs.google.com/presentation/d/1iIWMt-jM1UToQ9Fo4KQz0g5JuqYbGdLvvENL4YBnCG0/edit#slide=id.g1256acf68be_6_0

<dylan-fox> ^just requested edit access

Does WebXR need a permission prompt?

Ada: thinking about this as way to help get raw camera access past the tag

Put a lot of effort into making core parts of WebXR available by default - wary of using big hammer for small teacup

Save permission prompt for APIs more worth getting user's explicit consent on
… Let users distinguish between what's important and what's not so they can make more informed decisions

If we ask for permission for every thing, they won't notice that others are more invasive

(btw let me know if there's syntax I should be using as scribe)

Nick - what's the prompt?

Ada: would suggest that permission requests for certain modules in WebXR umbrella to be non-notifying, non mandatory

browsers could choose not to show permission prompts for certain things

Josh: like entering a VR space?

^can do

Ada: is making certain requests non-normative a good idea?

___: not against spirit of the thing, but against non-normative part; may make security experts mad at us

Ada: so "may" rather than forced?

Manishearth: Language of spec doesn't talk about prompts exactly - prompts are just one way to get permission. Browser can also say "you already have this permission"
… Can make that more explicit, let people that have opted in bypass it
… State of internet is already that it's opt-out

<alcooper> I think this is the section Manish is referring to: https://immersive-web.github.io/webxr/#user-consent

Manishearth: Need non-normative text saying which permissions are granted

Piotr: Consider that some of the things we want to ask for consent about won't be in advance; be very careful with idea of "implicit consent"
… API could be configured in advance in settings, and browser could view that as consent

Alex: differs based on form factor; for mobile browser it may be common to use AR features, whereas headset may be more rare/disruptive to jump into an experience
… in favor of giving ourselves more flexibility; fine with doing that in ways that are normative

Brandon: Agree that text is already in a state that allows for this; just need different measures such as explicit vs implicit consent
… Some features use is well understood to be covered by implicit consent, e.g. user clearly signals they want to enter an experience
… Text as written gives us leniency, esp when it comes to frequency of prompting
… Value to first few times that user goes into an immersive session - give them instructions on "here's what you're about to do, here's how to get out," other onboarding
… Once that's well understood we don't need to announce every single time
… direction should be normative text only when helpful; consider on case by case basis
… Going into XR on headset vs phone vs desktop is different
… May not even realize that the headset on my shelf is lit up

Rik: way spec is written gives a lot of flexibility
… don't even need user actions, necessarily; could just go straight to VR
… OK for the browser to decide

Ada: so you're saying we're already in the situation I'm asking for. cool

Manish: to add to what Brandon said, implicit and explicit consent... spec never mandates one or the other
… concepts were there as hooks so we could discuss
… How do you do a permission or explicit consent in VR?
… We could mandate explicit in some cases but I don't think there's any situation we'd want to do so across the board
… Many different levels of trust

<dom> dylan-fox: when it comes to explicit consent in VR, we probably need to think more about whether that's a pop-up, a specific gesture, ...

Ada: next topic is scent-based peripherals

Scent based peripherals

Alex: smell input or smell output?

Ada: smell input

Ada: smell output

Ada: can't remember who put this on the docket but there are companies designing these

Brandon: don't want to be personally liable for the failure case

Nick: way to use game pad API to implement this?

<dom> Smell-o-vision (e-nose) #74

Ada: can leave this to webUXD (sp?) or web bluetooth

MichaelB: difference between that and audio?

<dom> dylan-fox: alternative to motion controls is useful in the context of accessibility

Dylan: is there merit to discussing switch control or e.g. universal xbox accessible controller?

<mkeblx> webUXD => webUSB, also webHID (https://developer.mozilla.org/en-US/docs/Web/API/WebHID_API)

Brandon: within gamepad API there are recommended mapping; Chrome has simple "map this to A" type features, whereas Steam lets you assign anything you want
… Not sure if you can reassign motion to button presses
… At that point browser sets motion controller like any other
… That is generally right level for accessibility controllers to come into play
… Don't want to broadly advertise that someone is using an accessibility-focused controller - don't want to expose people to fingerprinting
… Avoid broadcasting "I'm a user with a disability"
… Let user be in control of information
… Wish those capabilities were more widespread

Ada: is that the type of thing that would work well on the Quest? E.g. plugging in a Microsoft accessible controller
… and using it as a VR controller
… In terms of operate functionality to allow remapping to let people with accessibility requirements to use non-quest controllers, but scene doesn't know it's a non-quest controller

Rick: could be hypothetical because we don't track that right now

Brandon: website says it's possible to hook up a non-quest controller but might show up as regular gamepad
… showing up as generic input might be outside of capability

dylan: would be great to be able to support e.g. mouth controllers, or to have Steam VR Walkin Driver style functionality w/o notifying system of disabled status

Nick: made joystick input control that worked across desktop and mobile headsets; on desktop could use Xbox controller
… Found that on Quest and/or HoloLens, xbox controllers worked well in web browser until you entered VR, at which point they stopped working
… counterintuitive that existing gamepad API stops itself from working and letting you use these controllers in an immersive setting
… Really fun to take an xbox controller and run a character around, using joystick to drive virtual content
… Why is old-school gamepad disabled in webxr? What would it take to get xbox controllers as xbox controllers in webXR as gamepads usable for this use case?

Brandon: not aware of anything blocking the api from working
… A normal gamepad will not show up as one of the xr sessions because we want to differentiate b/t inputs that were specifically tracked
… if it's dropping gamepad when you go into a VR session on any device, that sounds like a bug, and it should be filed

Nick: can't remember if it was hololens or quest or both

Brandon: hololens I could see them doing a moat switch, because it goes out of its way to normalize input across different modes
… but spec-wise, nothing should prevent that from happening

Nick: in that case, I'll sync with my team
… gamepads are working well within XR, they're fun

Manish: have had opposite discussion, of whether XR controller should be exposed to navigator and get game paths
… existing gamepads should just work; if they don't it's a bug in implementation

Rick: didn't know xbox controllers were supported, would like to investigate
… for VR API hardcoded controllers/hands, but for OpenXR it should just work

Ada: nice to have a conversation around compatibility and accessibility
… now for a 20 minute break

<ada> sorry everyone it just got pointed out the unconf doc was view only

<ada> it's now editable!

<ada> massive over sight on my part

<ada> here is the link feel free to add things: https://docs.google.com/presentation/d/1iIWMt-jM1UToQ9Fo4KQz0g5JuqYbGdLvvENL4YBnCG0/edit?usp=sharing

Ada: Marker tracking, yay!
… Wanted to bring it up to get status, see if there are blockers

Marker tracking

Ada: tracking is important b/c one feature people want is shared anchors, which is a pain to implement but very important
… marker tracking would give us a version of shared anchors w/ requirement of having a physical object or at least something drawn on a physical object

Alex: Marker tracking in sense of having some marker is key; support QR code tracking on hololens using head tracking cameras
… Vuforia and others will use other cameras
… Very interested in lighting this up so people can track against known qr codes
… particular feature is QR code tracking
… trying to zero in on right feature subset

Klaus: 2 things; one is that it may make sense from API perspective to distinguish QR code from other types
… Avoid surprise that you won't know if images end up being trackable
… avoid case of having universe of mutually incompatible tracker images
… may be difficult to add features like analytical surfaces unless underlying platform supports it
… concerned about launching without clear view of how it's going to be used

Ada: already have implementations for camera access; you said marker tracking will take a while?

Klaus: no, saying there may be new requirements like tracking cylindrical surfaces or other things not offered by line platform
… if other line library exposes it it may take a while to get that new type of marker standardizes and available for wider use
… Possible if there's raw camera access you can do marker tracking by running your own javascript code
… Apart from privacy issues, could use e.g. full field of view of camera for tracking incl parts that are not currently visible on screen

Ada: raw camera access a blocker?

Klaus: not in a technical sense, just somewhat entangled

https://www.navilens.com/en/

<ada> dylan-fox: one use case I wanted to bring to people's attention is that when it comes to love vision navigation that there is a group called navalens that uses rgb qr codes that are more easily visible and allow people with low vision to navigate a public space it's used in Barcelona public transport

<ada> ...there are lots of different types

Alex Cooper: there is a disjoint in what platform can support and what people want wrt QR code tracking
… Don't know how well ARCore would do with e.g. curves
… This is a case where raw camera access is a lot more powerful and guaranteed to span that whole set of things that people are looking for
… Features are entangled from a roadmap perspective; raw camera access could be a way to fill in the gaps that even a full implementation of marker tracking may not be able to meet

Ada: so if chrome had good support for raw camera and it could also do QR codes well, you could use image tracking using raw camera access under the hood?

Alex: That's the thing we're measuring out now
… Sounds like some of that might be available right now through raw camera access
… It's less privacy preserving but I don't know all the runtimes right now can support the breadth of markers we want to track
… Don't know if there's a runtime that meets developer requirements with marker tracking

Ada: Is there any tool that supports QR codes?

Alex: I know ARCore does not support QR codes, not sure about hololens; seems like Microsoft preferred them

Alex Turner: from platform perspective we have qr codes but not image tracking

Ada: they keep telling us to do something alongside raw camera access, but if we tell them there is no overlap in the platforms for the different types of images we want to support
… We can tell them that raw camera access would enable platforms to build the gaps, enable more
… Could give us leverage to getting raw camera access done

Klaus: currently no overlap; one of us looking at AR Kit capabilities but haven't heard back from Apple
… don't see feasible path to get something to marker tracking api because it's a somewhat niche market type
… would need browser side implementations that seem quite difficult vs very doable through raw camera access
… if you don't have raw camera access there's no way you can do anything at all

Dom: if there's a way to expose tracking capabilities to developer...
… say you provide a way to run a shader on raw camera stream
… without access to raw camera stream, developer could identify things based on shader
… use case is tracking, not other kinds of processing
… perhaps we could give that type of optimized focusing on real camera stream without hard coding specific things
… should be up to developer to provide right tool to do the processing
… developer could offer code, then client returns results

Ada: reminds me of ARJS, and you put your image in a special trainer, and it generates a textile

<Zakim> alexturn, you wanted to talk about the polyfill idea

Ada: Take your image, put it into a tool someone develops, returns a blob of code you run and it tells you things

Alex T: thinking of something on those lines; how do you restrict output? Would need to provide forcibly low bandwidth output
… make sure that what you get out is not smuggled data exposed to outside
… need to figure out how to limit expressiveness of code stuffed in the box

Dom: That's why I'm thinking of a shader

Alex T: I know some people used shaders to do e.g. timing attacks; would take a lot of security research
… In the interim, I wonder if we could do the polyfill based on marker; e.g. polyfill that uses webcam on device could recognize multiple images
… We write a proof of concept polyfill to show how to do it without one or the other
… Warning you get is scary if you have to use the camera stuff, but it could be simpler if you don't
… but who's going to write the two halves of that polyfill?

Ada: probably bundle both polyfills with the browser

Alex T: if we had someone to write it we could open source the code. But still looking around to see who has bandwidth to do it for free

Ada: that's a big blocker, I know many of the people here are not paid to work on OpenXR

Alex T: could hook up the API once the heavy lifting has been done

Piotr: 2 comments, first one is related to how we can provide some kind of secure end for the CV algorithms to run
… Been trying to chat with people about challenges there; seems like main worry is side channels we can't fully patch up
… If there's already something like a secure claim or something that provides us that kind of capability into web platform I'd be very happy to use it
… but very concerned about how to devise this kind of mechanism; outside of my expertise
… Might not be feasible at this point but it's a topic that recurs every time we chat about this API
… Other comment is that it seems like we can try to leverage raw camera access as a way to prototype things on web platform
… Maybe then we can have ammo to justify moving some use cases into web platform, like building to browser as opposed to saying sites can do the same thing in Javascript
… QR codes, tracked images, curved images being popular could be enough to justify putting the work into doing it in the browser
… Let's see which ideas are popular and maybe for the ones that are super popular we can add to platform

Ada: Slight worry that many APIs implemented low-level version then people with passion/energy/bandwidth moved to other projects
… Might ship raw camera access then stop there
… Would be nice if there were simple alternatives
… If it's not done in 5 years I'll keep bringing it up

Klaus: most of what I wanted to say was covered; about doing this as Polyfill or browser side, there are libraries such as ___ which claims to be able to do this
… seen it working for some; an experiment on these lines could be helpful
… If you have a software limitation that can handle one type well, that could be sufficient
… One application wouldn't need to support two kinds of markers if it has one kind that just works
… If there's a lower latency or uses less power or something, that would be a reason for people to move to using the high-level API over the raw one
… doesn't mean we shouldn't be doing the low level API

Ada: never suggested not doing the low-level API, just want to make sure we don't forget the high level one

Nick: A couple of responses; first, talking about providing shader access to images
… As a data point, when we're doing image tracking, the output of shader we run is very different than the original image, but is a 1024x1024 texture; not low bandwith
… A lot of information we use to do subsequent processing
… The idea you can do all your computation in a shader is not how WebGL works
… Not really sure what the cool extra things you can do in other systems but WebGL is about taking graphics and turning them into other things
… Another point was around implementations around things like polyfills and reference implementations
… Reference for polyfill would be very complicated; may include trade secrets
… Open source version is generally considered lower quality bar than ARCore or other solutions
… Not the kind of problem where all implementations are created equal
… Even for QR code tracking, there are open source libraries that do very good tracking but have nontrivial implementations
… Often take a high quality, complicated solution and use web assembly to make a version of it
… Your reference stack is a giant black box
… On topic of doing processing in GPU, when you do QR code scanning the only thing we're doing on a shader is shrinking the image before passing to QR code detector
… there are legitimate reason to get full images out of a camera feed

Ada: thinking about low quality, that's fine; there are lots of companies that make money by providing good SLAM built on top of camera access
… Some companies may need something like 8th Wall to provide more stuff than what the higher level API would provide

Nick: becomes more concerning when "official browser level polyfill" sets expectations around this is how things are supposed to work
… could be the thing that everyone uses, even though it may not work in the way you want
… Different expectations and standards setting vs finding a library on the web and using it

Alex Cooper: wondering if there is such a clear-cut line of what different runtimes can support or polyfill
… almost like having 2 portions where image/marker tracking would make sense
… may need to be concerned about fingerprint effect, but can get some hint of "you can do high level tracking of QR codes on this platform" if there is a dividing line

Klaus: the issue is that you won't be able to query if features are available until you actually start a session
… you could potentially see if we have required features like marker and image tracking
… if you put in both as optional features then you only know at runtime
… Would be nice if people knew what's available but the way APIs are currently designed you won't know until you're already in the session
… whether your image is trackable or not
… Not meant as an "official" polyfill, agree with concerns that we don't want to platform something until it meets some quality and support bar
… Moreso thinking of proof of concept
… Should work decently but not necessarily like state of the art performances

Ada: No last words? OK, I think we can wrap this up
… Not the answer I was hoping for but it's good to know the state of stuff
… Rest of the time was slated for Unconference topics
… There's a slide deck with a few topics listed

https://docs.google.com/presentation/d/1iIWMt-jM1UToQ9Fo4KQz0g5JuqYbGdLvvENL4YBnCG0/edit#slide=id.g1256a5d4206_5_0
… There are three topics
… Let's do 15 minutes break, then 10 minutes per topic, then done

Unconference

<laford> cabanier: Today we have input-profiles

Controller Meshes

...works ok. App responsible for going to the repo and downloading the gltf
… Some people copy them. Sometimes we want to update them.
… Implies a rename
… When you switch to OpenXR we received different joints resulting in broken hand mesh
… OpenXR could solve this
… Is it a necessary burden on the browser?
… Also strange that you have to go to a site to download arbitrary controller mesh data
… Some devs don't like it

alexturn: Currently discussing it in the working group
… If goal is removing the need for access to the repo, might need to do other things to finish the job
… It also contains data to map buttons to physical controller features
… People would still need to go to the CDN for that

cabanier: CDN wont go away, but have option to use stuff on the device

cabanier: For WebXR, totally reasonable to return a GLTF

alexturn: From OpenXR we're giving out the same binary data that is the repo
… part of the render model discussion is how much data to give out and what the capabilities of the model would be.
… Is it "here's a model and here's how to articulate it, or something else"

bajones: Could implement an effective polyfill
… The input profile repo is never going to go away as that's the location where we register the profile info as well as the meshes
… Its an open question how much we need to solve these problems
… You can presume the models might load faster locally

<Zakim> alexturn, you wanted to talk about the OpenXR mapping

alexturn: Would love to be able to encode the mapping from WebXR input-profiles to OpenXR interaction-profiles
… Would be fairly straightforward
… Have it be data-driven

bajones: Another thing is matches whatever OpenXR is doing and flow it up through the layers
… Would effectively consecrate GLTF as a web standard

3D DOM elements

alexturn: Difference between model tag (inline 3d DOM) and scene describing tags
… Maybe there is overlap and there is alignment
… Sphere is not as useful unless is part of a suite of scene primitives

ada: Additional CSS where there is a keyword that applies transforms in 'real space'
… e.g. in real space this object is X far out and Y skewed

<dom> @media immersive { }

ada: would prefer this over new scene description stuff
… Instead we can leverage existing css3D transformational stuff

bajones: Agree with Ada. If you were to do it you'd not pick up any CSS with 3D transforms. You'd break everything!
… People do stupid things to do what they want
… CSS already has lots of fundamental 3D capabilities

<alexturn> CSS Unflatten Module Level 1

bajones: Biggest challenge bar rewriting CSS implementation, is how to contain the scope
… e.g. how do you keep the volume of your page to something reasonable?

bajones: You'd have limits on how big volumes can be

<dom> [this sounds a lot like what CSS has to deal with for the "print" media feature]

bajones: Magic Leap has a definition for this stuff that may be leveraged

<dom> [where you can declare the size of the page you're styles are targeting, e.G. A4 vs US-letter]

bajones: Content beyond z bounds clipped

<Zakim> dom, you wanted to react to bajones to recall we were suggesting to wait for tomorrow for this discussion

dom: Reminder that we wanted to wait till tomorrow

Ada: "its cool beans"

cabanier: ML has 3D CSS, went through CSS working group
… Proposal, but backburner'd

<dom> https://github.com/w3c/csswg-drafts/issues/4242

Ada: Can still work on it and push to the CSS WG when it makes sense

<dom> https://github.com/w3c/csswg-drafts/issues/2723

<yonet> https://github.com/immersive-web/model-element

<dom> 5.5. Detecting the display technology: the environment-blending feature in CSS Media Queries 5

<ada> Here is the repo for 3D dom stuff: https://github.com/immersive-web/detached-elements

Accessibility & W3C Transparency

<dylan-fox> Immersive Captions CG Final Draft https://docs.google.com/document/d/1P-T5S9pDBbcAGrlJDvbzG0QBLTV1GfrtabfkmohZP6w/edit?usp=sharing

<yonet> https://www.w3.org/community/immersive-captions/

W3C immersive captions community group has put out final draft
… More about lived experience side vs technical implementation side
… Second link is project to define Accessibility Object Model
… Intended to make immersive content accessible
… e.g. alt-text for 3D objects
… Huge topic that a lot of folks are talking about
… Ensuring that people can leverage what we are working on
… How can we tie all these together?
… Should not duplicate work
… Want to make sure WebXR has the best shot it can in supporting accessibility

<dylan-fox> A11yVR Meetup - Apr 12 2022 - Building a More Accessible Social Virtual Reality World https://www.youtube.com/watch?v=yF4I263OiMs&ab_channel=A11yVR-AccessibilityVirtualReality

<dylan-fox> XR Semantics Module https://www.w3.org/WAI/APA/wiki/XR-Semantics-Module

<dylan-fox> XR Access Symposium, June 9-10th http://xraccess.org/symposium/

<dom> [the XR Semantics Module is part of of the Accessible Platform Architectures Working Group wiki FWIW]

<dylan-fox> Contact Dylan Fox, Coordination & Engagement Team Lead info@xraccess.org

<ada> ack

<Zakim> ada, you wanted to with captiosn and video

Ada: Captions on spherical WebXR layers has an oversight. Should have been part of the platform and not something users need to implement
… Implementation doable but need to do all user experience stuff yourself
… If I'm wrong and you can just have a video element with subtitles, that should work in WebXR as you attach it to a layer, but it might not look correct
… If it doesn't look correct, then that is an issue with video elements on WebXR layers

<dylan-fox> XR Accessibility project - open source resources: https://xra.org/GitHub

<ada> ?

<yonet> Zero Zero, 826 Folsom St, San Francisco, CA 94107

– DRAFT –
Immersive Web CG/WG F2F - Day 1

21 April 2022

Attendees