Immersive-web WG/CG TPAC 2020

Meeting minutes

Performance Improvements (@toji, @manish, @cabanier)

bajones: We talked earlier this week about what's left to be done in the performance improvements repo

bajones: We could call it entirely or assert that we've addressed it

<bajones> https://github.com/immersive-web/performance-improvements/issues

bajones: I scrubbed through this to write down where we are on individual things

bajones: First is hidden area mesh

bajones: This masks off a certain area of your output surface that won't be displayed, so you can safely not render those pixels

bajones: Not sure if Oculus does this, Valve claimed 20% but maybe that's lower with reprojection

bajones: OpenXR has this - not sure how widely it's supported

bajones: Something we could still expose

<klausw> clarification: when using reprojection, having a larger rendered area is potentially useful since content might become visible

cabanier: Kip says fixed foveated rendering could get the same benefit

bajones: Yea - and fixed foveated rendering is just a bit to flip, vs. having to do a prepass

alexturn: Gives a benefit if your app is fill-bound

alexturn: Valve and MS at least had it in their native APIs, even with reprojection

alexturn: Oculus, Valve and MS are supporting the extension in OpenXR

bajones: Good to keep this in our pocket - nobody asking for it yet

bajones: Enable devs to select between multiple view configuration

bajones: For systems like Pimax, it would be ideal to render all 4 views

bajones: But if app isn't ready for that, it can render in a compat mode

bajones: We do have "secondary views" in the spec

bajones: If you leave it off, you get two views

bajones: If you turn it on, you can get extra views

bajones: Not just for multiple views per eye, can be for first-person observer too

bajones: Less flexible than the OpenXR approach, but it's also less fingerprintable

bajones: From my perspective, this is actually solved - waiting for further feedback

klausw: Correction, the Pimax has two screens - its compat mode is about parallel projection

klausw: We don't have a way to give apps that parallel assumption in WebXR - perhaps that's OK?

klausw: Could also want to let apps avoid double rendering if the secondary view overlaps the primary veiw

alexturn: Secondary views seem good for HoloLens needs

alexturn: Not sure how this works for Varjo though where the two inner views change if you opt into 4 views

bajones: The spec says primary views are required to get an experience and secondary views aren't necessarily needed

bajones: Technically meets the spec text since you'd still get something in the headset if you ignore secondary views even after enabling it

bajones: Multiple viewport per eye support for lens-matched shading, etc.

bajones: I believe the layers module already covers what we intend here

bajones: Maybe foveated is good enough for now

cabanier: The layers modules does support foveated rendering with just a bit you flip on

bajones: Is it a bool or a float?

cabanier: It's a float, to control the amount of foveation used by the compositor

bajones: Being able to say "ahh, I just want a bit of foveation here" is probably the right thing for the web

cabanier: Definitely seems easier

bajones: Some of the lower-level techniques get pretty arcane

bajones: Some techniques have fallen out of favor - the new technique lets you render another texture to say what shading detail to do in each block

bajones: For supporting headsets beyond 180 degrees, libraries could get confused and frustum cull wrong

klausw: Can we force some of the non-weird configurations to be more of a weird config to get people to think consistently now?

cabanier: The way three.js is doing this today is wrong for Magic Leap, because they assume it just points forward

alexturn: Windows Mixed Reality has a "jiggle mode" feature which can randomize the rotation and FOV

alexturn: Could be used through a UA to test WebXR engines for correctness here

<ada> https://github.com/immersive-web/administrivia/issues/142

ada: We'll go through the remaining perf topics on Tuesday

Anchors (@bialpio, @fordacious)

bialpio: do we want to discuss cloud anchors
… the biggest issues are already solved with regular anchors

bajones: yes

bialpio: right now there is a way to create persistent anchor to persist across sessions
… is this something we want to do because the charter says that we shouldn't focus on this
… will you anchor from your phone work in a hololens
… I don't think we can solve this because it implies some sort of format that can serialized across sdk, devices, etc
… I don't see how this group could solve this without support from another platform
… do other people have thoughts?

alexturn: this is a though one
… in openxr we have the same issue in discussion
… even if we agree on the format
… it is not easy. some vendors have a cloud
… if you're on an arcore device, everything comes from that
… with anchor, there are google cloud anchor. Spatial anchors work on arcore
… it starts to feel less like a platform thing that you want to do with a browser

<Zakim> bajones, you wanted to ask Alex if there's been and OpenXR discussion about support for anchors/cloud anchors

bajones: is the topic of cloud anchor brought up in OpenXR

alexturn: yes, this topic was brought up there
… those concerns stalled a solution

bialpio: maybe we can build on the image tracking API
… maybe we can push it into the frameworks
… can we describe a format that is understood by all the framework?
… and emulate the cloud-iness. Is this something we could do
… or do we think that is something we can't. Is it feasible?

alexturn: what pieces need to be in place for azure anchors
… on some platforms it might be ok
… but I don't know how you would do it
… we would need some type of web api like get anchor-blob
… and it would be the azure spatial blob
… but how would that work on an ARCore
… the cloud SDK that the developer takes in, could use image
… the SDK could make the right decisions but it's not obvious how it could be done in a common way

bajones: I can't figure that out in a reasonable way either
… with android, there's an arcore that works on iOS
… so there's nothing magical there
… but even that is not quite good enough since Safari will be using ARKit
… the only way this works if there is a common serialization format and we can piggyback on that
… or we do a ton of work, most systems throw their information in the cloud
… which is not going to work for us
… we need to come up with our own backend and then somehow push that to the system

bialpio: going back, we don't need to run our own 8th wall algorithms
… maybe we can do a point cloud API
… a serialized way to describe the anchor
… there is a delivery method
… which should be independent on the cloud
… so we're not reliant on the cloud
… the challenge is if it's possible for us to get this serialization format
… how much help will we get
… and maybe we need to be smart
… to get a distilled version of the anchor

alexturn: maybe this will happen when the industry is ready to converge
… are there trade secrets?
… even if we're exposing it, are there vendor specific extensions
… the blob itself should not be vendor specific
… I'm unsure how I would have the conversation
… what should be in the blob?
… on microsoft cloud,
… google can provide extra data to their anchors but not so on iOS
… what ever signals are available (???), and then if each platform has special sauce to make them even better, I'm unsure how we'd extract that

bialpio: I agree. It might be too early to talk about this
… but this is a common request
… I think a part of the use case, could be solved with image tracking
… it still might be good enough
… we should have partial response for more persistent way to get it across sessions

alexturn: for vendor stuff, it feels like an OK place
… in practice developes, they grab a specific SDK
… handling that at the library layer
… I wouldn't be upset if we land there

ada: it would be great if we could have a talk
… among vendors
… this group could be the venue
… or if we need to change the charter
… then maybe they can find a common ground
… it feels similar to byte assembly
… so it would be really great and as a chair I would like to help

cabanier: ...

bialpio: if we're living in the world, where there are cloud anchor
… we can push it down the stack
… is this something we would like to do?

alexturn: the question is by what mechanism, how would azure anchors end up on android
… or is this be served by the web developer
… if the privacy impact is impacted (???)
… how do you decide who's allowed to be in
… local anchor persistence
… hololens has several versions of anchors
… I'm sure if everyone agrees
… but maybe the web could abstract

<Zakim> ada, you wanted to talk about persistent anchors, and promotability when api support

alexturn: so we don't have to give the blob to the developer
… the underlying thing that we're abstracting is specified in web assembly
… but here the blobs are opaque and vendor specific
… and maybe that is where the stepping stone are available
… eventually people might converge
… but that conversation needs to happen first
… I wonder if people will just wait and see
… I'd love to see cloud anchors happen
… but we need to have blob format agreement

ada: it would be great to have an intermediate step
… it would be good to have persistent anchors first
… and it could be something we could do before we do cloud anchors
… it would be great if we could have buy-in from vendors
… I'm wary about vendor specific solutions

bajones: persistent anchors sound a lot like shader caching
… but on all the browsers, if you pass the same shader string, you can pull the prebuilt shader out of the cache
… but it only works if you're on the same OS, driver, etc
… it's circumstantial but we can experiment with it first
… the web is full of people that can figure out how to do something useful
… you can look at cloud anchor blobs
… because the platforms controls the storage of the anchor on the device and the cloud
… and I'm not convinced that the native formats will stay stable
… it would be an interesting exercise to get the vendors together to freeze their formats
… persistent anchors is a more realistic goal

alexturn: bajones covered what I was going to say

<alexturn> My favorite YouTube channel for this stuff: https://www.youtube.com/c/RetroGameMechanicsExplained

alexturn: looked into lighting up babylon native on HoloLens, porting to babylon.js

RWG + Depth Sensing, Module Breakdown (@toji)

alexturn: look into what future modules for RWU could look like
… how to split up modules that deal w/ generic topic of RWU
… lowest common denominator features - in WebXR, hit test
… following on, when doing the hit, hit is on a plane from RWG API, enumerate RWG objects
… two other topics came up since then:
… assuming there's hit test, and a RWG module for planes / meshes
… depth API came up, camera aligned or correlatable with the color image
… could we fudge something similar enough for hololens or magic leap
… where camera isn't primarily used for rendering, but could get depth from mesh
… would this be used for occlusion?
… demos: duck behind chair or table, creature can pop out behind table
… is depth API or RWG better for this?
… optimal API for mesh for analysis purposes is perhaps different from mesh for occlusion
… if occlusion doesn't quite fit into RWG, may be too slow
… maybe another bucket that does more for you - an "enable occlusion" API
… similar to hit test, tell me where the hit is, doesn't care about underlying API
… for occlusion, could be based on depth or mesh. Do it silently in background and it just happens?
… does it fill in depth buffer (privacy risks)
… could benefit from being tech neutral and more abstract

bialpio: breakdown of modules was mostly done for editorial reasons
… not really trying to merge different aspects of the same concept
… all are different aspects of RWG or detecting user's environment
… just made more sense to create a new module w/ new feature to reason about user's env
… could decide to have in same spec or keep separate, I don't have a strong opinion here
… a bit tricky from my perspective to decide if this is the best approach, don't see it mattering that much

alexturn: let me change the question
… not so much about module breakdown, more about API
… one path for depth sensing, one path for a boolean occlusion=on/off
… is image tracking / marker tracking part of RWG/RWU API?

bialpio: currently, we've been looking at use cases and finding APIs to fit them
… not so much looking at cross dependencies between modules. Could do more to connect them.
… i.e., once we have plane detection and hit test, could have hit test refer to plane
… what if plane is also part of a mesh?
… haven't been looking so much at modules as a part of RWG, more use case focused.
… can see how to make progress here

bajones: answer ada's Q
… write-only depth buffer possible? short answer yes
… webgl framebuffer has no mechanism to read back from framebuffer
… only if it's based on a texture
… this is different in layers module where depth is a separate texture that can be sampled
… in terms of boolean occlusion on/off
… obstacle: with depth data coming out of ARCore currently (which may not be representative),
… it may be too noisy to use for occlusion. Doesn't get written into depth buffer.
… It's put into texture, doing a feathered fade depending on how close it's to object
… search for "tiger" in Google Search
… uses native app to display it
… not a crisp line when hidden behind table. a bit noisy, falloff.
… tiger gets more transparent when getting closer to table.
… there's documentation for this on the native side, fairly complex change to rendering
… gaussian blurred depth output, not nearly as simple as depth buffer which would look really bad
… have doubts that planes would give a good occlusion result, tend to overextend
… like the idea in theory, but unclear how it would work in practice
… until machine learned to better output

<Zakim> alexturn, you wanted to ask about post-processing the frame

klausw: ARCore limitation is for depth from RGB camera. It's better for phones that have a depth camera such as time of flight

alexturn: this variety is what makes me think it should be scenario focused
… apps shouldn't need to care about details, should just get suddenly better when adding hw
… is the feathering for performance? could this be done via postprocessing?
… occlusion done separately

bajones: don't know if it's been explored

bialpio: main reason why current shape of depth API doesn't treat occlusion as main use case
… the current tech on ARCore isn't good enough for occlusion from my perspective
… if we want to be scenario focused, this isn't the best API to get occlusion
… focused on getting information about environment, i.e. for physical effects such as bouncing off furniture
… may need different API for occlusion

<Zakim> ada, you wanted to add both would work well

ada: if we were to do depth write case, would work quite well with combination of both
… if feathering is outside of occluded content, by using depth twice
… once to block for performance reasons, once for feathering, could work
… if you think it's a thing that some platforms may not be able to do well, request as optional feature
… platforms can opt in if they think underlying HW is good enough

alexturn: this was a good discussion, seeing how to think through it
… could introduce another mode for RWG, less watertight but faster quick mesh
… page could say, if I'm on a device with quick mesh, use that, otherwise use depth sensing API
… offer multiple methods, engines can pick
… not every device can do good occlusion, return a quality setting so app can decide
… is it possible to find a good approach, i.e. postprocessing or prefill, to inherit best behavior for any given device
… what would give best results on something like ARCore?
… discourage use on older phones

klausw: depth sensors are fairly rare in Android phones at this point

bajones: use feathering on phones that don't have depth sensor
… not enough phones that don't have it

alexturn: you don't see many apps using feathering?

bajones: I've seen depth data used more for rough collision physics, where a 5cm offset wouldn't matter much

klausw: could feathering work as an xr compositing postprocess?

<atsushi> (sorry for interrupt. housekeeping for minutes)

bajones: might work, but would require reading back buffer which causes pipeline bubble
… what you'd be doing is taking rendered scene, force it to resolve, use depth buffer, resample into another output surface
… think it's doable, but would be awkward, and weird interactions with antialiasing
… would be pretty expensive

ada: briefly look back at core question, how do we think about real world sensing and modules
… may want to break these things out into features vs trying to deliver a module to see the world described
… i.e. here's a feature for occlusion, vs. here's a depth buffer or mesh

alexturn: not sure if we have a chosen path
… path would be proposing an occlusion api
… other option would be more data driven and more specific for headsets
… some do it with depth sensing, others with quick mesh, would happen in occlusion repo to see if it should be a module

ada: is this something people would want to do soonish?
… create a repo?

alexturn: it's important for us. would like to champion / push forward

ada: anyone want to join alexturn?

bialpio: interested in outcome, but unsure if occlusion is the goal due to ARCore constraints

ada: will create this with alexturn and bialpio as leads
… occlusion repo

<yonet> Occlusion repo: https://github.com/immersive-web/occlusion

<ada> https://xkcd.com/221/

Marker/Image Tracking (@klausw)

<klausw> Marker tracking slides: https://docs.google.com/presentation/d/1_ivZwzNLDn54Q-6wUK3fKGAZgTQ5GJtOClgbVGrh-qw/view

<Zakim> ada, you wanted to step back a little bit and to

ada: when ypou say moving images do you mean video
… image is handheld attached to a moving object as opposed to attached to a static wall

klausw: the current prototype works as you give a list of image...
… design constrains from the privacy point of view, in order to avoid user surprise, we don't want to track litrature, environment
… potentially if you are using barcodes, we don't want all of the barcodes to be scanned

ada: what's the difference between a natural image and a syntactic image?

klausw: QRcodes on the slides

alexturn_: Just around the privacy stuff
… on hololens we required the web camera.
… you can figure out where people are in there is a specific QR codes.
… we might want to have specific requests for QR codes

bajones: with ARcore and arkit it takes time to precess this images, around 30 milliseconds per image.
… I am wondering if there is a reasonable approach where we can say, we have a cache of these images
… you can definetely take image bitmaps and say we have seen them
… maybe we can make that little smoother in the additional runs

klausw: it depends on the implementation too
… if you want to process a ten thousand images, this won't be a good API

klausw: it would be nice if we can have a more privacy
… I think some features like camera view. If it is facing the camera it is possible to pick up things from users environments

alexturn_: would the idea be, we can have some of these classes that we can use. We can get pretty universal coverage like qr codes that you would need to feature detect as the developer?

klausw: if we think if it is an important feature, maybe we can use computor vision.
… it would be doable but costly

alexturn_: you could do the image in the qr code and it is small.

kalusw: it is possible to make qrcodes tracked by adding an image to them, with image tracking

ada: do we need different platforms to tell us, I can do markers or images?

klausw: we would benefit if we have a common api for both.

ada: would it be like a same api surface but different feature

klausw: more like features

ada: in the bit where you are passing in the image, you would say, find me the qr codes

klausw: yes
… one thing I want to mention is, image tracking or marker tracking use the cases on shared anchors.
… users share image and share the same experience because they have common entry points. I wonder if someone explore the use cases?

bajones: how persistent are the anchors...if I want to do an experince that I start everybody from the same source, marker and
… move everybody in the same direction.

klausw: we have the tracking status emulated, assumption that it is stationary
… basicly establish a tracking system with a stable anchor point

bajones: now I detected a marker and I will drop an anchor

klausw: it is something we should look into

piotr: if you are assuming that your image is stationary, you can create an achor.
… we don't offer a helper right now but it is doable with seperate apis

<Zakim> ada, you wanted to ask about cancelling

bajones: If you are only using the image to get everyone at the same point vs if you are tracking the movement...

ada: if the image tracking is expensive, would it be possible to turn off

klausw: it would be useful to pause and continue
… if the application can give feedback

ada: if this is the end of the issue we could move on to the next topic.
… 5 minute break

<klausw> ada: I filed https://github.com/immersive-web/marker-tracking/issues/1 for your pause/resume suggestion

AR Use Cases (50 minutes)

ada: next subject is AR use cases, we can talk about the accessibility use cases too

klausw: I think the topic is the difference between headset and handset behaviours

klausw: the q has two parts, is the API good enough to give applications the info they need
… and are we doing enough to create a pit of sucess so that a phone app works on a headset and vice-versa
… have the people as oculus had experience with running hand held experiences on their headsets

alexturn: we've seen a good number of experiences work, it's definitely possible to paint yourself into a corner, we've been working with engines to make sure they work. some features which effect that are using features w ed o not support like dom overlay.
… i don't have the top set of blockers for of for example some one building for phoen and getting stuck

klausw: for model viewer it wasn't an engine issue it's that the developer hadn't had tested on a headset and had a simple bug

yonet: people put the information at 0,0,0 and put the camera offset which works on mobile but breaks in the headset

alexturn: does arcore offset the origin to put it in front of the phone?

alexturn: we had defined localspace 0,0,0 to be at the device origin on the floor so if they are placing it in front it could create this kind of incompatibility
… this shouldn't break things but seems to be a policy decision which is causing issues

klausw: iirc it could be a near plane issue and the application is placing stuff within the nearplane by placing it at the origin

alexturn: i forget what were doing for the near plane for webxr
… @yonet if you could create a github issue in webxr for these issues with the URLs it could be helpful.

alexturn: in OpenXR the decision you make first is to pick the device form factor,
… which is a decision which makes more sense for native for webxr making that decision is harder
… WebXR was designed to be a pit of success.

ada: could we create a polyfill to fake immersive-ar in immersive-vr to make headset ar testing easier

yonet: could this be a chrome debugger feature?

bajones: this could maybe be done in the webxr emulator
… it could also be a fairly easy thing to do in an environment like THREE.js wher eyou plug in an AR emulation plugin
… i don't think the chrome debugger would be useful for this

alexturn: babylon is doing something similar letting you drag the camera round so this could be done at the engine layer

alexturn: you could download the hololens emulator for windows which recently has support for using it through a VR headset

alexturn: specifically requires a windows vr headset

<yonet> https://github.com/MozillaReality/WebXR-emulator-extension

ada: if that supported more headsets I would love to write about it

yonet: do we know who is maintaining the WebXR Emulator Extension?

everyone: silence

aysegul: Rik is there an Oculus Emulator?

cabanier: yes but it doesn't work in WebXr

yonet: so how do you debug?

cabanier: you can use adb

ada: you can set it up to work wireless

bajones: you need to install the drive the n go to about:inspect and set up port forwarding to setup port forwarding
… this lets you inspect the page and use a local server

cabanier: you can add the ip address of your computer as a secure host within the browser

bajones: sometimes you have to unplug and replug a few times

cabanier: that can happen if you have an adb server running

bajones: the oculus developer hub is really useful such as lets you be able to capture screenshots from your desktop

cabanier: It also lets you turn off leaving immersive when you take off the headset

<yonet> RRSAgent make minutes