immersive-web 2019/June Face-to-Face 2nd day -- 05 Jun 2019

<The_Room> The Google Hangout for today's meeting is https://hangouts.google.com/call/2b5ml1tc3hJvili7kb02AEEF

<cwilso> Argh! The info is actually: https://meet.google.com/soa-sipr-hmj

<cwilso> Dial-in: (US) +1 318-467-6139 PIN: 607 096#

<ada> Hey y'all!

<klausw_> Is dial-in enabled? I tried cswilso's info from yesterday

<cwilso> new info:

<cwilso> https://meet.google.com/soa-sipr-hmj

<cwilso> Dial-in: (US) +1 318-467-6139 PIN: 607 096#

Real World Geometry incubation

<joshmarinacci> +scribe

<joshmarinacci> +present

<atsushi> scribe: joshmarinacci

first item on the agenda is John w/ real world geometry

github.com/immersive-web/real-world-geometry/

John: there is something you can try in chrome canary
... plane detection works. in demo we can see boundaries of the plane, rendered from js.
... there is a reticle which uses hit testing.
... what are the use cases for plane detection. some of the feedback on hittesting is lack of control over how the algorithm works. we are adding more options to hit testing. ex: do you want the plane to go to infinity or not.

there are ways to do some of this w/ current hit testing api, but not enough.

John: another feedback from devs is that w/ current api its hard to do certain call to actions. need to be able to show user more feedback during the hit testing process
... 3rd set of use cases: fundamental things about planes. you can meaasure distance between surfaces. this would be good to expose
... being able to communicate to the user what the system is actually seeing makes the world and physics seem more realistic
... design decisions in current version. data is synchronous, only available in raf (request animation frame), if you want data outside that you have to copy. we do this because it avoids jitter situations where the data lags a frame behind

<blair> I thought we solved this one frame behind issue with hit testing?!

john: shows two videos, one is async and one is sync. the one on the left has a 1 frame lag and it can lead to disorientation.

brandon: explain what we are seeing

<Zakim> NellWaliczek, you wanted to ask what is this "existing hit testing api? because the one in the explainer is not async

john: the point of this is to show diff between sync and async

<blair> nell has clarified, thanks

<blair> 1+

nell: the api already accounts for this. we decided sync was great.

blair: i don't think we should get too bogged down in this. the swimmingness shouldn't be an issue.

nell: have you explored the api in the way we explored before or is this orthogonal effort

john: this is orghtongal. we can look at how this maps back into the existing hittest api

<blair> I think this is orthogonal.

john: next slide

<NellWaliczek> https://github.com/immersive-web/webxr/blob/master/hit-testing-explainer.md

john: the idea w/ planes is that there is not a unique id for each plane in current design. currently you test if it's the same object. blair, you suggested adding a timestamp.

<bajones> +q to ask what platforms design was evaluated against

john: purpose today is to give an update on what we've got and where we are going with it
... samples show both older hit test detection and new plane detection. test it today in chrome
... any queustions?

brandon: slides didn't cover what the team designed.

<Zakim> bajones, you wanted to ask what platforms design was evaluated against

<blair> SPEAKUP

<blair> :)

john: i know we were looking at commonality between arkit and arcore. there's been a gerneral review of the differences.
... this api will map cleanly onto what arkit and arcore are doing
... i don't know how this would map onto hololense and othters

blair: i posted some comments into the repo. itw ould be great to get feedback on. we impl into XR Viewer. i started by looking at how hololense handles mesh data, inspired by that.
... biggest change is instead of having separate planes and objects, to have more of an object heirarchy. everything is a mesh, some are also planes, etc.
... this way if something new comes along then the api can handle it because it could provide a mesh and you'd do it the same way
... also having flags to let you know what htings have changed, like verts, normals, structure of the mesh,
... could help you manage the data bandwidth better
... the suggestions i had in the repo on refinement. i like the idea of requestion with an options array to refine what kind of real world geo is selected.
... someone pointed out that you could not get certain thigns if you want. ex: i just want tables, and i'd just get it
... i like the direction of this

nell: you're talking about the repo, not something google specific, right?

john: this presentaiton is to share what is going on. we wanat to make surue all of this feedback is captured in the repo issues. ex: investigate what oculus and hololens are doing.

<blair> +1 on getting MSFT and ML folks to comment on this

nell: the real world geometry repo

john?: I don't some have opinions on this. i wonder if there's a way to have a point clouds api so you could add meshes and surufaces yourself in JS

people are used to the fact that you can do meaninful computation in js. anything which can be done in js should be. allows sharable libs and algoritms across platforms

these things become value added as software packages on top of the underlying data, which is presumably is point clouds

brandon: if you expose a point cloud does ht eplatform need to provide connectivity info for that? unelss you know the points are wellformed in a certain plane you can't do much w/ the data

<blair> opinion: doing everything in JS is a terrible idea

that connectivity info is being computued by arcore on the point cloud.

<cabanier> +1

<ada> wasm?

john: the thing we are trying to avoid is establishing the one type to rule them all as far as geo for real world understanding.

the ability to get planes does not precluded other types in tthe futurue

<blair> (I totally agree that data at all levels should be available)

john? I'm saying that if we open up the point clouds then you could add other stuff w/out changing the spec

<johnpallett> (that was Nick)

nick: summary: you go to a website to calculate it out it would you have to start from scratch instead of from cache. yes that is an interesting opint
... good point, i think a way to provide basic level source data so we can do ocmputaiton on top in the browser itself is interesting
... i think the plains api is interesting. where there be walls and celings or just floors?

john: it will vary. with the planes what you are really getting acdess to is what ht eunderlying platform allows
... also the resolution will vary

nick: could we do a hint?

john: are you saying filtering the results you get back?

nick: give me your best guess for a wall

john: what you are saying is a more semantic approach. which is the other direction from giving point clouds

<cwilso> (The last few have been rik, not nick)

john: a semantic understanding
... there maya be opportunites. if i see a ground plane w/ legs then it's probably a table.
... there are different layers you can provide. you can use a plane and get semantics from it yourself, we don't provide these se3mantics from the underlyign platform

rik: are you adding point clouds?

john: not right now. we may explore

blair: i think it's pretty clear what we want is an eventual api is to get data at lots of levels. if the platform supports then we want a way to get it w/ user permisisons. also important to provide progressive meshes and planes and more semantic objects like chairs
... if os is already doing this then why not provide it. otherwise we can't compete w/ native platforms. super important that we provide both

<Zakim> klausw_, you wanted to say is it a safe assumption that point clouds are fundamental and sparse enough to handle?

klaus: a quick point which may be moot. we shouldn't assume that every impl has point clouds.

<johnpallett> Apologies. The bit.ly link in the slides didn't work. Here's the correct link: https://storage.googleapis.com/chromium-webxr-test/r663922/proposals/index.html

klaus: we should expose everything the native platform has and not do in JS unless we have to

john: I don't know if every platform provides point clouds

rik: diff between point clouds and mesh points?

is there a kind of points that you would prefer?

<johnpallett> (to clarify: consistent point cloud capability that a site could build on reliably)

klaus: if it's depth points you'd have a ton of them

<johnpallett> (and I don't know the answer to the question)

nick: i don't have an answer right now. if something can be computed in an application layer then we shouldn't be arguing about the many ways the info can be processed. if there are meaningful distinctions between types of ponts so that libraries are less generalizable then that would be less helpful

on hololens al ot of the work is done in AI hardware. the only option is to decompress mesh to points and app tries to put it together

depth camera data for holo does not escape the HPU. you wouldn't be able to obtain that data in an app context on a hololens

<Zakim> alexturn, you wanted to talk about differences in fundamental data

there might be a narrow form of data that is available for everything and options above and below.

nell: this was part of the original design of hit testing to account for this

john: note that i fixed the link above. https
... try out the cemos

https://storage.googleapis.com/chromium-webxr-test/r663922/proposals/index.html

john: this is a great discussion. keep going inside the repo

nell: there are these moments in the design process whewre the follks deepest would get benefit from a broader review from the community

chris: who is broader community

<klausw_> s/everything the platform has/data based on what the platform has natively/

nell: for these repos there is one or two people deeply active in this space. i mean a broader group within this working group
... in general we will get more isolated so having a cadence for wider review is useful

trevor: we can add to the actual calls

<ada> q/

<cwilso> /?

john: other repo. the explainer for this landed yesterday

<cwilso> https://github.com/immersive-web/dom-overlays/blob/master/explainer.md

john: not go through any solution we've explored because we havent. instead go throught the problem spaace and the challenges we have identified. i'm posting questions
... feedback is welcome
... if you have ideas, please caputre in the github issues
... in terms of dom overlays are. the use case is ... pause

<blair> can't see the slides

john: these slides are a poor substitute for the explainer.

<blair> the hangout is showing a black screen for the speaker

DOM overlay - approaches

john: there are a lot of places where having a 2d image is useful. one is a material picker or a call to action or others.
... options for internationalization gives HTML advanatages over pure rendering
... generally the use case is when i want to present a div or dom element over top of a scene and provide interaciton
... challenges: pixel sniffing. privacy and security, timing attacks. the webpage should not be able to sniff pixels out of the dom.

<klausw_> cswilso: the presentation isn't showing in the VC meeting

john: others. input sniffing. click jacking, cross origin risks
... we do not have solutiosn for this. we've just identified them.
... the next challenge is dom structure. we aassume that the dom tree is displayed in one place. is it possible to display a subtree in a second place. some interesting overlap with full screen and picture in picture
... some of this is impl questions.
... if we don't subtree should it represent the entire underlying document? fine for phone but not desktop tethered VR headset where regular doc could be visible at same time as immersive
... challenge: input. the conversion of screen touch into click events is an area with interesting history with how new form factors are introduced.
... how do we disambiguate if a ray happens to go through the dom if it's transparent or intractive or not-interactive. how do we avoid duplicate events. etc.

nell: this was part of the original design of hit testing to account for this

john: also non-input events like controller poses. we don't have answers. the key is having it behave in a consistent manner across device formfactors
... there is a proposal right now but we haven't implemented yet. allow a single rectangular dom branch. not supporting control over z or occlusion.
... not allowing the site to place the overlay. the user agent will do it. tthe overlay *will* accept use input. the user agent is the best equipped to place the dom and make it useful to users.
... examples: a smart phone has an overlay that covers the entire screen. an HMD with a moderate FOV a headocked ui. on a vr hmd it would be a rectangle floating in space
... i want to stress this will vary from device and not be locked to pixels
... the other non-goal: we don't want to allow the dom to be placed into the 3d scene. it's not associated with a particuluar anchor or object
... you can't assume a screen coord in the overlay will map to a screen coord in the 3d space
... this is the scope of the problem.
... If I had to suummarize the problem.. what if we had just one thing and the user agent has control, then what happens next.

<ada> https://github.com/immersive-web/dom-overlays

alex: are these intended to be modal?

john: assumption is they could sit around while you are using it

<Zakim> alexturn, you wanted to talk about how WinMR sends input to just one app in the platform

john: ex: paint color selector

nell: we had this discuussion at t-pac for three days. trying to align these will be a non-starter because of ui formfactor differences.

john: i think it's different because this is a more bounded problem no

nell: no, we talked through most of this at t-pac. i'm a little confused
... are you saying there is new info or you chose one of the options from t-pac.

john: this is still an open issue. this area of exploration is fresh for us.

nell: it seems like you are saying this a brand new concept but three or four companies talked about this at t-pack

chris: i think this is a deeper exploration of one of the options

leo: you mentioned it doesn't address depth. z-space or 3space?

john: both. you have control over painters algorithm but it will always overlay the 3 space. always over the environment. you couldn't use dom-ordering to push into the scene

leo: in the dom area if it covers the full screen will you allow touch throughs on transparent pixels

john: open question still

ada: you mentioned a controllable output channel. althougbh you can't do occusion can you stencil out the occluded bits?

john: no. you won't be able to make any assumptions about where the overlay will be and it will vary by form-factor

josh: how do position these semnatically?

<cwilso> john: the idea of that is in the explainer, but it's not detailed.

klaus: this ui is not world locked so you can move your head so it shouldn't be an issue

john: the key requirment is that i must be viewable by the user and interactable. there's a lot of open flexibility beyiond that. there may be other opportunies for user agents to manage this

nick: one of the things you mentioned is a special sub-tree of the dom. it sounds like how a-frame works. as soon as you have an a-scene in the dom it's basically a canvas and everything below is handled by aframe

we could imagine embedding some sort of html type renderer in your 3d engine

nick: a lot of frameworks have core assumptions about how the dom works. this might break those. if we want to maximize our ability to use html inside of webxr what you are talking about a layer inside is more like a chrome outside of xr experince where somnething is attached to the body in the dom and it just happened to bve rendered in a way that is platform appropriate
... another way of side stepping this issue is to have other htings which are attached to the document and it's up to the plaform to figure out how to position/render it

<Zakim> klausw_, you wanted to say is there a recorded summary of TPAC discussions/conclusions? Not everyone was there.

john: this is itneresting. we should think about it.

klaus: is there a recording of tpac?

chris: yes

klaus: while it's great it was discussed earlier lets keep it going. esp. since they didn't reach a conclusion.
... let's not stop because it was discussed at t-pac

<blair> we also had a long discussion about this at the F2F we had at Mozilla (last year?)

nell: i want to make sure we don't re-invent the wheel. let's not ignore prior work and balazing indepdenent work.
... this was recorded and a presentation was posted.

<atsushi> scribe: cwilso

josh: what happens if we don't solve this? One of Blair's research assistants did a library that kinda does 90% of this using HTML canvas - quad in space in 3D.
... if this doesn't get solved, does it need to be?
... there are workarounds.

john: this work is going on in an incubation in the CG; this isn't WG work right now. We should go through that library and talk about some of hte assumptions/limitations; managing DOM [wrt security] is hard

alexturn: that library might be a good baseline, then see what else Sumerian and others need on top of that.

<blair> can I follow up on this?

yes.

scribe: Josh to provide more info on the library.

blair: I can point you at the repo. John was suggesting there's lots you can't represent thru DOMtoCanvas.

<joshmarinacci> blair: to follow up. the big realization is that john said that there is a lot that can't be rendered w/ dom to canvas, but the structure of the lib gives you a far more functional approach to doing 2d

<scribe> scribe: joshmarinacci

blair: in relation to the dom overlay stuff, the real question is if we couuld have html elements in text to canvas (qs about security) it would give us a vastly more funcitonal approach to getting 2d into 3d
... jared did routing events and other tricky stuff. it worked very well and is surprisingly efficient
... lots of caching. pretty cool. provides an alternative. done demos where i show 2d on the screen, enter 3d, replcae 2d elements w/ this layer and it works pretty well.
... let's talk more on the repo

<blair> (oh, and we have a paper on it accepted to Web3D if anyone wants to see it :)

josh: we don't have solutions yet. as you are reading the explainer. lett's look for new solutions. some of this should be useful to the discussion.

<bajones_> Since it was requested: The relevant DOM overlay minutes from TPAC, with links to the presentation

<bajones_> https://www.w3.org/2018/10/26-immersive-web-minutes.html#item04

<kearwood_> PSA - PR for lighting-estimation explainer is up: https://github.com/immersive-web/lighting-estimation/pull/4/files?short_path=358ce32#diff-358ce32b724ff39fbbfd77ff2ab6ad30

<joshmarinaacci> this is the DOM layer library repo we talked about earlier: https://github.com/aelatgt/three-web-layer

<NellWaliczek> https://github.com/immersive-web/webxr/issues/630

Detached Typed Arrays

<johnpallett> scribenick:johnpallett

<trevorfsmith> scribe: John Pallett

<scribe> scribenick: johnpallett

I think we're good for scribing

brandon: hopefully this is a quick topic w/ a strawpoll at the end

issue 630

bajones: typed arrays seemed to be the right way to approach providing data. But feedback was you could rip data out and toss it over a worker boundary... not good.

nick-8thwall: more deets please

bajones: background: There is the concept in the browser of a typed array. Two concepts:
... 1) array buffer, a contiguous chunk of bytes with no formatting
... 2) different types of arrays on top (e.g. Float32, UInt8, etc.) which are views into the underlying array buffer
... this is what's being used for matrix representation
... the reason we do this is because WebGL and WebGPU will interact with data going into a shader with typed arrays like this
... formatting it this way preps us for being able to shuttle things into the rendering system. Particularly useful for projection matrices which are the primary things we're looking at here, since we don't want people mucking with them.
... the problem is that somebody can take the array buffer that underlies the Float32 array, and detach it (e.g. send it to a different context via PostMessage)
... and that's great for efficiency (no copy required!) but it also means the original context can't access it for thread safety, which means the Float32 array points at nothing, and crashes abound.
... (OK maybe not always a crash, but it's not good)
... places where we're using Float32 arrays:
... a) projection matrix in a View, which can change from frame to frame
... b) rigidTransform attribute which represents the transform as a matrix
... c) XRRay in the form of a matrix
... all of these are 4x4 matrices
... so, there are 4 different options here for what we do (see issue #630 for more details)

Option 1: You break it, you bought it. Don't detach the data from the context.

<blair> ASIDE: will also come up when we expose video, or real-world-geometry, as these will probably be typed arrays too

scribe: if developers do bad things, that's on the developer. This is probably reasonable if we ensure that all places where data comes from are *snapshots* - today that's the case, e.g. each View is a snapshot from the frame. We're not trying to share that data across boundaries.

<blair> ASIDE: and Anchors and results from hit Test

scribe: if in the future we needed to carry forward this pattern we'd need to be careful, probably not horrible.

<Zakim> alexturn, you wanted to ask if WebGL/WebGPU could move forward to accept DOMMatrix as well, if that's the forward-looking type

Option 1: Quietly clean up breakage for the developer. There's copy overhead here, but only when the underlying buffer gets detached. [Nell] This is an edge case until we start dealing with multi-threading (e.g. workers) and that might happen in the not-to-distant future... then this won't be an edge case.

scribe: er... that was Option 2 sorry.

Option 3: Give developer a new object every time, always get a new Float32Array. There's overhead, particularly if you have multiple places in the code accessing the same object, but it's more predictable.

Option 4: Switch away from Float32Array, i.e. move to DOMMatrix.

scribe: for consistency could say there's a read-only DOMMatrix. This could allow a more strict definition of a matrix.
... this would solve a couple of questions of ambiguity around how the matrix is presented. Would add overhead for the typical usecase since WebGL/GPU still want the float representation so there'd be a conversion.
... with allocations.
... also, DOMMatrix requires everything to be Double attributes, meaning that's one more form of conversion.
... but this could interact nicely with CSS mechanisms. If other aspects of the web platform need matrices they'd likely use DOMMatrix

alexturn: per Klaus' point, it's really Float64, seems a little odd that we have a strong type then we can't use it for the strong purpose in other places.

bajones: this does seem like a point-in-time thing.
... memory mapping mechanisms are probably a more likely approach for GPU memory management.
... there are a lot of different GPU memory data structure types, too. Possible that we could go to GL WG and ask for a variant that takes DOMMatrix natively, this might not be a terrible idea
... but it's still going to incur some overhead because they also would have to do the Float64>Float32 casting

Nick-8thwall: All options seem feasible. DOMMatrix assumes you will always have a view matrix (not a projection matrix) with assumptions about the 3x3 being for rotation, transform, etc. Weird to use it for a projection matrix!
... Float32 is kind of nice, need to be cogniscent of not optimizing for efficiency if it's not an issue (e.g. these aren't huge data types). Should probably optimize for clarity and ease of use, instead of performance.

<leonard> +1 for Nick's comment on optimization for clarity over performance

bajones: Agreed this isn't the same kind of efficiency concern as uploading a large buffer of vertices. I'd be reluctant to introduce a new type of 16 values to represent a 4x4 matrix since we'll get called out on that

ric: note that DOMMatrix methods are designed to fail
... if used incorrectly

bajones: It'd be a hard sell to TAG to create a new type. Float32Array is salable, so is DOMMatrix.

<Zakim> Manishearth, you wanted to mention that dealing with detaching seems to mostly be the concern of API consumers

Manish: Other things using Float32 arrays check and throw errors in certain cases, maybe we can ask WebGL to do this?
... webAudio supports detached arrays, it'll complain accordingly. Can we learn from WebAudio and do Option 1 without it being a silent failure mode?

bajones: I think option 1 is valid even if all other APIs did something terrible. We aren't here to handle the other APIs.

manish: If we can get WebGL to complain properly then Option 1 becomes more appealing.

ada: WebGL already throws an error if the array is detached, so we might already be there.

cwilso: webaudio does something more like option 2 where it copies it as necessary (at least it does now. Historically not so much.)

<Zakim> NellWaliczek, you wanted to ask about DOMPoint as doubles

<NellWaliczek> https://immersive-web.github.io/webxr/#xrrigidtransform

nell: given Klaus' point about DOMMatrix being Float64>32 are we concerned about using DOMPoint and DOMPoint read-only since they have the same 64/32 conversion issue
... e.g. in XR rigid transforms
... if the expectation is that we're encouraging folks to use whatever works for them, are we risking loss of precision, particularly when converting from 32-bit points to matrices...

<kearwood> +q To talk about float64 for world scale transforms

Nick: comverting 64/32 is no worse than working in float (all the math is done in doubles anyway in Javascript)

bajones: you can create a DOMMatrix from a float32 array, so if people want a DOMMatrix for whatever reason, and we're doing Float32 arrays, it's not like they're incompatible.

Ric: Original work on DOMMatrix was trying to make it work for everyone including GL. GL team didn't like it, basically it's an input but immediately converted to GL structures.

<trevorfsmith> scribe: trevorfsmith

bradon: I don't have a problem with that.

Manish: It's fine and it works.

Brandon: In chrome SameObject is basically a noop. There's a tag that does it for you, internally to the function impl you cache it.

Manish: The pointer object can change change but the Object is the same.

Ada: I have concerned with options 3 and 4 with garbage collection in a tight loop.
... Where they're working with the underlying data and then spawning a lot of single use objects that trigger GC frequently. Option 2 lets them mess up and have less performance but hopefully not so much because documentation will have examples that say to clone it first.

Nell: So you want option 1 or 2?

Ada: Leaning toward option 2.

<johnpallett> scribe: johnpallett

strawpoll starts

<ada> Straw poll for the options 1,2,3,4:

nell: don't forget cost to implementers of change. Option 1 is no change. Option 2 is minor spec surface area change. Option 3 is turning attributes into functions, that's definitely a breaking change. Option 4 is also a significant breaking change.

bajones: also cost to spec editors goes up more with options 3 and 4

<trevorfsmith> 2

<ada> 2

<alexis_menard> 2

<Nick-8thWall> 3

<NellWaliczek> 2

<joshmarinaacci> 3

<alexturn> 3

<Manishearth> 2 > 1 > 3

<cabanier> 1

<kearwood> 3

<daoshengmu> 3

question: Is there a significant performance difference between options?

bajones: Yes. #4 is the least performant, #3 better, #2 even better, #1 the best

Leonard: but is it SIGNIFICANT

bajones: probably not an overt difference
... shouldn't break ability to do VR or AR if less performant options are done

Manish: We already do 4 or 5 allocations per frame

bajones: V8 team says we shouldn't worry too much about garbage collection particularly for things that only stick around for a frame or so, this isn't the only consideration of course

<JGwinner> 3, I don't have a dog in the fight, but 3 has best advantages and seemingly moderate disadvantages, so that's my vote.

Nick: for normal use this probably doesn't have a huge impact. If there was some algorithm analyzing 1M points then we'd have a challenge.

<cabanier> 2

<leonard> either 1 or 4 -- I see no point in going 1/2-way

VOTING ABOUT TO END

VOTE NOW

<trevorfsmith> π

<cwilso> e

<alexturn> https://giphy.com/gifs/funny-lol-ne3xrYlWtQFtC

<ada> 6x2 5x3 1x2

all: <conversation devolves into traditional end-of-straw-poll hilarity>

ada: six #2s, five #3s, one #1s

bajones: by a narrow margin #2 is (non-bindingly) the guidance of the group. [nell: if we find other problems we'll let everyone know. We'll post a PR...]

manish: I can do spec text on this if you want!

nell: Yes please! <assigns>

END OF TOPIC

What do you want out of focus/blur events?

NEXT TOPIC: What do you want out of focus/blur events?

nell: one of the last holdovers from WebVR that we haven't figured out yet are how to handle blur and focus events

<cwilso> https://github.com/immersive-web/webxr/issues/488

nell: the idea originally was that there would be reasons why the user agent would pause the page rendering immersively, but there are also reasons for inline as well
... note 'blur' is the phrase for 'pause' in this case
... may need to blur inline sessions when you can't see them
... feedback: names are confusing! this exists in the 2D web but in XR?
... we need to figure out better naming. Is there more than Brandon and I need to be paying attention to as far as requirements or use cases as we clean it up in June?

bajones: platform may be the one that's instantiating things through the suspend
... e.g. on oculus or vive there are platform-level events that let you know that you're being backgrounded. The idea was that these would be exposed through 'blur' events, that hasn't changed

john-oracle: folks who think in terms of cameras might try to use 'blur'?

nell/bajones: they can't use 'blur' for that purpose and it's a good reason to change the name

nell: the thing we're naming is 'you won't get input' or 'you're running at a lower frame rate' or basically 'your session isn't over, but you're not going to get things regularly for a while'. Typically this is a great reason to pause physics engines, or game countdown timers, or... lots of things you might want to do if you aren't foregrounded or visible

bajones: the results of getting this event are recognizing that new requestAnimationFrame calls may be processed slowly (or not at all!)

<JGwinner> how about "Attention" ... "attention lost" "attention gained"

bajones: if platform shows blurred out version of content behind it
... you will not get input while blurred for privacy/security reasons (threat vector: input sniffing)

nell: assumption is that if we get consistency about trusted UX this event would fire when trusted UX shows up so the experience can pause accordingly

bajones: not something developers are required to handle, it's a quality of life change so you can pause physics, notify people you're offline, etc. etc.

nell: we currently have no signal for 'the headset is not physically on' - might be a scary privacy signal to call this out as a separate signal
... this might be useful?

bajones: WebVR did have an event for this so you could start a proximity sensor event, which was problematic. More generally signals for whether headset is on (worn) or not are not totally reliable... it's a difficult thing to utilize.
... not against reintroducing that but need to make sure WebVR history is included in considerationo.

nell: nothing going forward would be structured to give a signal about whether headset is worn or not for privacy concerns

<Zakim> Manishearth, you wanted to mention Window.onfocus/blur exists for this purpose on Window, so we have precedent

alex: there's a warning with blur events about developers not confusing them with page visibillity. (see comment in issue #488)

<Zakim> alexturn, you wanted to point out that we need both "visibilitychange" and "focus"/"blur"

alex: visibility vs. input blur are two different things. With blur you may need to provide new pixels, even if logically paused; that's different than actually not visible.

nell: might be the same event with different attributes, but yes those are different signals.

alex: we have one enum on our platform: Visible, Visible not Active, Not Visible (absent, present-passive, present-active)

manish: WebVR had 'presentChange' event can we use that? It'd need an enum maybe?

nell: we pulled that one explicitly from WebXR because the pattern changed, but there is an event on the session itself to say that the session ended

manish: I mean, can we use the name 'presentChange' ?
... with an enum indicating what it changed to?

nell: might confuse people migrating from WebVR to XR

bajones: OpenXR has XRSession state which sounds like something we could consider using, enumerates "possible session lifecycle states"
... states: unknown, ready, idle, lost-pending, existing, max-enum, and a few others I can't type fast enough to capture.

alex: bigger change to adopt all those states

<kearwood> (Brandon's idea of plopping in the OpenXR enums is better than what I was going to propose...)

alex: also a synchronous way that is delivered. When you get waitFrame and a shouldRender bool indicating whether pixels will go anywhere...
... this might be a different way.

alexis: semantic of pageVisibility events are understood by web developers, they already know what this means

<kearwood> (Correction... Not from OpenXR enums.. Page Visibility API)

alexis: e.g. they know what to do when they get this type of event. A token visibility change might be visible.

Leonard: In the case of AR systems through a camera, are they still going to display the camera pixels, and in a multi-user environment would you track users in a non-active state?

<kearwood> (Correction again.. These enums: https://www.khronos.org/registry/OpenXR/specs/0.90/html/xrspec.html#XrSessionState)

nell: AR session today doesn't have the page render camera pixels, it's the user agent
... need to word in such a way such that it's flexible enough so that pixels aren't displayed or could be out of sync with real-world during composition
... we may need a firmer stance on visibilty-blurred vs. input-blurred in terms of whether requestAnimationFrame is pumped

bajones: there is a chunk in the spec called 'trusted environment' which isn't a great name, general idea is that the user agent must have a tracked trusted environment in the compositor which can always fall back to showing a tracked environment with the appropriate safety information
... assumption is that the user agent needs to provide information to keep the user safe (e.g. indication of bounds)
... switching from page to trusted representation of environment is probably needed. Will be platform specific, not everyone will have capabilities to participate in those experiences.

nell: this will get resolved by end of June. If you have ideas afterwards please add to #488 ASAP

https://github.com/immersive-web/webxr/issues/488

err... sorry

https://github.com/immersive-web/webxr/issues/488

CG topic lightning talks - Ambient Lighting

<cwilso> scribe: Chris Wilson

<cwilso> scribenick: cwilso

<kearwood> https://github.com/immersive-web/lighting-estimation/pull/4/files?short_path=358ce32#diff-358ce32b724ff39fbbfd77ff2ab6ad30

kip: ^^ is an explainer for lighting estimation; apologies for not having more visuals.

<kearwood> Precomputed Radiance Transfer: https://en.wikipedia.org/wiki/Precomputed_Radiance_Transfer

<kearwood> Spherical Harmonics: https://en.wikipedia.org/wiki/Spherical_harmonics

<kearwood> Image Based Lighting: https://en.wikipedia.org/wiki/Image-based_lighting

<kearwood> Global Illumination: https://en.wikipedia.org/wiki/Global_illumination

<kearwood> Rotating Spherical Harmonics: http://filmicworlds.com/blog/simple-and-fast-spherical-harmonic-rotation/

kip: TL;dr: took the approach of triangulating across native apis.
... there's a combination of hardware sensors and LCD of simple radiance value
... at a minimum, implementations could just provide that.
... web content should expect to be progressive WRT what's provided.
... originally considered just a simple value.
... but broke them out because it's easier to understand.
... looking at native APIs - sw-only ones by Microsoft, ARCore, ARKit, LuminCore....
... all these things are implementable on these platforms,
... these aren't just theoretical.
... looked at other APIs, but didn't see much other than ultra-simple (e.g. gamma)
... just weeks ago, three.js landed support for SH (Spherical Harmonics)
... should the browser update this once and then it persists and you update for rotations?
... or is it promise/frame aligned?
... wanted to be conscious of time-based attacks on luminance (cf. comments in explainer)

Leonard: XREflectionProbe equiv to requesting access to camera and microphone? Why microphone?

<Zakim> bajones, you wanted to ask a whole lot of things :)

Kip: Just camera - microphone is similar.

Brandon: this is cool, thanks. I assume there's significant overhead to producing these values?

Kip: in some platforms, yes, but in others you've already calculated?

Brandon: is there a reason this is on the frame?

Kip: open to feedback on this point. Might be reusing values. Would love to hear from native platform implementers.

Brandon: would be interesting to look at lifecycle across platforms.

kip: would it make sense to request up front?

<Zakim> NellWaliczek, you wanted to ask about frame alignment

Nell: might make sense to look at how we do this with hit testing - it's similar imo

brandon: I don't know much about SH - is the 9 value thing a standard thing?

Kip: yes.

Nell: it would be nice to have those links up above ^^ in the explainer itself.

Brandon: something here suggested that we could only do indirect radiance? What's the privacy concern?

Kip: maybe we don't need it?

JohnPallett: sun position

kip: ah, yeah. time of day.

Nell: This doc is wonderful, btw. Explains context and background before diving into design. Helps frame things.
... and thanks for looking across the ecosystem.

<Zakim> johnpallett, you wanted to ask about interchangeability between lighting approaches, also to appreciate privacy considerations :)

JohnPallett: is this SH-lite?

Kip: no.

JohnPallett: what's the overlap between these?

Kip: an SH can reflect 3 things in one structure - but if you had a simple renderer that only did hard shadows, you'd have to compute from that.

JohnPallett: it's not assumed these are all convertible to each other?

Kip: correct.

CG topic lightning talks - Update on Spatial Favicon

ravi: update on spatial favicons.
... important point on sizes:
... what is the "size" analog for spatial icons? for 2d images, this was equated to quality/LoD
... I came up with static bounding box of asset as estimate of "size" - then scale consistently.
... in theory, these units are meters, but in practice, all platforms interpret as a unit and scale.
... there's a FB extension in glTF that uses the static bounding box as a metric for LoD, but this approach has issues - e.g. Head+Torso has larger bounding box than head-only.
... [although this is probably the opposite of intended]
... pros of bounding box approach: no changes to spec, analogous to 2D size.
...cons: non-trivial calculation, and currently no tool to calculate.
...alternatives: new attribute: "lod": "low" | "medium" | "high". but authors could interpret differently
... or use 'size' attribute with vertex count x primitive count as values - but this is a mismatch in name/value.
... MSFT_lod is same problem but not a solution.
... Other update, we opened a webapp manifest issue to address.

<NellWaliczek> https://github.com/KhronosGroup/glTF/tree/master/extensions/2.0/Vendor/FB_geometry_metadata

<leonard> three.js has a tool to generate the bounding box

<NellWaliczek> https://github.com/KhronosGroup/glTF/tree/master/extensions/2.0/Vendor/MSFT_lod

rik: what ravi was saying that sizes are used in the RW to pick LoD, not size. So maybe it would be okay.

ravi: will add link to deck.

<leonard> size or #polys is not the sole measure of quality. You can have low poly and high texture too

thanks, let's break for lunch

Lunch

<ravi> @cwilso yes, i intended the other way

<ravi> Here is the issue opened on Web App manifest: https://github.com/w3c/manifest/issues/763

<ravi> Link to slide deck: https://github.com/immersive-web/spatial-favicons/blob/master/design_docs/SpatialFaviconSizes.pdf

<ravi> Link to issue: https://github.com/immersive-web/spatial-favicons/issues/5 where we can discuss about this

<ravi> @leonard yes, that is true, but again these are just "hints" and a complete reflection on quality

<ravi> *not complete*

<JGwinner> Note: JGwinner, John Gwinner, is an independant VR/Technology author, I don't directly work for Oracle (or Samsung).

Coordinating WebXR frameworks

<trevorfsmith> https://transmutable.com/images/wider-web-stack.png

<ada> scribenick: bajones

Trevor: Felt like it would be good to look up-stack from the implementations
... "How do I get a unity for the web" is a common refrain
... (Intentionally messy chart) looking at exemplars of clients and services
... Want to take a moment to do a bit of coordination to make sure we're not duplicating effort.

Starting with Nell

Nell: Want closure on when the implementations were going to be in a place Sumerian could target
... Now unblocked after discussion at this meeting
... Would be smart to have Sumerian ready in roughly the same timeframe as the the implementations
... Talking about AWS (Sumerian) hosts, amalgamation of several services, such as speech processing.

Trevor: Can you give an overview of the set of things that Sumerian is doing?

Nell: Now that we know what version we're targeting we'll use the polyfill to hit all of our targets.
... Can use Sumerian in app containers, would love to get a more unified story between the UAs there.

Trevor: (Talking about Nell's gamepad-mapping repo) There's a layer above that that's common to every framework of mapping input to intents/actions.
... Would be good to have future discussions about that.

Nell: Description is pretty vauge but Amazon team would be interested in collaborating on open source collab in that area.

Josh: Mozilla has a content group that does Firefox reality, spoke, hubs.
... Wondering what they could do to help 2D (traditional web) designers progressively enhance with immersive content.
... Embedding 360 videos, 3D models, etc on a 2D webpage.
... Extension to see if it's useful to place content in an immersive environment.
... Looking into extensions like multiview and basis (compressed texture) support
... We are entirely outside MDN's purview

<Zakim> cwilso, you wanted to talk about MDN, Omnitone

Asked about state of WebVR, didn't know at the moment since developer recently left the company.

Chris: Working with Joe Medley to ensure everything that gets into Chrome is documented in WebXR

<NellWaliczek> https://github.com/immersive-web/webxr-reference

Chris: He's spinning up on WebXR documentation again.
... Anyone who's interested in helping, please reach out.
... Also wanted to talk about Omnitone, which is more robust 3D positional audio (ambisonics)
... Preferred shipping as a JS library to baking it into the browser.

Josh: Can anyone talk about YouTube metadata?

John P: <cackling>

Nick: Since we have a VR complete milestone, do we have an AR complete milestone?

Nell: That's an upcoming topic!

<cwilso> Oh! Forgot to mention as an aside - three.js also has positional audio built in, btw. (I.e., if you use three.js as scene graph, it will handle the camera/pose related positional audio for you.)

<Zakim> NellWaliczek, you wanted to talk about how to track this info for developers

Nell: Also, polyfill should be listed on the chart
... We get asked how to use WebXR, and we don't recommend that most people use it directly.
... If you look at glTF repo main page, shows spec but also has sections for artists, developers, tools, etc.
... Lays out the ecosystem around the standard

Trevor: There's a lot of services that aren't specific to XR but are relevant to immersive web projects.

<NellWaliczek> https://www.khronos.org/gltf/

(Billing, messaging, matchmaking, etc)

Trevor: Does anyone want to speak to those?

Ada: Payments should largely rely on the web payment API
... Would feel better about the UA handling it than someone rolling their own.

Trevor: Big roadblock for a lot of people. "I can't get paid on the web"

Computer Vision

<ada> scribenick: NellWaliczek

Nick: Not asking to for specific changes right now.
... Want to raise awareness of requirements for high performance computer vision on the web.
... and consider hardware form factors
... 8th wall is SLAM tracking at 30fps on today's phones
... demos of minigames tied to specific images in the real world
... wants to open the door for experimentation since we don't know what's gonna be important
... 18 years in the way back machine... every operation on every pixel is costly.. came up with techniques to minimize that.
... today it's different, but this is the foundation for how to think about the problem
... how does 8th wall work...
... generate a low-res image from the high-res image
... extracts stats from the low-res image, then processes that.
... overlays the results of the rendered content over the high-res image
... *shows diagram of the control flow*

might grab high-res section based on info from previous frames

Nick: We do special stuff to improve throughput on older devices
... this includes adding latency in the drawing but are able to keep a high throughput
... summary of tech requirements
... 1. high-res camera texture available where a shader can be run and results extracted
... 2. The shader needs to be adaptive
... 3. Needs to be able to hold multiple frames
... What changes for headset? not much
... where is the camera, where are the eyes, and where are the inputs. need to know the relative locations to map it all together
... no longer need to keep a buffer of textures because not drawing own overly. do now need to update just before drawing, so it will match up
... questions?

<alexturn> https://docs.microsoft.com/en-us/windows/mixed-reality/locatable-camera-in-directx

<Zakim> alexturn, you wanted to talk about how HoloLens does things today

Alex: on the native layer, hololens explanation. Agree that almost everything is there
... this api annotates the camera frames with the spatial data

<alexturn> https://docs.microsoft.com/en-us/windows/mixed-reality/vuforia-development-overview

Alex: one other link, based on that Vuforia was able to do this link

Ada: Are you asking for a computer vision worklet that takes in the shader logic and then more logic to parse the image?

<blair> can you repeat, couldn't understand what ada said

<blair> (sorry for being such a PITA)

Nick: In general I prefer to use basic building blocks like a web worker (or in main thread as now). Building that at the application layer is preferrable unless a strong techncial reason to do more rigidly
... prefer a way to attach a texture and play with it like that

<blair> ^^^ good one, john

Blair: WebXR had huge performance issues due to the way the iOS works.
... WebXRViewer (sorry!)
... Are we wanting to build on WebRTC? How feasible would it be make those video frames directly available in the GPU? Seems like an either or scenario... GPU, CPU, or both could be requested
... would be great to skip all the extra copies. how feasible is it?
... if the user has already said you can have camera data, there are no privacy issues in how it's delivered
... Also want to coordinate with anchors and hittest repos.... there will be overlap in how these are build

Nick: regarding WebRTC or not... we just want to call gl.texImage2D

Blair: ok, but if the browser has the data in texture memory... why should you even need to call this api?

Nick: That's an interesting question. It's possible that this api could already be used to do that efficiently. WE haven't seen texture copy be a bottleneck in what we're doing

Blair: only so many miliseconds in a frame... let's not waste any of them

Artem: texImage2D can take up to 11ms... can cause mid-frame flushes

Nick: we timed it..not bad now in latest chrome

Artem: it really matters when you call it

Nick: we don't currently see a problem in our timing info

Josh: What changes would make computer vision easier/faster on the web. If there was a magic api that was the most/fastest... what would you ask

Nick: 2 answers... 1 is about speed of processing. 2 is about ease of use
... lots of misconceptions about what is possible. *shows architecture slide* this is doable today, it's not easy though
... SIMD in webassembly would make it faster to process vector data more efficiently

Blair: this ties back to what brandon was saying earlier about passing things to workers to make things faster. Anything we can do to smooth the transition to workers would be good. GPU is great, but also there are CPU-only algorithms too. Believe there will be educational opportunities that are based on existing building blocks. OTOH, Every time I talk to someone who works on mobile, they say there's a unified memory architecture
... this may not be the case for javascript typed array buffers.... and also maybe things like spectre... but anything that makes the transition from CPU to GPU and back seems like it would be good

<blair> bye bye, I gotta run and feed kiddos

Roadmap for AR in WebXR discussion

<atsushi> scribe: alexis_menard

cwilso is talking about our we can organize our work moving forward

<cwilso> 1) Should we use modularization as a way to organize our work?2) Are we comfortable with unchaining AR & VR in that way? 3) Is first spec smaller than what we've considered, because it's only core module?4) What exactly would be in "Core" = Spaces+VR+controllers?5) "AR module" = mode, requestHitTest/RWG, Anchors?

<cwilso> 1) Should we use modularization as a way to organize our work?

cwilso: inviting people to give suggestions on how we can organize our work....

<cwilso> 2) Are we comfortable with unchaining AR & VR in that way?

option would be to follow the idea of modules like the CSS WG and work pieces in a independent fashion

<cwilso> 3) Is first spec smaller than what we've considered, because it's only core module?

<cwilso> 4) What exactly would be in "Core" = Spaces+VR+controllers?

<cwilso> 5) "AR module" = mode, requestHitTest/RWG, Anchors?

bajones is asking about how the CSS WG is deprecating a module and how does it look like?

brandon thinks that not all module could work out....

cwilso is asking to clarify the deprecation bajones is asking about, basically how late in the process for e.g. CR or shipping

he is giving an example of technology changes too much and the spec doesn't work out for new hardware or technologies...

NellWaliczek is asking if the difference of deprecating a module would be the same as deprecating the entire spec for example

cwilso says that if the spec doesn't make it to REC then it's not really a spec so we can trash it :) :) :) .... rooms laugh...

we have to be careful about breaking the ecosystem

(e.g. feature detect, polyfill, ...)

to avoid breakage...

A good example is WebAudio, multiple implementation shipped and some of the design was somewhat dumb

a monkey patch library was created to fix and port of the code...

to the newer and better API

it's possible to deprecate a REC

rescinding is the right term for it

cwilso is hopeful that things would get cut off before the REC steps

the only worry part is shipping implementation

bajones thinks that module or not, shipping features and features, the process should be the same....

cwilso notes that that's why we have an incubation process so that things can fail nicely.

NellWaliczek is struggling with the idea of module and their scope, e.g. does lighting is together with AR....

NellWaliczek is worrying on what happen if our breakdown of modules fails...

cwilso thinks that a core module with all the spaces included for e.g.

if things can be modularized further cwilso thinks that it should be separated into its own module

leonard is asking about what is the obligation or requirements for browser to implement modules

in addition of core

cwilso is saying that no browser is obligated to ship anything

he gives the example of CSS L2 that never ships fully

cwilso reformulates the question : "if ambient light is a module, is it less likely to be implemented?"

alexis_menard mentioned that CSS modules are not a contract with UAs, it's a implementation detail of the WG to operate faster/better

<Zakim> johnpallett, you wanted to ask for clarification

johnpallett modules are collection of features self contained

modules can run on different schedule it seems?

what if 2 modules have a dependency on each others?

how we detect conflicts between modules?

schedule and authors are different

NellWaliczek thinks that we would need to reorganized as a WG

if the module are very dependent maybe they should be into a single module

johnpallett is asking about the module that needs changing on core module

NellWaliczek we have to implement mechanism to review progress and make a process to engage with core members...

NellWaliczek thinks that the module is appealing because of the flexibility.

alexturn what is a minimal spec, what are the core features?

alexturn looked at the input for e.g.

<johnpallett> one point I want to make sure gets captured if we design a module-based process is ensuring that there is implementer feedback ensuring that the requirements of two modules don't conflict in a way that makes it impossible to implement both.

<cwilso> alexis: dependencies are just a process; we'd still have to manage that. The core of the WG still meets in the same way.

NellWaliczek mentioned the explainers are split for e.g.

<cwilso> john pallett: one difference between this and CSS is the diversity of underlying platforms. I thinbk we have more diversity under th hood?

johnpallett gives an example of 2 modules looking good but one is unimplementable on one OS/platform

NellWaliczek mentioned that CSS also have similar problem, CSS feature may work well with layout system in Chrome but not in another vendor

ada is asking if maybe we need to increase the amount of calls (weekly) to make sure that modules have time to be covered

<trevorfsmith> q10000

straw poll in the room on whether the modules are a good idea

most of the people raised their hand (2/3 thinks it's a good idea)

<ada> NO strong objections

NellWaliczek is asking if we feel comfortable pulling all AR stuff of the core?

that AR will be absent of the REC

alexturn except immersive-ar token in core, AR could be split in module

ada thinks that it changes the way we speak the developer, "we completed the core of WebXR it's stable and we make considerable progress on the AR module"

it will make the developer more confident with our API

trevorfsmith is saying that it's weird that we have an AR module and that VR doesn't have it's on the core

trevorfsmith could we present it also so that "here is the core it does a bunch of VR but there is this AR module and this VR module which does few new things not yet quite ready for prime time"

NellWaliczek says that we have already some VR specific things that could be in a module and are currently in the spec complete milestone

maybe they could be removed and moved to a module

NellWaliczek is suggesting that for example of VR you can have a multi view module and a layer module

and that you can't implement the multi view unless you have implementing the layer module

NellWaliczek immersive-ar is really the open question...

should we include it or not?

alexturn says that immersive-ar is not really define in the current spec

NellWaliczek and cwilso that it's an issue we need to solve if we want to go REC

bajones there are a lot of work related to composition and that's very related to immersive-ar

<Zakim> johnpallett, you wanted to mention proposals #51 which has a list of AR topics worth considering

<cwilso> https://github.com/immersive-web/proposals/issues/51

johnpallett points that #51 have interesting questions about being able to full AR solutions to make the AR experience useful

<johnpallett> (clarification - there may be things in #51 that we'd need to address before AR is useful)

NellWaliczek says that even if immersive-ar is shipping in Level 1 of core WebXR it doesn't matter if it's not super useful

browsers also have the choice to reject the session type...

bajones there is going to be a marketing push of these features and we want to make sure that we have a set of features that are useful otherwise people will say "ar on the web is useless"

<cwilso> alexis: now you've confused me. :) . Even if the spec includes immersive-ar, and UA decides to reject property... that seems okay.

<Zakim> JGwinner, you wanted to say It could have benefits as well, as supporting immersive VR for say the Hololens is less than useful. (probably obvious)

<Zakim> kearwood, you wanted to ask the theoretical question about what it would look like if a new mode was added. (Not AR or VR.. Perhaps passthrough mixed reality?)

NellWaliczek if we have an AR composition module this would be the right place to define the various environment and blend modes (I guess I captured that correctly)

kearwood is asking what it does look if we add another presentation mode provided that if ar/vr mode part of the core module...

bajones thinks that we could introduce a new mode in a module

Manishearth thinks that often you would have to modify the core spec

NellWaliczek says if we need we can just level the spec

Manishearth says that we should keep the presentation mode generic enough and their reference in the spec so it doesn't matter if we add new ones

NellWaliczek do we believe we can ship VR without AR

<cwilso> Nell: there are two orthogonal questions - a) can we ship vr without ar? b) how do we break up the rest of the features?

and #2 how do we break up AR so we can ship it faster....

<cwilso> nell: on a), "yes, as long as we implement privacy first"

NellWaliczek thinks we should address the privacy and it would help...

we should discuss these options on the call

in 2 weeks after we talked to our various internal organizations

NellWaliczek can we draw a straw poll today, then draft a bit on how we could split AR to have discussions inside respective companies

ada do we need to recharter?

cwilso not convinced

cwilso we need to recharter though so maybe we should do it. But we don't really need to recharter we don't ship anything new....

Manishearth was input part of the deliverables?

cwilso says that we are not trying to ship something that was not in the charter, we're actually thinking shipping less that what was on the charter...

cwilso WG frequently miss what they say they will deliver on their charter, this is normal

johnpallett thinks that how AR could be organized would be great but would be great to have an idea of the process would help

cwilso: )

<atsushi> rrsあげんｔ、

<cwilso> State +1 if you agree we can ship vr without ar; -1 if you disagree; 0 if you are ambivalent.

(provided we adopt module to move AR forward, not a massive L2 WebXR spec)

<cwilso> IMPORTANT CAVEAT: this is PRESUMING that we are breaking into modules, and not waiting on massive level 2 spec in order to address AR

johnpallett is curious about how we get a new level out, is it fast/slow?

<cwilso> IFF we are breaking into modules, THEN state +1 if you agree we can ship VR without AR; -1 if you disagree; 0 to abstain.

<Manishearth> +1

<samdrazin> +1

<NellWaliczek> +1

<artem> +1000000

<ada> +1

<cwilso> +1

<joshmarinacci> +1

<kearwood> +1

<bajones> +1!

<trevorfsmith> +1 IFF we use modules for AR and they're worked on in parallel as the core spec is approaching CR.

<johnpallett> +1

<alexturn> +1

<JGwinner> 0

<cwilso> DECIDED: IFF we use modules for AR features, we could ship Core+VR first as a separate spec.

<cwilso> Action item on the LT to make a proposal on what modularization looks like,.

<ada> scribe: Manishearth

Trusted UI

NellWaliczek: When we were discussing the privacy doc yesterday, one of the key assumptions we were making was the idea that there was not interest/possibility cross UA/hardware to provide a mechanism for trusted UI
... if this were available it would have changed our conclusions.
... As there were many people invested in getting into this i wanted us to discuss this separately here
... Alex, I was hoping you could elaborate more on the inception thing from yesterday?

alexturn: as part of the discussion on "what trusted actions exist" (e.g. you can hit the home button , etc)
... we were taking for granted you could reliably *leave* the experience without spoofing
... it depends what kind of experience it is
... so on HL2 you go to home by touching the logo. but what if you miss or something?
... how do you know you've woken up from the dream
... it's hard to spoof a home screen (based on icons/background)
... but what about spoofing a browser? it may be quite simple
... the ability to spoof on the way out is also a problem. this doesn't mean we should write off all spoofing
... does this yet further increase the requirement that we must solve this

<Zakim> johnpallett, you wanted to point at https://github.com/immersive-web/webxr/issues/424#issuecomment-492857627 for background on concerns about cross-platform trusted interfaces and

johnpallett: As framing for the concern in the privacy doc
... the concern was not "can one UA on one platform can solve this problem?:
... the concern is whether variations in approaches to this problem would make it difficult to get predictable behavior
... where on some platforms they have smooth flow but on others, where they haven't tested it, they get something unexpected
... the other comment i wanted to make was that this affects things like x-origin navigation
... consent is one aspect, but also things like presenting the origin (how do i know i'm on bankofamerica.com?)

NellWaliczek: just wanted to point out i know we had a requirement in WebVR that there MUST be a way to exit the experience. we're vague about it
... thing I'm hoping to get out of this convo is "what a trusted UI should be consistent in" so there *can* be variations but we can still provide predictability

bajones: there's text in the WebXR spec for this, but it says should not must and needs to be expanded

Nick-8thWall: a q that came up several times yesterday that wasn't addressed:
... what are the specific examples of the threat vectors associated with presenting a fake permissions UI

NellWaliczek: mostly to do with the idea that it convinces people that such dialogs can be presented in immersive
... i.e. a user may feel okay with a password dialog if they see a perms dialog once

johnpallett: there are a bunch of these, we should probably assume the threat vectors exist instead of going into this
... another q: how does the UA tell the user they need to press the trusted button (or whatever)

NellWaliczek: we should also look at the worst case scenario
... e.g. if the permissions dialog says "press the home button", and if it's spoofed and you press the home button, you will end up in the home window
... i.e. set up the failure mode of the spoofed prompt to be different

joshmarinacci: i think we should allow UAs to be different as a monoculture is easier to attack

<bajones> +q

joshmarinacci: can we specify that there MUST be some form of trusted thing but provide no guidance on implementation
... i.e. we specify the level of security but not solve the problem

NellWaliczek: there's a continuum here, we can say *handwave handwave* trusted ui, or we can try to specify common elements for consistency
... e.g. if we require a trusted button we can specify a worst case fallback for the threat vector

bajones: the outcome of this discussion should NOT be to design such a UI
... we should be able to establish on all of the environments we're aware of, is there a feasible way to provide such a trsted UI
... and if we can't we're in trouble and we need to take more aggressive steps to work around that

joshmarinacci: i guess the subtext of this question is "how much must we solve this before shipping?"
... i don't want to say we're not shipping till we climb this hill when we don't know how big the hill is

NellWaliczek: we should just brainstorm first, and then later figure out "practically speaking, we should do xyz"

<cwilso> ack kearq?

NellWaliczek: find out if there is a place to go first

<Zakim> kearwood, you wanted to say that permissions prompts must also identify the origin in a secure way simultaneously.

kearwood: an example of what you *can* define as a requirement without dictating the impl or appearance
... when the user is making a choice, they must be simultaneously able to decide which origin they're making the choice for
... e.g. if a UA lets you hold down home to see what your origin is

<JGwinner> +1

kearwood: and they have a separate UI for permissions
... you need to be careful about the origin changing underneath the user.
... so you can require that it's clear of the origin of the permission

NellWaliczek: if one of the concerns is that people will get confused between perms and password prompts, perhaps we can require that perms look different and see if browser vendors are okay with that

alexturn: just wanted to provide a concrete example >>holds laptop up<<
... on chrome when you get a mic perms question, the little lock changes color a bit
... but if an HTTP Basic authentication prompt shows up it's just a modal that sliiiightly overlaps with the url bar

(various people party as we realize the web is broken anyway)

<cwilso> ack q?

alexturn: if we can derive the principles for normal vs fullscreen we can also apply them here
... maybe they were just trying to be conservative, or maybe they had a strong reason

bajones: going with nell's comment on brainstorming ; wanted to comment: all the XR platforms we work with have some way to go to the home screen, out of necessity
... (also asks about HL2)

alexturn: HL2 has this thingy on your wrist that you can tap, which will be picked up by the system even if it gets drawn over

bajones: either way it's pretty bad if you as a platform don't have a way to do this
... we're in a pretty weird place as a platform building on a platform
... we can imagine for most platform how a trusted UI would work, but what edge cases exist?
... i know that Cardboard is one edge case, and the AR scenario
... Cardboard is also the "barely a vr device" system we have. are there even any advanced perms that wuld apply here? "do you want to give me camera access"? "no i'm cardboard@"
... idk how this translates to phone AR
... the actual flow starts to look like fullscreen, but there *are* reliable platform gestures that you can use to back out
... and if there are not that is not our problem, that's android/ios's problem

<Zakim> johnpallett, you wanted to address 'button' vs. donning as different approaches that have different UX implications. However ideas like 'totem' are more portable.

johnpallett: talking about whether or not there is a way to exit the experience is a bit of a red herring
... since there has to be one. except maybe CAVE
... q is not if there is a way to exit, q is if there's a way to provide an experience which is smooth/consistent on all platforms
... there are various approaches of consistency like a reserved UI slot, etc

NellWaliczek: i think you've misunderstood: i'm trying to ask, is it possible to consider if every system has a way to exit, can we build something based on that way to exit that gives you a trusted UI
... e.g. when you go to purchase something on your iphone you have to doubletap home to trigger faceid
... if it was spoofed that would drop you on the home screen
... my point was that there may be a path towards defining a common element that we can build this UI on

johnpallett: more trying to say: let's explore the divergence of platforms

JGwinner: agree with joshmarinacci that we shouldn't specify a particular method
... maybe we can ask UA implementors to upload things as an IP free thing
... another thing: we shouldn't discount audio

bajones: if we use sigils we should make sure that the sigils can't be captured any way via APIs

<joshmarinacci> +q

JGwinner: also we should handle cases where something is in kiosk mode and wants to disallow permissions

NellWaliczek: it's fine if browsers have a flag that lets you disable things from the spec. not technically conformant but this is quite normal

<cwilso> scribe: kearwood

<cwilso> Manis: I was just reminded of Ctrl-Alt-Del.

MAnish: I was reminded of windows xp and CTRL+ALT+DEL. When you log in, it asks you to press it. Applications can ask you to hit a key combination that is unmaskable. It's just another example, a well known one. One issue with this is that even if the system has this, the browser may not be able to piggy-back on this.

Firefox reality may not be able to intercept the oculus home button for example

Nell:

Not being prescriptive if its an operating systems preserved button.

Multiple: there is a back button in browsers.

Brandon:

There is an app button that is not exposed by the gamepad api

In a hololens or magic leap like environment, popping out the browser stack may be equivalent to going to the home screen.

Can function as equivalent gestures

Even if the application can't intercept the home gesture, there should be another thing you can reserve.

<scribe> scribe: Manishearth

joshmarinacci: is there a way we can write this such that the UA can pick any one of these solutions

bajones: Yes

<atsushi> scribe: Manishearth

NellWaliczek: four categories:
... 1) is trusted UI possible/required?
... 2) when can we obtain consent (creation only? during session?)
... 3) how granular do we want to enable the spec to do requests?
... 4) once permission has been granted, what are the implications around what needs to happen around that data (e.g. quantization, categorization of user-configured data vs fallback)

which we need to separate out

this way we can treat the consent stuff as separate

NellWaliczek: action item to figure out who's responsible for each of these

rssagent, please draft the minutes

- DRAFT -

immersive-web 2019/June Face-to-Face 2nd day

05 Jun 2019

Attendees

Contents