Immersive Web WG/CG TPAC Day 1 – 11 September 2023

Meeting minutes

Administrivia: Review big wins since last TPAC, Accessibility task force, and moving standards forward [WebXR 1333]

ada: Real World Meshing and Real World Geometry, maybe we can move from the CG to the WG

ada: WebXR web GPU Binding? We'll have to check into that one

ada: Some of the other CG topics we have - I'm not sure, I'll do a quick list and you can q+ .... capture, computer vision, detached elements, front-facing camera, geo-alignment, marker tracking,

ada: the hololens... I'm not sure if they are supporting WebXR anymore - that was one blocker in the past, what they supported. It broke and we're not sure what the plans are with it.

ada: for now lets move that in CG. <model> is still being active. Nagivation, occlusion, raw camera access... is that out behind a flag in chrome

bialpio: it is enabled by default

ada: do you think we should move it into the WG?

bialpio: i have no problems with it if we meet the criteria

bialpio: it is pretty unique in that it is mainly a OS issue for mobile and pretty much with this flavor of the API that means some kind of chromium currently

ada: lets keep in the CG for now

ada: In the working group, we can look at some stuff and move it to CR, or things that are CR what can move to REC

ada: Anchors is on oculus quest and androud - that would be good to move to CR

ada: Hit test is on both of those and wolvic, we could more them

ada: is it in wolvic

I don't know, is it jfernandez

jfernandez: no I dont think so

ada: webxr - there is still some little bit of work, could be the first spec fully finished by this wg...

ada: the webxr AR module and gamepad module seem like they are very stable, maybe we can move those to rec

ada: it would be good to move web xr hand input

bialpio: We did have a couple of additions to specs that I know, for example, chrome doesn't implement. Persistence I think, hit testing with semantic labels

cabanier: it's optional

bialpio: There might be some parts of the spec that have been stable for a while but they aren't in all of the browsers yet - do we split it into modules or what?

ada: Our next charter isn't until next year - I don't mind if we add it to that or... 1) split it into levels 2) try to get the implementations to all them by july or 3) put it on as more in our charter

bialpio: as long as it is not a blocker

ada: It will go out to more people to ask for objections and so on... This group just needs to think it is ready for that

cabanier: at some point we talked about becomming living specs?

ada: does someone know the process for that?

<dontcallmeDOM_> [slide 16]

<dontcallmeDOM_> Dan: our current charter ends at the end of October, we're in the process of rechartering - please comment on the proposed draft charter

<ada> dontcallmeDOM_: y ou are i nthe wrong channel!!

dontcallmeDOM_: I think you are in the wrong channel

ada: I will take an action item to look into how we do that

yonet: I will look into that

ada: it's probably not applicable to all of the specs, but certainly some of them.

ada: charles had an a11y task force issue - I thought it would be good to have a repo with all of them.. Right now we just have a label. It might get more good people involved if it wasn't just a thing that happens with labels in the background

cabanier: Did you mention the depth sensing module?

ada: it's in the WG already, but it is chrome only still so not a great one to move to CR

ada: Anyone can agenda any items they want

ada: 45 minutes on the <model> tag.. I think we have an issue

immersive-web/model-element#69

ada: introductions by dino

<Leonard> Can the speaker be closer to a mic?

<Leonard> TY, much better

<Leonard> Now no audio

<alcooper> Room cameras appear to have frozen as well

<Leonard> Yes

dino: (presents slides)... I'm dean, there's a few other folks...

dino: we announced this product recently, but we've been working on it for years... WebXR is supported at launch, it's not quite ready... Our model for controler is "you don't have any"... so we have some things

dino: we don't want WebXR to be the only way you work with the Web. It needs to follow the normal web principles: safe and default, interoperable, privacy

dino: we want to say "it's not a different web" - it lets us extend the existing features, and add some new features

dino: it's really important to get high quality, realistic rendering...

dino: that's why we suggested some things need to be more in control of the OS or the browser itself

dino: we think you shouldn't have to be a coder or a 3d artist - that they should be able to make minor enhancements on existing pages

dino: specifically, gaze and eye tracking - we cannot provide every web page this information. That would be horrible

dino: (shows some videos of Vision OS) This is a flat page on vision SOS, and you can see as we move around it is just like a floating window in space you can walk around it... we can show that the illusion of 3d on the page is broken, and lots of things just want a model - you could do this with a webgl canvas, but that requires environment lighting and head tracking, etc

dino: you can see the <model> element - and it looks correctly 3d and you can see if we walk around it is kind of an illusion and you're looking into a portal... This is why we gave some feedback about why we didn't _just_ want to do webgl

dino: there are just lots of limitations too -- retained mode vs intermediate... retained mode has been tried a few times before, and all of them have failed--- so why should it work now. We think there are just a lot more apis out there that all sort of share a common subset at what they agree upon as a sort of model

dino: the next thing though is consistent interoperable rendering... real time 3d rendererers are just different than many other things. But recently in the last few years the industry beyond w3c has been working on this and we think it makes it possible

<Leonard> https://materialx.org/

dino: material x is something from ILM that is declarative

dino: it describes the gltf rendering model - you see you have a fresnel effect - it's a very very powerful shading system built to be used in feature films as well as real time rendering

dino: it's also sort of procedural - you can see introducing some noise...

<dontcallmeDOM_> [slide 4]

dino: next: What about file formats. Strangely when we first made this proposal this is what everyone seemed to care about - I suggested we should recommend them how the HTML spec does.

dino: apple and a few others in the company recently started a standards body around USD...

<Leonard> Alliance for Open USD: https://aousd.org/

<Leonard> I do, but will save them for later

bradley: You say we can't give you head tracking or eye tracking - could we put noise into the signal?

dino: we tried this, the answer is "no".

dino: you need both to be very accurate and we run into vestibular issues. Things that are 'slightly off' are strangely way worse than things that are way off

cabanier: you showed material x -- are you proposing that as part of the model standard or...

dino: to be clear, I am just trying to show there are other standards bodies working on real time physically based models -- this is something we could adopt or build on.

cwilso: I have to disagree that the format doesn't matter. You would never try to build a browser today and not support jpeg

cwilso: and that is a really simplistic case. I'm not sure that this is going to happen. If we are aiming for interoperability great - but I'm not sure that happens in a world where we don't support the same formats or there isn't at least a common format

dino: It definitely does matter, the html spec doesn't _say_ that it requires jpeg, but things work themselves out

cwilso: because of licensing issues at the time

dino: there have been others since, png, for example

dino: there are divides in which browsers support which formats

cwilso: I don't like that

myles___: I don't understand

cwilso: if we wind up in a world where apple supports X and google supports Y, but neither supports something common, I think we have failed

Brandel: It is really just about whether it is the space of the specification to say that -- not that it is not something to discuss. It's about whether it should go into the spec at this time

cabanier: the reason we all objected to USD was because there was no standard for it, but it sounds like this is being worked on and it can be developed in such a way as to run nicely in web browsers... If it can do all of these things, I don't see why we cannot support that

cabanier: maybe gltf comes later?

dino: I think you should consider usd a superset of gltf

cabanier: personally I don't care that there are so many, it's a bit of a pain. IT might be a problem later, but we can also tackle them later... If we wanted to break it open... we can do it later. Maybe by then the format will be better flected out

bialpio: maybe a question related to open usd... dino you mentioned that it was a "subset of usd" does that mean that the work that the alliance will be doing is specing and starting from scratch or... how would other implementers interact with that process -- is meta or google a part of that standards body? How will we be able to provide feedback? As a person who hasn't been in standardization so long, I'm not sure. I guess I will

have to impleement both

dino7: the alliance just started it has only a few members. I believe that once the charter is in place it would be expanded. I'm not sure. The goal of the alliance is to make a more clear specification for what USD is. It is effectively explaining all of the things a web specification would. USD did not describe them - so doing that makes it more intereoperable. Does it make sense?

Leonard: Alliance for Open usd just started so I expect it will take several years to get to a spec. From what I understand it is not starting from scratch but taking existing work which is done mainly by example. There is active work between AOUSD and ? ... It's not anticipated that gltf will be able to make a lossless loop

Leonard: there are features in each format that aaren't present in the others

<Leonard> Items of present: AOUSD is just starting. It will take at least a couple of years to get a written specification

cwilso: I want to be transparent about why this is important. I worked as MS when we did <object> -- we convinced people this would be interoperable. It never was. Pragmatically we never got that to work. Object has a great fallback mechanism, but basically no one ever used it because it was so powerful and what do you fall back to

<Leonard> ... Active work between USD (AOUSD) and Khronos (glTF) [mostly in Metaverse Standards Forum] to handle the interchange and differences

cwilso: where you need to fit this in is in HTML, so the bar there is in WHATWG

dino7: So what do you think we should do? What is the alternative?

<Leonard> ... It does not seem possible to make a lossless loop between USD and glTF. There are features of one that are not present in the other.

cwilso: start with a format that has wider adoption or stay on the track that you're on and focus on getting wider adoption. I'm not trying to paint this the wrong way -- but is there even another parser strcuture for USD that has been implemented more than once? How much is specified? How much are you going to chop off for 'web usd'

<Leonard> ... glTF is actively working with OpenPBR (https://www.aswf.io/blog/academy-software-foundation-announces-openpbr-a-new-subproject-of-materialx/)

<Leonard> ... to use/incorporate their standard.

cwilso: You're not going to use the same renderers for a web page and a cinema movie, are you? This is tough because there are lots of capabilities of products...

<Leonard> ... OpenPBR is a sub-project of MaterialX and Academy Software Foundation (ASWF).

dino: I'm glad you said that - because it sounds like you're main concern is about interop..

cwilso: Yeah, it's been a long and painful process for videos

dino: I should have had a slide that it's going to be hard and there are going to take a while. Part of the reason I used material x is that it is one of the first ones where people are starting to agree like this...

dino: If I sent a 16k video to a feature phone it's not going to work - it just happens to be that with 3d we hit that more quickly

Leonard: I had a question on the video you showed of the flat screen but the model had 3d characteristics... for at least the near future actually flat screens will be the predominant mode of consuming web pages -- does this benefit them? or

dino: that is what we were trying to do - it has to be visible on a flat screen because that is almost all screens at least currently. It needs to be designed for the whole range, just like anything for the web should be. You might not get the same beautiul rendering... the same way you do with steroscopic images maybe

ada: your comment about the hard work being done in the interop stage - would that be here or aousd?

dino: I think cwilso's comments about some of this belonging in HTML apply too. I guess that's the question I have for everyone, where should it happen. This seems like the most appropriate place in terms of interest

ada: implementers type of interested people are probably in this room yeah

<Leonard> Note: Credits were missing from the 'Damanged Helmet' model. Details listed below.

<Leonard> Taken from https://github.com/KhronosGroup/glTF-Sample-Assets/blob/main/Models/DamagedHelmet/README.md

alcooper: I think there are things common to both gltf and usd that would be criticisms of both.

<Leonard> License & credits:

cabanier: I am concerned about the complexity - the place to worry about that though is probably in the formats. We tried to be clear about about how complex it could be

ada: do you think the formats themselves should include information about how to handle the lower level of detail use cases?

Leonard: Is this similar to the idea we talked about several years ago when we talked about create a favicon with lower level of detail

cabanier: It wasn't a standard, but it was kind of similar

<ada> break over, we are starting again

ada: We can crack on with other topics from the discussion.

alcooper: I was reading over your topics, and proposed topics, and can we discuss why we're building this stuff? It might help inform discussions later

ada: I'm open to doing that

cabanier: Is this about specific topics?

ada: who raised this issue?

immersive-web/model-element#70

Leonard: There are a lot of discussions about <model> since it was proposed 2 years ago. People have said about why it needs be done. I have never seen written descriptions about what we're trying to do that differs from existing things.

Leonard: In the video, that was the best I've seen about why <model> rather than model viewer. Also, privacy detections. It would be good to see this written down in the issues. I've spent a long time interacting with the issues. Why I wrote this issue is to understand why this is being proposed and its history, why existing systems are not being used, and why other systems are not sufficient

Leonard: Let's get these things discussed instead of complaining about particular cases

ada: Should we discuss this now? Or is this ongoing work?

Leonard: Without thought, it would be difficult to produce what I'm looking for during the next hour. This is important to do for further discussion.

Leonard: If a number of peopl can go out and address this and provide specifics as to why something is or should not work. So the rest of us can address use cases

dino: I think Apple can take this action. Now that we can publicly talk about the justification. We'll handle it.

dino: I'm not joking. We will do it.

Leonard: That's great, thank you

cabanier: Dean's presentation covered a lot of what Leonard is asking for. Updating hte explainer to include the info will be good

ada: Can you share the presentation?

dino: I can share a modified version of it

dino: Keynote files are OK, right? Wait, nevermind, let's not argue about formats

alcooper: I'd like to see alternatives considered flushed out more.

dino: Sure thing.

cwilso: Off the cuff, Leonard, the original explainer for <model> probably explains that question better than anything else that's pointed to. It does talk about that a little bit. To be clear, I don't see model-viewer as a competitor to <model>. <model> is trying to satisfy use cases that are more integral to the language. model-viewer is "can we drop a model into a page and make it interoperable in many different implementations, at

around ~2015". There are some lessons we collected from that.

dino: I agree, and I felt bad using model-viewer page to show that. I only did that because there were so many people saying "we shouldn't do <model> and do model-viewer instead" but we can learn from model-element. <model> doesn't replace model-viewer.

dino: The explainer did go into that. There's a random thread by weinig on Twitter (X?) that explained the rationale behind <model>. He was getting a lot of feedback.

ada: Please preserve that. Twitter might not be around forever

dino: I'll try to figure it out

alcooper: Something that was mentioned is that he's had a hard time trying to keep that API surface down. One concern for <model> is to not have every browser re-implement a game engine. It's something to keep in mind.

<ntim> https://twitter.com/samweinig/status/1445464463067398154

<Leonard> Thanks for addressing it

<dino> ^^ that was Sam's twitter thread

<dino> but we'll make an update to the explainer to include more justification

Specifying an image-based light

Brandel: In our exploration and experimentation with <model> it's become clear that even on environments where they've provided <missed> the model as it currently exists is within the context of the page. There's a strong indication that the lighting is the lighting from the page rather than the world. It's important to be able to control lighting. Marketing organizations in particular want to control this.

Brandel: In that context it would be important for us to be able to specify an environment map or image-based light. It seems pretty important that that format be an HDR format. To my knowledge, nobody has HDR at this point. There was a breakout session about HDR. I just wanted to raise it in here at this time.

Leonard: I agree with that. Is this the right working group? Or is that something that should soley done with a 3D context?

<Leonard> Agree with need for HDR. Where is the best forum to handle it. Here if it is 3D only; other WG for general browser support

Brandel: I don't want to exclude raising it elsewhere. Are there discussions that we need to have? I thought we could pre-populate that with views about the case for it, and some of the attributes/properties of an HDR format might not be familiar with folks

bajones: hi

<Leonard> ... If it is general browser, then need to make sure it is not just 8-bit colors. There is precendence for 3D-only: KTX

<Leonard> ... does not display in a browser. It requires a GPU to decompress and display, so it is used on 3D models, and not web pages

bajones: HDR is something that has had a lot of ongoing discussions in WebGPU, and video and 2d canvas and CSS and such. I've sat through many of their presentations and I understand a little better now, but not enough to explain it to a group like this. But it does sound like there is at least some headway being made there in terms of having different image sources being able to output to the browser that can be displayed on a variety

of displays and has all the attributes that people are looking for.

bajones: Please don't try to re-invent any wheels here. It's a dense topic. I think we will generally be able to rely on straight webxr point of view - we can rely on the output of these groups to feed into any comparative content that we create. For <model>, I think that you should generally be able to just provide some of these attributes that would otherwise go to a video or other images, and provide it as part of a model or part of

the embedding tag, and piggyback on what they're doing there. I wish I was in a better position to have more details on what format that is

bajones: I think we should lean into what they're doing and provide a consistent surface for HDR across the web.

Brandel: Cool

<Leonard> +1 to Brandon's comment

Brandel: One difference about our use of HDR is we have no expectation of displaying the whole gamut range of the file. We use it for an estimation. We need to think about that code wildly different intensities than anyone else. Because it's a precursor step to the ultimate display. But it's great to have that conversaiont.

bajones: Are you talking about the materials, or lighting?

Brandel: Lighting.

bajones: The materials would make the <missed> as well.

bajones: My understanding is many times when you're dealing with HDR content, the materials don't actually contain much HDR information. There's not a whole lot of value going into a particular material and saying "this is 300% red." But oftentimes, the HDRness is just being able to capture the full range of illumination that's being applied to those materials.

Brandel: <nods>

<ada> ack /url 1

bajones: I don't think we need special HDR materials. You can still embed 10/10/10/2 or something like that. But HDR isn't necessary for materials.

cabanier: Why does the browser need to specify this global lighting image? Can it not be part of the USD file itself?

cabanier: If we're going to do this, HDR becomes a problem. Now we have to define how it is on a browser that supports HDR and one that doens't.

cabanier: I think this might spiral out of control and make implementation, how to define it, much harder.

cabanier: Can it just be part of the format?

dino: Are you talking about IDLs? You could definitely embed image-based lighting into a file format. It is possible definitely in USD. You might want different IDLs for the same model in different circumstances.

cabanier: Could you post different models?

dino: Yes, but they would be different files.

dino: You'll want it external to the file.

dino: You want to say "how does this green look when in a restaurant showing off to my friends, vs when i'm outdoors, vs when i'm in my living room"

cabanier: It's too much effort to have different models for different conditions?

ada: if it's a big file, it's duplicated.

dino: Sometimes you _do_ want to embed it in the file.

<yonet_> s/idl/image-based-lighting

cabanier: Is this such a big use case that it's worth accounting for?

dino: The case is: The 3d models we show on apple.com, they have lighting that's specifically picked by the designers, specifically for product showcase. If you took that same model, and viewed it in AR, you want the realistic lighting of the environment.

dino: You want to use the same file in different places

cabanier: You want it to look OK in a place where there is no lighting?

Brandel: Even if we were to package an IDL on a particular model, it would make sense in AR quicklook use in a phone, to use the estimated lighting that comes from the system. In a webpage, it's in portol. It's not immediately clear that it exists in the world. The presentation context has more opinion about what kind of lighting and color it should take on

Brandel: there are 2 different things that can be done with the model, but people can probably use a custom image based lighting for an environment map in an AR view, it doesn't necessarily make sense to carry it along.

Brandel: For HDR, it's important to have an HDR image based light, simply because the sun is very bright. The illumination component that comes from the sun, it's way brighter than whatever device you're on.

Leonard: You might not get bright light halogen or LED, you only want to have a single model, because if changes to the model you have to propagate them out. In the glTF context, you don't embed lighting in the model file. The player can choose it as necessary. What you might want to use in a room vs when you take that same model and walk outside with it.

Leonard: If the browser can display an HDR image is separate. There is work in other working groups, even out of the W3C, to make it work. We shouldn't go there until they are done.

bajones: I am here. I tried to take myself off the queue. You want different lighting on the page vs an immersive environment.

ada: break for coffee. Come back in 15 minutes

<Leonard> Can you fix up the room cameras?

<Leonard> We have all gotten so good at zoom calls, we don't know how to deal with F2F!

Camera controls, being able to move the pivot point rather than just tumbling around the world origin.

Brandel: currently, the proposal has a camera control that consists of pitch/yaw/scale
… this is because it's generally understood that people don't like roll
… but also having full 360 controls is not appropriate for stereo displays
… but we should talk about other platforms
… the intent of the webkit implementation might need further constraints
… what is the bare mimimum for model

bajones: the pitch/yaw/scale is for orbit camera ccontrols?
… yes, that is appropriate
… for targeting, we need to make sure that the element itself is able to predict the center
… because very often there is a mismatch
… and you never find it
… the element needs to identify the center of the object
… and the external bounds and center itself
… otherwise people will have bad time
… it might be good to have an override
… there's an obvious thing that people want but there's a niche thing that others want

Leonard: does this discussion limit the depth of field?
… like the focal distance

Brandel: the goal is to define the point of interaction
… there's no notion of depth of field
… there's no dref or focal point support on our devices
… that would be a separate aspect
… (???)

bajones: for depth of field, we don't want to apply it automatically
… it's something that the embedder will want to set automatically
… we don't want to lock the ability
… it would be a separate control
… and it would be like setting a focal plane
… this is an artistic choice so should be left to the person embedding or creating the model

<Leonard> ATM glTF does not have a facility for depth of field

bajones: it's not something that this would interfere with. It's a separate control

bialpio: how will we expose the camera control?
… do we think it's a problem that the site has to account for both?
… is it a hybrid model where we allow turning even if it's turned off?

Brandel: the current representation has controls inline on the page
… so there's a capability to rotate the model
… (??) our users like to mess with the camera
… it is possible to camera controls on all views
… I
… I'm interested how people deal with those transforms

Leonard: are the camera controls on the camera or do they rotate the object?

Brandel: it's a pith/yaw/scale on the object itself

Leonard: so it's like walking away from it

Brandel: yes, by user interacting or walking

Leonard: (???)

Brandel: yes, it's the same

<Zakim> bajones, you wanted to actually say something this time!

bajones: it feels like the camera controls are needed. depth of field would be hard in stereoscopic
… how do you determine how deep intro the page the object is?
… because that might change the way you interact with it
… is it a magic window or protruding from the page?
… are there controls for that?

Brandel: currently is inset into the page. It's reasonable to alter that but I don't have an answer for that
… it might be reasonable to specify a pivot point

bajones: I'm making an assumption that if the object protrudes the page, it will be clipped
… it won't satisfy everyone
… (???)
… we need to work on way to not have things pushed out of the page

Extra Camera controls? 2D browsers might want to control FOV etc

<Leonard> I disagree with Brandon, but not enough to bring it up verbally. I think you allow objects to be closer than near-clipping plane of the page...

<Leonard> ... If you have a wall-mounted sculpture, it is likely that you want the page to be wall-aligned with the scuplter coming out of the wall (aka page)

Brandel: it's hard to see what the common denominator is

bajones: I think it's one of the areas to see what the mvp is
… but I'm unsure if we need to dive into this for the first version
… maybe we can start by saying what you see in 3D, is on the page but stereoscopic
… then allow developer feedback to go from there
… the potential for feature creep is large
… my general leaning is to focus on the minimum that is still useful
… we can add a lot of capabilities that nobody is going to use
… it's a slow process but that's ok since that is how the web works

Can they have background images? BG color yes. Can the portal be transparent

Brandel: it would be nice for the portal to have the same color
… I'm unsure if the background image should be part of the spec
… it's not as good to control with an image background
… having a transparent background makes it hard because the depth
… if anyone has examples

Leonard: does transparent mean that the background is the camera?

Brandel: for this proposal, I envision that the (??)

Leonard: so you'd see the page elements behind the canvas?

Brandel: yes

bajones: yes, like canvas can be transparent and float on top of a block of text
… then either the text needs to be in stereo mode
… my inclination is that if you have a transparent background, you don't get stereoscopic
… so the default maybe should be to not be transparent
… the other thing that I want to mention is image background
… you could surround it with a sphere
… but that could be problematic if you move around
… I think you can make the argument that there's a cubemap style image
… not sure if it's part of the mvp
… maybe an environment map style image
… I see that use case but not sure if it's needed right away

<Leonard> Agree with Brandon

cabanier: would it make sense if the model punches out a hole in the page

<Leonard> ... on image (environment) background

bajones: that would make a big change between 2d and 3d
… I don't think developers would expect that
… so we'd like to have an explicit option for that

bradleyn: what if you give it a stretch property so you can push it in?

bajones: that would be very tricky to specify
… you can image that as you scroll there are 3d elements that scroll over text
… but in most cases it can overlay text on the page
… given your description you would have a black hole effect
… when you view this in immersive environment, you'd sink it into the space
… so the text would sink into the background

<Leonard> Call the black-hole effect!

bajones: that would be really awkward
… maybe there's a use for that type of effect but specifying transparent should not trigger this

Jesse_Jurman: is this excluse to the model element?
… or is this something that any element can opt into?
… maybe you can do this in CSS
… so text or an image could do this

Brandel: in this context, it's around the model on the page
… we don't have a way for people to knock out pixels
… this is in terms about what we do with the concept of transparency
… what you propose is a general problem
… the model element is the first stereoscopic element on the page

Leonard: is this similar to CSS z-index

bajones: this is similar to compositing in a stereoscopic effect
… but that would not work at all because z-index expects to be drawn in 2d
… if you use this as a queue for actual depth, this won't work
… it would be great

<ntim> 2D HTML/CSS has the top layer

bajones: but you don't want to use existing cues of the page
… maybe something that's at the top of the page to make you opt in
… you want to have the developer say that they want this
… we can expect that the vast number of developers to still develop in 2d

cabanier: a number of years ago, there was a proposal and an implementation from Magic Leap to have 3D transforms apply in actual 3D. I believe the webkit team said that they would implement this and would satisfy your request

Do we want to expect visible transport controls, or will it all be done in custom JS?

Brandel: models can have animations
… : animations can have a duration specified so models can be understood as media elements
… : how do we think the minimal treatment of animations should be presented in the MVP?

Leonard: not all models are intended to have their animations autoplayed
… : they can be played in response to events, they can have multiple tracks
… : we should not spec that they should autoplay their first tracks

Brandel: ???

bajones: there is common use case for video where you have an embedded player
… : secondary uses are video that is being used as visual element in the page

<Brandel> (I just said that a lot of people use <video> in a way where it is desirable to hide it)

bajones: : default behavior is you get the controls for video and then they can be hidden
… : for model it should probably be the opposite
… : but there may be cases where it's desired to step through animations

<Leonard> A few cases where it can be nice to have controls: product assembly (IKEA products), simulations

bajones: : those controls can be built by the page, for video most users build their own controls
… : if you want to use model element in the context of sth like sketchfab, they'll probably build their own controls

<Zakim> ada, you wanted to ask about CSS animation

bajones: : there should be programmatic control but we may not need to build the controls in

ada: it'd be nice to have scroll-based timelines
… : so then a model would play the animation when the user scrolls down the page
… : unsure about the best way to achieve this

cabanier: it may not be needed for MVP to build in the controls
… : may be problematic to specify the position of the controls
… : maybe just specify "auto play default animation or no"

alcooper: agree that we should let pages build their own controls
… : but how can this be done? the point of model was that it's the browser that does things
… : so now we need extra API surface for sites to build the UI

bajones: we need to be careful how to do this
… : in magic leap there was a car model config that got entirely disconnected from what was set on a page when it was torn out into the environment

<dino> it might be enough to start with HTMLModelElement having part of the HTMLMediaElement API. Don't worry about multiple animations. Just a simple currentTime and paused, etc.

bajones: : we probably want the model to be connected to the page
… : so whatever controls are left in the page can still control the model

<dino> The good news is that HTMLMediaElement has done all the hard work on API design.

bajones: : some of this is OS-level

Brandel: currently in model we have a model on a page that stays on a page and lives in a page
… : you can extract quicklook models out of a page
… : transport controls are in context of a page when viewed on a page

<bajones> Thanks for the clarification!

cabanier: what dino showed is a 2d surface with stereo rendering

<vicki> despite the fact that most sites implement their own video controls, I think it would be strange to provide an API to control model playback without providing built-in controls.

<dino> bajones, did you see the videos I showed earlier?

cabanier: : magic leap would hand off the model to system compositor and the sys. compositor would show it above the page
… : the model is handed off to another process that renders it in a different context

dino: we get requests for being able to take the object out of the page and have it still controlled by JS

cabanier: in case of magic leap they had option for room-scale, the model would be in the room and the browser would still be visible
… : this was gated by permission prompt
… : it required user intent

dino: similar to full-screen that could spoof UI that user trusts
… : we thought about this and didn't come up with acceptable solution
… : model's in a page when it is in a page, but when it's pulled out it will be taken over by sth else
… : if a browser can do it safely then fine

cabanier: would you allow it or would it be forbidden?

dino: IDK, maybe we could do model.fullscreen ?
… : not a suggestion, just sth we were thinking about
… : people will want to take things out into their space
… : web devs will also want that to happen
… : IDK if we need to formalize the API right away

bajones: I like dino's suggestion to use fullscreen
… : it does seem like model is breaking into 2 different use cases
… : games? object placement? etc
… : being able to take the model into fullscreen seems like a reasonable option
… : the page would stay in the background and JS will run
… : 2nd use case is someone taking the model out of a page as a widget
… : that'd run in perpetuity w/o having to run the page
… : or maybe we could have some kind of a worklet that keeps running
… : I will be sad if we didn't have a way of pulling the model out and still being able to programatically control it

dino: very useful but we should not worry about it for MVP

bajones: agreed

dino: "I'm shopping for a couch, pull it out into my env, but I don't want to quit this mode" - that seems like the use case here
… : but unsure how to do it safely and securely right now
… : but not essential yet

Brandel: model can have a ground shadow - related to presentation mode in quicklook
… : helps visually understand what's happening
… : in stereoscopic context it may not make sense because we already have other visual cues
… : curious if people think it's valuable - always on, always off, configurable?

bajones: it won't be appropriate for every model in every context for it to be always on
… : solar system?
… : you may not want to have an implied ground
… : there's a lot of models that put a platform in the model itself that acts as a ground
… : not too many scenarios where ground shadow is really important in the page itself
… : if you have the model exist outside of page then sure
… : in a page the lack of a shadow may not be too problematic

Leonard: when you don't have a shadow, will it be bad for UX?
… : if a shadow is done badly, how bad is it?
… : if the model is bounded, will the shadow be clipped and how will that affect UX?

<Zakim> ada, you wanted to ask about ShadowMaterials in model formats

Leonard: : may be easier for MVP to skip the shadows and add capabilities for sites to add them

ada: some engines have shadow materials
… : which are only going to get colors from shadows
… : they will be transparent if there is no shadows

Brandel: not a real world entity, just something that receives shadows

Leonard: content creator's choice, but you need to know the light source correctly

ada: you can put light source inside the model
… : or the page (?) can have a light source and since the browser renders things it can do so properly

Leonard: will probably also fall under "not in MVP but maybe later"

yonet: we can change the schedule (frequency of mtgs) so we can chat about model more frequently in our regular meetings

ada: good discussion, see you tomorrow!
… : zoom link in IRC for web apps WG meeting

– DRAFT –
Immersive Web WG/CG TPAC Day 1

11 September 2023

Attendees