W3C

Second Screen WG/CG F2F - Day 2/2

24 May 2019

Attendees

Present
Anssi Kostiainen (Intel), Chris Needham (BBC), Eric Carlson (Apple), Francois Daoust (W3C, remote), Kasar Masood (SportTotal), Louay Bassbouss (Fraunhofer FOKUS), Mark Foltz (Google), Peter Thatcher
Chair
Anssi
Scribe
Chris, Francois

Meeting minutes

See also:

anssik: more on technical topics for open screen protocol, the open screen protocol library, explainer for open screen protocol, plans for wide review, group rechartering, and upcoming F2F

General 1.0 protocol issues

Open Screen Protocol slides (56-69)

mfoltzgoogle: type ID used by endpoints to decide which CBOR parsing code to use.
… Also numeric identifier for requests/responses.
… Third type of IDs are how endpoints correlate requests/responses. If you want to determine the availability of an endpoint, you create a "watch" with an identifier.
… I want to review how these identifiers are assigned.
… I don't think we have issues open for message type keys.
… Let's start with message type IDs. [showing type ID structure slide]
… How many bytes do we want to each type of message?
… With one or two bytes, we already have a pretty big range of message types. I don't think we need to go beyond that.
… The basic idea is to reserve the 1-byte key for messages that are most frequently sent. Hard to tell before we have implementations, but some are obvious.
… presentation-connectionmessage is a good example, audio and video frames. One change I would make is to at remote-playback-state as well.
… The majority of message types will be 2 bytes, from 64 to 8192 in the table. These are messages that should happen one or two times per session.
… Note we reserve ranges each time for extensions.
… Longer type IDs, let's say they are reserved for now, I don't think we'll have to bother about them any time.
… The current allocation appears in the CDDL.
… The spec is not explicit about the allocation scheme.
… The spec should just say which ranges can be used.

Chris: Do we need a way for extensions to be able to register ranges of numbers for their own use, e.g., some form of registry?

mfoltzgoogle: There are a couple more steps before we can have full support for extensions. We need a way for an endpoint to discover what extensions are supported.
... I think it's important to have a registry of capabilities. We don't want two endpoints to understand extensions differently
... Once you know the capabilities of the other endpoint, hopefully you can send/receive messages for the right IDs.

Peter: The spec currently says that if you get a type key that you don't understand, you just close the connection.

Eric: So you need to recompile to support extensions

Peter: You would need new code. Whether you recompile depends on how you implemented it.

mfoltzgoogle: The capabilities have to be mutually agreed. The other side has to acknowledge that it understands the capabilities.

<cpn> Also, there may be a need to add a time synchronization protocol (e.g, similar to HbbTV companion screen) that may want to use low range message type IDs

Action: Chris to add a reference to the DVB / HbbTV sync message definitions

Peter: The endpoint will just indicate which type keys you understand. If you don't know what the capabilities mean, then you just won't send these messages.

Peter: We could add a section that explains how to add new types of messages.

mfoltzgoogle: There's a normative aspect that says that no implementation should send messages for a type key that the other endpoint has not declared support of.

Peter: Yes. There's no way for me to tell you that I cannot receive a certain type of message other than closing the connection.

anssik: OK, we should create an issue to track it.

mfoltzgoogle: Extensions is part of v1 scope, there may already be an issue opened.

Action: mfoltzgoogle Add remote-playback-state to the list of one-byte message type IDs.

<anssik> [Example of an Extensibility section]

Peter: We could reserve a range of numbers for capabilities. The sizes aren't too important, so we can reserve large numbers.

mfoltzgoogle: I think there's an assumption that all agents support certain types of messages. We could be pedantic and say that each message type maps to a capability.

Eric: There's a basic set that you have to implement.

<anssik> [Opened Add Extensibility section issue (#175)]

mfoltzgoogle: I think we should reserve something like a thousand capabilities for standard capabilities and reserve the rest for non standard capabilities.

Resolved: Reserve 1-1000 for standard capability IDs. Extensions can pick numbers > 1000

mfoltzgoogle: We talked about length prefixing having been dropped yesterday.
… We talked about an IANA registry for external (extensions) type keys. I don't know if we want to start with ours or register it at IANA.

anssik: Probably good to start small on our own to avoid overhead

mfoltzgoogle: It will probably be a file on GitHub for now.

anssik: That's fine.

mfoltzgoogle: For dictionaries, we use numeric IDs instead of keys on the wire to save bytes. In principle, we could let endpoints add additional entries to the dictionaries, but we need a way for endpoints to detail the mapping between IDs and keys.
… The message has to make sense without the field being processed in principle.

Peter: If we have an extensibility section, we can explain how to do that there.
… We didn't think of saying that fields added to regular messages should be optional.

mfoltzgoogle: Can you add this to the pull request?

Peter: Sure.

anssik: Right. Whoever might want to create extensions for HbbTV might want some checklist of things to be done.

mfoltzgoogle: Alright, that wraps up for message type IDs and fields.
… Next category is request/response IDs.
… Concretely, when you send a message that needs a correlated response, the request message includes a request-id, which must appear in the response.
… We don't need the request-ids to be all unique across all endpoints.
… But we do want to avoid collision between multiple connections between two agents.

Peter: 2 options. Request/Responses tied to transport. Problem is with reconnect which switches to another transport.
… Other option is to include state. But what happens if an endpoint forgets to include state.

mfoltzgoogle: One solution is that each agent keeps track of the next request ID to use for an endpoint and increments it by 1 for each message. A request-id of 1 is a "reset your counter" message.
… 1 becomes a "magic number".
… Option 2 is Peter's option.

Peter: Basically, send a message to request state if you forgot about it.
… Problem with this one is that if I forget everything, I may forget to send that reset request as well.
… so we need to make this a requirement to send the request.

mfoltzgoogle: Basically a garbage collection mechanism.

Eric: Is that likely to be in the middle of a session?
… I just wonder if this is a problem that really needs to be solved.

Peter: I can think of a restart use case.

Eric: After restarting, doesn't Chrome indicate that it is connecting for the first time?

Peter: That's basically what we're trying to specify here.
… Third option is to include a state-id.

mfoltzgoogle: It could be a timestamp, for instance.

Eric: What's the initial message sent when a connection is created for the first time?

Peter: agent info, basically.

Eric: Could you include an optional field there, perhaps?

Peter: Right, sounds like option 3, but in agent-info, not agent-status.

Resolved: For issue #139, add a random state token to agent-info to notify connecting agent to discard previous state (including ID counter).

Resolved: The additional field can be named "state-token"

mfoltzgoogle: For completeness, we had a separate issue for watch IDs, but I think we should use a combined ID counter.
… For request-ids and watch-ids.

Peter: The state-token automatically covers this.
… I don't think we need to specify anything in the spec. Implementations can pick whatever ID they want. It will work, no need to combine the ID counters.

mfoltzgoogle: Fair enough.

anssik: Can we close the issue then?

mfoltzgoogle: Yes, it can be closed with the pull request for the previous issue we discussed, solved by the state-token proposal.

Action: Close Issue #145 when the pull request lands for the previous resolution (for Issue #139).

mfoltzgoogle: I believe that's all of the ID related issues that we had.

Remote Playback API Protocol Issues

Open Screen Protocol slides (76-84)

mfoltzgoogle: Some of the remaining issues for v1. First one is separate for the next three. We need to finish the requirements for the Remote Playback API.
… We did that for the Presentation API but not for the Remote Playback API. I prepared something in a pull request
PR #162.
… I looked at use cases to prepare the pull request. Relatively short.
… [reviewing HTML preview of PR #162]

mfoltzgoogle: If we add remoting to the scope, we may want to amend the third requirement to include media data.

Eric: What about the connectivity? LAN vs. bluetooth, etc.

mfoltzgoogle: This is more about what the messages need to be able to do

Peter: I think we do everything in the spec right now except number 6.

mfoltzgoogle: We do pass HTTP headers but I don't know if that's sufficient to meet the requirements.

Eric: No, that's not.

Peter: I think we would need to put the locale somewhere else.

anssik: How does Airplay handle locales if they are different?

Eric: I have no idea.

mfoltzgoogle: [looking at related notes in the Remote Playback API]

Peter: you wouldn't have different locale for one remoting and another?

Eric: Actually, especially for extract, there may not be encoding in your first language, but there may be in your second/third language.
… We look at the options vs. the list of languages and try to make the best choice.

[discussion on controller/receiver language references]

mfoltzgoogle: Eventually, it's up to the receiver to choose the locale.

Peter: Is it enough to have a list of locales, then?

Eric: Yes.

Peter: we could have that in agent-info.

mfoltzgoogle: That seems reasonable. I agree that this is information that is not specific to the Presentation API and Remote Playback API.

Action: Peter to add locale to agent-info and to create an agent-info event to advertise changes

mfoltzgoogle: The next three issues are related and future-looking. We may want a mechanism by which an application can reconnect to a remote playback started somewhere else through a remote playback ID.

Eric: Where does this ID come from?

Peter: Right now, that is not included.

mfoltzgoogle: Correct. One way to do this would be to expose a remote playback ID to the JS and then to add a reconnect method to the Remote Playback API. This does not exist yet.

[Eric describes how media playback works on iOS and Mac]

Peter: The distinction here is the idea that the receiver can continue playing even if the controller goes away. For Presentation API, that's in the spec. For remoting, that has not been envisioned. Is it worth trying to introduce the idea?
… Does Airplay have this concept?

Eric: Not as far as I know.

Peter: It wouldn't work for remoting, obviously.

mfoltzgoogle: The flinging implementation could theoretically be made compatible with such a design.

Peter: Second related question on whether we want multiple controllers for remote playback.

Eric: There has to be some kind of a default behavior when the last connection drops.
… You may want to start a remoting solution where you start watching something on a device and then switch to your mobile to control it.

Peter: We could do that.

Eric: It may make sense to flag this as v2 until we have more practical experience.

mfoltzgoogle: When the controller disconnects, it's implementation specific whether media playback continues or not.

anssik: Would it be possible to start a remote connection from a mobile phone, then put the phone to sleep, and then pause the playback one hour later?

Eric: It's implementation specific for now.
… Application might get killed for various reasons. If you want to survive putting the screen off, you actually want to survive app restart.

Peter: As long as you remember the remote playback ID, then it would work at the user agent level. But not for the Web application. What happens if Safari gets backgrounded?

Eric: It depends on what the system is going to need in terms of resources.
… The web app needs to store the info.

Peter: But it does not have a way to reconnect now.
… The user agent can reconnect, but the Web app wouldn't know about it.

mfoltzgoogle: There is an issue opened. According to the discussion, the protocol supports it, but we don't have support at the API level.

Peter: If we decide that we're going to do something later, we may change the design right now.

mfoltzgoogle: Not to derail but I have slides for this afternoon from a colleague on multiple controllers.

<anssik> Reconnecting media elements to the playback (#5)

[Group discusses reconnecting ins and outs, reuse of media elements]

Peter: It's almost as if you would want to create a proxy but not reuse an existing thing.

Eric: Yes, it's a separate consideration from setting up a connection and sending full status.
… Is the operation different from getting a MediaStream from a camera?

Peter: So you want to be able to do "new Element.srcObject = [something with ID in it]"?

Eric: Yes.
… It would do everything but render the stream locally.

Peter: One discussion is bikeshed the current API for how to setup a proxy. Second is how we would want to specify the behavior. Original question is whether we want to do that at all.

mfoltzgoogle: The protocol for a second remote playback is going to be different from an intial remote playback.
… Play/Pause/Seek should be available to secondary controllers, for sure.
… What we've seen a lot from Presentation API usage is that, when controller disconnects, presentation continues.
… So we'd want that to happen with remote playback as well.
… If we want single controllers, we want to prevent the case where another controller can take over.
… If we want multiple controllers, that's the opposite, we probably want to use GUIDs for remote playback as well.
… It seems that there is support for addressing multiple controllers in the future.

Peter: If we want to do that later, we don't want to be stuck by the protocol.

mfoltzgoogle: The discussion on how to expose that at the API level would have to continue.

anssik: So, for protocol this would be v1. For the API, this would be v2.

Peter: Adding a message that allows you to reconnect to an existing one would be v2 for the protocol as well.

mfoltzgoogle: Yes, I'm trying to avoid scope creep.

Resolved: Use a string GUID for remote-playback-id, similar to Presentation IDs.

Remaining streaming issues

Peter: I presented audio-frame, video-frame, and data-frame. But then I realized that I did not need data-frame so I removed it.
… There's no text if I share a tab, right?
… For remoting, you would have text, but we already have a mechanism to send text to the other side, through text track.
… So it doesn't need to be in the form of a data-frame.

Eric: [question on remoting]

Peter: If we're trying to solve remoting, we can concentrate on solving text. But in the context of streaming, there is no need to stream text/data.
… I'm proposing to drop generic data streams as long as we don't have a use case for them.

mfoltzgoogle: The main use case for data frames is when you have interactive content, e.g. an interactive display, touch screen.

Peter: Yes, but that's opposite direction, so you would set it up differently.

mfoltzgoogle: Why?

Peter: I don't want to try to solve a problem that we don't have for now.

Kasar: For sports stuff, we may have overlays in the stream.

Peter: If we want to put it back, we can do that.

Eric: The problem is not about sending text track. It's about sending arbitrary data.

Peter: If we have a use case, that's fine, we can put it back when needed.

mfoltzgoogle: The use cases do exist, they are just not documented yet. I think it's fine to postpone to v2. I'd like to see an action for someone who's interested in it to describe what they are willing to have.
… Once we have use cases, we can decide whether we define specific messages or a generic data-frame.
… Also, how does an application extend streaming for its own data.
… A game streaming product may send map location, remote inputs, etc. along with media.
… If there are APIs for these things, it makes sense to eventually consider them.

Peter: [difference between streaming and remoting in terms of latency/buffering optimizations]

mfoltzgoogle: If there are new APIs coming, we'll have to consider them in the future.

Peter: The solution for texttrack should be more first-class citizen than using generic data frames.
… So I'll keep data-frame

mfoltzgoogle: I'm not sure that we have to keep the message. Extensions can define their own.

Peter: I think it can be good to define a message for generic data so that extensions don't need to create new ones each time.

mfoltzgoogle: Let's discuss this on the pull request.

Action: Peter to update pull request to keep data frames and discuss can be used to extend the streaming protocol.

Open Screen Protocol Library

Open Screen Protocol slides (100-116)

mfoltzgoogle: Goal is to provide update since TPAC
… Open source implementation of the Open Screen Protocol. We intent to implement everything that is or will be in the spec. We're a bit behind, because spec moves faster.
… The feature set we support: advertize/listen using mDNS, agent info, connetion to an agent with QUIC, and we support all of the presentation-* messages.
… We have a demo folder to demonstrate the presentation side of the protocol.
… There's also Peter's implementation written in Go.

anssik: Is the goal to keep evolving the two implementations in parallel?

Peter: Yes.

anssik: So two independent implementations. That's great and would be very helpful for more formal steps later on.

Peter: It's in the repo. osp/go

mfoltzgoogle: At TPAC, we were figuring out how we were going to implement these CDDL messages.
… We now use a lot of CDDL features to generate code automatically.
… Currently, by default, the library uses a copy of what's in Chromium for QUIC.
… PresentationController, PresentationReceiver, PresentationConnection APIs exposed by the library, so hopefully easy to integrated in user agents.
… Support for mDND TXT data, to make it easy to handle fingerprint.

Peter: QUIC is still maturing so code is not fully compatible with IETF spec for now.

Eric: We have our own implementation of string_view and optional

mfoltzgoogle: Yes, we have an open question on whether to expose that through the API as well.
… I don't think we use any of those in the public API for now.
… We finished adding support for IPv6 socket and multicast.
… We finished our Mac porting layer.
… We added support for clang as well as gcc
… We decided to use std::chrono for our time library as it provides useful abstraction and access to system time in a consistent way.
… We're still discussing threading.
… That probably will require another few months.
… We're also trying to prevent duplicate dependencies with Chromium, and also about reusing Chromium stuff instead of our own when possible (e.g. mDNS).
… [reviews high level structure]
… Clients may care about the platform layer (e.g. threads). We currently have platform implementations for Linux, Mac and POSIX.

Eric: One thing that's really painful for us to use WebRTC libraries. Because interfaces are in C++, it isn't possible to load them dynamically. Any client, whether it is going to use WebRTC or not, pays the cost of loading them to start with.
… There is no standard way to lazy initialize interfaces.
… If there is some way to load and initialize these things on demand, that would be a big win, and would make it much easier to adopt. That doesn't mean that the interface cannot be in C++, but it would be very helpful to be able to load it and tell it to initialize itself and find the entry points dynamically.

mfoltzgoogle: Got it.

[The group discusses possibility to include/exclude dependencies in build params]

mfoltzgoogle: [shows architecture]. It mostly replicates the folder structure.

cpn: Thinking about an HbbTV implementation where you would discovery and authentication. These would need to happen outside of the browser engine. Done at the TV layer, and then passed on to the browser layer.

mfoltzgoogle: What parts would be done where?

cpn: The browser would be responsible for actually implementing the interface exposed to the application. The TV itself would be doing the discovery part, authentication.
… We may need cross-process communication

mfoltzgoogle: If you wanted such a model, then we would need to make sure that the embedder could run the library in two places with different starting points. You'd have to figure out how to synchronize the two.
… Right now, we kind of assumed the code would be run in one place to do the whole thing.
… If we did want to split the library, it should be possible in principle.
… Knowing where those splits could be would help that process.

Chris: I suspect an integrator may not be able to just take the Chromium implementation as it, they would have to separate out some parts, e.g., the discovery and authentication

[Some discussion on network sockets and where they are supported]

mfoltzgoogle: [shows PresentationController interface in C++ code]
… We do request the embedder to provide some callback parameters

mfoltzgoogle: Depending on which protocol you support, you're going to inject different services. If you want to use the default implementation, you can just use it. Otherwise, you can inject your own implementation, and then it's up to the library to decide when these services are run.
… each of these objects has an observer that gets notified when the state changes.

Eric: Back to lazy loading, it's harder to look for these things yourself. Wrapping this functionality in factory functions makes it considerably easier.

Mark: We can generate code to parse messages
… [shows example of CDDL translated to a C++ struct]
… There's a platform API, based on C++ interfaces, high level methods to create and bind sockets, join multicast groups,
… send and receive datagrams, etc
… For the next few months, we want to implement auth messages and TLS support
… We'll add an API for remote playback, and messages; also streaming
… We're integrating with Chrome, and hope to make available in the browser behind a flag
… By end of year
… You can contribute to the code. There's a CI system to build and test

Presentation and Remote Playback APIs: Proposals for enhancements

Open Screen Protocol slides (118-138)

Peter: There are three ideas for changing the APIs
… Some things we want to improve. We have two APIs, but if a web app wants to support both, you need two buttons
… Also synchronized media in the Presentation API
… Also MSE streaming in remote playback
… Looking at MSE streaming in remote playback. If the browser is doing the forwarding, what the app puts in the buffer isn't suitable for the remote playback device, you may need to transcode. We want to avoid that
… Proposal to report capabilities of the remote playback device via the API
… Add a RemotePlaybackCapabilities interface - for display resolution, codec support

Anssi: Is this Media Capabilities, or something more specific?

Peter: This would give a dynamic set of capabilities that could change at any time
… This could be problematic from a fingerprinting point of view

Mark: When would this attribute be populated?

Peter: In OSP, it's after you get the response from the remote side. So the API discussion is tied to the protocol discussion

Peter: Yes

Mark: In Media Capabilities, the application asks the platform what media it can support

Eric: The difference is that the answer tells the controller which is its preferred format, rather than indications of smooth and power efficient, etc
… If we go with this kind of API, do we want the receiver to be able to change, if conditions change? For example, if the bandwidth available changes

Peter: The controller would know about the bandwith
… A video in picture in picture, could move to a smaller size rendering to save bandwidth

Eric: It's the size of the rendering layer that's important

Peter: We could follow the Media Capabilities model, but would need extending to give extra information
… There's nothing there about screen resolution

Mark: It's distinct from media queries

Chris: The explainer talks about the relationship between Media Capabilities and the Screen interface

Peter: You'd want to know about screen capabilities such as HDR when deciding what to request

Eric: You'd need Key system, codec type, size

Peter: Eric, you were interested in the receiver picking, to help with fingerprinting. Is there still a fingerprinting concern if we model it on Media Capabilities?

Eric: Sure, the less information we expose, the better
… If there's a good reason to make the decision on the controller size, we should discuss it. But it seems the remote device is the best place to do that, it has all the information

Peter: What would app developers actually use?

Eric: The app developer would just have to request the one selected by the receiver
… It's also more consistent, as the logic for which one to pick is the same from app to app

Anssi: So the controller presents the available options and the receiver selects one?

Eric: Yes

Anssi: Can we use the source element approach?

Eric: The controller knows more about the sources

Peter: It's MSE using applications that will use this

Mark: In this case, it's the application that selects
… Those applications are now being directed to use Media Capabilities. To extend that to the remote playback case, it would make sense to use the MediaConfiguration

Peter: instead of querying by format one by one, you could pass in a list

Mark: Do we need to extend the protocol to support the MediaConfiguration dictionary?

Eric: Yes, and offer and an answer

Peter: The message could be a serialization of VideoConfiguration and AudioConfiguration

Action: Eric to people at Apple about which of these options makes more sense: offer/answer vs exposing media capabilities

Peter: [Discussion of when the information is needed, in relation to the prompt to the user]

Peter: What happens during a prompt?

Mark: If the application has the choice of flushing the buffer, rather than playing out the content, it can decide

Peter: Option to look at what's been buffered, or look at duration

Eric: Another option could be to have a poll model where the remote indicates when its ready to receive more data
… It could look at latency between controller and receiver, and space available in its local buffer

Peter: Perhaps we could have a RemotingState object and an onremotingstatechange event

Mark: The MSE spec describes a HAVE_ENOUGH_DATA state

Peter: This is for ensuring smoothness due to buffer underrun
… With option A, the application needs to diff the buffers to decide what to download. In option C, it's simpler

Eric: It's subtly different, there are two states: how much you have buffered and the state of the receiver
… A problem with option A is that the controller doesn't have any idea of how much the receiver wants to have buffered

Peter: What is the browser's job in remoting mode? It could copy the local buffer to the remote buffer

Eric: I think we need input from MSE people

Mark: Are the buffering algorithms going to be different between local and remote playback modes?
… Then there's the congestion control difference between controller and server, and controller and receiver

Eric: If it works as a poll mode, it makes it easier for the app on the controller, which is responsible for making sure it has data when the receiver needs it

Peter: Our Chromecast receivers are either showing a web page, Presentation API, or streaming and then discarding the data

Eric: From a low level implementation perspective, we use AV foundation objects. You give this a lambda and a dispatch queue, and it calls the lambda when it wants more data
… In the case of a stream coming from a camera, i feed the data into a queue, then the lambda on a different thread, then another thread pushing it down to the display code
… At the lowest level, it's looking at the clock and at the timestamps. This is the thing that knows how fast it needs data

Peter: This is like "has enough" in MSE. I think we need a third state, "has too much"

Eric: In MSE today, it's the responsibility of the app to do that, using a heuristic that doesn't cause the browser to run out of memory
… So we need a mechanism to deal with the back pressure
… Could lead to throwing data away

Peter: So it's not just networking, it's the network and the receiver buffer

Eric: Yes, the receiver is in a place to adjust the frequency at which it makes requests

Peter: So options A and B don't work, if we want to allow remote receivers to push back

Eric: There may be other options

Mark: Could use the readyState, e.g., to add "have too much"

Eric: Or does "have enough" mean also "don't send any more"?
… I think option C is a good starting point

Action: Peter to make the buffer and ready state proposal more concrete, and update the protocol to match

Mark: I think there are open questions about the API. When would the application produce the offer?

Eric: We should think about how this works for a MediaStream as well
… For a local stream I could make an offer, get a reply, then push frames

Peter: This is like the streaming case

Eric: The simpler we can make the API, the more successful it'll be

Peter: We could provide an API where you give a stream and the browser decides what to do (e.g, encoding)

Eric: Hopefully that would be as similar to this as possible.

Peter: When connected to a stream, the only part of the media element that's used is the rendering
… Could simplify further by not requiring a media element

Eric: For a MediaStream, the app doesn't have access to the frames, so you just need a mechanism to negotiate the format
… We have a similar situation as with MSE, where the conditions at the controller (optimal size and frame rate) aren't the same as at the receiver
… Could configure on the media stream Track, most efficient to do that as it's being captured, so you want the optimum configuration information

Mark: If you want to base the capture constraints on the receiver capabilities, need to do that when initiating capture

Peter: This is a cool idea, supporting streaming, maybe for v2

Mark: This could be a new separate API, or an extension to the Presentation API, to be determined

Revisit adding support for Remoting in OSP

PR #173: Add support for remoting

Peter: This is modelled on the receiver advertising capabilities, but now looking to move to offer based
… Also add information on buffer state

Mark: we can use readyState

Peter: With an additional "have too much data" value

Resolved: Use the HTMLMediaElement's readyState with an additional HAVE_TOO_MUCH state

Local Presentation Mode

Local Presentation Mode explainer

Mark: There's an explainer. Today, the presentation created is in its own separate context
… If it needs access to cookies, or want to use parts of the original page (copy DOM content), would need to serialize to a message
… People want interactive elements across controller and receiver pages
… For applications like presentations, visualisations, this makes the presentation API harder to use
… Proposal to create a mode where you create a presentation more like a child of the controlling page, per window.open()
… That child can then be mirrored to another display
… This would work for applications like slides, with speaker notes and controls

Eric: Mirroring as in the bits, or as in the live document?

Mark: The bits

Mark: Added a feature to the Presentation API to request this mode
… Feedback was adding a new mode was preferable to adding options

Eric: How does the application know how big to make the window?

Mark: Currently it doesn't, we'd want to size the window accoridng to the size of the target display
… Adds a dictionary to PresentationRequest with isLocal flag
… The request has an attribute that reflect the options set on the request
… The start promise returns a connection or a Window object
… Developers are asking us for this, may want to run an origin trial to get feedback

Peter: So from a developer perspective it's like window.open() with a display picker

One API for starting sessions

Proposal 1: One method to invoke both APIs

Mark: We currently have two APIs, difficult for applications
… An idea we had was to allow a PresentationRequest to take a sourceMedia attribute to allow section between a presentation and remote playback
… The user would see a combined list of devices
… The start() would resolve with a presentation connection or a null for remote playback
… The remote playback API would work in the same way, the app would get remote playback events, just as if they'd chosen prompt()
… [shows example sender code]

Eric: In this case, if I pick something that goes to remote playback, what happens with the URL in the PresentationRequest?

Peter: It's not used

Eric: I don't like this, seems we're trying to retrofit onto the existing API
… Something we were talking about last week was the setSyncId. As written in the current spec, enumerateDevices is supposed to include both input and output devices. By default, with no prompt, the user can get UUIDs for all devices. It's an information leak. We need the capability for WebRTC on the phone where a script can request that the output from a MediaStream goes to a headset instead of the speaker, depending on user choice.
… We looked at PresentationRequest to choose how to route the request without exposing information. Seems like the same problem, maybe there's a more general purpose API that could be used to do both things

Peter: [Shows example code with remoteDevice prompt]

Eric: If we try to make it also work for picking an audio output device, wouldn't pass array, as those are different to presentation devices

Anssi: This seems to need more thought

Mark: Maybe there's a generic device selection API? Worth talking with Device and Sensors people

Peter: [Shows example with capability based selection]

Resolved: Explore a more generic device selection mechanism

OSP Explainer

Explainer for Open Screen Protocol

Mark: Explainer talks about the goals, enourage wider adoption of open protocol
… It describes alternatives, such as existing protocols like HbbTV and cast, and why they may not be the ideal starting points
… It's not an API, so how do I convey this to the TAG?
… I was planning to show an example mapping to the API, without too much detail

Anssi: Could invite TAG members at TPAC

Mark: What's the TAG's turnaround time?

Anssi: Depends on who is assigned

<anssik> The web platform needs a service discovery mechanism

Mark: Also related to service discovery

Eric: Tess is interested in that

Mark: For the breakout, we should ask people to familiarise with our work so we can have a productive discussion
… For planning, should we include streaming and remoting in scope for V1?
… I'm happy to include it

Resolved: Add streaming and remoting in scope of v1 OSP, conditional to CG charter update

Anssi: We should have a draft charter ready for TPAC
… Then in October, have a paper to go for approval

Mark: Two things: standards process for OSP, and what are the main API features we want to include in scope of next charter

Wrap up

Anssi: Thank you everyone for attending

Mark: We'll update and share the slides

Summary of action items

  1. Chris to add a reference to the DVB / HbbTV sync message definitions
  2. mfoltzgoogle Add remote-playback-state to the list of one-byte message type IDs.
  3. Close Issue #145 when the pull request lands for the previous resolution (for Issue #139).
  4. Peter to add locale to agent-info and to create an agent-info event to advertise changes
  5. Peter to update pull request to keep data frames and discuss can be used to extend the streaming protocol.
  6. Eric to people at Apple about which of these options makes more sense: offer/answer vs exposing media capabilities
  7. Peter to make the buffer and ready state proposal more concrete, and update the protocol to match

Summary of resolutions

  1. Reserve 1-1000 for standard capability IDs. Extensions can pick numbers > 1000
  2. For issue #139, add a random state token to agent-info to notify connecting agent to discard previous state (including ID counter).
  3. The additional field can be named "state-token"
  4. Use a string GUID for remote-playback-id, similar to Presentation IDs.
  5. Use the HTMLMediaElement's readyState with an additional HAVE_TOO_MUCH state
  6. Explore a more generic device selection mechanism
  7. Add streaming and remoting in scope of v1 OSP, conditional to CG charter update
Minutes manually created (not a transcript), formatted by Bert Bos's scribe.perl version Mon Apr 15 13:11:59 2019 UTC, a reimplementation of David Booth's scribe.perl. See history.