anssik: We'll start with the application control, then show some demos, coffee break, HbbTV/ATSC discussion, lunch, open screen protocol library, other v1 issues, future use cases and APIs (local presentation window)
… No leftovers from day 1 as far as I can tell.
cpn: Got some feedback on #82 [QUIC] Find out timeline for TLS 1.3
cpn: At the IESG Review stage right now. Open question as to how long it's going to take until the RFC status but implementations are being deployed. Middlebox issue that was mentioned was resolved.
mfoltzgoogle: Question was the feature set in 1.3. Seems like the spec is stable now.
… Also status of implementations, and it seems that is happening in at least two shipping clients.
… Once we have a stable release, we can depend on it.
cpn: I've also been given some information on QUIC itself
… The quick summary of QUIC is that they targeting the end of the year for completion. Everyone focused on the transport right now, and not really looking at HTTP mapping
tidoust: And we don't need the HTTP mappings
anssik: Two implementations on their way, in Chrome and Firefox. Not sure about the status in other browsers.
… It's easier to convince browsers to implement the APIs if they have the right architecture in place already.
… I'm wondering what "v1" means.
mfoltzgoogle: Probably meant to convey some particular scoping
anssik: OK, thanks Chris for the background work.
btolsch: Some material on control protocol and serialization. Current custom binary serialization. Then considering CBOR and Protocol Buffers as alternatives. We'll discuss and recommend something for serialization..
… Protocol messages are of 4 types: command, request, response, event.
… Command is a one-way message that tells the other end to perform an action. It does not expect a response.
… Request expects a Response.
… Event is a way to tell the other end that something has happened. It does not expect a response either.
… There are also different types of Presentation API messaging types.
… From display availability to application messages, and receiver status
… [showing the current structure of header of messages]
… Version is attached to each message (change of version is possible mid-session)
… The message type gives the flavor, and then for each flavor the tpe and subtype, as just reviewed.
… For sequence ID, it is the Sequence ID of the request it's responding to
… Two implementations can talk if they talk the same major version.
… Haven't addressed that yet, but version will be negotiated during discovery and connection establishment.
… [looking at the custom binary format]
… CBOR is RFC7049. It's relatively new.
mfoltzgoogle: When we first proposed the binary protocol, Sangwhan from TAG took a look and pointed out that custom binary protocols are usually not a good practice.
… I agreed and took an action to investigate alternatives more deeply.
… CBOR is used in the Web Packaging effort.
btolsch: The primary design goals of CBOR are not necessarily small binary size, but small code size and fairly small message size.
… There are many many implementations of this already. It is based on the JSON data model.
… The possible downside of this is that the serialization of this is not directly specified as it is with Protocol Buffers. There's no canonical way to serialize CBOR.
… [showing CBOR samples]
mfoltzgoogle: What is the largest bit-length integer that can be represented?
btolsch: I believe it's 64.
mfoltzgoogle: OK, because for J-PAKE we need to represent 1000-length integers
btolsch: That would have to be done as byte strings
mfoltzgoogle: We'll have to look at how it's done, e.g. in TLS.
btolsch: [going through CBOR details on optional fields]
… omission only works with one value, otherwise there is a possible confusion. No problem with map-style encoding as key is then specified.
mfoltzgoogle: If we go with CBOR, we should definitely have rules about what receivers should do with values they don't expect, for extensibility purpose.
<sangwhan> If you implement a custom tag, I believe it is possible to do BigInts as well. Alternatively you can put it in uint8 arrays, and roll your own.
<anssik> [group discussing sangwhan's feedback]
cpn: Sangwhan made a few comments on IRC. First is that it is possible to implement a custom tag for BigInts, or use uint8 arrays. Second is that he does not have any stake in CBOR.
btolsch: Switching to Protocol Buffer now. It's Google's serialization scheme. Open source implementations are also available in many languages.
… Between versions, new fields can be added without confusing older versions. Optional fields can also be removed.
mfoltzgoogle: One of the difference between CBOR and Protocol Buffer, is that instead of having a key-value, each field has a tag ID so both ends need to have the same definition.
… The order does not matter much, although fields get serialized in order.
… You need to have the schema at hand to make the message readable.
btolsch: I have some benchmark data. We tried to keep it close to the binary format that's on GitHub.
… Done with a C++ implementation, same compilation flags (-O2).
… [looking at figures]
… For Protocol Buffers, even with the light version, there's a pretty good difference in size compared with the CBOR library.
… Benchmark with 10000 messages show relatively similar figures in both cases for reads and writes.
mfoltzgoogle: If larger messages were used, do you know what that would give?
btolsch: That should remain similar.
cpn: code footprint and compatibility with Web Packaging would tend to favor CBOR.
… You can do it with Protocol Buffers but I don't think there's an open-source library for that
btolsch: Right. As we've just discussed, performance is similar but CBOR is more size-efficient (both for code and messages)
<sangwhan> Also, standards. PB isn’t a standard. Also isn’t very JS friendly, although CBOR has a couple JS-interop pitfalls too. (e.g. integer keys in objects..)
btolsch: Also CBOR has decent tooling support, including for debugging.
mfoltzgoogle: On the debuggability point, we'll probably have to define a CDDL format.
btolsch: For debugging purposes, yes
btolsch: Our recommendation at this time is to use CBOR for serialization
mfoltzgoogle: Does CBOR have a notion of framing messages? Or do you just provide a stream?
btolsch: Forgot to mention that. CBOR is easier with streams as well. With Protocol Buffers, you have to process one message at a time.
… With CBOR, it's up to you, but you can use the number of fields to detect the end of the message
… With Protocol Buffer, the parser continues to read.
mfoltzgoogle: This is related to the question of whether we use QUIC frames, or do that at the application protocol. It would seem possible to just use CBOR without having to define a specific framing.
… OK, there are a few things to work out, driven by transport decisions.
… The data we have suggests to use CBOR
<anssik> [ The tinycbor library was used in benchmarking. ]
cpn: Have you given consideration to future use cases with media streaming?
mfoltzgoogle: I'm not sure that we'd use CBOR for media streaming. Media streaming would be mapped onto different QUIC connections/streams.
… WebRTC tries to do this by allowing two or three different packet formats on a connection (RTP, RSTP, etc.)
Louay: So CBOR is not currently supported in browsers?
btolsch: I don't think there's an implementation in Chrome
mfoltzgoogle: It's probably tied to Web Packaging and I suspect there will be code landing in Chrome at some point.
… I can follow up with the Editor of the Web Packaging work to get more visibility.
<tomoyuki> [ Chromium seems to have implementation of CBOR for WebAuthn ]
<anssik> Web Packaging Format spec repo
tidoust: Work on Web Packaging happens in the WICG although it's a few IETF specs in practice.
mfoltzgoogle: Happy to put results of the benchmark in the GitHub issue
anssik: Yes, that's a good idea
PROPOSED: For the application protocol, use CBOR for serialization
Resolved: For the application protocol, use CBOR for serialization
mfoltzgoogle: I don't have a concrete technical proposal. We talk about client/server, one side initiating a connection.
… At the control protocol level, an important property is that it be symmetric. Either side could assume either role, regardless of who initiated the connection
… example of a TV showing broadcast content on an app, wants to start an experience on the companion device
… regardless of who discovered whom
… I want to enable this case from the open screen protocol
… technically, there needs to be advertisement of the roles each side want to play
… UX, one could act as the controller, give a prompt to initiate, receiver has a prompt to accept a connection
… let's not always assume that one side initiates the connection, it wants to be the controller
… looking for feedback or suggestions or other requirements
anssik: when we started this work we had the reverse idea of what is the companion screen
tidoust: we had the opposite use case, where we have something running on the device that wants to become a presentation
… do you envision changing the role during the presentation?
mfoltzgoogle: a more media centric use cases where the controller first launches a movie on the receiver, an actor appears on the screen, and wants to push info on the actor to the controller. it may want to launch a new piece of content on a controller or other nearby device
louay: in HbbTV, this part is not specified. the discovery of the companion screen is not as easy as discovery of the TV
… on mobile devices there are restrictions. e.g, on iOS you can't run services in the background
… to do this, we used push notifications. HbbTV use case is advertisements, but the product seen on TV
… the protocol for this isn't specified, up to the manufacturer
… manufacturer remote control apps could use this
mfoltzgoogle: it would be useful to think about how we can use the Push API from a receiving context
… eg, a server push to a controlling web app
… we might be able to use mDNS to advertise that the receiver could assume the controller role, may be more compatible with mobile devices
louay: with this, this is a gap with HbbTV. how to discover the companion screen isn't specified
igarashi: is HbbTV companion screen deployed in the market?
louay: companion screen is a mandatory feature in HbbTV 2.0
[discussion of current state of HbbTV availability]
louay: HbbTV has an API for the TV to discover companion screens. can show the friendly name and launch an application
… could launch an HTML app in a web view or give more information about the content
… how the TV does the discovery isn't defined
mfoltzgoogle: do we want the mobile device to be supporting mDNS so it can respond. issues with power consumption
… mobile devices can already consume server push
igarashi: in japan, broadcasters think of TV first scenarios, TV discover the mobile
… more recently looking at mobile discovering the TV
… using a TV remote isn't easy, mobile device can have a better UI
louay: for the advertisement use case, the broadcaster wants to have more personalised ads
… is this a way to notify the user?
anssik: i can see there are use cases for both directions
cpn: My understanding of what Louay said about HbbTV is that protocols were left intentionally out of scope. Maybe a question as to whether people will be willing to converge on the Open Screen Protocol.
mfoltzgoogle: to move forward we need to understand the use cases well. would be helpful to have a few specific scenarios
… then we could ensure the protocols accommodate that use case
… could have an impact on advertisement, transport, and pairing, as well as the application protocol itself
[Coffee break with demos]
Mark demoes a photowall application using the Presentation API with attached displays in Chrome. Display name cannot be found on Windows without admin priviledges, so Chrome simply enumerates displays in the pop-up window.
One question is whether this already works with Miracast devices on Windows. Francois will check when he's back at home.
Chris wonders about Picture-in-Picture spec, and whether this can extend to other types of content.
Presentation slides: HbbTV Companion Screen Sync
cpn: Same presentation as in TPAC last year. Won't go into details.
… HbbTV defines an application environment alongside broadcast channel.
… HbbTV v1 has pretty wide adoption across Europe (and elsewhere).
… v2 introduces the companion screen and so on, first deployments now.
… For the browser environment that we use for broadcast-related applications. Typically, user presses the Red button on her TV remote to launch the application.
… A lot of the capabilities are now being addressed by CTA WAVE, so we hope that future versions of the spec will just reference that work. Currently, that's not the case. Manufacturers include a version of some open source browser codebase, and extend it as needed.
… What HbbTV has also done is specify media playback on MPEG DASH videos. Type 1 player is when native code is responsible for the playback. Type 3 is when the application does it through MSE. HbbTV mandates type 1.
… More specifically on companion screen. From a broadcaster perspective, our interest is for a mobile device to discover and launch our services on the TV.
… The reverse is also possible, but that's more in a proprietary way for the moment.
… Companion feature is similar to the Presentation API. Discovery and launch, then message exchange over a Websocket connection. The receiving page opens a Websocket connection to a local Websocket server running on the TV. The controlling page does the same thing.
… That is what HbbTV calls the "App to App" communication.
… That is not the only feature that we have. We're also interested in the synchronization of content. Web application running on the TV can synchronize content across multiple devices. E.g. multiple camera views of a partiular scene.
… Or content replacement in some cases: linear TV channel, and switch the content in a particular segment, e.g. for local news.
mfoltzgoogle: When you say content, do you mean linear content?
mfoltzgoogle: The synchronization is not necessarily about having other content then.
cpn: I'll come back to that in a moment.
… Example use case is to render a programme on the TV screen from our iPlayer mobile app.
… Also, simple interactive games such as a quiz that plays along with the TV show. Depending on what you're trying to do, there are different levels of synchronization that are required. Multiple camera angle is the most demanding.
… [Showing some details about the application lifecycle in HbbTV, broadcast-independent and broadcast-related applications]
… Broadcast-independent applications do not have access to the broadcast signal. Transition to a broadcast-related application is possible when application ID matches the one that comes with the broadcast signal.
… We only want to allow certain content providers to access the broadcast signal.
cpn: Right but it's more about being able to switch to another linear channel, as the HbbTV application is already rendered as an overlay on top of the rendered video.
anssik: Do you implement the programme guide as a companion app?
cpn: Yes, the iPlayer app has the programme. We typically use that to launch on-demand videos
… There are also PVR features that the app on the TV can access.
igarashi: ATSC 3.0 defines control of the broadcast signal from the companion screen. In Japan, there's something similar to the TV Control API that we tried to define in the W3C.
cpn: So, some additional feature that do not exist in HbbTV.
… [details about protocols]
… One of the important thing from a Presentation API perspective is that there is a difference in lifecycle. We would want to allow browsers on the network to establish a communication with an app running on the TV.
… The application environment could be torn down for other reasons, e.g. when the user switches to another channel. I guess we have that already in the Presentation API.
mfoltzgoogle: For the latter, there is a way for the receiver to close a presentation connection. That use case is accommodate already.
… For the receiving page accepting connections, I don't think there's anything in the spec that prohibits it. But the controller has to run the URL and ID of the presentation that's running.
… It's up to the receiver to advertise that in the application control protocol.
cpn: So we'd have to pass that through some out-of-band mechanism.
Louay: Maybe similar to connecting to Wifi. Extend the protocol so that when a new controller detects a receiving page, maybe already connected controllers could be notified and asked whether they accept the connection or not.
mfoltzgoogle: If the receiver decides to publish the URL and ID of the presentation, it's up to the device to decide what it wants to do with that. The controller could use that to connect to the receiver. It may also advertise the URL of the controlling page that could be run on new controlling devices to take control of the presentation.
… We may need an API change to allow the receiver to tell the controller what URL to publish
… When the TV wants to launch a companion screen application on a device, does the user receive a notification?
cpn: Yes, the user agrees with launching the application.
… I don't think that there is a notion of session identifier in HbbTV.
Louay: [going into workflow details for connecting to an application on a TV]
mfoltzgoogle: So there's a hard-coded channel ID to connect.
Louay: Yes, there is no security embedded.
Stephan: But they can change the ID from day to day. Broadcast and application come from the same authority.
igarashi: This case is very similar to DIAL. Mobile device will launch TV application.
… It applies to second screen, not difficult.
… [talks about hbbtv: URLs]
mfoltzgoogle: For DIAL applications, we support a mode where application can fetch additional information from the running application and decide whether to rendez-vous with the application or to launch a new one.
Louay: Communication is not part of DIAL. App-to-App communication needs to be defined.
igarashi: Scenario will work with a specific HbbTV URL
mfoltzgoogle: Theoretically, if we did allow connections using Presentation API to HbbTV apps, we would either have to do some custom signaling to implement some custom protocol to discover running applications, or to pass the session ID that can be used to connect to the running application. Otherwise Presentation API will launch a new app.
… Presentation IDs are supposed to be unique and hard to guess.
Presentation slides: Open Screen Protocol / HbbTV Companion Screen Compatibility
cpn: Spec on companion screen has been stable since HbbTV 2.0 was published (2015)
… Device implementation of companion screen features. We expect this to be in products but it's not there yet.
… In terms of broadcaster services that make use of companion screen features, we are still in the prototyping stages.
… There's a chicken and egg problem that needs to be solved with all of this.
… We feel we have some nice compelling use cases
… What are our goals with regards to HbbTV within this group?
… Enable adoption by more devices. We like the Open Screen Protocol because it makes the ecosystem more open. More specifically, we, BBC, would like to enable broadcasters to open HbbTV broadcast-related web applications on compatible receivers.
… Also, perhaps more for device manufacturers, but we'd like to enable users to launch general web content on TV devices. Other views may be possible. E.g. Cast has a registration scheme.
mfoltzgoogle: Right, Cast is no different from other devices here, in that they want some control over what runs over the device.
cpn: Open questions are around capability requirements: TV devices may have multiple browser engines. All of this is device manufacturer specific. So one question is whether we can signal a desire to use HbbTV broadcast-related features through the presentation request, such that the receiver can then choose which browser engine it will select.
mfoltzgoogle: Do you have a specific example?
cpn: It's mainly the tuner to change the channel.
btolsch: Would it be sufficient to define a scheme for that?
cpn: There's an issue against the Presentation API around extending the presentation request to pass additional parameters. And then, we said that, instead of passing the parameters, we thought a different scheme would be good.
mfoltzgoogle: That is what we did for Cast, with cast: URLs. Parameters are passed within that string.
… If there are features that are specific to a particular environment, I think defining a URL format for that is one way to go.
tidoust: Would new schemes be supported out of the box by implementations of the Presentation API? Cast URLs switch to a different protocol. In this case, the protocol would be the same, it's just the env created on the receiver that is different.
mfoltzgoogle: Right. No matter how it's done, the controller will have to parse the URL to determine whether it can work. hbbtv: URLs could be known as compatible with the Open Screen Protocol.
[More discussion about the scheme idea and the contents of the URL]
cpn: I think that there is some limitation on what an HbbTV URL may contain. It may not accept arbitrary parameters for the moment. I need to figure out the details.
Louay: We need to look at HbbTV when the Open Screen Protocol gets defined to see what's the best way.
mfoltzgoogle: The problem with using https URL that expect some specific environment is that it would be perfectly reasonable for a controller to launch the application in 1-UA mode, and the application won't work there since the environment is just a regular one.
cpn: Right, I'm suggesting that we just take the regular https URL and use an hbbtv scheme.
anssik: I think we're converging on something. I propose that you bake it further, Chris.
cpn: That sounds good.
igarashi: The goal of discussing HbbTV is not clear. You mention using the Open Screen Protocol. Is that a goal?
igarashi: From my perspective, I'd like to let browser on the receiver become a presentation receiver for the Presentation API. More focused at the application level.
cpn: Browser support is essential. We want interoperability.
… Proposal that I have today is to develop one single protocol, instead of layering Open Screen Protocol on top of the HbbTV Companion Screen and Scynrhonization protocols.
… The latter triggers needs that are not really in scope of what we're trying to achieve here.
… If the output of that work is an Open Screen Protocol that can support the Web API and that could also support the companion screen stuff, then these features could be developed possibly outside of W3C.
… This is probably a hard sell, as the TV companies put a lot of efforts in the current mechanism. But our selling point is that we need to address security.
… Addressing the security at the transport level is a must. HbbTV recognized that it was not well-equipped to address that on their own.
… The question becomes: what is the migration path from companion screen mechanism to Open Screen Protocol?
… Looking at the synchronization mechanism, timeline plus wall-clock synchronization can be used.
… Can groups extend the Open Screen Protocol to add these domain specific features?
anssik: I think it's good to create protocols that are extensible. What needs to be done is to get the complete list of domain specific features so that we can determine whether it's feasible with the Open Screen Protocol.
mfoltzgoogle: I think that this is a good direction. The work done in this group is focusing on using modern protocols. There will always be a challenge supporting former protocols especially if security is not enforced. Shipping support for non-secure Websockets sounds very hard in that regard.
… For extensions, if some can be implemented as JS APIs on top of the existing APIs, that would be great.
anssik: Who would like to take an action to list domain specific features?
cpn: I can do that.
… With Louay's help.
… Note I point at launching companion apps from the TV as low priority, because that is more vendor-specific for now.
mfoltzgoogle: Again, the intention is to make the control protocol as symmetric as possible.
cpn: Back to synchronization, note there's already a proposal in W3C called Timing Object if we want that at the Web level. That's future work, though.
cpn: How can W3C and HbbTV relate to each other?
… HbbTV runs as a project. Requirements about next version. HbbTV will not give a perspective about something that is not a scope of their current project.
… Sending liaison letters about the Presentation API won't work for now.
… HbbTV is going through requirements gathering now for a possible next version (circa 2021).
tidoust: Main issue we've had is that the timeline approach is not the same. W3C does development iteratively. HbbTV is more waterfall, with spec work done early on after requirements gathering, which makes it hard to integrate specs that are still under development.
cpn: We can still make some input to requirements.
anssik: Can you also take an action to inform this group about the requirements timeline for HbbTV?
Action: cpn to list domain specific features for adopting Open Screen Protocol in HbbTV
Action: cpn to inform the Second Screen WG about requirements timeline in HbbTV
cpn: HbbTV are mostly interested in adopting completed W3C specs, it has more impact to get in touch with HbbTV members individually than to HbbTV as a whole.
… CTA WAVE may be more influential.
igarashi: From a TV manufacturer perspective, we do not expect browsers specific in TV. We would prefer all features to be implemented in "regular" browsers.
anssik: The point is that, if it's a domain-specific feature, it's unfair to expect browsers to support that.
mfoltzgoogle: Generally speaking, we're looking for convergence across protocols and features. I would prefer that domain-specific features become part of browsers or that they can be implemented as JS on top of lower-level APIs.
tidoust: Pursuing the standardization of "domain-specific" features is a valuable effort that can be done in parallel and is, in any case, orthogonal to the effort we're trying to achieve here. We failed with the TV Control API, but we could try again.
igarashi: Note this is not restricted to HbbTV, also applies elsewhere.
cpn: That's a good point.*
btolsch: we want a complete and independent library that goes from the network protocol to presentation API and remote playback API like interface
… we want services such as mDNS and QUIC to be replaceable
… [architecture diagram]
… at the bottom are the replaceable network services
… we also have the platform API (the porting layer) - not shown in the diagram
… [embedder API example]
… we've used observer interfaces to implement callbacks
… you can register a screen watcher, which the browser uses to find compatible devices
… also availability
… the observer receives a list of screen IDs, it could present those in the UI
… the StartPresentation function takes callbacks for success/failure and to receive the connection on success
mfoltzgoogle: you can see that the implementation details of the API are abstracted away
… most of these calls will result in messages in the protocol being generated
… the APIs we're showing are for providing a more straightforward implementation
anssik: could be useful to other browsers for their implementations
mfoltzgoogle: our goal is to make is independently reusable, not tied to Chrome
anssik: webrtc library is similar, shared between firefox and chrome
btolsch: we plan to expose the protocol connection directly to library users, to enable protocol extensions without modifying the library
… may not be part if the initial library though
mfoltzgoogle: when we've better defined the protocol extension points, we'll be in a better position to expose it
btolsch: [embedder API example]
… [platform api]
… easily transplanted between platforms, similar to Chromium base/ and WebRTC rtc_base/
… to include sockets and threading, etc
… not designed yet, these are just examples
mfoltzgoogle: we do expose some of the services running in the library, such as the mDNS responder, allowing the embedder to start/stop them
… some of the platform APIs will be used to observe state, know when to allow new connections, etc
… [get the source]
… we use some submodules for extra dependencies
… we build with gn, chromium's build tool
… there are instructions in the ReadMe
… not really looking for contributions right now, but will eventually
mfoltzgoogle: we're focused on API design now, waiting on outcomes from this meeting
… you'll need a chromium.org account to get access, happy to discuss how to collaborate
anssik: please let us know when it's ready for wider review
… by august, we'll have the platform APIs and control protocol.
anssik: would be great to have a demo at TPAC, to show to mozilla people
mfoltzgoogle: we think it should be possible to make a stand-alone demo as a command line app
anssik: good from a community engagement point of view
cpn: we'd like to use this to experiment with the extensibility points, such as the synchronisation stuff we've talked about for HbbTV
… message latency measurements
mfoltzgoogle: we want to allow embedders to extend
… we've been working on this since january/february
btolsch: mark and i are the main developers
anssik: so you're the main points of contact for feedback?
mfoltzgoogle: it's a bit to early to know what kinds of contributions we'd want
louay: could use the sender part of this library to experiment with HbbTV application launch
mfoltzgoogle: yes. the thing to be careful about is enforcing the security requirements (e.g, entering of passwords)
anssik: what's the license?
mfoltzgoogle: currently the chromium license, similar to webrtc
mfoltzgoogle: One of the requirements we have for the Presentation API is to set a language tag along messages. Question is how to encode this language tag on the wire? You can have short tags or long tags.
… It wasn't clear if there's a recommended length for language tags or if it should be left out as variable-length.
tomoyuki: At first, 64 characters were assumed. Now, with CBOR, the length can be variable.
mfoltzgoogle: Yes, I think variable-length is fine. Consistent with HTTP headers as well. Since CBOR can do variable-length strings, this is a non issue.
tidoust: We can certainly ask the i18n group.
anssik: Tomoyuki, is this an issue that you would like to investigate further?
tomoyuki: I think it is now not necessary, since variable-length string is now possible
mfoltzgoogle: Right, I agree.
tomoyuki: Chinese, Japanese and Korean people have issues about font rendering. But, in this case, we need to distinguish between regions and we only need 2 characters (suh as "zh")
PROPOSED: Close #54 and use variable-length string for language tags
Resolved: Close #54 and use variable-length string for language tags
mfoltzgoogle: Some carry-over from TPAC with some actions that were not carried on.
… I'm happy to take the actions and add some pull requests to address them
… If other TV people have inputs there, that would be great
cpn: Device manufacturers do not always disclose support information
anssik: is Sample Device Specifications up to date with the latest thinking?
mfoltzgoogle: The level of details in that page may not be available publicly from device manufacturers of smart TVs.
… If it looks like we're shooting too high for our hardware requirements, we may want to target a different platform for our benchmarking.
… Raspberry PI is a typical device that we're using within Google. It would be useful to know where that sits in relation with devices that are available out there.
cpn: Igarashi-san, could you review the device hardware requirements that we have in this page to see if that roughly matches the hardware of available devices?
igarashi: As Chris said earlier, most manufacturers do not expose that level.
… But some information is available publicly on the Internet.
anssik: Yes, we can look at these specs.
mfoltzgoogle: The Raspberry 2 is a good device. But it is dual-core. If there are TVs that are single-core, there may be an impact of performance.
… Probably four vendors of media chipset, e.g. Marvell, Mediatek. It should be easy to look at specs. Some overlap with smartphones. ARM-based chipsets.
… I'll do some further research
cpn: I can help a bit to investigate
mfoltzgoogle: This came up a while ago when we developed the Presentation API spec, and it came up again from Web developers internally.
… Basically, when we specified how the Presentation API worked, we wanted it to be consistent whether it ran in 1-UA or 2-UA mode.
… It also preserves privacy.
… We specified a browsing context that worked a bit like an incognito or private browsing context.
… The profile lives as long as the presentation is running and as soon as the presentation is closed, the profile gets trashed.
… Talking about using the API with internal developers, we came upon a series of use cases where this scenario does not work very well, especially for the 1-UA mode.
… Suppose your presentation have resources that requires a cookie, there's no simple way to get the cookie from the controller, so the user cannot authenticate on the receiver side.
… You can do things using OAuth2 and so on, but that's complicated.
… Second scenario is sites that have offline features. To replicate the application state, the application would have to read it all, send it over as messages to the receiver, which could take time.
… Third use case is WYSIWIG. Again, you need to capture the state, send it over as message and reconstruct it on the receiver side.
… Finally, some cases where you want the user on the controller to use the "mouse" on the receiver. If you want that to happen, you have to track all events and pass that over as messages.
… It's hard to get right.
… Each one of these use cases can be solved, but taken together they show that there is an underlying issue.
… I'm proposing a new mode that looks more like window.open where the 2 documents have access to each other's document context. Same storage, you can copy part of the DOM across the controller and the receiver, etc.
… We don't need to return a presentation connection in that case, more a Window object. You would still call PresentationRequest.start but get a promise to get a pointer to the Window object of the presentation.
… Essentially, the idea is to add a flag to the PresentationRequest constructor to advertise that you need this special mode.
anssik: You can check with and without the flag as a fallback to switch to 2-UA mode.
… We would add an isLocal attribute to the PresentationRequest. How the implementation works is very similar to the 1-UA mode. User would see a list of devices, select one.
… We would never do 2-UA for that.
… I'm happy to take feedback privately on that. The WG is not scoped to work on new features.
tidoust: But the CG is.
mfoltzgoogle: We should confirm that. The CG charter does not necessarily allow that incubation for now.
… Two options would be to adjust our charter or to use WICG process. It seems easier to do that in the CG.
Action: anssik to propose amendments to the CG charter to make incubation possible.
Anssi: I will propose some text so that we can work on it in the CG
mfoltzgoogle: OK, when that is done, we'll want to work internally to make sure that this addresses their use cases.
Louay: One question. Let's say I'm connected using 1-UA mode. In this case, you can still have the local mode. What is the difference between this mode and the 1-UA mode?
mfoltzgoogle: The main difference from a Web developer perspective is in the 1-UA mode, you start from a fresh browsing context. In the local mode, you share the context with the controller.
Louay: In case of Airplay, do we need also in the 1-UA mode to support all the features that are supported in the local mode?
mfoltzgoogle: Mozilla's feedback at TPAC was that, if it's a better fit, then it may replace the 1-UA mode.
Louay: Is there a security issue in the 1-UA mode if we share cookies and the like?
mfoltzgoogle: There is a potential issue if video stream is not transmitted securely, but that exists today.
… We trust the platform-provided displays are secure enough.
anssik: This is a "glorified" version of window.open!
tidoust: Note TAG's design principle document says that dictionary parameters are better than boolean (see TAG Client-side API Design Principles: 2.6. Prefer dictionary parameters over boolean parameters or other unreadable parameters).
mfoltzgoogle: When we launched the Open Screen Protocol effort, we said we'd look into requirements for the Remote Playback API. It would be useful to look into it if someone can help. Otherwise it will be a bit lower priority for us right now.
anssik: We probably want to end up with requirements that can be satisfied by Presentation API requirements.
mfoltzgoogle: Correct. I think many requirements are similar.
… Control messages will be different. For remote playback, we need a way to remote media commands. That's still the work to be done if we want to provide a path for implementers.
… I know that there was a lot of discussions in the group about which commands are supported and which are not.
anssik: Maybe Apple will ship something sometime and you may hear feedback. I think Mounir and Anton did a research on support in different browsers.
anssik: What do you think about the latency requirements? In the Remote Playback API case, you would want commands to be reflected on the receiving side.
mfoltzgoogle: I think that's similar to the Presentation API, you can use Presentation API for media playback as well.
anssik: fastSeek may be a challenge
mfoltzgoogle: Right, knowing whether we need to support fastSeek would be a valuable thing to know.
anssik: We have 3 implementations of a similar feature Safari, Edge, Chrome.
mfoltzgoogle: A demo app that allows to send all the video commands, we could see how the implementations behave.
… We'd probably end up with play/pause, seek. And then some properties where there would be some variance.
tidoust: I believe plh had such an page with all video commands in the past. I'll try to find a pointer to it.
Action: tidoust to find link to video testing page that can be run to test media remoting capabilities for #3
mfoltzgoogle: We just discussed that. Knowing the feature set will help specify that.
[Conclusion is that #12 depends on #3. We need to figure out the requirements before]
cpn: I have some information on this. We've written some paper on the HbbTV work in particular. It's not about message latency. It's about synchronization of the timeline between devices. At the most tight synchronization, we're talking about synchronization at the audio sample, which might be needed for e.g. stereo speakers.
… And then we have things where you want to synchronize to the video frames. Few tens of milliseconds.
… Alternate language is a use case there.
… From 20ms to 200ms, audio description along video, and subtitle captions.
… There's some detail in the paper. For lip sync, having audio appear early is problematic, a bit later, it's OK.
… Another use case is providing contextual information about a piece of video content. 1-2s is kind of fine for that.
… I can link to the paper that we published.
… This is different from message latency though.
mfoltzgoogle: Right, messages would be used to synchronize the timeline between devices.
… Message latency does not seem crucial, here.
cpn: A basic implementation of the Remote Playback API would update currentTime when it receives the message from the receiver, but then you'd add the latency of the message delivery.
… In the remote playback case, do we envision as a use case playing multiple streams on the controller and the receiver and expect them to be closely synchronized?
Louay: For media timeline synchronization, it's not important the latency between the sender and receiver. What's important is to know the difference between the clocks so that you can do some projection.
… The thing where we have some issue is when you have to rely on currentTime.
… Using playbackrate to set the synchronization gives us precise control to adjust the positions between two video elements.
[Discussion on the synchronization, the Timing Object proposal. In the case of the Remote Playback API, either the browser needs to correct the currentTime it receives from the receiver with the latency introduced by the network, or the application needs to know the latency introduced by the network so that it can correct the currentTime itself.]
[Wall clock synchronization is not really affected by network latency but is needed to]
tidoust: there could be an event stream in the media, we may want to send these back to the controller to trigger actions at the right time at the controller
… maybe such events should be reported before they're triggered
mfoltzgoogle: what's the difference between reporting and triggering?
tidoust: they'll be exposed as cues on the controlling side. if you want them to be triggered at the right time, you need to know about them slightly in advance
mfoltzgoogle: are the events tied to playback time or wall clock time?
tidoust: playback time
mfoltzgoogle: if the controller has a representation of the remote playback time, it could trigger at the right time
… extract the events, tie to a future playback time?
louay: i've done an experiment with 9 screens, can share some results
… a video wall using BBC content, using a video object on 9 screens
… each video shows the time of each frame
… one is a controller and the others are slaves
… [shows graph]
… average is zero frames difference. spike in difference when the user seeks, due to buffering
… most other times, the sync is frame accurate. we use playbackrate adjustment
igarashi: does it work on TV devices?
louay: only in HbbTV 2.0, but also in desktop devices
igarashi: could be ok in devices with a software decoder, but not with hardware decoder
louay: these results include chromecast devices
tidoust: in summary, we don't have a strong requirement for message latency. but there are implications on the protocol
mfoltzgoogle: also on the remote playback API to incorporate round trip time, to estimate the current playback time, either internally or exposed through a web API
… and eventing, to ensure events are triggered at the right time on the controller side
… do you have to accommodate wall clock skew between devices to implement the synchronisation?
louay: we measure specific time periods and the round trip times. all videos are connected to the same server. can measure difference between local and server wall clocks
… we use websockets, not UDP
louay: that's right
igarashi: in terms of video decoding, you use the playbackrate only, not the decoder clock?
igarashi: in general, it's not possible to adjust the decoder clock
… or adjust the playback rate in hardware
cpn: I'm slightly reluctant to take an action on this. We've been involved in media synchronization in HbbTV. I can investigate internally to see what we can achieve in practice using your open source library. That's a bit too big a piece of work that I can commit to right now, but in principle, I'd love to do that.
mfoltzgoogle: Most of the work we touched upon to support ICE would address most of that.
… To be added in the next major version of the specification.
anssik: OK, we'll keep the issue as-is, then.
… Moving on to enhancement issues.
mfoltzgoogle: Some of them we already touched upon.
mfoltzgoogle: First is to investigate measurement techniques for network power consumption on mobile devices.
anssik: For the 1-UA mode, impact on power consumption is obviously high.
mfoltzgoogle: Right, we know it's going to be a high-power consumption activity.
anssik: Focus on discovery, connection establishment, exchange of messages
mfoltzgoogle: The last two are probably the ones worth discussing a little bit.
mfoltzgoogle: Display control procotol is to expose more display specific commands. For instance CEDC commands over HDMI displays.
Louay: I don't think it has come up in discussions in HbbTV. There's been discussions on TV connected through HDMI to the set-top box, where we lose all DVB signaling.
… But this is not related to this.
mfoltzgoogle: The main use case we come across is for remote control. To change the volume through the media element, you can only do attenuation. You need to turn up the volume on the device to make it louder.
… Having the ability to do while remoting the media or presentation would be useful from a Cast perspective. Want to hear back from device manufacturers.
mfoltzgoogle: The other enhancement that might be worth looking at in the future is wake-on-lan.
… For devices that switched to low power mode. It could be useful to have a way to wake the device up through the network.
Louay: I think this is already mentioned in newer versions of DIAL
mfoltzgoogle: Yes. Maybe something we want to add more explicit support for in the Open Screen Protocol.
anssik: If it's in DIAL, proven to work, then expectation of users will be better matched with this feature.
… So this could be v1.
Louay: Maybe related to HDMI as well. If TV is off, launching an application on a Cast device resumes the TV.
mfoltzgoogle: Right, this is a bit different as it is not wake-on-lan. Done through CDC commands.
anssik: I'll remove the "enhancement" label, then.
mfoltzgoogle: About streams, generally to support 1-UA mode, remote playback, we need to have media streaming. Generally speaking, I was looking at the WebRTC WG as to how they establish media streaming sessions.
… The reason is that WebRTC has good browser support, so we would not need to invent new media streaming protocol.
… One open question is what control do we get? Right now, it's behind the scenes from an application perspective. That's good because it allows the devices to negotiate the best parameters for them.
… That's an interesting area to explore in the future
Action: mfoltzgoogle to open an issue on how to map WebRTC on the control protocol for media streaming
anssik: Brandon, you had nice milestones for the implementation of the library.
… Is it reasonable to expect the spec timeline to look like the implementation timeline?
mfoltzgoogle: Based on the outcome of this meeting, I was hoping to take the resolutions into a consolidated Open Screen Protocol and move some of the older stuff to an archive folder, so that we can point out progress to other people.
… From the spec point of view, assembling things together in one place is our main objective by TPAC.
tidoust: what would the output be? e.g, draft specs
… can others in the group help with the spec work?
mfoltzgoogle: i wasn't planning to start a formal spec yet, just consolidate the existing information
tidoust: i'm envisaging a workflow of the API to the protocol, a mapping, without going into detail on the protocol
… such as a sequence diagram for each step: discovery, establishing transport, message exchange for each step in the API
mfoltzgoogle: i'd like to have something we can all refer to, so that sounds good
… the goal for editing a more cohesive spec is to bring others into the group
… hopefully we can get some interest before TPAC
… i don't expect to be able to show full interop at TPAC, but maybe interop between our own implementations
anssik: next F2F is in Lyon in October. I requested Thursday 25 and Friday 26. Also requested joint meeting with WebRTC, who meet Monday/Tuesday
… we could join the WebRTC meeting, also could have ad-hoc meetings during the week
mfoltzgoogle: would be good to get their feedback and collaboration
anssik: we could have a teleconference if needed. let us know if you want to do this, otherwise we continue via github
mfoltzgoogle: let's see how it goes
anssik: we could invite other participants. sangwhan said he'd like to be invited
anssik: any other technical matters?
… thanks to mark and brandon for preparing everything
… thanks to the whole group for participating
… and the scribes, francois and chris
… and our hosts, fraunhofer fokus
… it's been good to co-locate with other meetings such as MWS and DASH-IF