Second Screen WG F2F - Day 2/2 -- 23 Sep 2016

[Shih-Chiang demoes a 1-UA implementation of the Presentation API in Firefox using a Chromedcast device]

Shih-Chiang: Mirroring start presentation.
... Without connection mirroring to chromecast. Trigger presentation via session.start().
... Patches are still landing. May be able to use on Firefox nightly soon.

Anton: Media Router is part of device. Remote Display to initate is part of Play Services.

Second Screen WG re-chartering

-> 2016 charter proposal

-> Diff from previous charter

Anssi: Let's look at working group charter.
... Presentation API final recommendation to Q2 2017.

Francois: Keep changes to a minimum.

Francois: End date is October 2017. Scope hasn't changed.
...Dropped term "Presentation" from WG name. In practice no one uses it.
... The level 2 item allows the WG to work on a spec. Does not commit the group.

Mark_Foltz: Will Level 2 go back to first base (ED)? Francois/Anssi: Yes. If minor changes, can go quickly to FPWD.
... Remote Playback API was not in the first charter. Now it's in a separate document.
... Expected completion of Q3 2017 may be optimistic. However implementations already exist.
... Updated dependencies, working group names.
... Licensing had custom text. The rule that allows us to publish with software + document license is now the default.
... Change is that we are going to switch date. Implementations are focusing on 1-UA mode. Can relax exit criteria for candidate rec if necessary.

Anssi: For 2-UA mode, protocols are being incubated. May delay until protocols are mature.

Francois: End of next weeks, goes to AC review, 4-5 weeks.

Mark_Foltz: What about changes that need to be upstreamed.
... Like sandboxing flag.

Francois: Needs test coverage. Will be upstreamed into HTML 5.2.
... Prefer to remove it from our spec to avoid duplicate content.

Anssi: Normative reference policy, enforced in principle. Make it easier to publish specs for normative referncing.

Remote Playback API Issue #41

-> Guidance for HTMLMediaElement, HTMLAudioElement, HTMLVideoElement behaviors during remoting (#41)

Anton: Looking at whether we need to spec how the Media Element API works in a connected state.
... Philip J suggested we list what we have.
... Listed all properties. Grouped into areas.
... Too restrictive, could be restrictions on seeking in the original element. But they may not comply with the element spec.
... Kind to be more specific for what must be supported. Don't want to copy the entire media element spec into the remote playback API.
... How to avoid enumerating in RP spec.

Francois: MediaStream copied the element spec and changed behavior of features.
... Lists all attributes that are specific to media streams.

Mark: Enumerate behaviors that must be different for all implementations.
... And behaviors that may be difficult to implement. Change MUST to SHOULD or MAY.
... Capture a diff from the default media element spec.

Anton: Want a table that should be have differently.
... And the might not work. Francois: Not normative, application developers should not expect this to work.

Anssi: Unless there is a good use case, don't extend semantics.
... (for MediaError)
... Could flex the definition of a network error.

Anton: Changing states will be the hardest. Do we want to spec that in some way.

Francois: Changing state transitions is hard. As a rule of thumb can't change algorithms.

Anssi: Minimize patches to existing spec. Think about the long term plan to upstream these changes.
... Custom elements are slowly getting upstreamed into HTML. Could get pushback if keep patching.
... In upstream spec, define extension points for changing behaviors.

The group discusses high-level media features: seeking (and fast seeking), how playbackrate could work or break in a remote context. Mounir suggests to just leave some features as "best efforts". Problem is that it may break content, e.g. apps that use playbackrate to synchronize content.

Mounir suggests that all the operation that should apply to the remote playback should be sent to the remote side and conversely all the operations and changes happening on the remote side be sent to the local side.

Mark points out that it may be difficult to get remote players to support all features properly. Hence the need for SHOULD in some cases. For example, Philipp mentioned WebVTT as a challenge. Going from MUST to SHOULD is problematic though, since that changes the expected behavior.

Proposed plan is to find the features that are challenging to implement, and for those that need to switch to a SHOULD, figure out how the changes will impact developers.

Mark_Foltz: For features that change from MUST to SHOULD, should define expected behavior if the feature is not implemented.

[End of WG meeting / Start of CG meeting]

Renewed Second Screen CG

-> Second Screen CG charter

Anssi: The CG is 3 year old, transitioned to the WG, but improved interoperability for the Presentation API supposes that there is agreement on underlying protocols. The CG re-chartered recently to work on it.

Mark: Initially, the CG was chartered with protocols out of scope, to work on features first. That poses a challenge for interoperability though. Cross-vendor support is hard to achieve without agreement on standard protocols.
... Now that the work on the Presentation API is stable, we decided to re-charter the CG to work on protocols.
... Goals are to 1/ enable interoperability, 2/ encourage implementations of the Presentation API thanks to the existence of a protocol, 3/ establish complementary specs for the Presentation API.
... In particular, discuss network details.
... This CG is for incubation. We'll want to transition the deliverables to the appropriate venue. In parallel, we may want to do some implementation work to provide feedback.
... The goal is to move to a standards-track somewhere, whether that means W3C or IETF is an open question.
... Looking at the scope of work, the scope was initially larger, we pruned some of the features and that's good to better focus it. Other interesting use cases could be explored independently.
... We're focusing on the common scenario: control of a second screen in the LAN. We also want to make sure that we can implement all of the Presentation API and the Remote Playback API. The former one is mostly stable now, the latter is still work in progress but we're starting to see needs there.
... The priority focus is on the 2-UA mode. But the 1-UA mode is interesting too, hopefully the discovery and connection establishment can be the same in both cases, with control message communications replaced by WebRTC-like offer exchanges and stream passing.
... We'll want to consider extension points for the protocols.
... I think we want to leverage existing standards as much as practical. We should consider the work we're doing as an application of these standards.
... We're not going to define video codecs, work on 1-UA mode right away, discuss MSE/EME support.
... We're also going to leave non IPv4/IPv6 protocols out of scope as well to start with. We're also not going to develop spec that use proprietary protocols for the Presentation API.
... Deliverables: 1/ discovery and communication establishment. 2/ control channel for the Presentation API. 3/ control channel for the Remote Playback API.
... If there is interest in expanding the scope, participants can request to change the charter.

Anssi: Note we may publish non-normative reports as well, such as use cases, requirements and other notes.

Tidoust: Two main differences between WGs. License agreement: agree that the final spec is licensed under a royalty-free patent policy.

CG: Agree to license contributions to a royalty-free patent policy.
... Companies are encouaraged to license final spec, but not mandatory.
... CG can publish specs as it wants. No AC review, no draft, report process. No intermediate steps.
... CG are open to all and free to join. Non-members are free to join.

Anssi: I'm happy to chair this CG. There's a process to change the chair if needed.

Tatsuya_Igarashi: Do we assume that the local network is always connected to the Internet?

Mark: I don't think the charter specifies one way or the other.
... My opinion is that the Web should work in "disconnected" networks in principle. It would be a nice property to keep it this way. But it may not be entirely feasible.
... So I'd say it is in scope but I don't think we can guarantee that this will be supported at this stage.

Anssi: Goal of today is to gather initial inputs, and discuss who could be interested to lead some of this work.

Mark: I have some thoughts about protocols and potential options. I want to take a broader approach here.

Anssi: We won't rubberstamp anything here.

Mark: Right. For different areas, we'll want to identify who might want to investigate them.

Anssi: Are we expecting new participants for implementers side?

Mark: Maybe, some expertise (security, privacy, etc.) might be useful.

Shih-Chiang: I need to double-check with other Mozilla people to see if this is on our priority lists.

Anssi: OK, let's put ideas on the table and see how we split the work, as well as who could lead each part.

WebAPI/PresentationAPI protocol draft

-> Mozilla's WebAPI/Presentation API protocol

Shih-Chiang: To start with, we need some way to discover and establish the communication. Then we also need some way to share capabilities so we can figure out what the device can do, and typically whether the device can fulfil the presentation request in the queue.
... That's the first step. Once discovered and selected by the user, the next step is to launch the application. This requires passing the URL around as well as the presentation session information. We need to support controlling messages to connect, disconnect, close, terminate presentations.
... We also need some way to pass some user settings to the receiver device, such as the user's locale.
... After application launch, using control messages we defined, we need to define what protocol gets used for the communication channel between the controlling page and the receiving page.
... In parallel, we need to enforce security measures: device authentication, data encryption and data integrity.
... These are the basic requirements I have in mind.
... The rest of the Wiki page is proposed protocols to address these requirements.

Anssi: Do these requirements match your thoughts?

Mark: It's pretty close. Two comments I would have: capabilities is not exposed to the API, so that may not be a strong requirement. It's fine as an extension.
... It also depends on the methods by which the user agent might want to determine the availability for a given URL.
... The second comment is that I think the security requirements depend on type and number of network transports in use between devices and that's not determined yet.
... Otherwise, that seems good.
... I have similar requirements in my presentation.

Francois: For the communication channel, both "one communication channel per presentation session" and "one communication channel that multiplexes presentation exchanges" are possible, right?

Mark: Yes, that's two possible options.

Open Screen Protocol

-> Mark's Open Screen Protocol requirements and proposed options

[Mark presents slides]

Mark: Starting with requirements. We need discovery of receivers on a shared LAN.
... Some authentication mechanism and reliable communication channel is needed.
... Some non-functional aspects include usability.
... Also, we should be careful that we're not leaking information that would compromise privacy.
... A common use case is mobile devices, both sides may be constrained in terms of memory and battery life. That may have implications on protocol operations.
... We want a broad range of implementations. It would be interesting to target a platform or a device that might be difficult to implement this on, like a Chromecast device, a Raspberry PI.
... It would be good to understand the limitations of such devices.
... Finally, we need to have ways to extend the protocols to add new features (remote playback, Presentation API v2, etc.) and retain backward compatibility.
... In particular, crypto protocols evolve over time, some way to update cypher-suites will be needed.
... That's all for requirements. 4 main parts: discovery, transport, application protocol and security.

Open Screen Protocol - Discovery

Mark: For discovery, the work we've done in Chrome is that we've used 2 different methods. Once was to use SSDP for DIAL support.
... Controller continuously transmits multicast queries via UDP and the receiver that is interested responds to the queries.
... Our experience is that it is very simple to implement. Open a socket, 3 queries per 2 minutes and that's pretty reliable.
... Downsides: it requires continuous polling. Caching rules are unclear. There is an extension for IPv6 but it's not an official part of DIAL.
... When things fall down is when we need to fetch a document, proxy issues.
... Finally, XML parsing requires a separate sandboxing in Chrome for security.
... The second method we use is based on mDNS/DNS-SD (aka Bonjour)
... The receiver publishes DNS records for discover on the .local domain.
... Our experience with this method is that listeners are built into three common platforms including iOS and Android.
... It is on standards track and defines caching rules and supports IPv6.
... We have found some issue when the listener is not supported on the platform, then reliability suffers.
... We think this is due to firewall software which conflicts with the port used in mDNS.
... For battery, it depends a bit on how much caching you use. Finally, the client may think that the receiver is still there when it actually went away.
... Flushing the cache may be possible.
... In general, we consider DNS-SD a preferred option due to the standards state in particular.
... We think we should propose a service, instance name, a TXT record format.
... We should also gather data on reliability.

Shih-Chiang: mDNS, DNS-SD can also support TCP discovery, not only UDP so more reliable than SSDP.

Mark: Another area we'd like to investigate is around ports and multicasting on port 5353.

Shih-Chiang: Only one port for exchanging messages, so only one platform listener in theory.

Mark: Can be worked around in practice.
... One action item is to gather reliability data from our implementation so that we can get a baseline on how many packets we exchange.
... I'll see how I can share this information.
... With DIAL, most of my concerns are around XML.
... Fetching an XML document has a number of issues, because it goes through the usual HTTP stack, possibly through proxies. Two, parsing XML is generally considered unsafe.
... We may be able to remove dependency on the XML parsing.

Louay: Why do you need polling for SSDP?

Mark: The DIAL implementation of SSDP is query only, but it's true that SSDP is not, so we may want to consider it in isolation.

Shih-Chiang: The issue I have with SSDP vs. DNS-SD is that there are different implementations differ in the XML they produce and don't always follow the right format.

Louay: You may not need an XML description in SSDP. SSDP is used in TV devices, which could be receiver devices.

[Lunch break]

<Louay> Note: SSDP proposal for Physical Web

Open Screen Protocol - Transport

Mark presents transport requirements.

Mark: Some sort of flow control would be useful. More optional would be some compression mechanism.
... The trade-off is with CPU/battery usage.
... Whether we want to multiplex communication channels is an open question. There are pros and cons to both approaches.
... You want to be extra careful about identifying connections if you multiplex. It also makes the application protocol a bit more complex.
... One thing we found in our implementation is that, since it may be difficult to tell whether a device is still connected, some keep-alive mechanism is needed. TCP keep-alive does not quite seem to work.
... Finally, some non-functional requirement is that we don't want both ends to be in a disconnected state for too long when they re-establish connections.
... We'll want to minimize the latency there.

Shih-Chiang: Compression and latency are less of a problem in a local network environment. I would not take them as mandatory requirements.

Mark: Right, if we don't multiplex, we need to go through hand-shakes for each connection.

Shih-Chiang: What about prioritization of messages on the controlling channel? For instance, keep-alive messages might have higher priority

Mark: That's a good point.
... I don't have specific transport protocol at this time. I listed 5 options, and suggest we investigate further.
... One option is to use TCP + HTTP/2, since HTTP/2 allows for bi-directional protocols. It can handle all of the things we need for the API. It has headers, supports compression and other requirements. I do not know about prioritization though.

Shih-Chiang: It might have a fairness mechanism.

Mark: The one that is most similar is UDP + QUIC. Google is primarily developing that protocol. QUIC is designed to minimize a number of round-trip exchanges so has minimal latency.
... It will also support new versions of TLS.
... It would be good to understand the pros and cons of each of them.
... Another option is to use plain TCP + WebSockets. We could reuse a lot of that for the application messaging because the API is very similar.
... It may not support multiplexing out of the box though.
... Another option is UDP + STCP as done for RTCDataChannel.
... I'm not deeply familiar with that approach.
... I'm also not familiar on how you can establish an RTCDataChannel out of WebRTC, but I suppose that's easy.
... Finally, we could do TCP + Custom protocol. I think that can be an option if none of the other options look reasonable enough.
... I'm hopeful one of the other protocols will be suitable.
... Note HTTP/2 and QUIC are relatively new protocols.
... If possible, it would be interesting to do some benchmarking.
... Some non-functional requirements is that the application level protocol should be easy to implement on top of the transport protocol. Security should be easy to address as well. Standards need to be mature enough and implementations available as well as easy to integrate.

Anssi: Can we filter out some options based on existing data?

Mark: We publish some expectations about how responsive applications should be, which places some restrictions on what these protocols can be.
... We try to be efficient in the code so that network operations are not much shorter than their visible outcome for the application developer

Tatsuya_Igarashi: What about media transport?

Mark: This is not included here. I would think it will be what WebRTC uses for media transport.

Tatsuya_Igarashi: Do you think another protocol should be defined for media playback?

Mark: In the 2-UA "media flinging" case, the media is not streamed, the connection is used to exchange state information. Media streaming is currently out of scope for the CG.

Francois: We have a signaling channel and a communication channel. Does it make sense to envision different transport protocols for both channels?

Mark: It depends on how much multiplexing we end up doing.
... Things should be compatible with the WebRTC case in the end.

Louay: I can study TCP + WebSockets. Also interested on discovery.

Anssi: You did some background study on interoperability, I think.

Louay: Yes
... Thinking about HbbTV case, where they chose DIAL for discovery, WebSocket for communication. They also have synchronization endpoints, which is not part of our discussions here.

Mark: QUIC is basically a way to multiplex multiple reliable streams over UDP.
... It's optimized for HTTP, to fetch resources in parallel.
... Normally, over TCP, if a packet gets dropped, you have to request the packet another time. With UDP, it's easier to reconstruct the missing packet afterwards.

Shih-Chiang: Another advantage is to have better TCP features in user space. Easier to share libraries to update QUIC support instead of relying on platform update.

Mark: Yes. It would be good to understand the roadmap.

Open Screen Protocol - Application protocol

Mark: I think there's a lot of freedom here to do things differently
... Looking into it, it felt a little bit more webby and more HTTP-oriented protocol.
... My proposal is very REST-like.
... If we use HTTP/2, then it makes sense. We may want to adjust if we chose a different transport.
... To find more information about a receiver, you will issue a GET on the receiver to get its name as well as on-going presentations.
... We don't have a feature in the Presentation API right now to tell whether a presentation is public or private.
... That could be something to add at the protocol level then in v2.
... I'm using JSON here, that can be discussed. I'm not a fan of XML.

Shih-Chiang: Some of this information can be baked into e.g. the DNS record during the discovery phase.
... It could make sense to collect that information immediately to speed up screen detection.

Mark: There's a layering approach. If you're connecting to the device through other means, then you may need that information. I'm trying to design the protocol independent of the discovery protocol. That does not prevent optimizations.
... To start a presentation, we could send a POST request that gives you a path to the presentation and connection resources.
... A terminate would be a DELETE, a reconnect a POST request on the presentation resource, and a close another DELETE on the connection resource.
... These requests are all coming from the controller at this point.
... For messaging, we need to use bi-directional features of the transport protocol.
... We need to figure out what headers to send, what caching could happen, etc.

Francois: REST "purity" would use PUT to create resources (returned in a Location HTTP header), although in our case, we create both a presentation resource and a connection resource.

Mark: OK.
... For messaging, you would send a POST request for sending messaging. To listen to messages, we would use the subscription mechanism of the transport protocol that would simulate a GET request.

Shih-Chiang: OK, we would still need to frame each of the messages.

Mark: If we choose a transport that is not based on HTTP, I think we will want to design our own command protocol that mimics HTTP verbs.

Shih-Chiang: Because the control message is bi-directional, I was not thinking about HTTP.

Mark: We should evaluate bi-directional communication in HTTP/2 to decide whether it's a good fit.
... This proposal would be a good fit for HTTP/2 or QUIC, but it needs to be revisited if we choose a different transport protocol.
... One of the things that made DIAL challenging is that it uses HTTP/1 which does not have bi-directional support.
... Now we can find a transport protocol that addresses both client-server and bi-directional needs.

Open Screen Protocol - Security

Mark: Stepping back on security aspects, the focus in my current thinking is that it's more important to authenticate the receiving UA to the controlling UA than vice versa.
... The presentation must be made aware that there is no guarantee and that it needs to apply application logic to validate the authentication of the application that connects to it.
... As such, I'm putting app to app authentication, as well as authentication of the controlling UA to the receiving UA out of scope.

Shih-Chiang: The certificate may not need to be signed by a trusted third-party. In our current experimentation, we use anonymous authentication exchanges to make sure the certificate is from the other peer.

Mark: So you do mutual authentication?

Shih-Chiang: Yes.

Mark: That's a useful property to have. If the handshake has mutual authentication property, that's fine. In my Chromecast case, authentication gets based on my Google ID.
... Of course, one goal is to maintain confidentiality of data exchanged between peers.
... Since we're targeting 2-UA, we're not worried about authentication issues with media streams.

Tatsuya_Igarashi: How is the authentication established?

Mark: The user will have to be involved. At least, in the proposal I have, the user is involved. There may be ways to bypass the user. That seems more vendor-specific though.

The group discusses the impossibility for apps to authenticate the app running on the other end if non secure origins get used. This links back to previous discussions on restricting the Presentation API to secure origins.

In some environments, such as HbbTV, the communication channel is current non secure. Is it OK for communications between non-secure origin apps? Probably not! The group will prefer secure protocols no matter what.

Mark: Continuing with authentication consideration. Private/Public key approaches look good. No password!
... My current reading of the Web security community is that they are trying to invent the next version of TLS which is TLS 1.3, also getting rid of crypto algorithms that are no longer secure.
... I believe it is still in the standardisation process.

Shih-Chiang: I'm following progress. The editor of TLS 1.3 is from Mozilla and he keeps updating the draft. I haven't seen a clear consensus on the spec so far. I can check internally about status.

Mark: Thanks. Based on discussions both at Google and in other places, there are some possible solutions that try to leverage public DNS to solve private LAN authentication, and that seems to complicate things and leak private information.
... My preference is to avoid this.
... Two possible approaches: a key-based approach where the receiver has a long-lived public/private keypair which can be used to sign short lived certificates. The controller will have to check that the long-lived public key belongs to that device.
... The user would authorize the receiver on first use and verify the hostname. The public key hash would be displayed with a friendly name.
... Some open questions e.g. on how to revoke keys, etc.

Shih-Chiang: The approach we have right now is to exchange certificates that are self-signed. We do carry the hash of the certificate in the DNS TXT record.
... When we receive the certificate, we can check whether that certificate received through the anonymous procedure matches the one that is registered in the DNS TXT record.
... If it matches, then we assume it's good to use.
... The assumption is that the DNS TXT record for the fingerprint of the certificate is correct.

Mark: That probably is not an assumption that would make it through our security folks.
... We need something that the user needs to verify. It can be a PIN code, or a hash.
... The second approach that I've been thinking about is to ask the user to enter a short code. PAKE then gets used to generate a shared secret.

Shih-Chiang: After the authentication is passed, you will be able to encrypt with a shared secret key. We exchange the certificate through that channel for later connections.
... I will share some documentation about our design with PAKE procedures.

Mark: From a usability point of view, a one-time pairing is important.
... Under certain circumstances, e.g. change of device friendly name, change of location, we may need to re-pair.
... I think we're pretty aligned on the overall security procedures.
... I'm going to review this with security people at Google.

Wrap-up session

Anssi: I created a new GitHub Open Screen Protocol repository for the CG.
... Let's create markdown files in the GitHub repo, one per area
... The four areas: discovery, transport, protocol, security
... Looking for contributors

Mark: I'll try to get some performance data
... Can we create a meta-issue for each work item?

Chris: I am also interested in contributing

Anssi: We're on track with testing
... We may produce another CR and adjust the exit criteria, but let's see
... The protocol work may motivate some V2 features

Mark: We want to allow documents to announce themselves as presentations

Anssi: The Presentation API is in good shape, thanks to contributions
... With the Remote Playback API, the guidance for HTMLMediaElement behaviours is the most important issue
... We have a plan for next steps
... We didn't discuss wide review for the working draft, should be towards the end of this year
... There are some changes to milestones to make in the WG charter
... The CG kick-off was successful, we have a good list of requirements, identified potential technologies that we need to evaluate
... That work will take a while, there's background work to do
... I don't want to set timelines for this

Mark: I can do some of the work but will also need to consult internally
... Our priority is to get the Presentation API to CR
... If we create separate documents with our findings, will we then make a more formal document? We'll need an editor

Anssi: The people doing the work are good candidates to edit
... [some discussion of relative merits of ReSpec and Bikeshed ]
... Next F2F meeting? We'll meet again at TPAC

Francois: It's at Burlingame, 6-10 November 2017

Anssi: We don't need to decide now, but maybe around May?

Louay: We could offer to host it, as part of the Media Web Symposium

<mfoltzgoogle> Mid-April to mid-June is a good window

<Louay> Fraunhofer FOKUS Media Web Symposium, May 16-18, 2017, Berlin

Francois: We should also check Anton's availability

Anssi: Let's decide where and when to meet early next year
... This has been a great meeting, we're making good progress, we've had good discussions
... Thanks for coming

[ adjourned ]

Second Screen WG F2F - Day 2/2

23 Sep 2016

Attendees

Contents