Anssi: Welcome! This is a Working Group (WG) plus Community Group (CG) meeting, but the WG has delivered more or less what it was supposed to, so focus will be more on the CG part. Mark and Brandon prepared an agenda with a number of issues to go through.
… Those topics were touched during our last F2F at TPAC last year.
… [going through the agenda]
anssik: Working for Intel. Chairing the Second Screen CG/WG. Also the Device and Sensors WG. Working with our Chromium contributors. We want to write specs and work on implementations.
mfoltzgoogle: Work on Chrome, for Google. 3 years in the group. Editor of the Presentation API. My team, including Brandon and others, have been implementing this API in Chrome, liaising with other teams to make the API better for them.
btolsch: Working for Google. First time in W3C. Implementing the specs, integration with Cast.
Louay: Working for Fraunhofer FOKUS. Been in the group since day 1. Also doing multi-screen more generically, and other streaming use cases, such as 360 videos.
hyojin: Representing LGE. Involved in various groups at W3C. LG devices use Chromium, we don't have cast-enabled function for now, so interested in the open screen protocol
igarashi: Work for Sony. AC representative. Also involved in other interactive TV SDOs, including Hybridcast, ATSC 3.0.
… Not a member of the group, but interested to understand how this activity relates to companion screen features in other standards and whether they can align.
… This kind of API could be useful for broadcast standards. Also, personally, I was involved in home network technology such as DLNA. Involved in the HTTPS in Local Network CG.
… Securing the local network is a missing technology for now.
cpn_: Work for BBC. Look at new technologies for audio and videos. New formats, new experiences. I co-chair the Media & Entertainment Interest Group.
… We have interest from two points of view here: for our iplayer to launch video to a cast device, but also in aligning with HbbTV.
Francois: I work for W3C as the Media & Entertainment champion, looking at what we do and could be doing at W3C
stepsteg: Work for Fraunhofer FOKUS as Louay. Happy to host the meeting of the group here!
anssik: [going through topics]. For discovery, we discussed 3 different methods last time, suggesting both mDNS and SSDP be supported.
mfoltzgoogle: Mostly, what I have prepared is data about our dual discovery implementation in Chrome and use that data to inform the discussion
anssik: are people happy to collect data from implementations and base design decisions on that data?
anssik: For transport, we raised QUIC last time
mfoltzgoogle: Yes, we agreed to take a deeper dive at what it would take to use the QUIC data channel.
… To support that on top of WebRTC-based transport. Direct UDP, or using ICE. We've looked at the different scenarios.
… Also looked at leveraging the QUIC protocol to support the application command protocol that we need for the API.
… TLS1.3 is still being worked upon. But it's close to being interoperable, I believe.
cpn_: I think there was some recent announcement at IETF about TLS 1.3
… I can take the issue that was assigned to Shih-Chiang
mfoltzgoogle: I'll talk a bit about the bootstrapping mechanism.
anssik: First one for today is transport security and authentication. I think you have a proposal and Tomoyuki has some slides about J-PAKE.
mfoltzgoogle: Have some background material. Part of the proposal is to use J-PAKE indeed. Then we can talk about next steps.
cpn_: Is there something we can report back from the HTTPS in Local Network CG that could inform the discussion?
anssik: Right, Igarashi-san and Tomoyuki can jump in discussions as needed.
mfoltzgoogle: [presenting slides]
… People here should be somewhat familiar with the history of the API, but I'll go quickly through it. We went through agenda bashing, we'll go into a bit more of details in technical aspects.
… On day 2, Brandon will lead the discussion on the control protocol. We have some data to share for that serialization and recommendation on how to move forward. Also HbbTV/ATSC compatibility.
… Also, my team is working on an open source implementation of the Open Screen Protocol, discussion on next steps.
… Starting with some background material
… [Showing timeline]. Started in Nov 2013, I think.
anssik: Yes, it started in a breakout session at TPAC there. Some browser vendors showed interest at the time, and we quickly started incubation
mfoltzgoogle: The API was incubated in the CG, and then transitioned to a WG. At that point, the CG no longer had much to do, until it appeared clearly that interoperability was a major issue to look at.
… CG rechartered to work on interoperability.
anssik: We should credit the TAG. They submitted their feedback that they wanted to see interoperability at the protocol layer. So we did our homework, and here we are. Mozilla was also strongly pushing for that.
mfoltzgoogle: Making progress on the Open Screen Protocol since then. The WG was rechartered, and progress on the APIs is more tied up to progress on the Open Screen Protocol.
… To publish the Presentation API as a Rec, we'll need to demonstrate interoperability based on the Open Screen Protocol across browsers and devices (see Presentation API Candidate Recommendation exit criteria).
mfoltzgoogle: At a high level, for the Presentation API, a controlling page (in a browser) wants to present content on a second screen (connected or not to the first one).
… The page requests presentation. The browser lists compatible display, prompts the user to choose one. If successful, the browser creates a communication channel with the receiver, and loads the presentation there.
… The controlling page can then exchange with the receiving page.
igarashi: What happens when one end closes the connection?
mfoltzgoogle: Both ends can close the connection at any time. This does not kill the page on either end.
hyojin: Are multiple connections possible?
mfoltzgoogle: Yes. Two tabs on the same controlling browser can connect to the same presentation.
… Or different pages running on different devices, potentially.
Louay: Also, one controller can launch multiple presentations.
mfoltzgoogle: Correct. With the same URL, if the user chooses a different device for the second request, this creates a separate connection.
… I don't think we have sample code for one controller to multiple devices case.
… To implement this API, the controlling user-agent can do two different things: in the 2-UA mode, the browser sends the URL of the presentation to the receiving device, which renders it.
… In this case, the user-agent rendering the controlling page is distinct from the user-agent rendering the receiving page.
… They only share commands through the application channel
… The other scenario is when the devices are connected by wire/wireless media
… In that case, the controlling browser also renders the receiving page locally, and passes the rendered content to the receiving display.
… It can be wired displays, we recently added support for that in Chrome: if you connect a second display, it can be the target of the Presentation API.
anssik: Useful to point out that this is not in scope of the Open Screen Protocol.
mfoltzgoogle: The second API that we developed is the Remote Playback API, where a controlling page can request playback of a media element to a second screen.
… Media commands get forwarded to the remote playback device. The concepts are similar to the Presentation API, but you can only exchange media commands.
anssik: It's interesting to note that all other browsers implement that feature on their chrome for now: Edge, Apple, Firefox for Android.
Louay: Any requirement for synchronization? The currentTime needs to be synchronized with the current time in the receiving side.
mfoltzgoogle: We have not yet defined the requirements for the Open Screen Protocol for the Remote Playback API. Latency should be one of them.
anssik: Suggest that you mention that in the related issue, Louay!
mfoltzgoogle: I can show some demos over coffee break.
… We have shipped support for these APIs. Presentation controller (2-UA) shipped in May 2016. We also shipped support for cloud browser to launch a presentation on another instance of Chrome.
… The Presentation receiver part (1-UA) was launched in June 2017. Allows to send presentation to a Cast device.
mfoltzgoogle: Remote Playback also shipped on Android in February 2017. On desktop, we don't have full support for the API right now, but we're shipping a remote media feature soon, so hopefully we'll have full support later on.
anssik: If people have contacts at Mozilla, it would be good to have their explicit feedback on implementation of the APIs.
… The Presentation API is not currently mentioned on Edge Status.
… As for Apple, I think that they are very interested in the Remote Playback API.
igarashi: Will Android applications support that protocol?
mfoltzgoogle: Chrome tries to provide the same functionality across platforms. In this case, there are differences between desktop and mobile platforms that makes it more challenging, so I cannot really state concrete plans here.
igarashi: Android Chromecast application on mobile devices. Will they support Open Screen Protocol?
mfoltzgoogle: I can't really say. Part of the implementation is in Chrome, part is at the platform level.
igarashi: If the Android OS platform supports the Open Screen Protocol, it would be easy to enable it for all kinds of applications.
mfoltzgoogle: Right, the lower level where it is supported, the more it allows to use it everywhere, I just cannot speak more concretely as it touches on various teams.
mfoltzgoogle: CG rechartering was done to address main feedback from the TAG around interoperability. Also in scope is extension ability for future use cases.
igarashi: Last TPAC, I asked whether the API would support connection to existing devices. Web application is already running on the TV, and user connects to the page running on the TV.
mfoltzgoogle: We have an open issue in the Open Screen Protocol when the presentation displays discovers a controller.
igarashi: The application may be already running on the device.
anssik: Let's discuss this tomorrow during the HbbTV topic.
mfoltzgoogle: I don't think there's anything in the protocol that prevents that, but we may need to adjust the API for that use case.
mfoltzgoogle: Out of scope of the CG: media codecs. Also streaming use cases (we may want to address that later on). Also network traversal, guest mode, but we should still think about it.
… And finally direct interoperability with proprietary protocols (DLNA, Google Cast, etc.), but we'll still see how to enable interoperability with some of them.
igarashi: Audio playback is also in scope of Remote Playback API?
mfoltzgoogle: Yes, Remote Playback API applies to both. Things like Web Audio are out of scope for now.
… Applies to HTMLMediaElement.
… I don't think we've looked into remote playback when backgrounded.
… I believe that, as a general rule, whatever the user agent decides to do should be reflected on the remote side.
[discussion on what happens to media when a tab switches to background and whether we should say something about that in the Remote Playback API. For instance, for local playback, user agents typically stop rendering of the video, but not playback of audio. Probably no requirement in either direction, it should be left up to implementations. Also see related note in Remote Playback API specification.]
anssik: Igarashi-san, please open an issue on GitHub to describe your use case.
mfoltzgoogle: Essentially, what we're trying to achieve is to enable second screen experiences across devices and browsers.
… At a higher level, the web developer should not have to worry about internals of the discovery and of the remote device, and just be able to use it.
… We're targeting a wide variety of devices and platforms, including HDMI dongles.
… [going through list of functional requirements]
… The channel for messages is a reliable and in-order channel, sort of TCP-like abstraction.
… We want to do the same thing for the Remote Playback API but haven't looked at detailed requirements for now.
[discussion on Remote Playback API requirements, a polyfill of the Remote Playback API could be done on top of the Presentation API]
mfoltzgoogle: Also some non-functional requirements: usability, protect user privacy from other users on the network, resource efficiency. A mobile device may not have a lot of battery, memory.
… Generally, goal is to keep the implementation complexity down.
… The Open Screen Protocol is a stack of 4 levels: Discovery. Then Authentication to verify the identity of the discovered device. Then Transport to create a communication channel with the discovered device. And then Application Protocol to exchange messages over the transport.
… I will go into details for all of these.
… Among possible choices, we've done a deeper dive on mDNS and SSDP for discovery, TLS 1.3 (via QUIC) and J-PAKE for authentication, QUIC DataChannel for Transport, and we have semantics for the Application Protocol but need to focus on serialization, possibly with CBOR and Protocol Messages.
… There's also some investigation on using WebSockets for Transport, in which case discovery would probably be based on DIAL, and Application Protocol would use JSON. To be discussed tomorrow.
… For each of these things that we're going to dive on, we look at data we have around performance, and we collect data. We haven't done a log of data collection yet, most of the data comes from real implementation of the protocols for now.
… I have a few Raspberry PIs on my desk now to investigate exactly how much network traffic is received. Raspberry devices are a good proxy for a low-end device where we might want to have the protocols run.
… What we've done this far is that we listed requirements for the Presentation API, some evaluations on mDNS, SSDP/DIAL, QUIC, RTCDataChannel, more recently J-PAKE. Also some benchmarking and the control protocol.
… Major items remaining:
… - which discovery mechanisms to require (we have some proposal)
… - QUIC DataChannel
… - For the control protocol, consensus on serialization
… - For authentication mechanisms, integrate J-PAKE and support PKI-based authentication
anssik: Thanks, that's a good summary!
mfoltzgoogle: start with some background. we've looked at 2 mechanisms so far
… not too much detail on DIAL, similar to SSDP
… we have feedback from our Chrome implementation on how it discovers Chromecast devices
… we've reimplemented all our discovery mechanisms from JS in another process to native code in the browser
… this has improved reliability
… the data we have now reflects the underlying protocols and not weird implementation details
… shipped around Q4 last year
… the key requirement is for two openscreen devices on the same network to be able to discover each other
… this could be controllers discovering receivers, but could be the other way round, with negotiation over roles
… the goal of discovery is to publish enough data to bootstrap the transport
… could include a friendly name, capabilities, device model name
… as the information is broadcast on the network, there's an issue of privacy
… there may be other data needed as we refine the protocol stack
… we want to be reponsive to when devices are added or removed, 30 second requirement
… we want it to be power efficient, also as the number of devices increases we don't want huge increases in network traffic
… advertising network services opens the possibility for malicious devices
… mDNS has two parts to it: the multicast DNS part and the DNS service discovery, but they're mostly discussed together
… there's a listener that multicasts a DNS query with a specific query type in the local domain
… it sets DNS flags for this kind of query, uses a well known port
… services listen for this query and return a list of DNS records that specify an instance of that service
… friendly name, protocol name, e.g., for an openscreen QUIC it could be openscreen.quic
… the SRV record advertises a port where the service is available, PTR record, the TXT allows additional data
… when records are returned, the listener can cache them. there's a time to live
… implementations are supposed to requery to refresh the cache when nearing the end of the TTL
… if a device leaves the network, it can advertise records with zero TTL, which should flush caches
… if a device is unplugged it may not have the opportunity to do that
anssik: port conflicts?
mfoltzgoogle: for platforms that don't provide their own listener implementation, there could be multiple running
… if the port is opened in exclusive mode, they won't be able to send queries
… SSDP is an alternative to mDNS. it's similar in some ways, different in others
… UPnP. devices like routers use this to advertise services. here, we're not using it to advertise services, just for discovery
… the root device advertises services, the control point wants to find services to control
… when the root device comes online it advertises using headers, sent to a well known multicast port that control points can listen on
… there's a UUID that the control can use to identify a device, caching information. i don't believe there's a cache invalidation part to SSDP
… so you have to continually do multicast to discover that devices are no longer avaialble
… for our purposes, we'll probably add some specific metadata for OpenScreen
… such as a friendly name, which would have to be base64 encoded as SSDP doesn't specify a character encoding
… you can add custom headers
louay: you could advertise the URL of a device manifest where you put the name
francois: could the name be used in other contexts?
mfoltzgoogle: no, it's custom to our protocol
… control point could send a target search request, then devices respond unicast to the control point that sent the query
louay: when the device sends the search request, it can specify a time window, sending a multicast message could cause replies from many devices
… this may have implication on responsiveness in device discovery
mfoltzgoogle: for plain SSDP, devices can advertise disconnection
… respond by removing the device from caches
… DIAL discovery just uses the search request and response part of SDP, then request and XML document
… last time we looked at it, we decided we don't need an XML document for OpenScreen
… all of these things face challenges when deployed in the field
… we've supported mDNS and DIAL in Chrome for some time
… reimplemented to fix some issues
… number one complaint is that the browser can't find their cast device or TV
… on certain OSs, users can configure firewalls, Windows has settings that can block multicast
anssik: what do the default configurations do?
mfoltzgoogle: the defaults seem to work fine
… but if you set up your LAN as a non-trusted network, less so
btolsch: the default behaviour on windows is to show a prompt on first attempt to access a multicast prompt
… in Chrome we try to delay that until there's a specific user action that needs it
anssik: do you know the user denied access?
btolsch: no, we don't know. we could still probe the firewall
anssik: what does the prompt look like?
btolstch: it's not like the UAC prompt, it;s a window that opens that you can ignore
… it's a strange prompt compared to others the user may be used to, so they may say no. another reason why we delay the prompt
anssik: you're looking at windows and Mac OS
mfoltzgoogle: on Mac OS, signed software is generally allowed access, but there are settings that could block access
mfoltzgoogle: peer to peer communication may not be allowed on a guest network
… on Windows, if there are multiple appications that want to bind to the port, could prevent us binding for multicat
… problematic with mDNS, which uses a fixed port
… Chrome has an enterprise policy to disable all cast functionality, concerns over network traffic
… they could have more restrictive firewalls blocking multicast
… we don't always get enough feedback on the causes of problems
anssik: there's a wide range of devices with different capabilities. how can you capture say 80% of those devices?
mfoltzgoogle: we don't try to test against a wide variety of devices. we have to test on a case by case basis in our lab
… given this context, what Chrome does to maximize chance of discovery, we advertise the same UUID through both mechanisms
… if we discover one through SSDP, we don't get a port number. may not be a cast device
… could be another generic DIAL device
… we have data on how this performs to share
… 69% is found first through mDNS, then through both mechanisms
… 10% is reconnection attempts to devices previously found
… 8% is mDNS only, 8% is DIAL only, 3% is DIAL first, then mDNS
… 0.32% is by looking at our network cache
mfoltzgoogle: on Mac, there's a bigger difference
… the majority we find through both
btolsch: what was the data volume for Mac, does that cause the variance?
mfoltzgoogle: i'd have to check
… for ChromeOS, we see the majority of devices can be disocvered through both
… how many would you find if you find 100 by dual discovery?
… in general mDNS is more likely to find a given device. 96% seems to be a ceiling, there's about 5% failure due to network issues
… failure rate on Windows is 10% for both mDNS and DIAL
… adding DIAL increases reliability by 5-10%
… what i recommend for our implementation is to focus on mDNS as the mandatory mechanism. it's more reliable across the board, and has better platform support
… eg, as it's built in on Mac
… we should specify SSDP as an alternative, but not make it a core protocol
… we could evaluate other mechanisms such as NFC and QR codes
anssik: this seems reasonable based on the data
… we previously discussed dual-mode, but then decided to make a data driven decision
… comments on the recommendation?
tidoust: about the data, what are the devices you tested? can you identify specific devices that are problematic?
mfoltzgoogle: the best information we have is the client platform. we don't collect a lot of detailed data
… when users give us feedback, we gather additional data
btolsh: this data is only really for Chromecast devices
mfoltzgoogle: we also do TVs, but we don't have data on those
anssik: the recommendation prioritises mDNS
mfoltzgoogle: mDNS has parameters such as number of retries that can be played with
cpn_: One comment is that HbbTV only uses SSDP (DIAL). To be revisited when we discuss that tomorrow.
mfoltzgoogle: Right, this recommendation is only for the modern stack of the protocol.
louay: discussed with chris yesterday, HbbTV are looking to adopt existing W3C specs. we want to focus on the openscreen protocol
… i think this is a good proposal to have mDNS. if HbbTV adopt OpenScreen, we still have SSDP as the alternative
anssik: so we can make a resolution
PROPOSED: make mDNS mandatory for controllers and receivers. Make SSDP a non-mandatory alternative
Resolved: make mDNS mandatory for controllers and receivers. Make SSDP a non-mandatory alternative
mfoltzgoogle: we have some open GitHub issues
… related to discovery, in addition to this resolution, starting with #81 [SSDP] Update implementation information
louay: I just submitted a pull request on implementations. it adds libupnp and peer-ssdp.
francois: any more to add?
anssik: that can close the issue
… when mark adds the MX value
mfoltzgoogle: the MX is 2
anssik: next is the amplification attack issue: #57 - [SSDP] Update proposed use of SSDP to specifically prevent SSDP amplification attacks
mfoltzgoogle: some router firewalls may prevent this. i believe the correct thing to do is ensure only local IP targets
… we've previously added info on security mitigations for SSDP
[mfoltzgoogle self-assigned #57 and will propose some text]
anssik: next issue is prefiltering: #21 [SSDP] Investigate mechanisms to pre-filter devices by Presentation URL
mfoltzgoogle: i wanted to get feedback on how useful this would be. we haven't figured out how to apply this to mDNS yet
mfoltzgoogle: we could adopt a simpler version for mDNS or omit it entirely and require all the checks to be done over the transport protocol
… i see value in limiting the number of connections, to see if it improves UX and performance of the protocol
louay: there are two aspects. one is performance. could be faster to do discovery, but we're sharing all the URLs during discovery. privacy issue
… at the app protocol level, can do caching, it keeps discovery simple, just share IP address, port, friendly name, maybe protocol version to allow for future versions
mfoltzgoogle: we have an open item for that, particularly for the connection establishment stage
mfoltzgoogle: for the TXT record, i would want to have room to advertise protocol version, capabilities, public key
anssik: there's a TAG finding on data minimisation (see Data Minimization in Web APIs TAG Finding), also applies to protocols
… we may get push-back during privacy review if we disclose too much
louay: also an issue, UDP packets have a limited size. what happens if we have 10 URLs?
mfoltzgoogle: if we want to advertise receiver URL schemes, make this a general capability advertisement
louay: also relates to HbbTV, want to launch HbbTV application
mfoltzgoogle: if they have a specific URL scheme, otherwise require the controller to ask for a specific URL when they want to determine compatibility
… it's more privacy preserving, simplifies discovery. potential downside is requiring user to pair the device, e.g., J-PAKE. could be a usability reason for doing this in future
anssik: what are the other use cases where this could be an issue?
mfoltzgoogle: go to your friends house, not used befoer
tidoust: are only HTTPS urls supported?
… in SSDP, the proposol is to have a protocols header that can be used to advertise supported URL schemes
… it's a list of URL schemes other than HTTPS, so you can't advertise that you don't support HTTPS
mfoltzgoole: if we need to support cast only and/or hbbtv-only, we could make it a full list
… this may be a moot point, as we're discussing not having that data at all
cpn_: Anything related to discovery of media capabilities?
mfoltzgoogle: Not for now.
cpn_: There may be a number of reasons why a presentation request may be denied. From a UX perspective, it would be better to show the device as discovered, but not compatible.
btolsch: You're asking about list of discovered devices vs. list of compatible devices.
mfoltzgoogle: That's definitely a good point. In the end of the day, it's probably a UI decision for the browser.
… The bigger question is, when it comes to the authentication piece, if we require a transport before we detect compatibility, that would require the user-agent to go through that process before it can show useful information to the user.
anssik: how to resolve this issue?
PROPOSED: have a protocols header that can be used to advertise supported URL schemes
mfoltzgoogle: i'm fine with the schemes-only header. the main issue we'd have is how it fits with data minimization
anssik: even the URL schemes discloses something
mfoltzgoogle: we could call this out in the reviews
anssik: is it a reasonable compromise for receivers to advertise supported URL schemes?
louay: the application knows how many receivers I have (not exposed to the web app)
mfoltzgoogle: if the receiver must advertise the schemes for the controller to make a decision, the receiver has no control over whether to advertise or not
… the most conservative thing is to move scheme advertisement to the control protocol
PROPOSED: For SSDP, make advertisement of supported schemes truly optional, in that controllers will still connect to the receiving device in the absence of the PROTOCOLS info. We'll adopt a similar mechanism for mDNS.
Resolved: For SSDP, make advertisement of supported schemes truly optional, in that controllers will still connect to the receiving device in the absence of the PROTOCOLS info. We'll adopt a similar mechanism for mDNS.
mfoltzgoogle: To set the background, transport is primarily about establishing network channel between 2 devices that can be used to send protocol commands (launch a new presentation), media commands and application commands
… At our last F2F, we said we'd go deeper on QUIC.
… I'll talk a bit about that, also about bootstrapping it.
… We'll look at some of the challenges. E.g. ORTC API.
… Then GitHub issues, proposals and next step.
… Starting with an overview of QUIC: transport protocol designed primarily to allow greater parallelism between a given pair.
… The QUIC ensures that one stream won't block the other.
… You can use it for message based (fixed-length exchanges) scenarios or streaming.
… The most fully specified way to do authentication is based on TLS 1.3
… It also supports different congestion control. QUIC enables a different mechanism that tries to maximize the bandwith per protocol.
… You can do a 0-RTT session resumption, meaning you can resume and exchange messages in the same packet.
igarashi: Encyrption of the payload, is it defined in the spec?
mfoltzgoogle: Key-exchange and encryption is defined in TLS 1.3.
… In theory, you can roll out your own authentication, but I don't think there has been any other attempt than just using TLS 1.3
… QUIC has a 2-layer definition. Top layer is a QUIC connection: encrypted channel between a controller and a receiver.
… In principle, the spec allows multiple connections to use the same IP and port.
… Every QUIC packet has a connection ID and each point uses that to distinguish the connection.
… Whether port sharing is or is not in v1 seems to remain an open issue, there's a long discussion right now on GitHub (see #714 [QUIC] Multiple connections on the same port).
… Might be important for us.
igarashi: Network congestion control is separate per connection, or per port sharing set of connections. Flow control.
mfoltzgoogle: That's a good question. Believe network congestion is per port sharing set of connections.
… Within a connection, the API exposed allow sending data on the stream. The receiver receives the bits in order. If there are multiple streams, the QUIC protocol will try to share the bandwidth equally.
… Different ways to think about mapping our control protocol to QUI connections and there are pros and cons to each of them. Don't have a strong preference for now.
… Both protocols allow multiple connections between a controller and a receiver.
… We could use stream IDs to label individual messages sent between end points. But then, how do you order these commands?
… Similarly, for each presentation connection, we could establish a separate QUIC connection.
… Streams could be used to frame the messages between the end points and we wouldn't need a framing algorithm on top of it.
… Some of the disadvantages is that each connection requires a separate crypto handshake, so additional setup time to start with.
… The alternative is to use a single QUIC connection for all presentations between controlling and receiving ends.
… QUIC stream for each channel command/response.
… The advantage here is that we don't have to deal with different connections, but each side has to do some book keeping on streams to assign them correctly.
… Some feedback from QUIC folks would be useful to decide on which of these 2 approaches is preferrable.
… Congestion control: good article on blog.apnic.net. Congestion control is about figuring out the rate at which it may send packets without having to resend packets.
… CUBIC and Reno have been used in the past. They ramp up until packets get lost and start again from a lower level. BBR is a new method with a different approach. Initial data suggests that it works well.
… It's interesting and useful: if you have multiple devices that are streaming in your home.
… We might have some data to report on that based on our tab mirroring later on.
… About the handshake, the client introduces some preliminary information. Then key exchange, with certificates.
… Not an expert in TLS, so you'll have to refer to the spec for details.
… Part of the outcome of that handshake is a hash of approval that the handshake was successful.
… If the server caches that and the client includes that hash later on, you don't have to redo another round-trip to resume a session later.
… This mechanism is a bit more sensitive to replay attacks though.
… QUIC DataChannel. QUIC is UDP-based. At some point last year, the WebRTC WG looked at combining QUIC with the mechanism to agree on an IP address and port between peers.
… That mechanism is called ICE. So instead of sending UDP packets directly, ICE gets used, which typically uses a third-party server to detect configuration parameters.
… I'll briefly go through ICE. Basically, through a signaling channel, ICE starts sending specific kinds of packets. Peers usually select the best pair of candidates.
… STUN is a way for a network endpoint to find its public IP address, which can be used to offer an IP/port pair to the other end.
… In order for this to work for a QUIC DataChannel on the LAN, there are a couple of things we can do.
… We could use host candidates (candidates that are directly usable)
… The controller can add a host candidate that it discovers through mDNS discovery.
… The host candidate for the receiver would have to be an extension of our protocol. We would have to devise some mechanism by which the receiving end can advertise a point where it can be given the host candidate from the controller.
… To get ICE started at all, you need to setup STUN parameter, but we don't need STUN servers.
… This is something that we can and that would be compatible with what is being done in WebRTC. However, that is a bit complex for our use cases.
… My perspective is that we should start without ICE, and then consider adding ICE later on.
… I'm discussing with WebRTC folks to know whether we can do ICE without a signaling channel when we do not need it.
anssik: I note this is not yet part of the WebRTC WG. In incubation for now.
[Discussion about status of ORTC]
[Discussion about bootstrapping the QUIC DataChannel without ICE and on how ports get assigned]
cpn: Wondering about the scope of what we're trying to achieve here.
mfoltzgoogle: We don't need NAT traversal here, ICE would provide a path. But it's out of scope for now. It's more to provide a clean upgrade path to v2.
… I hope to get more information about the use of ICE on local networks, but my current inclination is not to focus on that.
Louay: Same port could be used for multiple connections so one port may be enough
tidoust: I was more thinking of cases where multiple controllers try to connect to the same receiver. Different ports would be needed then, even if we multiplex connections on the same port.
mfoltzgoogle: Actually, QUIC allows to reuse the same port for connections with multiple endpoints.
mfoltzgoogle: Because DataChannel are fundamentally peer-to-peer, authentication is currently certificate based.
… Each endpoint obtains or generates a certificate and then passes a fingerpring to the other party by secure signaling.
… The fingerprint is passed into the data channel after ICE connects the initial state.
… No direct support for J-PAKE
… ORTC exposes several layers of transport that can be used for peer-to-peer communications, and enable use of QUIC.
… ORTC uses ICE parameter under the hoods as explained before.
… [looking at ORTC sample code to establish a QUIC connection]]
<anssik> the proposed new WebRTC WG charter currently under AC review states: "In recognition of this interest, the Working Group intends to start work on a new set of object-oriented APIs [ORTC] for real-time communication." https://www.w3.org/2018/04/webrtc-charter.html
mfoltzgoogle: What's interesting here is that browsers that don't support the Presentation API could plug-in QUIC directly and use that as a supporting library to provide that support.
… Also rumors that ORTC is also willing to expose mDNS.
… For prototyping this might be good.
… From an implementation status perspective, there is a framework implemented in Chrome called QUARTC. It supports BBR. Some crypto stuff is stubbed out for now. There's still a little bit of work to do there to support ICE. Also that work may be tight to ORTC work in Chrome.
… That also explains why I propose to move forward with QUIC without ICE as a first step.
… [showing proposals]
[Discussion on subnets. Proposal won't work across subnets. But discovery won't work either. Multicast-based discovery easily break across boundaries.]
mfoltzgoogle: If we want to extend the scope to net traversal, I'm not opposed to it, but it seems wise to solve the direct connection case to start with.
… I think I managed to convince myself that we should not work on the ICE with host candidates mode to start with. Something like for a v1.5.
… We just do want to support that in the future.
igarashi: Why not use ICE directly?
mfoltzgoogle: I don't have a good solution on how to advertise the ports in mDNS. Happy to get additional feedback from ORTC group or others.
igarashi: Wondering in enterprise how user can select among a very large number of receivers?
mfoltzgoogle: It's potentially a problem. In general, if you have a very large network and doing multicast discovery, then that can create a large amount of network traffic. That's why many companies restrict multicast.
… Entreprise policy may disable these features in Chrome, actually.
cpn: Specific question about QUIC. One of the things that we might want to do for HbbTV is to implement some round-trip time measurement to help with synchronization. Is there something that prevents that in QUIC?
mfoltzgoogle: No. I think we have something like that already.
tidoust: At the QUIC level? Why not just send packets over the QUIC connection to measure round trip time?
cpn: App-to-app communication in HbbTV. If we're considering moving the HbbTV protocols over to this set of protocols, we would want to do the same things.
… Close synchronization is one of the features in HbbTV. The synchronization mechanism is based on round-trip time measurement of single UDP packets.
… Wondering if the same thing could be possible with QUIC.
mfoltzgoogle: One of the inputs to BBR is the round-trip time.
cpn: OK, that sounds like something that we could investigate for HbbTV.
Louay: Could be done at the application layer
mfoltzgoogle: Right, if the use case is shared in different scenarios, then we could look at it at a lower level
PROPOSED: For transport, mandate QUIC DataChannel at transport, and mandate DataChannel over UDP mode (without ICE) in v1. Plan is to allow use of ICE with host candidates in v1.5. And integrate ICE + STUN / TURN for network traversal in v2.
Resolved: For transport, mandate QUIC DataChannel as transport, and mandate DataChannel over UDP mode (without ICE) in v1. Plan is to allow use of ICE with host candidates in v1.5. And integrate ICE + STUN / TURN for network traversal in v2.
mfoltzgoogle: [going through some work items with WebRTC]
… [and with QUIC]
… I'll scribe them as actions
tidoust: is BBR mandated?
mfoltzgoogle: is more to make a recommendation
mfoltzgoogle: open issues
mfoltzgoogle: this explains about using QUIC with ICE as well as UDP
… the idea is to allow DataChannel semantics that could be migrated to a PeerConnection
… there are slides there that cover this
… our proposed resolution here could be a way to close this issue
mfoltzgoogle: this is an implementation question. until today, the only way to use the base transport is to pull in all of WebRTC
… there will be a way to use QUIC DataChannel without requiring all of WebRTC
… i'll follow up with the developers
mfoltzgoogle: we already covered this in part. for an RTCDataChannel you need a way to signal messaging between two sides already established. WebRTC uses SDP
… there'd have to be a bootstrapping channel together with authn for that channel
… the goal of our current work is to not require this bootstrapping channel until it's required for network traversal
… we can revisit in V2, when we're ready to include network traversal into the protocol stack
anssik: if there are no concerns, we can label it as V2
mfoltzgoogle: timeline for deployment of TLS 1.3
cpn: I can get information on that for us, as well as QUIC standardisation
mfoltzgoogle: we want to have an integrity of display section, so the browser can guarantee that's the intended display
… important that message are routed correctly, so not redirected elsewhere
… private data, URLs, ID, any application data is exchanged through confidential channels
… the high level categories of threats to consider in our security evaluation are: passive network observers (on or off LAN, on the internet), active network attackers (traffic injection), side channels
… timing attacks, spectre and meltdown
… people looking at your display from far away. things to keep in mind
… also insecure or malicious content. presentation connections are inherently cross-origin
… if you want to prove the other side belongs to an origin, needs key exchange
… UX for guidance to users for what content is being displayed
anssik: are there learnings from full-screen mode?
mfoltzgoogle: the Cast for EDU has a popup showing the origin
… we don't always have full control over the UX on the receiver side
mfoltzgoogle: ISP misconfiguration, example of an ISP putting two customers on the same subnet, can see each other's cast devices
… somehow they were routing multicast packets between the two
… we added some protections in Chrome to prevent connections to public IPs by default
… also, potential malicious software running on the presentation display or the UA itself. the platform should provide the best protection itself
… devices such as STBs and dongles can change ownership or be stolen
… it's worth us documenting these scenarios more thoroughly, does the work we've done so far mitigate these?
… it's important to know what we're protection. what information do we consider private and should be protected?
… i can start that, with contributions from others. we could get the WebAppSec group to also look at it
tomoyuki: i wrote a document about J-PAKE
… it has two purposes. one is to authenticate both devices, the other is to exchange encryption keys
tomoyuki: J-PAKE is standardised in IETF RFC8236. there are two variants of the protocol
… J-PAKE over Finite Field, J-PAKE over Elliptic Curve
… elliptic curve is more popular
… J-PAKE has two rounds. Both parties exchange two keypairs with a shared secret
… J-PAKE includes two RTTS. there is also a three-pass variant proposed
… we need to consider which variant is most suitable for the OpenScreen protocol
… the purpose of J-PAKE is key exchange and authn
… we need an authn mechanism for the open screen protocol. J-PAKE provides a common key generation mechanism. should this be used?
anssik: from a UX perspective, does this spec impose any requirements on the length of the passcode to be typed by the user?
tomoyuki: i'm not sure of that detail. the two round passcode isn't user friendly
mfoltzgoogle: i need to look at other products and what length is used, and find out how they decide that
… the main issue i see is brute force attacks. you want to prevent an attacker from succeeding by enumerating passcodes
… you could have a delay scheme to prevent that
… the other question I have is about the uniqueness requirement, also to prevent brute forcing. is that important?
tomoyuki: this is a general problem, not specific to the open screen protocol
mfoltzgoogle: could the screen have a permanent passcode? what are the security implications of these options?
mfoltzgoogle: also, there's an issue if you transfer ownership of the device with a permanent passcode
tomoyuki: incorporating J-PAKE into the open screen protocol
… there are a couple of possible schemes. server-signed certificate. this doesn't look like a good idea
… also TLS integration with J-PAKE
anssik: where is that specified, and what's the status?
… can we reuse existing work, or would we need to specify it as part of open-screen protocol?
tomoyuki: we have two options to reuse work (open-source implementations). Mbed TLS provides an implementation based on TLS 1.2, extended to use J-PAKE authn and key generation
igarashi: is the first one compliant to the draft that was obsoloted?
tomoyuki: it's already obsoleted, but they intend to make an updated internet draft, but i haven't found it yet
anssik: how does spec expiry work at IETF?
igarahi: you have to update every 6 months to prevent it being expired. so an expired status doesn't necessarily mean it's bad
tomoyuki: there are some comments on the IETF mailing list.
… the first is that it's based on TLS1.2, and TLS 1.3 hasn't been considered yet
… the next is about the two message vs 3 or 4 message handshake
… TLS 1.3 exchanges public keys in parallel with the negotiation. different between TLS 1.2 and 1.3
mfoltzgoogle: it seems like the way it was proposed is incompatible with cypher suite negotiation, making it difficult to deploy
… unless you're in a closed system where everyone is on the same version
… unless there's a way to make it compatible with normal client/server negotiation, it seems like it would be a non-starter for integration into TLS
anssik: many thanks, it's very useful
mfoltzgoogle: if we implement J-PAKE, it would have to be done at the control protocol level, not the TLS level
anssik: any UX implications?
mfoltzgoogle: it has implications on how we make a new transport connection. it's not successful until we've authenticated either through J-PAKE or other means
… similar if part of TLS or not. it still requires a passcode
tidoust: i want to understand when the different steps happen. presentation request, discovery, then when does the J-PAKE step happen, ie, the user is prompted for the passcode? or is there another exchange first?
mfoltzgoogle: there are two state a device an be in: discovered but not authenticated, then discovered but not connected to.
… for discovered but not authentcated, the UA offers a way to pair. this leads to the pairing code, then we move to the connected state
… if it's implemented as part of TLS, presumably that would use the port advertised for QUIC, for the crypto handshake
… if it's done at the application level, we'd use self signed certificates at the transport layer, then JPAKE on top of that
… we need a separate step to verify that the TLS session is the one that did the handshake
tidoust: what about reconnection later? is a new session needed?
anssik: tomoyuki, any other comments?
tomoyuki: if we use self-signed certificates, does that mean the verification of the certificate can be omitted? if J-PAKE is used for authentication
mfoltzgoogle: J-PAKE provides mutual authn. with fingerprints, the user would have to look in two different places. fingerprints are long, so they'd have to be readable. i think it's worth looking at as an option if J-PAKE is not suitable
[side discussion on J-PAKE exchange over TLS]
mfoltzgoogle: Thanks for the presentation, Tomoyuki. On my slides, there's also a link to a blog that has some more material.
… I transcribed the implementation of J-PAKE in Python 3. Each packet is around 1Kb in size.
… It's very feasible doing this handshake using control messaging.
… I think we covered some of the next steps for this.
… We need to understand the requirements around the passcode.
… Recommended UI for the user. Understand the key derivation function to use.
… Assuming we're doing it at the control protocol level, we need to define J-PAKE key exchange messages.
… And we need to determine whether J-PAKE can be used for recurring authentications.
… For an initial connection, this suggests creating a TLS 1.3 connection with self-signed certificates, then use J-PAKE to derive a shared secret.
… Then extract keying info from QUIC connection and verify with shared secret. Similar to what is done in Cast devices.
… That verifies that the handshake on the QUIC connection corresponds to the one that the J-PAKE handshake was done on.
igarashi: Does this ensure that the QUIC connection is secure?
mfoltzgoogle: It prevents man-in-the-middle attacks.
… Once the initial connection is complete, one proposal I want to develop more fully is to use that to derive new certificates to reuse in further connections.
… The presentation display may generate a long lived signing certificate. Server sends public key to client. Client generates a long lived signing certificate.
igarashi: Your assumption is that you cannot exchange long lived signing certificate through J-PAKE?
mfoltzgoogle: No, we don't want to require a J-PAKE for every use, we need to use a certificate to record that we completed J-PAKE at some point and trust the device from now on.
… Long-lived certificates may no longer be usable in TLS 1.3. Shorter-lived certificates can be created out of the long-lived certificates (signed by the long-lived certificate).
… [looking at certificate signature diagram]
… The client certificate provides 2 important properties: it restricts the scope of devices that are allowed to receivers to ones that have successfully managed to authenticate before.
… Secondly, it may provide some more improved privacy. We can create different certificates for different profiles.
… There are some advantages about certificates. We can record in some metadata about the device such as the serial number, the device model.
… We could think about generating new certificate for each connection, but that seems expensive.
… We may want separate certificates for private browsing. Also we need a way to revoke certificates, e.g. when requirements change.
… Some work to do, but it's probably necessary in the long term.
igarashi: I don't follow the details of the discussions, but if the short-term certificate is expired, new ones are generated, and re-authentication is not needed?
mfoltzgoogle: Right, as long as they are signed by long-lived certificates, then that's fine.
… Some work to do for key-based steps. Full proposal on key exchange, full proposal on certificate structure & scope. Obviously, there are other efforts in IoT/WoT.
… It's important to develop representative UI for both J-PAKE and PKI based auth because we want the users to understand what happens.
… Internal team may provide feedback on that as well.
igarashi: Not familiar with J-PAKE, but passcode may be unique, right? E.g. printed on the device?
mfoltzgoogle: Problem with permanent passcodes is that there is no way to revoke them.
… Rotating passcodes should be done at the same time as revoking certificates.
igarashi: The assumption is that the device has a display. What about devices without display?
mfoltzgoogle: For devices without displays, it will have the permanent passcode.
… There will need to be trade-off. Maybe for audio devices, the security implications are slightly lower.
igarashi: Also thinking about IoT devices. Does not need to be visual.
anssik: Need to formalize these proposals into issues and documents
mfoltzgoogle: Right. At least, we have something to start with, the J-PAKE implication. I would be comfortable to specify the J-PAKE implementation.
… We'll probably have to do an application level implementation on top of QUIC connection for now.
tidoust: is the result of the J-PAKE influence the receiver, or only the controller?
mfoltzgoogle: after the J-PAKE is successful, both sides generate certificates, and they exchange the public parts
… the certificates are stored. in the future, to create a new QUIC connection, the controller checks the server's public key in the TLS exchange using PKI
tidoust: the receiver creates a new certificate, to be used for all connections. how does it know which certificate to use for each connection
mfoltzgoogle: it has a single certificate to be shared with all clients
[side discussion on the detail of certificate generation and J-PAKE handshakes]
mfoltzgoogle: an area for exploration, with hardware backed certificates, show device number as trusted information in the UI. we could eventually find a way for trusted manufacturer public keys to incorporated
… another idea that could be explored is generating new certificates when changing the friendly name of the device. it's about the integrity of the information we display to the user
tidoust: is there a way for a controller and receiver to agree on the mechanism to use for passcode exchange (e.g, QR code, NFC, or other mechanisms)?
… it could enable the receiver to choose which UI to present
mfoltzgoogle: we haven't talked about the UI aspects yet. as part of that process, a list of mechanisms could be proposed, triggering the device to show a QR code or whatever
igarashi: if the receiver browser can't read a QR code, it wouldn't be interoperable
mfoltzgoogle: we need a mandatory option, e.g., typing the code
mfoltzgoogle: to create the full branding, we'd need UX standards as well
mfoltzgoogle: chromecast has a guest mode with a short PIN or an audio beacon, not sure if still supported
igarashi: I'd like to share what we're doing in the HTTPS in the Local Network group
… the web application is from a secure origin, connecting to a device, how do we trust the device?
… we don't have a clear solution
… my preferred solution uses PKI, others propose private certificates
… the user grants the trust, so is the trust anchor
… if the web application doesn't trust the user, the mechanism fails
mfoltzgoogle: if there were a web API for accessing manufacturer certificates, you could use to set up a WebRTC DataChannel
… verify the fingerprint, take a look at the ORTC APIs
… its up to the application to decide what certificates to accept in that scenario
igarashi: Presentation API assumes the device is trusted. device could be compromised or impersonated on the network
… we need to verify the identity before exchanging information
mfoltzgoogle: we'd verify the identity information with a trusted source (model name and number)
… this is a new infrastructure for public key stuff
… i agree it's important to have trusted identity information
… is the fingerprint permanent?
mfoltzgoogle: it's associated with the long lived certificate. it helps the controller identify which key exchange was done
igarashi: can be done offline?
mfoltzgoogle: it could be done on first boot, for example. as long as it doesn't change hands or identity. it may need to be regenerated from time to time, when we expect the user to have to re-pair with the device
igarashi: this mechanism could be address some of the issues we have, such as mixed content
mfoltzgoogle: the issue is shared with a lot of web APIs right now, e.g, Web Bluetooth, Web USB. could use some more scrutiny in general, scenarios where things could go wrong
… the goal is to have sufficient security guarantees for exchange of private data. it's opt-in for web applications, they need to be aware. it's based on a security mechanism with different properties to the public internet
… we wouldn't want to give out long term credentials, as a mitigation
… that's just good web security design in general
igarashi: a compromised receiver could be a big issue. the controller might want to know about the origin of the receiver - e.g, a sony device, verifiied by a sony server
mfoltzgoogle: adding encryption on top of this is another option. there's no inherent origin to origin security model here. we could think about adding it in future
anssik: issues to create coming out of this?
mfoltzgoogle: identity aspects of receiver certificates, origin to origin. this is a new workstream
… the priority for brandon and myself are to complete J-PAKE, and see if we can derive a key exchange mechanism based on that
anssik: thank you everyone for a productive day