W3C

Second Screen WG - TPAC F2F - Day 2/2

26 October 2018

Meeting minutes

See also:

Going through the agenda, we'll talk about HbbTV, library implementation (quick run-through), security implications, ICE support for cross-LAN scenarios, J-PAKE discussions, possibility to put streaming in scope, the ability to use a polyfill implementation

HbbTV and Second Screen investigation

Louay: More a discussion about using other schemes than HTTP/HTTPS. HbbTV is one candidate for this. I don't think we should consider HbbTV as a specific case, more as an example.
… It's similar to mobile applications where you can start other applications using a specific scheme.
… In the example I'm showing, I pass an "hbbtv:" URL to the PresentationRequest constructor, with a fallback on HTTPS and cast.

<anssik> Schemes and Open Screen Protocol

<anssik> [showing Schemes and Open Screen Protocol document on the screen]

Louay: The user-agent will pick up the right URL, although we need to discuss the order and priority.
… Scrolling down to the diagram, here's an example with a controller and two receivers. The controller supports hbbtv, cast.
… The first receiver has two UAs, one supporting cast, the other hbbtv.
… What should we display to the user?
… I don't think the user cares about the technology, she just wants to select a device, not the device/scheme pair.
… The controller device will detect that the URLs are supported and by whom and display the right device accordingly.
… The filtering is important. Supposing we want to pass a URL with an "hbbtv:", the controller could send the targeted scheme in the discovery request. Or the receiver could advertise supported schemes in the response. Third option, just do the discovery without info, and exchange the data afterwards during subsequent phases.
… For the hbbtv scheme, it's important that we pass all the parameters required to launch the application.
… For now, in HBBTV devices, DIAL is used, with an XML request message that contains a few parameters.
… Important ones are the "appId", "orgId", "appName", "appUrl". I don't think that's up to the Open Screen Protocol to specify that.

MarkFoltz: There's a number of things to unpack here.
… Going back to the diagram, what is the user experience today with Android TV?
… Today, the way it's implemented, the user would see two different displays that correspond to the same physical device.

Louay: I discussed that with Chris. We could add a Cast logo or an HbbTV logo, but that would be confusing for users as they don't know anything about these technologies.

Francois: I note the spec imposes which URL gets selected, using the array order.

MarkFoltz: It really comes down to whether the controller sees the Cast receiver or the HbbTV receiver as same or separate screens.
… If the device platform is advertising OSP and accepts a single connection, there shouldn't be an issue.

Francois: That supposes that there is a box, missing here, that connects to the two internal user-agents in the first receiver.

Louay: Yes.
… On one devices, you have two or three different receiving agents.
… We should consider ways to merge them into one physical device, perhaps using the IP address.

MarkFoltz: There may be solutions available, but it's important to think about impersonation issues, so there would need to be some keys shared by all the receivers at the platform level.
… Different ways to have multiple listening endpoints on the same device, one is using mDNS subtypes.
… One mDNS request should be able to trigger multiple mDNS subtypes responses.

Louay: The solution designed here won't work for existing HbbTV devices because DIAL is used. But that's ok, that's future-looking. We just want to make it attractive for HbbTV to be able to adopt this.

MarkWatson: It's been a long road to get DIAL into TVs, not only for HbbTV. Something new will take a long time as well.

MarkFoltz: Allowing a device that supports multiple schemes seems a good candidate for mDNS subtypes.

Louay: Then you get the info right away and don't need to connect to these devices with a scheme they do not support.
… This is more specification work, design decisions for the Open Screen Protocol.

MarkFoltz: Proposed resolution here, I think, would be to discuss alternative ways to advertise schemes and come up with specs that allow for pull requests.

<anssik> [Meta] Add note to schemes.md about custom schemes and interop #93

The group discusses arbitrary URL schemes. Browser vendors will unlikely support arbitrary ones. There are trade-offs to consider, and different perspectives. If arbitrary ones are supported, interoperability is unlikely to be achieved in the sense that receivers won't converge to common schemes.

Some tradeoffs need to be considered, that is the purpose of #93.

Some question about how it affects user experience. Clearly, the user will think about it in terms of how many times it works and how many times it does not.

If you have only one way to do things, then interoperability is great. If not, then there is a combinatory issue.

MarkWatson: If you let vendors decide which schemes they can support, then obviously they will go with their own, and interoperability is an issue. But then, if you stick to HTTPS, you're pumping the problem into the domain name, where receivers may detect e.g. "netflix.com" and launch a specific applications for such URLs.

MarkFoltz: Two levels, device interop, and content interop. We need to address both, and we're more focusing on the first one for now.
… We definitely have transitions across schemes, e.g. Web Intents, native app triggering. The goal here is to document the pros and cons.

anssik: What are the pros and cons of the custom schemes? That's a good investigation to make.

MarkFoltz: That's something we may want to bring to the TAG for comments at some point.

PROPOSED RESOLUTION: Explore and document the pros and cons of specific schemes in a format that the TAG would fancy reviewing (e.g. including thumbs-up cats)

Resolved: Explore and document the pros and cons of specific schemes in a format that the TAG would fancy reviewing (e.g. including thumbs-up cats)

MarkFoltz: The other resolution I wanted to capture is, given the reality of existing schemes, we need a way to advertise supported schemes during discovery, e.g. using mDNS subtypes.

PROPOSED RESOLUTION: Investigate and include some proposed solution to advertise custom scheme in mDNS

Resolved: Investigate and include some solution to advertise custom scheme in mDNS

Progress on open screen library

Open Screen Library slides (63-75)

MarkFoltz: Will go quickly over the status and progress in the library

<anssik> Open Screen Library

MarkFoltz: Objectives are to make it easy for people to add open screen protocol support.
… It's self-contained, embeddables, and there is also an OS abstraction layer if you want to integrate it at that level.
… We're trying to keep it with a small footprint, although we're integrating a bit of WebRTC, which increases the footprint surface.
… We're implementing things one after the other. We obviously want to support controller/receiver parts of the protocol for the Presentation API.
… And local/remote for the Remote Playback API.
… The library does not do rendering of HTML5 and media obviously, that's up to the embedding user agent.
… Extension points are being considered.
… Overview of the architecture: top API that exposes two different interfaces. One is an API that looks like the spec. The other is an API to expose some controls over protocol, e.g. turning on/off QUIC connections, controlling authentication flows
… At the protocol layer, the CBOR control protocol and QUIC connections are the main ones.
… For service implementations, we include some parts of the Chromium QUIC and integrate that in the library. Chromium QUIC is moving towards a more embeddable codebase, but for now that takes some size.
… At the platform API, this is where things like ways to control sockets go.

MarkWatson: That's just one C++ library?

MarkFoltz: Yes.

MarkWatson: These may live in different processes on the TV, e.g. mDNS discovery. Is the QUIC stack loaded everywhere?

MarkFoltz: We cannot do that mapping ourselves. Some boxes can be substituted to platform boxes. It will be more modular as we integrate with different platforms.
… As a starting point, you'll get a library that can do everything.

MarkWatson: A different approach would have been to expose CBOR and implement the rest in JavaScript.

MarkFoltz: We expose the interfaces in such a way that embedders can supplement their own implementations in some cases.
… You would like to be able to update the core protocol at a faster pace?

MarkWatson: I'm wondering how you see that. JS is easy to update on a daily basis.

MarkFoltz: The CBOR messages would always be in C++ because we don't want to have a JS interpreter dependency there
… Extension points are one of the design goal.

MarkWatson: Experience with CE devices is that the C++ update lifecycle does not even exist, unless there are big security issues.
… That will progress, but that's a huge overhead.

MarkFoltz: I think it's certainly doable to implement some of that in the JS runtime. I don't think we'll do that as part of that library.
… If for your use case, you want to draw a horizontal line in the middle and expose a QUIC connection to the upper runtime, then you can

Francois: It could be worth exploring how to compile the upper layers to WebAssembly for instance.

MarkWatson: Modularization would be key here. That would help with integration and maintenance in CE devices.

Mark Foltz presents a video screencast of running the library running on the command-line.

MarkFoltz: Obviously, we want to hook this up to something that can render the results for a more interesting demo.
… We recently landed a way to generate CBOR serialization and parsing through CDDL. Links to the discussion we had yesterday.
… We started doing this by hand for one message, and decided to automate that for 40 messages.
… This does not yet use the comment convention we discussed yesterday.
… The Platform API, we have an implementation that works for Linux. Other platforms are possible.
… Looking at the roadmap, we started in February. Today, we have most of the embedder APIs done, a part of the control protocol. The next task is to develop the authentication part.
… Then benchmarking, end-to-end features.
… We're getting closer to enabling external contributions. Louay left the room, but we'd be happy to use some of his resources.
… Our code system is a bit "interesting" but it is documented.

Privacy and Security

Privacy & Security slides (76-85)

Mark: This is a two part discussion. I prepared material to talk about the considerations for thinking about a security architecture for OSP
… What information shuold we protect, threat models, high level techniques to address them
… Peter will present on some of the mechanisms
… W3C has a security and privacy questionnaire that covers some of the high level material
… Used that to frame some of this discussion
… Separately, I think we should use the questionnaire in more detail
… What's important to protect? Personally identifiable information, high value data
… For our protocol and APIs there are few things: the content of URLs and IDs used by the Presentation API
… Presentation messages generated by the application are considered private data. They could contain passwords or tokens
… Remote playback contains URLs, also streaming would be in scope
… We'll look at mechanisms to implemented security, such as private keys and J-PAKE passwords
… There could be more things to add to the list, so the questionnaire will be important to work through more systematically
… In the case where you have a public presentation, e.g., digital signage
… You may want to advertise the URL, so it's no longer completely protected data
… Should consider who should be allowed access to it
… The second category of data I considered is device-specific data
… Not about content or users, more about the device itself
… Could be discovered through network advertisement, could include GUIDs like device IDs, device capabilities
… Asking whether a device can play a URL divulges information about the device
… We should consider how much to protect this information
… There's other information available such as MAC addresses
… How valuable is the data, aggregation for different purposes?
… Moving the friendly name to the post-authentication part of the protocol may address some concern
… There are different threats, unauthorized actors who want to access this info
… Three threat models in the questionnaire: a passive attacker, on the local network, all they can do is read data being exchanged, can't modify it
… Learn what's advertised through mDNS
… They can see who's conntecting to whom, but can't see the content of messages
… We want to minimise the data expose before establishing an encrypted transport
… A few things to follow up on. Even with TLS there are passive active attacks possible, so we should look at the history of those kinds of vulnerabilities
… One mitigation is to rotate keys, could help address these concerns
… The second kind of attack is where an attacker can inject or modify traffic in the network
… There are ways to do that, by gaining access to the LAN, comprimised device or router
… Have a potential for a much wider range of exploitable vulnerabilities
… Three scenarios we should address in the design
… One is impersonation, man in the middle. A device that intercepts the data intended for a TV where the user is not aware of it
… Another is impersonation of a controller. Want to understand the vulnerabilities this may introduce
… The third is denial of service.
… Impersonation of a receiver. Want a way to validate the QUIC connection to the intended device
… One mitigation is the J-PAKE proposal, to create a shared secret that's passed out of band
… A password or QR code displayed on the TV, creates a shared secret
… Something used in combination with out of band secret is to have a third part that can sign a public key and the device can use the key as a trusted identity
… Want a distinction in the UI for trusted vs untrused devices. Analagous to the TLS connection indication in Chrome
… Also in the advertisement process, look for collisions between MAC addresses or friendly names, and flag these to the user
… If the user changes the name, it may imply that trust should be broken with that device, and have to go through the authentication process again
… The second analogous case is a device acting as a controller that impersonates another device
… Important, as we don't want to allow devices to access presentations without there being a relationship between those two devices
… When the user verifies the pairing process, create a client certificate
… Another thing we could do is look at the presentation id itself, for reconnecting to presentations and harden that by connecting to the authentication mechanism
… Want to prevent brute force attacks
… Don't have a proposal for that yet
… Finally, in our protocol design, we shouldn't expose PII data without a valid client certificate and valid presentation ID

Peter: If I initiate a remote playback, could someone else connect and control that remote playback?

Mark: Currently not, but if in the future you want a remote media session where a device could have remote media controls
… Want to look at which controls could be PII, e.g., don't expose URLs to other controllers
… The third high level issue is denial of service
… A malicious device could poison mDNS caches, publish fake records
… Exhausting resources on the receiver
… Don't have great ideas for how to mitigate
… Maybe DNSSEC for filtering of mDNS responses, throttling of connection attempts
… These may just be implementation concerns
… If you have access to the LAN you can do things to prevent any network protocol from functioning, e.g., blocking packets
… May not be worth spending a lot of effort in this area
… We've been discussing protocol designs for authentication
… Looking at the security of the entire software stack, there are things outside the protocol design
… Security of the underlying platform and O/S, hardware backed crypto
… Verified boot
… Another attack vector is untrusted content that tries to exploit vulnerabilities in the platform
… Sandboxing and origin isolation
… Software updates are key, particularly in the TV ecosystem
… Educating developers on how to use the APIs, and UI guidelines for how to display specific data
… Auditing and managing the lifetime of keys
… These are broad areas of interest, not necessarily in scope
… What I think we should do as a group is look at the W3C security and privacy questionnaire
… We should research best practices for authentication
… TLS 1.3 uses cyphers more secure than before
… in J-PAKE, length of passcodes and length of validity
… We'll talk to people internally and get review from other W3C groups
… And get feedback from application developers, as they're the ones who will generate the high value data

J-PAKE

JPAKE slides (88-96)

Peter: [recaps state of progress from yesterday]
… Two parts to this: how to use this with TLS, and what the CBOR messages are
… Three options: JPAKE after TLS, JPAKE replacing TLS, or do something else
… With JPAKE after TLS, it would have a number of round trips
… Certificates from both sides, each side knows the other's
… Each side authenticates to the other using the shared secret
… JPAKE requires 3 RTT, but can be dome with 2 RTT using the response to trigger the next step
… With round 1, the initiator provides some information and zero-knowledge proofs
… The JPAKE spec goes into the maths
… The initiator gives two sets of these, and the responder responds with two sets
… In the second round trip, completing round 2 and initiating round 3
… The third round is not actually part of JPAKE specifically
… The result of the first two rounds is a key known to both sides, can be used as a replacement in TLS
… If you want to verify that both sides have the same key, you need the third round
… Pass a double hash

Mark: [discussion of bigint, bytes, bignum as the CBOR type]

MarkW: What is the zero knowledge proof?

Peter: To start, you have g, p, q
… You provide gv, and r - it's all in RFC 8235 section 2.2
… The signer ID picked by each side, we could put the fingerprint from the TLS connection there

MarkW: Is it something passed from one side to the other, or is there challenge-response?

Peter: The other side verifies it's correct
… Both sides provide a shared secret, then we do all the messages, the result being some keys

MarkW: It's passed out of band
… Are we proposing to standardise that?

Mark: We have to provide constraints to ensure a given level of security
… e.g., printing the value on the device, want to have constraints for passcodes of a certain length, entropy, rotation

MarkW: Which way do these go?

Mark: Easier to input on the browser than on the TV

Peter: It's a one time code, then thrown away
… Option B is that instead of doing TLS then JPAKE over the encrypted connection, we could use JPAKE to replace the handshake
… There's an IETF draft
… Not clear when this is going to happen, the draft is 2 years old, has some issues
… If we did this, it would work by using JPAKE as the handshake on the first connection, then swap certifcates
… and use TLS certificates on subsequent connections

MarkW: TLS has cypher suite negotiation, JPAKE is elliptic curve, so different message formats?

Peter: If there were another form of JPAKE, would have to be different CBOR messages

MarkW: How do we review the security of this?

Peter: Let's come back to it, good question
… TLS with a challenge response, you can go from shared secret with mutual authentication
… A message that includes a challenge, response includes a hash with the challenge or password
… Two fingerprints avoids MITM attacks
… With those two certs and the challenge given
… I believe this would be secure, and simple
… Comparing options
… Having JPAKE replacing TLS is hardest to spec, easy to get wrong, but uses 2 RTT on the first connection
… JPAKE after TLS is easier to spec and implement, so harder to get wrong, but use 3 RTT on the first connection
… Third option, challenge response after TLS is easy to implement, and hard to get wrong - but we'd need security review. It has 2 RTT on the first connection

Mark: It seems there must be some precedent with the third option with WebRTC. Do you do the challenge response in a separate layer?

Peter: No, they're signalled out of band. You'd somehow convey the entire secret between the devices, show on the TV
… The out of band communication channel is trusted

MarkN: We do something similar at Intel. Do you have caching?

Peter: Yes, you wouldn't need to go through the process on subsequent connections
… You'd cache their certificate

MarkN: Does JPAKE account for key rotation?

Peter: New keys set on each QUIC connection

Nigel: Where you do the JPAKE and accept and store a cert, this implies you have trust from the JPAKE, so you'd have to independently verify the certs anyway
… It only works if you have independent scrutiny of the certs, so it seems something's missing

Peter: With JPAKE after TLS, the cert is self signed, and JPAKE is used to trust it

MarkW: You trust it because the user told you
… Would require collusion of the user

Peter: I recommend we don't do the JPAKE replacement of TLS, and consider dropping JPAKE for challenge/response
… If we don't want to drop JPAKE, we should do JPAKE after TLS

Mark: Tradeoffs with usability, length of code to input
… The challenge response seems simpler, reason to favour it for implementers
… We could research some of the existing solutions from Google and Intel
… A security review will be mandatory

MarkW: So we should put these in order of preference to focus the review

Peter: Anyone have a problem with dropping JPAKE?

PROPOSED RESOLUTION: Choose challenge/response model, ask for security review, and adjust based on feedback

Resolved: Choose challenge/response model, ask for security review, and adjust based on feedback

Support for off-LAN devices using ICE

ICE slides (97-99)

Peter: ICE converts a messaging or signalling channel to a data channel, not necessarily fast or high throughput
… Use that to bootstrap to a direct connection
… Needs and Id and password for the session
… ICE candidates are IP+port combinations
… Need to pass these to attempt the direct connection
… STUN is a mechanism where if you're behind a NAT it conveys what it sees as the IP and port
… TURN is a protocol for going through a relay server
… ICE doesn't always result successfully in a direct connection, could be through a relay server
… Two separate STUN servers don't need to know about each other, also for TURN server
… Each site can initiate ICE. Complication if both sides do it at the same time, but this can be worked out
… Two ways that ICE can be used. One is independently of OSP, then it's done over the discovered connection
… The second way is to put something about ICE into OSP for bootstrapping a connection
… For example, connecting from a browser to a TV on a LAN. Could use that to send the ICE session
… If we did that, when ICE is initiated we'd have a ICE start request
… [presentation of ICE messages]
… These would only be useful for bootstrapping from an existing connection to an ICE based connection
… For ICE to work well, you need a signaling channel the whole time
… It might have some value depending on the use cases we want to cover
… In the web context, we need some way to do the initial bootstrapping, exchange the candidates
… One way we could do that is use RTCIceTransport, and pass to the presentation API
… Or the Presentation API could create the ICE transport
… Alternatively, use RTCQuicTransport
… Might be useful where someone wants to do a presentation or remote playback to a server, which is then sent out to other systems
… Need some object that acts as a transport to pass to the APIs, the browser takes it from there

Mark: We need to explore further whether we drive the ICE functionality more from the browser side
… In which case we need to expose a messaging channel for doing the signalling back to the application
… Or we allow the application to drive ICE and present that to the API
… We'd need to figure out where the authentication handshake fits in

Mark: [discussion of API shape]

Peter: Key thing here is we need another discover mechanism than mDNS, e.g., a service that the client knows to go to

Mark: Register the device as a presentation screen, or other mechanisms such as bluetooth
… Other APIs in development could help, Web Bluetooth, Shape detection API
… Could use bluetooth or high frequency audio to bootstrap a URL

Mark: Use case is a projector on another network than your device, need a way to bootstrap

Anssi: Next step for this topic?

Peter: Three things we could do: one is to add support for ICE to the OSP library, second is to make an extension spec for the Presentation and Remote Playback APIs that allow injection of the transport, third is to add OSP messages

Mark: We'll explore in our own implementation initially, will help inform future spec work in this area

Anssi: This is currently out of scope of the CG, let us know if we want to bring it into scope

Peter: There's an RFC for SIP, but we don't want to do that

Mark: It's not OSP specific

Peter: There's a SIP way, and XMPP way, and if we do it, also an OSP way

<anssik> [breaking for lunch, back at 13:00]

Streaming

Streaming slides (100-109)

Anssi: The purpose of this session is to decide whether we want to add streaming to the CG's scope

Peter: There are 5 sets of messages that would work for streaming. The most important is sending of audio and video data
… It's possible to send audio in separate QUIC streams to get the frames out of order, and build on top of the CBOR we already have
… Some messaging is needed to determine capabilities
… The most important parts are sending the media, and key frame requests to the controller
… [Audio frame example]
… A video frame can be quite large, so some minor overhead due to the keys, but with audio the overhead would be larger
… Codec-specific payload
… The start-time could have an numerator and denominator, can infer these from the encoding-id and start-time
… Can infer the end time from the start time + sample count

MarkW: Other specs would call this the sample duration
… AAC uses a 24kHz clock but can get either 24,000 or 48,000 samples out depending on the encoding
… The term "sample" is ambigious - meaning is different between MP4 and PCM samples

Mark: If you change the encoding id, how do you specify that this is a new version of a previous id? Is there a 'track' concept?

Peter: I'll get to that
… Regarding encoding, you can put anything in the payload and CBOR can tag it
… You could put it as bytes, and put the codec info in the encoding-id, or you could have an opus specific payload

MarkW: But that indicates the codec in every frame

Mark: One option is to have a generic payload of bytes, or we have some tagged format with Opus bytes or AAC bytes, and then the receiver can send those bytes to the codec

Peter: If we're trying to squeeze bits, then we should have the payload be bytes and make the codec implicit
… Things that change infrequently can go into the optional fields

Mark: So all new values would be optional
… The thing to take away is making sure this is extensible without making the parser too complex

Peter: In the optional area, the duration may change. Opus is typically sent at 20ms. If you detect you need to send more bits, can set to 120ms
… In WebRTC, the sequence number is used by the jitter buffer. I added it as an optional field here, if we think it's worth having

Mark: Since it's optional until we need it, we can mark it as potential for the future. We'd have to think about how receivers would behave if it were missing

Peter: It's nice to have a mechanism for synchronizing clocks. Not every frame, every few seconds, have a value that can be compared between frames of different encodings, e.g., audio and video
… Not sure whether to use media-time field here

MarkW: Do you require the media time to advance at the same rate as the start time of the frame?

Peter: No, there may be clock drift. This is about synchronizing audio and video together

Mark: Does it need to be the common denominator between the audio and video clocks?

Peter: No
… Video frames are much bigger, so relative overhead of metadata is less
… The encoding id gives an implicit clock duration, rate and codec
… The video frame has a frame id. You don't always know the duration of the video frame. You need to be able to refer to previous frames
… It's more common to have codec specific headers or metadata with video than audio

Mark: What would be inside the payload here? May be interesting to come up with some examples

Peter: The other thing common with video frames is dependencies between frames, unlike for audio
… So we have an array of frame ids that the current frame depends on
… For a keyframe, this would be empty
… It gets more advanced with SVC

Mark: This has time dependencies and resolution dependencies
… Can we capture those dependencies with this?

Peter: We'll find out when we implement it

MarkW: If i'm receiving a fragmented MP4 video stream, and copy this to an outgoing stream in this format. I wouldn't know which frame this frame depends on, MP4 doesn't tell me. I'd have to dig into the codec
… Parsing the container format won't be enough

Peter: Having a field here that something in future will depend on it, and are we duplicating information

MarkW: Random access is a use case - how to know when to start without decoding the bitstream

MarkW: Would you send frames in presentation order or decode order?

Peter: They can be reordered over the network

MarkW: So the start times are not necessarily monotonic by frame id

Peter: Next is duration, optional and defaults to the encoding's value
… There's a video rotation, defaulting to the encoding's value. It's common to have a 90, 180, 270 degree

Francois: Also horizontal or vertical flip?

Peter: Only rotation is used in WebRTC
… I don't see a use case for that

Peter: The media time is for synchronisation, either between audio and video, could also be between two videos or two audios
… Lastly, there are data frames
… The most obvious case is for text tracks, but could be any data to be synchronised with the audio and video
… All the real info will be in the payload
… Again, we can discuss if this should be bytes or an arbitrary CBOR object

Mark: We're already going to have to define schemas for text tracks in remote playback

Peter: We can define things that we'd expect to go in there
… The encoding-id describes the sequence of frames, e.g., a particular encoding of text cues
… We could think about typical data sent along with video or from a capture device
… e.g., input devices such as a mouse pointer

Alexandre_Gouaillard: Are you thinking about AR and VR use cases?
… There's the problem of sync there

Mark_Arita: The Immersive Web WG was discussing having multiple layers of video

Peter: We can support that
… I didn't add anything regarding instructions to the receiver for what to do with them
… You could send more than one video stream
… We have nothing to talk about post-processing

Alexandre_Gouaillard: Do you assume the same transport?

Peter: This shares the same transport, and you should be able to synchronize

Mark: It's a good starting point, could bikeshed the structure details
… The video frame looks much like an audio frame with a few things added, could set up an 'extends' relationship

<takio> does it need a key-frame flags for video frames?

Peter: can have a keyframe flag by having an empty dependencies array

Alexandre_Gouaillard: The payload may have info about whether it's a keyframe

Mark: But the dependencies is optional

Alexandre_Gouaillard: Is there a use case to justify leaving it unknown? Do we have a problem making it optional?

MarkW: [discussion of how to handle keyframes]

Peter: This information is really for the jitter buffer so it can decide when to feed it to the decoder, as frames can arrive out of order

Mark: You may need to wait for re-transmits to complete a frame

MarkW: This structure looks good to me

Anssi: The key thing is to decide whether this should be in scope

MarkFoltz: Of course, we can take a resolution to explore further before we commit to extend the charter.
… The amendment process is written in the current charter

MarkFoltz: I don't recall if anyone objected specifically to streaming when we rechartered the CG. I think it was mostly a question of setting the scope.
… Streaming would be important to add to the charter and to open screen at some point, because the 1-UA mode has streaming needs, which will need to be addressed at some point.
… But we have other priorities to deal with.
… So, mostly a timeline question.

Francois: No specific requirement from a W3C perspective. The CG process is whatever the current charter says it is. It sets rules for amendments. So that's it. It's up to you.

anssik: Maybe we can add it with a note that the priority is the current scope.
… Streaming is a phase 2 deliverable.

PROPOSED RESOLUTION: Start the process to add streaming in scope of the Second Screen Community Group as a phase 2 deliverable

Resolved: Start the process to add streaming in scope of the Second Screen Community Group as a phase 2 deliverable

Action: Anssi to start the process for charter amendment to add stream in scope

Peter: If we start working on it, we'll need to think about media streaming capabilities, key frame requests and a few other things.

MarkWatson: You might want to add some HDR color identification in the capabilities because that's not in the codec.

RTC polyfill

RTC polyfill slides (110-113)

Peter: The idea here is "what can we do if the browsers do not implement the Open Screen Protocol?"
… Assuming they do not but are implementing some RTC technologies, there are things they may do.
… One example is Edge implementing ORTC specs and lookin into RTCIceTransport and RTCQuicTransport.
… There are things you can do, but you cannot do without mDNS discovery and some mechanism to cause the browser's receiver selection dialog to pop up.
… We could use ICE for discovery instead of mDNS and implement the Open Screen Protocol in JS, but the key here is that you wouldn't be able to use mDNS for discovery.
… If you did have this mechanism, the selection of the device would be an in-page mechanism.
… Actually, that's not really ICE for discovery. It's ICE to establish the connection following an out-of-band discovery mechanism.

MarkFoltz: Think about a list of friends on a social app, and you click on one.
… There are overlaps between this use case and our ICE discussion earlier.

Peter: The mechanism for that would be entirely driven by the Web app.

Louay: Another mechanism would be using audio fingerprinting, à la Shazam.

MarkFoltz: Anything that's going to require a sensor is probably going to require permission.

Peter: User experience would be that they would select the receiver in the page. Web-app specific thing. There will be no way for a device to find devices connected on the LAN unless there's another mechanism to do that.
… To stream out, you'd need some way to stream the content: getUserMedia, getDisplayMedia to stream a tab.
… Last piece is that you'd need an encoder. In some cases, that's not strictly needed. Right now, encoder is embedded in RTCRtpSender, decoder in the receiver.
… The WebRTC WG is looking at ways to making that more accessible.
… By far, the biggest missing gap is the lack of mDNS discovery.

MarkFoltz: [mentions the Network Discovery API and security issues triggered at the time]. It's a matter a finding a workaround this.
… Or to use a cloud server.
… For app-to-app communication, the second option is probably doable.

The group discusses layering of technologies, ways to split betwen natively supported features and features that could be provided at the JS level by some third-party library.

MarkFoltz: What I was trying to say earlier is that we'd be pushing the work from browser vendors to app implementers, which may or may not be an acceptable trade-off.
… Generally, as there are security implications, that's something that we'd prefer to do natively.
… The other rationale for doing more things in the browser is not share too much detail with the app about what's available on the LAN, but we could perhaps think of an mDNS variant specific to the Open Screen Protocol that only exposes the information that's absolutely needed, possibly on a per-origin basis.

cpn: Right, all the concerns around mDNS that had been raised previously still exist today.

MarkFoltz: Yes.

Francois: But an mDNS spec dedicated to open screen would still leak a lot of information on the LAN to the requesting Web app.

Peter: Right, but most of it is already known with WebRTC

Louay: We did some experimentation using an iframe from another domain as intermediate.
… The iframe will use cookies to track things up.
… Each device would have a connection to the server.

MarkFoltz: If we go this road, it would be interesting to understand whether browser vendors would converge to a registry of libraries that can initiate connections.

Goals for 2019

<anssik> [Meta] Open Screen Protocol 1.0 Specification

<anssik> Open Screen Protocol Editor’s Draft

MarkFoltz: The remaining work is to take the resolutions that we took in Berlin and during this meeting to GitHub and to the spec document with some appendix to describe all these CBOR messages
… To complete the privacy and security questionnaire. If we resolve the issues currently opened, I believe we can consider that as the main deliverable of this community group.

anssik: Wondering when would be the right time to start a TAG review.

DavidBaron: The TAG has been reviewing specs at an earlier stage
… Having specs to review without good ways to understand what the spec is about. Code examples are good. Pulling out things from several documents is doable, but if you have a thing that people reviewing the document can make sense of the spec, then that's a good moment to ask for a review.

MarkFoltz: We can back up an explainer from the API and WG/CG charters.

DavidBaron: Something that doesn't require following 15 links to figure out what's going on is good.

anssik: Part of this work has already been reviewed by TAG. We chose CBOR based on TAG's guidance.
… OK, I'm hearing that we don't need a fully written document

MarkFoltz: Still not entirely clear whether we need an explainer document separate from the spec

DavidBaron: as long as design questions are available somewhere, a full explainer may not be needed. We have an explainer's explainer

<dbaron> TAG's document on explainers ("explainer explainer"): https://‌github.com/‌w3ctag/‌w3ctag.github.io/‌blob/‌master/‌explainers.md

MarkFoltz: OK, that seems a priority. The spec itself will be fairly detailed at the end of the day. I'm not sure to what extent the TAG is willing to dive into implementation details.

anssik: We identified a few cases where we'd like TAG feedback

MarkFoltz: The explainer's explainer and the security and privacy questionnaire are the two things I'd like to focus on next.

anssik: That's probably even for this year. What are the goals for next year?

MarkFoltz: The goals of this group are completing TAG review(s), finishing a very detailed spec by the end of the year hopefully, and consider that as a 2019 outcome for the group.

anssik: Question about the formal standardization for this work afterwards. W3C, IETF.

MarkFoltz: Yes, assuming that we want to put it on the standards track, we need to figure this out. My initial thoughts are that most of the spec should go at IETF.

Peter: Certainly that would mirror the WebRTC model.
… Are there other examples? What happened to WebSockets?

anssik: Same thing, protocol went to IETF, but not much engagement from the editor into IETF if I remember things correctly.

Peter: Any example of a protocol that was done in W3C but not in IETF?

sangwhan: Linked Data Platform

MarkFoltz: WebDriver

anssik: Any feedback from experience in WebRTC?

Peter: Definitely some overhead. Switching hats. The mismatch between TPAC and IETF meeting dates make things awkward from time to time.
… The advantage would be if there would be significant overhead with other work at IETF.
… If we had significant with CBOR group or mDNS group, then that would make sense.
… I don't see any significant need for now.

MarkFoltz: QUIC and CBOR are the two immediate "new" needs that we have. We can simply wait until these things are done and use these. If we need to engage with them.

anssik: Can this work happen at W3C?

Francois: Provided there's enough support here, no objection from IETF, I don't see why not. In particular, that's not a pure protocol in the sense of a low-level protocol, it's an application-level protocol.

anssik: A bit too early to make decisions, apparently.

Peter: After we finish the first milestone and we look at streaming, balance may change.

<anssik> Second Screen Working Group Charter

MarkFoltz: Out of scope of this group, we will want to push our implementation of the open screen library.

anssik: We'll need to decide on transition(s) by the end of next year when the WG charter expires.

<dbaron> I think I misremembered the charter graph at https://‌w3c.github.io/‌charters-dashboard/

Conclusion

anssik: I'm happy for the discussions here. Thanks a lot to Mark and Peter for the thorough work. We managed to attract a few other participants here, including Mark Arita with a security background.
… We're making progress.
… Thanks to our scribes!
… No decision to meet before next TPAC yet, but we can plan that later on if needed. We don't need much. Room, projector, coffee, chairs, nice weather.

MarkFoltz: Minimizing total travel aligning with other events if there are other events where people would be going to, that would be good.
… FOMS next year on the East Coast. Perhaps the gaming workshop.

<dbaron> https://‌wiki.csswg.org/‌planning/‌hosting is some notes on things the CSS WG has learned in the past about things that people have done wrong when hosting a meeting :-/

MarkFoltz: We can look at possibilities.

anssik: Yes, some time around May.

MarkFoltz: I also don't mind teleconferences as long as we don't do that too often.
… We can talk offline.

Summary of action items

  1. Anssi to start the process for charter amendment to add stream in scope

Summary of resolutions

  1. Explore and document the pros and cons of specific schemes in a format that the TAG would fancy reviewing (e.g. including thumbs-up cats)
  2. Investigate and include some solution to advertise custom scheme in mDNS
  3. Choose challenge/response model, ask for security review, and adjust based on feedback
  4. Start the process to add streaming in scope of the Second Screen Community Group as a phase 2 deliverable
Minutes manually created (not a transcript), formatted by Bert Bos's scribe.perl version 2.49 (2018/09/19 15:29:32), a reimplementation of David Booth's scribe.perl. See CVS log.