See also: IRC log
See also the minutes of day 1.
[Shih-Chiang demoes a 1-UA implementation of the Presentation API in Firefox using a Chromedcast device]
Shih-Chiang: Mirroring start
presentation.
... Without connection mirroring to chromecast. Trigger
presentation via session.start().
... Patches are still landing. May be able to use on Firefox
nightly soon.
Anton: Media Router is part of device. Remote Display to initate is part of Play Services.
Anssi: Let's look at working group
charter.
... Presentation API final recommendation to Q2 2017.
Francois: Keep changes to a minimum.
Francois: End date is October
2017. Scope hasn't changed.
...Dropped term
"Presentation" from WG name. In practice no one uses it.
... The level 2 item
allows the WG to work on a spec. Does not commit the group.
Mark_Foltz: Will Level 2 go back
to first base (ED)? Francois/Anssi: Yes. If minor changes, can
go quickly to FPWD.
... Remote Playback API was not in the first charter. Now it's
in a separate document.
... Expected completion of Q3 2017 may be optimistic. However
implementations already exist.
... Updated dependencies, working group names.
... Licensing had custom text. The rule that allows us to
publish with software + document license is now the
default.
... Change is that we are going to switch date. Implementations
are focusing on 1-UA mode. Can relax exit criteria for
candidate rec if necessary.
Anssi: For 2-UA mode, protocols are being incubated. May delay until protocols are mature.
Francois: End of next weeks, goes to AC review, 4-5 weeks.
Mark_Foltz: What about changes
that need to be upstreamed.
... Like sandboxing flag.
Francois: Needs test coverage.
Will be upstreamed into HTML 5.2.
... Prefer to remove it from our spec to avoid duplicate
content.
Anssi: Normative reference policy, enforced in principle. Make it easier to publish specs for normative referncing.
-> Guidance for HTMLMediaElement, HTMLAudioElement, HTMLVideoElement behaviors during remoting (#41)
Anton: Looking at whether we need
to spec how the Media Element API works in a connected
state.
... Philip J suggested we list what we have.
... Listed all properties. Grouped into areas.
... Too restrictive, could be restrictions on seeking in the
original element. But they may not comply with the element
spec.
... Kind to be more specific for what must be supported. Don't
want to copy the entire media element spec into the remote
playback API.
... How to avoid enumerating in RP spec.
Francois: MediaStream copied the
element spec and changed behavior of features.
... Lists all attributes that are specific to media
streams.
Mark: Enumerate behaviors that
must be different for all implementations.
... And behaviors that may be difficult to implement. Change
MUST to SHOULD or MAY.
... Capture a diff from the default media element spec.
Anton: Want a table that should
be have differently.
... And the might not work. Francois: Not normative,
application developers should not expect this to work.
Anssi: Unless there is a good use
case, don't extend semantics.
... (for MediaError)
... Could flex the definition of a network error.
Anton: Changing states will be the hardest. Do we want to spec that in some way.
Francois: Changing state transitions is hard. As a rule of thumb can't change algorithms.
Anssi: Minimize patches to
existing spec. Think about the long term plan to upstream these
changes.
... Custom elements are slowly getting upstreamed into HTML.
Could get pushback if keep patching.
... In upstream spec, define extension points for changing
behaviors.
The group discusses high-level media features: seeking (and fast seeking), how playbackrate could work or break in a remote context. Mounir suggests to just leave some features as "best efforts". Problem is that it may break content, e.g. apps that use playbackrate to synchronize content.
Mounir suggests that all the operation that should apply to the remote playback should be sent to the remote side and conversely all the operations and changes happening on the remote side be sent to the local side.
Mark points out that it may be difficult to get remote players to support all features properly. Hence the need for SHOULD in some cases. For example, Philipp mentioned WebVTT as a challenge. Going from MUST to SHOULD is problematic though, since that changes the expected behavior.
Proposed plan is to find the features that are challenging to implement, and for those that need to switch to a SHOULD, figure out how the changes will impact developers.
Mark_Foltz: For features that change from MUST to SHOULD, should define expected behavior if the feature is not implemented.
[End of WG meeting / Start of CG meeting]
Anssi: The CG is 3 year old, transitioned to the WG, but improved interoperability for the Presentation API supposes that there is agreement on underlying protocols. The CG re-chartered recently to work on it.
Mark: Initially, the CG was
chartered with protocols out of scope, to work on features
first. That poses a challenge for interoperability though.
Cross-vendor support is hard to achieve without agreement on
standard protocols.
... Now that the work on the Presentation API is stable, we
decided to re-charter the CG to work on protocols.
... Goals are to 1/ enable interoperability, 2/ encourage
implementations of the Presentation API thanks to the existence
of a protocol, 3/ establish complementary specs for the
Presentation API.
... In particular, discuss network details.
... This CG is for incubation. We'll want to transition the
deliverables to the appropriate venue. In parallel, we may want
to do some implementation work to provide feedback.
... The goal is to move to a standards-track somewhere, whether
that means W3C or IETF is an open question.
... Looking at the scope of work, the scope was initially
larger, we pruned some of the features and that's good to
better focus it. Other interesting use cases could be explored
independently.
... We're focusing on the common scenario: control of a second
screen in the LAN. We also want to make sure that we can
implement all of the Presentation API and the Remote Playback
API. The former one is mostly stable now, the latter is still
work in progress but we're starting to see needs there.
... The priority focus is on the 2-UA mode. But the 1-UA mode
is interesting too, hopefully the discovery and connection
establishment can be the same in both cases, with control
message communications replaced by WebRTC-like offer exchanges
and stream passing.
... We'll want to consider extension points for the
protocols.
... I think we want to leverage existing standards as much as
practical. We should consider the work we're doing as an
application of these standards.
... We're not going to define video codecs, work on 1-UA mode
right away, discuss MSE/EME support.
... We're also going to leave non IPv4/IPv6 protocols out of
scope as well to start with. We're also not going to develop
spec that use proprietary protocols for the Presentation
API.
... Deliverables: 1/ discovery and communication establishment.
2/ control channel for the Presentation API. 3/ control channel
for the Remote Playback API.
... If there is interest in expanding the scope, participants
can request to change the charter.
Anssi: Note we may publish non-normative reports as well, such as use cases, requirements and other notes.
Tidoust: Two main differences between WGs. License agreement: agree that the final spec is licensed under a royalty-free patent policy.
CG: Agree to license
contributions to a royalty-free patent policy.
... Companies are encouaraged to license final spec, but not
mandatory.
... CG can publish specs as it wants. No AC review, no draft,
report process. No intermediate steps.
... CG are open to all and free to join. Non-members are free
to join.
Anssi: I'm happy to chair this CG. There's a process to change the chair if needed.
Tatsuya_Igarashi: Do we assume that the local network is always connected to the Internet?
Mark: I don't think the charter
specifies one way or the other.
... My opinion is that the Web should work in "disconnected"
networks in principle. It would be a nice property to keep it
this way. But it may not be entirely feasible.
... So I'd say it is in scope but I don't think we can
guarantee that this will be supported at this stage.
Anssi: Goal of today is to gather initial inputs, and discuss who could be interested to lead some of this work.
Mark: I have some thoughts about protocols and potential options. I want to take a broader approach here.
Anssi: We won't rubberstamp anything here.
Mark: Right. For different areas, we'll want to identify who might want to investigate them.
Anssi: Are we expecting new participants for implementers side?
Mark: Maybe, some expertise (security, privacy, etc.) might be useful.
Shih-Chiang: I need to double-check with other Mozilla people to see if this is on our priority lists.
Anssi: OK, let's put ideas on the table and see how we split the work, as well as who could lead each part.
-> Mozilla's WebAPI/Presentation API protocol
Shih-Chiang: To start with, we
need some way to discover and establish the communication. Then
we also need some way to share capabilities so we can figure
out what the device can do, and typically whether the device
can fulfil the presentation request in the queue.
... That's the first step. Once discovered and selected by the
user, the next step is to launch the application. This requires
passing the URL around as well as the presentation session
information. We need to support controlling messages to
connect, disconnect, close, terminate presentations.
... We also need some way to pass some user settings to the
receiver device, such as the user's locale.
... After application launch, using control messages we
defined, we need to define what protocol gets used for the
communication channel between the controlling page and the
receiving page.
... In parallel, we need to enforce security measures: device
authentication, data encryption and data integrity.
... These are the basic requirements I have in mind.
... The rest of the Wiki page is proposed protocols to address
these requirements.
Anssi: Do these requirements match your thoughts?
Mark: It's pretty close. Two
comments I would have: capabilities is not exposed to the API,
so that may not be a strong requirement. It's fine as an
extension.
... It also depends on the methods by which the user agent
might want to determine the availability for a given URL.
... The second comment is that I think the security
requirements depend on type and number of network transports in
use between devices and that's not determined yet.
... Otherwise, that seems good.
... I have similar requirements in my presentation.
Francois: For the communication channel, both "one communication channel per presentation session" and "one communication channel that multiplexes presentation exchanges" are possible, right?
Mark: Yes, that's two possible options.
-> Mark's Open Screen Protocol requirements and proposed options
[Mark presents slides]
Mark: Starting with requirements.
We need discovery of receivers on a shared LAN.
... Some authentication mechanism and reliable communication
channel is needed.
... Some non-functional aspects include usability.
... Also, we should be careful that we're not leaking
information that would compromise privacy.
... A common use case is mobile devices, both sides may be
constrained in terms of memory and battery life. That may have
implications on protocol operations.
... We want a broad range of implementations. It would be
interesting to target a platform or a device that might be
difficult to implement this on, like a Chromecast device, a
Raspberry PI.
... It would be good to understand the limitations of such
devices.
... Finally, we need to have ways to extend the protocols to
add new features (remote playback, Presentation API v2, etc.)
and retain backward compatibility.
... In particular, crypto protocols evolve over time, some way
to update cypher-suites will be needed.
... That's all for requirements. 4 main parts: discovery,
transport, application protocol and security.
Mark: For discovery, the work we've done in Chrome is that we've
used 2 different methods. Once was to use SSDP for DIAL
support.
... Controller continuously transmits multicast queries via UDP
and the receiver that is interested responds to the
queries.
... Our experience is that it is very simple to implement. Open
a socket, 3 queries per 2 minutes and that's pretty
reliable.
... Downsides: it requires continuous polling. Caching rules
are unclear. There is an extension for IPv6 but it's not an
official part of DIAL.
... When things fall down is when we need to fetch a document,
proxy issues.
... Finally, XML parsing requires a separate sandboxing in
Chrome for security.
... The second method we use is based on mDNS/DNS-SD (aka
Bonjour)
... The receiver publishes DNS records for discover on the
.local domain.
... Our experience with this method is that listeners are built
into three common platforms including iOS and Android.
... It is on standards track and defines caching rules and
supports IPv6.
... We have found some issue when the listener is not supported
on the platform, then reliability suffers.
... We think this is due to firewall software which conflicts
with the port used in mDNS.
... For battery, it depends a bit on how much caching you use.
Finally, the client may think that the receiver is still there
when it actually went away.
... Flushing the cache may be possible.
... In general, we consider DNS-SD a preferred option due to
the standards state in particular.
... We think we should propose a service, instance name, a TXT
record format.
... We should also gather data on reliability.
Shih-Chiang: mDNS, DNS-SD can also support TCP discovery, not only UDP so more reliable than SSDP.
Mark: Another area we'd like to investigate is around ports and multicasting on port 5353.
Shih-Chiang: Only one port for exchanging messages, so only one platform listener in theory.
Mark: Can be worked around in
practice.
... One action item is to gather reliability data from our
implementation so that we can get a baseline on how many
packets we exchange.
... I'll see how I can share this information.
... With DIAL, most of my concerns are around XML.
... Fetching an XML document has a number of issues, because it
goes through the usual HTTP stack, possibly through proxies.
Two, parsing XML is generally considered unsafe.
... We may be able to remove dependency on the XML parsing.
Louay: Why do you need polling for SSDP?
Mark: The DIAL implementation of SSDP is query only, but it's true that SSDP is not, so we may want to consider it in isolation.
Shih-Chiang: The issue I have with SSDP vs. DNS-SD is that there are different implementations differ in the XML they produce and don't always follow the right format.
Louay: You may not need an XML description in SSDP. SSDP is used in TV devices, which could be receiver devices.
[Lunch break]
<Louay> Note: SSDP proposal for Physical Web
Mark presents transport requirements.
Mark: Some sort of flow control
would be useful. More optional would be some compression
mechanism.
... The trade-off is with CPU/battery usage.
... Whether we want to multiplex communication channels is an
open question. There are pros and cons to both
approaches.
... You want to be extra careful about identifying connections
if you multiplex. It also makes the application protocol a bit
more complex.
... One thing we found in our implementation is that, since it
may be difficult to tell whether a device is still connected,
some keep-alive mechanism is needed. TCP keep-alive does not
quite seem to work.
... Finally, some non-functional requirement is that we don't
want both ends to be in a disconnected state for too long when
they re-establish connections.
... We'll want to minimize the latency there.
Shih-Chiang: Compression and latency are less of a problem in a local network environment. I would not take them as mandatory requirements.
Mark: Right, if we don't multiplex, we need to go through hand-shakes for each connection.
Shih-Chiang: What about prioritization of messages on the controlling channel? For instance, keep-alive messages might have higher priority
Mark: That's a good point.
... I don't have specific transport protocol at this time. I
listed 5 options, and suggest we investigate further.
... One option is to use TCP + HTTP/2, since HTTP/2 allows for
bi-directional protocols. It can handle all of the things we
need for the API. It has headers, supports compression and
other requirements. I do not know about prioritization
though.
Shih-Chiang: It might have a fairness mechanism.
Mark: The one that is most
similar is UDP + QUIC. Google is primarily developing that
protocol. QUIC is designed to minimize a number of round-trip
exchanges so has minimal latency.
... It will also support new versions of TLS.
... It would be good to understand the pros and cons of each of
them.
... Another option is to use plain TCP + WebSockets. We could
reuse a lot of that for the application messaging because the
API is very similar.
... It may not support multiplexing out of the box
though.
... Another option is UDP + STCP as done for
RTCDataChannel.
... I'm not deeply familiar with that approach.
... I'm also not familiar on how you can establish an
RTCDataChannel out of WebRTC, but I suppose that's easy.
... Finally, we could do TCP + Custom protocol. I think that
can be an option if none of the other options look reasonable
enough.
... I'm hopeful one of the other protocols will be
suitable.
... Note HTTP/2 and QUIC are relatively new protocols.
... If possible, it would be interesting to do some
benchmarking.
... Some non-functional requirements is that the application
level protocol should be easy to implement on top of the
transport protocol. Security should be easy to address as well.
Standards need to be mature enough and implementations
available as well as easy to integrate.
Anssi: Can we filter out some options based on existing data?
Mark: We publish some
expectations about how responsive applications should be, which
places some restrictions on what these protocols can be.
... We try to be efficient in the code so that network
operations are not much shorter than their visible outcome for
the application developer
Tatsuya_Igarashi: What about media transport?
Mark: This is not included here. I would think it will be what WebRTC uses for media transport.
Tatsuya_Igarashi: Do you think another protocol should be defined for media playback?
Mark: In the 2-UA "media flinging" case, the media is not streamed, the connection is used to exchange state information. Media streaming is currently out of scope for the CG.
Francois: We have a signaling channel and a communication channel. Does it make sense to envision different transport protocols for both channels?
Mark: It depends on how much
multiplexing we end up doing.
... Things should be compatible with the WebRTC case in the
end.
Louay: I can study TCP + WebSockets. Also interested on discovery.
Anssi: You did some background study on interoperability, I think.
Louay: Yes
... Thinking about HbbTV case, where they chose DIAL for
discovery, WebSocket for communication. They also have
synchronization endpoints, which is not part of our discussions
here.
Mark: QUIC is basically a way to
multiplex multiple reliable streams over UDP.
... It's optimized for HTTP, to fetch resources in
parallel.
... Normally, over TCP, if a packet gets dropped, you have to
request the packet another time. With UDP, it's easier to
reconstruct the missing packet afterwards.
Shih-Chiang: Another advantage is to have better TCP features in user space. Easier to share libraries to update QUIC support instead of relying on platform update.
Mark: Yes. It would be good to understand the roadmap.
Mark: I think there's a lot of
freedom here to do things differently
... Looking into it, it felt a little bit more webby and more
HTTP-oriented protocol.
... My proposal is very REST-like.
... If we use HTTP/2, then it makes sense. We may want to
adjust if we chose a different transport.
... To find more information about a receiver, you will issue a
GET on the receiver to get its name as well as on-going
presentations.
... We don't have a feature in the Presentation API right now
to tell whether a presentation is public or private.
... That could be something to add at the protocol level then
in v2.
... I'm using JSON here, that can be discussed. I'm not a fan
of XML.
Shih-Chiang: Some of this
information can be baked into e.g. the DNS record during the
discovery phase.
... It could make sense to collect that information immediately
to speed up screen detection.
Mark: There's a layering
approach. If you're connecting to the device through other
means, then you may need that information. I'm trying to design
the protocol independent of the discovery protocol. That does
not prevent optimizations.
... To start a presentation, we could send a POST request that
gives you a path to the presentation and connection
resources.
... A terminate would be a DELETE, a reconnect a POST request
on the presentation resource, and a close another DELETE on the
connection resource.
... These requests are all coming from the controller at this
point.
... For messaging, we need to use bi-directional features of
the transport protocol.
... We need to figure out what headers to send, what caching
could happen, etc.
Francois: REST "purity" would use PUT to create resources (returned in a Location HTTP header), although in our case, we create both a presentation resource and a connection resource.
Mark: OK.
... For messaging, you would send a POST request for sending
messaging. To listen to messages, we would use the subscription
mechanism of the transport protocol that would simulate a GET
request.
Shih-Chiang: OK, we would still need to frame each of the messages.
Mark: If we choose a transport that is not based on HTTP, I think we will want to design our own command protocol that mimics HTTP verbs.
Shih-Chiang: Because the control message is bi-directional, I was not thinking about HTTP.
Mark: We should evaluate
bi-directional communication in HTTP/2 to decide whether it's a
good fit.
... This proposal would be a good fit for HTTP/2 or QUIC, but
it needs to be revisited if we choose a different transport
protocol.
... One of the things that made DIAL challenging is that it
uses HTTP/1 which does not have bi-directional support.
... Now we can find a transport protocol that addresses both
client-server and bi-directional needs.
Mark: Stepping back on security
aspects, the focus in my current thinking is that it's more
important to authenticate the receiving UA to the controlling
UA than vice versa.
... The presentation must be made aware that there is no
guarantee and that it needs to apply application logic to
validate the authentication of the application that connects to
it.
... As such, I'm putting app to app authentication, as well as
authentication of the controlling UA to the receiving UA out of
scope.
Shih-Chiang: The certificate may not need to be signed by a trusted third-party. In our current experimentation, we use anonymous authentication exchanges to make sure the certificate is from the other peer.
Mark: So you do mutual authentication?
Shih-Chiang: Yes.
Mark: That's a useful property to
have. If the handshake has mutual authentication property,
that's fine. In my Chromecast case, authentication gets based
on my Google ID.
... Of course, one goal is to maintain confidentiality of data
exchanged between peers.
... Since we're targeting 2-UA, we're not worried about
authentication issues with media streams.
Tatsuya_Igarashi: How is the authentication established?
Mark: The user will have to be involved. At least, in the proposal I have, the user is involved. There may be ways to bypass the user. That seems more vendor-specific though.
The group discusses the impossibility for apps to authenticate the app running on the other end if non secure origins get used. This links back to previous discussions on restricting the Presentation API to secure origins.
In some environments, such as HbbTV, the communication channel is current non secure. Is it OK for communications between non-secure origin apps? Probably not! The group will prefer secure protocols no matter what.
Mark: Continuing with
authentication consideration. Private/Public key approaches
look good. No password!
... My current reading of the Web security community is that
they are trying to invent the next version of TLS which is TLS
1.3, also getting rid of crypto algorithms that are no longer
secure.
... I believe it is still in the standardisation process.
Shih-Chiang: I'm following progress. The editor of TLS 1.3 is from Mozilla and he keeps updating the draft. I haven't seen a clear consensus on the spec so far. I can check internally about status.
Mark: Thanks. Based on
discussions both at Google and in other places, there are some
possible solutions that try to leverage public DNS to solve
private LAN authentication, and that seems to complicate things
and leak private information.
... My preference is to avoid this.
... Two possible approaches: a key-based approach where the
receiver has a long-lived public/private keypair which can be
used to sign short lived certificates. The controller will have
to check that the long-lived public key belongs to that
device.
... The user would authorize the receiver on first use and
verify the hostname. The public key hash would be displayed
with a friendly name.
... Some open questions e.g. on how to revoke keys, etc.
Shih-Chiang: The approach we have
right now is to exchange certificates that are self-signed. We
do carry the hash of the certificate in the DNS TXT
record.
... When we receive the certificate, we can check whether that
certificate received through the anonymous procedure matches
the one that is registered in the DNS TXT record.
... If it matches, then we assume it's good to use.
... The assumption is that the DNS TXT record for the
fingerprint of the certificate is correct.
Mark: That probably is not an
assumption that would make it through our security folks.
... We need something that the user needs to verify. It can be
a PIN code, or a hash.
... The second approach that I've been thinking about is to ask
the user to enter a short code. PAKE then gets used to generate
a shared secret.
Shih-Chiang: After the
authentication is passed, you will be able to encrypt with a
shared secret key. We exchange the certificate through that
channel for later connections.
... I will share some documentation about our design with PAKE
procedures.
Mark: From a usability point of
view, a one-time pairing is important.
... Under certain circumstances, e.g. change of device friendly
name, change of location, we may need to re-pair.
... I think we're pretty aligned on the overall security
procedures.
... I'm going to review this with security people at
Google.
Anssi: I created a new GitHub Open Screen Protocol repository for the CG.
... Let's create markdown
files in the GitHub repo, one per area
... The four areas: discovery, transport, protocol,
security
... Looking for contributors
Mark: I'll try to get some
performance data
... Can we create a meta-issue for each work item?
Chris: I am also interested in contributing
Anssi: We're on track with
testing
... We may produce another CR and adjust the exit criteria, but
let's see
... The protocol work may motivate some V2 features
Mark: We want to allow documents to announce themselves as presentations
Anssi: The Presentation API is in
good shape, thanks to contributions
... With the Remote Playback API, the guidance for
HTMLMediaElement behaviours is the most important issue
... We have a plan for next steps
... We didn't discuss wide review for the working draft, should
be towards the end of this year
... There are some changes to milestones to make in the WG
charter
... The CG kick-off was successful, we have a good list of
requirements, identified potential technologies that we need to
evaluate
... That work will take a while, there's background work to
do
... I don't want to set timelines for this
Mark: I can do some of the work
but will also need to consult internally
... Our priority is to get the Presentation API to CR
... If we create separate documents with our findings, will we
then make a more formal document? We'll need an editor
Anssi: The people doing the work
are good candidates to edit
... [some discussion of relative merits of ReSpec and Bikeshed
]
... Next F2F meeting? We'll meet again at TPAC
Francois: It's at Burlingame, 6-10 November 2017
Anssi: We don't need to decide now, but maybe around May?
Louay: We could offer to host it, as part of the Media Web Symposium
<mfoltzgoogle> Mid-April to mid-June is a good window
<Louay> Fraunhofer FOKUS Media Web Symposium, May 16-18, 2017, Berlin
Francois: We should also check Anton's availability
Anssi: Let's decide where and
when to meet early next year
... This has been a great meeting, we're making good progress,
we've had good discussions
... Thanks for coming
[ adjourned ]