Re: Mapping of media streams to RTP (Re: Query: What does "context" mean in the context (sic) of requirement A15? [ACTION-6]) from Harald Alvestrand on 2011-08-25 (public-webrtc@w3.org from August 2011)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Thu, 25 Aug 2011 10:21:00 +0200
To: "Elwell, John" <john.elwell@siemens-enterprise.com>
CC: Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>, Cullen Jennings <fluffy@cisco.com>, Christer Holmberg <christer.holmberg@ericsson.com>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <4E5605EC.4050208@alvestrand.no>
On 08/25/2011 09:47 AM, Elwell, John wrote:
> See below. By the way, shouldn't this discussion take place on the IETF list rather than the W3C list?
I asked the WG chairs of the IETF WG about it, and they said the mapping 
from API concepts to RTP/SDP concepts was an API matter and should be 
discussed in the W3C.

A lot of people are on both lists....

One more comment at the end, apart from that I think we understand each 
other well.

>> -----Original Message-----
>> From: Harald Alvestrand [mailto:harald@alvestrand.no]
>> Sent: 24 August 2011 22:51
>> To: Elwell, John
>> Cc: Stefan Håkansson LK; Cullen Jennings; Christer Holmberg;
>> public-webrtc@w3.org
>> Subject: Re: Mapping of media streams to RTP (Re: Query: What
>> does "context" mean in the context (sic) of requirement A15?
>> [ACTION-6])
>>
>> On 08/24/2011 05:47 PM, Elwell, John wrote:
>>> Harald,
>>>
>>>> -----Original Message-----
>>>> From: Harald Alvestrand [mailto:harald@alvestrand.no]
>>>> Sent: 24 August 2011 14:48
>>>> To: Elwell, John
>>>> Cc: Stefan Håkansson LK; Cullen Jennings; Christer Holmberg;
>>>> public-webrtc@w3.org
>>>> Subject: Re: Mapping of media streams to RTP (Re: Query: What
>>>> does "context" mean in the context (sic) of requirement A15?
>>>> [ACTION-6])
>>>>
>>>> On 08/24/11 14:41, Elwell, John wrote:
>>>>> We need to consider how this will be used. I am not sure I
>>>> understand the requirements, but I think the requirement is
>>>> that a single identifier be shared between a JS running on
>>>> one browser and a JS running on another browser. Is this
>>>> correct, and if so, exactly why is it needed? What breaks if
>>>> we don't have it?
>>>> The canonical case for me is the situation where we have
>> two incoming
>>>> video streams, from the same source, displayed in two windows - the
>>>> "hockey game" case from the use cases docs is an example of this.
>>>>
>>>> We need to display the "game" stream in a big window and
>> the "agent"
>>>> stream in a small window. This requires knowing which one is
>>>> which, in a
>>>> deterministic manner. The sender knows which is which. The
>> recipient
>>>> needs to be told.
>>> [JRE] The SDP a=content attribute (RFC 4796) is designed
>> for this. The currently registered values are insufficient,
>> but new values can be registered (specification required).
>> Whatever mechanism we use, some sort of registration of
>> "game" and "agent" would be needed presumably.
>> a=content has the same issue as a=label: it attaches to an
>> RTP session.
>>
>> Query: What is the semantic difference between a=content and a=label?
>> Closed (content) versus open (label) namespace? Anything else?
> [JRE] Content says what type of content it is, label is name or identifier and says nothing about the type of content. Label is fine if you just want to reference a media description from somewhere else. Examples, are referencing a floor-controlled video stream from a floor control protocol, or referencing a recorded stream from recording metadata. Often label and content are both used at the same time, for their respective purposes.
>
>>>>> Assuming this to be correct, there is a need for the side
>>>> that assigns the identifier to convey that identifier to the
>>>> other side. Basically there are 4 ways this could be done:
>>>> SDP, RTCP, RTP and STUN. I don't think RTP and STUN have any
>>>> means to do this, and RTCP suffers a delay, so I would
>>>> imagine we want to do it in SDP. Correct so far?
>>>> Actually, we need to communicate the identifier twice:
>>>> - Once in the application's communication stream, to convey
>>>> the semantics
>>>> - Once alongside the media stream, to identify the media stream
>>>> It's the latter that is within the scope of this discussion.
>>> [JRE] Normally a=content is conveyed in an SDP media
>> description, which in turn is bound to an RTP stream by means
>> of the IP addresses and ports contained in SDP (or indirectly
>> through ICE negotiation based on candidates and parameters in
>> SDP). So there is no need to convey the equivalent of
>> a=content in RTP/RTCP. However, if we multiplex several
>> streams onto a single session, I agree we would need
>> something in SDP to map a given media description to a given
>> stream within the single RTP session, and that could be based
>> on CNAME.
>> Yup, I was being imprecise - using SDP as one example of
>> "alongside the
>> media stream", without being explicit about it.
>>>> RTCP doesn't have a delay if a SR is the first thing sent on
>>>> a session,
>>>> as recommended in draft-perkins-avt-rapid-rtp-sync-03
>> section 2.1.1 -
>>>> this is permitted by RFC 3550 too, it notes.
>>> [JRE] OK, so CNAME can be used to map an RTP stream to a
>> given media description in SDP. CNAME alone does not indicate
>> the purpose of an RTP stream - it would still need something
>> else to map it to "game" or "agent", and that is where
>> a=content comes in.
>> I'm beginning to detest the fact that SDP "media streams" are
>> mapped 1:1
>> to RTP "sessions", and that RTP "sessions" are mapped 1:1 to
>> 5-tuples......
>>
>> Remember that the hockey game viewer is not general comms
>> infrastructure
>> - it's a specialized application.
>> The app can use HTTP POST/GET or Websockets via a Web server to send
>> "hey, the game is on 1234@567" in whatever (proprietary)
>> syntax it wants
>> to. I wouldn't want to be in a state where we have to do an IANA
>> registration for "hockey game".
> [JRE] Yes, I feared that might be a concern. One possibility would be to reuse existing registered values, such as "speaker" and "alt".
> I think an important consideration whether we are trying to solve this for the case where the two guys involved in the hockey game scenario are both using the same application or the case where they are using different intercommunicating applications. In the first case I would think label is sufficient, but in the second case it seems content might be needed. Even then, I am not sure that this would be sufficient to tell the other application that the game is to be shown in a big window and the agent in a small window, but at least it gives a hint as to how the two streams differ.
The scenario in the scenarios document reads to me as if it is of the 
"one application" type, and I think it's appropriate for such a 
specialized application.

If there are scenarios that people want included where "content=" is 
appropriate / important, we need to make those scenarios explicit enough 
to show why that is.
Received on Thursday, 25 August 2011 08:21:32 UTC