Minutes (unless otherwise noted): Bert Bos, W3C
Chair: Philipp Hoschka, INRIA-W3C
Q: Is it possible to negotiate a Quality of Service (QoS) ?
A: Negotiation is being worked on. Different levels of QoS are available. Negotiation will be transparent for the user.
Speaker: Jonathan Grayson
Q: On what platforms does your software run?
A: PC, MAC, SGI, Sun
Q: What audio compression system do you use?
A: A proprietary scheme, based on asymmetric perceptive compression. It gives almost FM quality on a 14.4K modem.
Q: What is the format for the multiple resolution images?
A: It's called XRes and it's proprietary. It loads parts only on demand.
Q: How about using/creating some standards?
A: Maybe something for discussions over these two days...
Q: Can you combine parts of Macromedia software with parts of something
A: We have an open architecture; for example, RealAudio has a plug-in.
Q: Can you start a Shockwave presentation in the middle?
A: Yes, but it requires some in-depth knowledge of the scripting language.
Q: Is the scripting language published?
A: It's called Lingo and there are books about it. It's a text-based programming language.
Q: Can other vendors create programs that interpret Lingo?
A: People can add to Lingo for specific purposes.
Q: What protocols do you use?
A: Only standard HTTP. (In fact, we use Netscape as a back-end.)
Q: Do you use server extensions, such as DSMCC?
Q: What is IML? Are there licenses for it?
A: Idealized Machine Level; it's an API. Netscape will incorporate it. It depends on Netscape whether everybody will be able to use it and under what conditions. It contains such things as routines for bitblt for use by Java.
Chair: Patrick Soquet, HAVAS
Speaker: Klaus Hofrichter
Q: Is there overlap, or integration, between MHEG and HTML?
A: HTML is normally put inside the MHEG hypertext object. A different mapping is maybe a subject for a break-out session
Q: Instead of performing all of the operations on the server, why not
do only the operations that are to be billed, and do the rest on the client?
A: We don't yet know which operations will be billed. This is an experimental set-up, partly to investigate exactly that.
Q: Why do you use CORBA?
A: We need a system that not only works on the Internet.
Q: Does MHEG5 allow scripting?
A: MHEG3 defines a virtual machine for scripting.
Q: How successful is this scripting?
A: There is an MHEG6 under development now, which uses the Java VM instead of the MHEG3 virtual machine (minus AWT, plus MHEG5)
Q: Does that mean the Java VM will an ISO standard?
A: Yes, Sun is helping with that.
Q: Which version of HTML do you use for text objects?
A: We can use any registered MIME type.
Q: Why do you use MPEG for all network protocols?
A: MPEG was designed for all networks, not just IP.
Q: MPEG seems to have a large overhead.
A: I don't think so.
Q: MHEG approach seems to *require* specific authoring tools, since
document format is impossible (binary representation) or hard (text-based
representation) to edit manually. If the format would be text-based, authoring
tools would not be essential. See Lingo, HTML.
A: I was never very excited by the ASN.1 notation myself.
Q: What is DAVIC?
A: A consortium of companies doing standards for interactive TV, such as set-top boxes.
Q: When there are to be synchronized streams, don't you need to create
A: No, one or both may already exist. Synchronizing is done at a separate level.
Comment from Peter Hoddie (Apple): We used statistical feedback in order to do pre-fetching, like you suggested, and it turned out to help a lot for our CD-ROM based multimedia, even more than we expected.
Chair: Patrick Soquet, HAVAS
Q: Is Quicktime public?
A: Yes, apart from some of the external formats (such as Cinepack) that it supports. Apple doesn't plan to make money selling Quicktime, except maybe from implementations of Quicktime.
Q: Can one convert between an MPEG2 stream and Quicktime?
A: We have a prototype for it.
Q: Can you say something about Quicktime vs HyTime vs MHEG?
A: Quicktime can do everything they do and more. Plus, Quicktime is already working.
Q: Why do we just see Quicktime movies, and no other types of presentations?
A: Because our Windows implementation of Quicktime used to be very bad.
Q: Should Macromedia be afraid ?
Comment from Jonathan Grayson (Macromedia): We are talking to Apple about integrating out formats into Quicktime.
Hoddie: They complement each other.
Q: Is there authoring support?
A: Not as good as we like, but there is a variety of tools just arriving. We are working on tools for interactive Quicktime.
Q: Is there any dynamic adaptation to bandwidth?
A: With interactive Quicktime, that is possible.
Q: Is there support for scripting?
A: Scripting is sometimes necessary, but we try to minimize its use. The support is there.
Comment from the audience: Different authors need different tools. Some can do scripting, others can't.
Q: Is there dynamic bandwidth negotiation?
A: The schedule is dynamically adjusted to early or late arrivals, within the given constraints, but there are no alternative tracks.
Q: There apparently is no scripting and there are no numbers either.
A: All scheduling can be done by the system, by using a constraint solver.
Comment from audience: A declarative format is a bonus, as is a text-format, despite its larger size. Text formats allow different tools, and even no tools.
Comment from audience: Scripting also precludes conversion to other formats.
Chair: Klaus Hofrichter, GMD
Minutes: Roy Platon, RAL
The discussion group identified several areas of interest:
First discussion centered on 'what is an active object?'. It was agreed that these were objects which had feedback mechanisms, so that a browser view could be changed in a more dynamic way than with links.
Some ideas from MHEG could be used to identify OBJECTS, so that the browser can apply applicable methods.
Should objects be allowed to change the browser, eg. go outside there screen area or create new objects? This was an unanswered question.
How are objects represented in HTML? Here there seemed to a general ignorance of the OBJECT tag and its applicability. There was a requirement to represent objects in HTML with a general interface, which provided the structuring tools and enough common details, so that objects were not a 'black hole' to the browser.
Not all browsers/platforms could handle everything in an object. There was a need for profiles and negotiation mechanisms.
An object needs to be able to define a minimum set of requirements to run. This would also provide guidelines for content providers to work to. There was a 'BIG Danger' in proprietary solutions. The foundation set should be a minimal set of functions to provide functionality. More complex functions could be left to Java, Shockwave etc.
There was an some interesting approach by Microsoft in event handling. But using Visual Basic to control events was not an acceptable solution.
The key features of Active Objects are:
W3C could help in defining the main behaviour for classes of objects.
Chair: John Buford, University of Massachussets, Lowell
Minutes (incomplete): Philipp Hoschka, W3C
Screen-dumps from MBone transmission
Are commercial open formats possible ?
Marc Kaufman (Adobe): sure - defines how you compress audio/video, *not* how do you implement it ? You sell the latter. We nearly lost market for fonts due to this.
John Buford: Can we stay with plugin architecture ?
Marc Kaufman (Adobe): Not important, What counts is what goes over the wire. Standards are important especially for content-providers. They want to distribute their content to everyone.
?? (Oracle): Is it ok to publish a technology, but patent it ?
Marc Kaufman (Adobe): This is a question of how you plan to make revenue. Think of LZW. There are standards bodies that allow patents, but these must be available at a reasonable fee.
Buford: What do we need to change with URLs, html and http ?
Can we use html ? Does it need to be extended ?
Buford: time-based, using schedules - no streaming required - could be done. However, will be hard to pick a content model. Solution will be neither MHEG nor HyTime. It will be hard to decide between QuickTime and Macromedia.
Kimmo Djupsjobacka (Nokia): Html should allow to *control* video download or Macromedia download.
??, (General Instruments): Html should be evolved slowly. We can only support html 1.0 on settop boxes. URL's should enable *actions* when you click on them, like "buy movie now". This needs a new URL scheme.
Jonathan Grayson (Macromedia): Multimedia is not just a video playing in a window. It needs interactivity as well.
Discussion on how to "gracefully extend" html in the face of low-end hardware, and browsers that cannot deal with extended html
??, (General Instruments): Please keep low-end terminals in mind. Keep it simple ! Browsing the web using a terminal should be like using cars of different cost for looking at a scenery.
Jonathan Grayson (Macromedia): We need a very scalable language !
Kaufman (Adobe): We need a resource conditional.
Chair: Dick Bulterman, CWI
Minutes: Philipp Hoschka, W3C (from chair's report to whole group)
W3C should not be the first one to use MHEG - interest in MHEG should be collected in a MHEG consortium
Chair: Chris Lilley, INRIA-W3C
Q: You say rightly that we need vector graphics, but you say is cannot
be done with mark-up. How about Postscript, isn't that mark-up?
A: I meant, it cannot be done with HTML. You'll need something that draws on a canvas. Maybe "mark-up" is not the right word. You'll need something else than the structure-oriented and text-oriented mark-up that HTML represents.
Q: How about VRML?
A: It's to heavy. We only need 2D.
Comment from the audience: HTML isn't just structure.
Q: Tcl seems to be similar to Å.
A: Yes, but we started before Tcl.
Q: Do you transmit Å source to a client?
A: At first yes, but we are starting to use the Java VM now.
Q: What is the coordinate system?
A: Pixels; the programmer has to provide for resizing in the program itself.
Q: What does Å look like?
A: The syntax is a bit like C.
Q: Is it true that to create any picture, you need to write a program?
Comment from audience: There are better audio tools available now, that can deal with packet loss better.
Q: What is the incentive for ISP's to install an Edge server?
A: Added quality for their customers.
Comment from audience: Also, it may mean a lower cost to the ISP itself in terms of the lines it has to rent.
Q: What protocol is used between the plug-in and the Edge server?
A: We don't really care (?)
Q: How does a client find the nearest Edge server?
A: That problem hasn't been solved yet. It may be pre-installed in the software you get from your ISP.
Q: Will a single Edge server be large enough? How does its content get
updated? How about copyright and charging for movies?
A: Distribution is via multi-cast from the movie maker to all Edge servers.
Q: Will content providers have to buy space on an Edge server?
A: No, the Edge server uses a simple LRU algorithm.
Q: How much video is there out there, really?
Answer from audience: Terabytes per day, if you include the video that is not currently in digital form. It will be very difficult to decide what to cache.
Q: Why would an ISP buy an Edge server, instead of just buying more
A: There will never be enough bandwidth.
Q: Why not a direct line from a content provider to the ISP?
A: There are too many producers.
Comment from audience: The bandwidth will be available eventually, pricing is a question of market-structure, which means that the cost will go down eventually as well.
Q: How about accounting?
A: The Edge server will record everything.
Q: Why can't you use an ordinary HTTP proxy?
A: The Edge server can deliver a guaranteed QoS. Managing thousands of connections per second is hard, Oracle has developed hardware and software to do that.
Q: We tried video between all our users, but nobody ever uses it. Live
video is not a technical problem, but a social one.
A: For some situations it is better than for others.
Comment from audience: For conferences over a long distance is has more use.
Comment from audience: But in that case there is the time-zone problem.
Q: Audio can't be integrated into a browser window anyway, so why isn't
it good enough to use an (existing) external application for that?
A: I don't want people to have to download plug-ins. They should be able to see my stuff immediately when they reach my page.
Q: But there will always be old browsers without built-in audio/video.
A: People need an incentive to upgrade. Java may be (have been) such an incentive, but it may well be the only one.
Comment from audience: Video-phones appear to be of little use, but video-conferencing may have good uses. It really depends on the task. But video on the Internet may not yet be ready for widespread use.
Comment from audience: The problem of multiple formats will work itself out over time.
Comment from audience: A single standard for doing product upgrades would be nice.
Comment from audience: How many plug-ins are there now? Can't we define a single format now?
Minutes (unless otherwise noted): Anselm Baird-Smith, W3C
Chair: Jean Bolot, INRIA
TCP suffers from several problems when it comes to transporting real time data:
In contrast RTP offers:
RTP is split into:
Q.: Is this RTCP using the same transport layer as RTP?
A.: Yes, because RTP consists of both protocols: RTP data protocol and RTCP. RTP/RTCP is deliberately integrated within the appliation. RTP also is maily used above UDP.
Q.: So, why not split the two parts and put RTP over UDP and RTCP over
A.: But, RTCP is for real-time feedback
Q.: Is it a requirement?
A.: You can use RTP data protocol only but RTCP is helpful in the feedback and adaptive applications
In Our adaptive real-time Multimedia application in which we implemented, we handled congestion control through packet loss discovery (with the usual assumption that packet loss is congestion, as in TCP).
A Netscape client and sender get in touch through a central HTTP server (usually using CGI scripts), then they talk directly to each other through RTP and the client is not aware that Netscape is not running RTP directly. HTTP is used only at start up.
In our adaptive implementation, the transmission rate adapt to current state of the network by using RTCP provided information. This has the effect of changing the bandwidth.
Q.:In the multicast case, this can lead to some problems if subscribers
live in different bandwidth ranges. What happens if one subscriber is experiencing
a major packet loss more than the others.
A.:(one of the solutions suggested by J. Bolot would be to split the stream into several multicast groups, to which clients could subscribe depending on their bandwidth - the more groups you subscribe, the better quality you get).
TV quality broadcast on Intranets. Made possible because bandwidth is available (and is not a problem). Explicitly not targeted to the home user.
This domain has suffered from a chicken & egg problem: no bandwidth available means no applications to use it, no applications means the bandwidth is not made available.
Clients connect directly via ATM to video codecs for receiving real-time video/audio streams. A separate server controls the devices and clients communicate with this server via RPC over IP.
Speaker: Stephen Jacobs
Dynamic Rate Shaping is a way to transform video stream to make it suit available bandwidth.
Available bandwidth and general network data are collected by reimplementing TCP control flow on top of UDP.
DRS is a transformation of an MPEG stream that can be done on the fly by eliminating DCT coefficients.
User side concerns: is the user willing to accept a degraded video stream ?
Henning Schulzrinne: I am conducting human factors experiments to check this.
MPEG stream transformation: some concerns where expressed as to whether the technique was in fact usable (eliminating DCT coefficients might just not be enough).
Chair: Philipp Hoschka, INRIA-W3C
Goal: multicasted presentations for today needs.
Includes audio/video and some presentation material.
Previous work in that domain has always relied on modifying a browser (in general Mosaic), and the typical scenario is to broadcast URLs every N seconds.
The goal of mWeb is to distribute the material itself (eg HTML pages), and synchronization informations. mWeb is built on top of /TMP which includes:
Request for CCI (Comon Client Interface): should w3c take any actions on that to standardize such an API ?
HTML is enough for the purpose of distributed presentations (even if more limited then say powerpoint)
Something must be done about distributed learning, needs are present and no tools cover them yet (as of today).
History of multimedia on Internet.
Most documents are available from:
This includes a number of protocol description (SDP Session description Protocol, SAP Session Announcement Protocol, etc)
Typical usage is to GET a document in a special MIME type that describes the session to join to get the actual data.
Ability to invite a recorder or a player inside a session (to record it or to display some multimedia streams to all participants). This can be done through the invite protocol.
HTML extensions may be needed to handle multimedia (speaker wasn't aware of OBJECT tag. It was briefly presented by Vincent Quint).
Speaker: Jorgen Rosengren
Ability to predict what's going to happen is fundamental to the users: end users don't expect there TV to shutdown with a "core dump" message (ie as Netscape does today).
Security concerns: Applets should be run in a real sandbox (including CPU and memory limitations). Downloading an applet should not cause any damage to the browser itself, whatever the applet does.
Intersection between digital TV and delivery infrastructure and the Internet is not empty. IP frames can be encoded using standard MPEG system streams (kind of strange, but does work). This will happen.
There is a standard in the TV world for video server control (DSM-CC) based on CORBA (laughs).
Chair and Minutes: Henning Schulzrinne (Columbia University)
Minutes: Philipp Hoschka (derived from this report by Mark Handley)
Henning and Mark presented the need for a protocol like RTSP to fit into the larger picture of both realtime conferencing and web playback.
It was initially thought that the combination of S(C)IP/HTTP/SDP for initiation and a trimmed RTSP (but capable of both TCP and UDP transport) would fit the requirements pretty well. Then, more hypermedia-style scenarios came up, which RTSP in its current form might be able to support, but that initial viewpoint on it might have difficulty in supporting.
Such scenarios include examples such as might be provided by more sophisticated frameworks such as MHEG, where once a presentation has been started, different media start and stop as appropriate according to *both* pre-programmed events from the server and user events from the client. This sort of scenario doesn't fit cleanly into a typical VCR-control type protocol.
From a technical point of view, we would seem to have a set of priorities for revision of RTSP and also a few problems.
This last modification is not difficult in itself, but seems to be where the (technical) problems begin. If we do this, RTSP either cannot initiate streams itself, or duplicates S(C)IP/HTTP functionality when it does so. If we don't do this, it becomes difficult to integrate RTSP into live multimedia sessions where it is natural to "invite" the server to participate in an existing session.
Integrating RTSP into SCIP (which has a concept of a session, unlike SIP) is a technical possibility, but probably technically undesirable (we end up with too many cross-dependencies between different scenarios where we want to use these protocols), and almost certainly politically infeasible. It would however provide a more unified solution to this last problem.
Using SIP to initiate new media flows (events really) within a RTSP session is also a possibility, but seems a little ungainly given the session nature of RTSP and the pre-session intention of SIP.
The problem here really is that RTSP needs S(C)IP functionality but we don't really want to integrate RTSP into S(C)IP and getting RTSP to utilise S(C)IP and (either SIP or SCIP are right now) isn't really very elegant. With time we could probably get this cross-reference between these protocols right, but given the timescales that PN/NSCP seem to need to move in, Mark does not think we can revise SIP/SCIP sufficiently to satisfy the requirement.
Chair and Minutes: Philipp Hoschka, W3C
(discussion was hampered by the fact that most participants knowledgeable about digital TV/DAVIC had already left)
Philipp Hoschka (W3C): Why do Audio/Video over IP *and* digital television Network ? Why not use IP directly ?
There is an installed base of MPEG-based distribution systems.
You cannot do pay-per-view over Internet today.
Do you really *want* to do video over IP ? Low quality.
Dick Bulterman (CWI): Both digital television network and Internet can co-exist. Digital television for high-quality, pay-per-view. Internet for low quality, best effort.
Philipp Hoschka (W3C): Let's try to design an IP-based digital television network !
Marc Kaufman (Adobe): OC-* will never have enough bandwidth to carry a large number of wide-area continuous TV-quality video streams
(??): On the other hand, most traffic will be local. You don't want to watch a TV station local to San Francisco in the South of France.
But isn't this the great thing about the Web, that you can access non-local resources ? Plus, there are many expatriates in the South of France who only get a satellite dish to be able to watch their local TV station.
Martin Chapman (IONA): Here is the architecture used by DAVIC's DSM-CC (draws of a Client-Server system). Uses CORBA IDL. Functionality of protocol similar to video-recorder (pause etc.). Can we gateway http into this ? system was designed for a distance of 10 km between client and server, server stores all videos client wants to access.
Internet telephony: only people that don't own facilities are worried about this.
Patrick Soquet (Havas):
Peter Hoddie (Apple):
Chair and Minutes: Philipp Hoschka, W3C
A number of possible work items were collected for further review by W3C management: