Notes (unless noted otherwise): Janne Saarela, W3C
Chair : Philipp Hoschka
Notes: Allen Mornington-West, ITV
Daniel Tagg (BBC, UK) - Does the imposition of standards work against the introduction of innovative services from small service providers and in the interests of monolith like the BBC? [To which there did not seem to be a specific answer]
Klaus Hofrichter (GMD, DE) - We could start with the reference decoder model ,one which has a basic layer which does not require a hard disk and large amounts of RAM.
Philipp Hoschka - but the "minimal device supported to be cost effective" is a moving target: in a year, the price for something "too expensive" today will be "cost effective"
Rob Koenen (KPN, NL): An alternative is to think of the most complex model that is desired to be supported and see how far we can achieve this. It is likely that the very low audio and video quality which are features of the current implementations of Internet media delivery will evaporate as their performance approaches that of current broadcast.
Aninda Dasgupta (Philips, USA) - ATSC reckons that starting small is the only way to jump start the market and that means starting with what we have now and try to define a way into an evolutionary approach rather than a bunch of painful explosions. Hence the adoption of the Java VM. DVB says that the lowest profile of receiver will have a 2M+2M memory size whilst ATSC thinks that memory limitations are a distraction and has set higher limits such as 4M+4M and that enables Java VM, Java APIs and some form of content handler. Thus as time progresses TV receivers will have Pentium class receivers which a large number of decoders and protocol handlers.
Aninda Dasgupta (Philips, USA) - The ATSC solution will most likely be based on the Java VM with some extentions to handle TV tuning and service awareness. More content handlers can be added - real video and real audio are examples. ATSC are proposing that a content decoder would require that the object model should be exposed to the Java API.
Glenn Adams (Spyglass, USA) - What are the real limitations of the DVB approach in setting a baseline of 2M+2M . and Dasgupta thinks that this means that the DVB MHP would only handle some parts of HTML - tables, for example, would not be needed in broadcast TV. Would not Java script be needed in addition to a full Java API, reckons that both are valid.
Tommi Riikonenen (Nokia, FI) - proposes that W3C should work on some 3 or 4 decoder profiles for HTML for use in the TV. Of course DAVIC has formulated a suite of contours for implementing MHEG.
Dan (USA ) - This focuses the need to fix the level one as no sub-functionality should be expected to work below this level. However this means that there is a barrier to simpler devices. Further, the 2M+2M model is actually being assumed as a baseline for apparatus such as Web TV but an apples for oranges comparison might be dangerous.
Philipp Hoschka - one of the aims must be to browse the content that has already been published somewhere on the WWW - this makes any limitations to HTML difficult
Glenn Adams (Spyglass, USA) - However trade-offs are probably inevitable and HTML content may be used for purposes other than browsing arbitrary content, such as annotating a TV program, or writing a user interface for TV-based content. These classes of applications don't need full HTML support - not all boxes will allow general web browsing.
Jan van der Meer (Philips, NL) - perhaps two classes are being discussed one is interactive broadcast and the other is WWW browsing. The broadcast requirement requires a controlled set of functionality and this is one of the factors which can control the cost and complexity. Interactive broadcast applications need, however, to detail the methods for accessing media delivery specific features (CA, tuning &c). APIs are needed for these.
Dan Zigmond(Microsoft, USA) - notes that that a WWW oriented viewpoint might simply require simpler access to, say, browsers .. and that not a lot of innovation is required to bring such a product to market. For example a sports broadcaster who simply needs a way to deliver sports related data rather than provide a full function browser. Not all applications need to use all of the available APIs, for example, an advertiser wishing to identify the viewer does not need the tuning API.
Jan van der Meer (Philips, NL)- notes that the fact that USA ATSC and Europe's DVB may not be interchangeable . Uses of a common technology has a lot to do with the different local drivers.
Someone observed that the total product offering which is made manifest in an STB is part of the attraction to a user. Of course the STBs are no way all the same.
Aninda Dasgupta (Philips) - noted that ATSC asked potential service providers to list the kind of services that they thought of running. It turned out that, apart from the simplest watching of TV which might be said to require no API, that a wide range of APIs could be identified. NBC, for example, stressed the need to be able to implement tuning.
Tommi Riikonen (Nokia, FI) - would like to think that one of the outputs of this workshop would be the determination of the baseline of API requirements.
John Du (Intel, USA) - notes that tuning is a fairly basic need and this might be a useful extention of HTML
Dan Zigmond (Microsoft, USA) - The challenge seems to be to set a profile 1 level 1 functionality which will satisfy the primary requirements of program providers and STB makers. This should bring together some of the key players to agree on how to use HTML to incorporate media items as there had been evidence of divergence.
Philipp Hoschka: How did you come up with proposal ?
Dan Zigmonc (Microsoft, USA): Based on agreement between numerous companies over the last couple of months.
Glenn Adams (Spyglass, USA) - Three areas [a} hardware, memory and processing [b] content and protocol technologies [c] content applications development. Each of these layers can be described in terms of layers and, within W3C, there is probably sufficient grouping of exponents of each of these categories to engender a vigourous discussion. The challenge will then be to reconcile these views into a cohesive model. The model could be used to test assumptions being made in the various application and market domains which exist.
Jan Van Der Meer (Philips, NL) - focuses on the fact that W3C is designing solutions for the WWW whilst we need to recognise the difference between this delivery and the delivery of broadcast. This requires understanding of these differences. For example data elements which may be required by a broadcast application - for example think of an EPG - and this data may come from WWW sources. But a key feature is that the broadcast product is characterised by long term product lifetime and product stability. It does not make sense to define broadcast technology in terms of the WWW even though increasing amounts of WWW technology is being incorporated within conventional TV broadcasting.
Dan Zigmund (Microsoft, USA) - notes that if all content has a properly ascribed URL it ought to be possible to access it via a WWW oriented device. Further it does not seem necessary to accept that DVB will define the services in a unique manner. It really should be adequate to start with HTML enhanced TV broadcast and to do this then a standard should be created.
Glenn Adams (Spyglass, USA) - notes that this implies that once you have accepted some element of WWW browsing that the move into a fully browsing based approach becomes inevitable. This will probably drive the decisions of the technology to deploy.
Aninda Dasgupta (Philips, USA) - but if Dan's proposals are implemented you will have the problem that once you try to apply a STB according to Dan to other sources and application areas then there will not be the adequate functionality around. The ATSC view is to make it possible to have any of the necessary APIs installed so that the required functionality can be achieved. This approach had been achieved through the classic approach of requesting a list of requirements and then producing solutions to match. This approach is something which DAVIC, DVB and ATSC have adopted successfully but W3C appears not to have adopted.
Aninda Dasgupta (Philips, USA) - stresses that the TV standards groups (DVB,
ATSC) are actually turning to the W3C for assistance to determine the fresh
combination of HTML related protocols (SMIL, XML &c) which could be
formulated into a stable suite of offerings which could then be adopted into
the TV world. Some of the protocols which Dan has proposed have been developed
quite out of context with the broadcast standards bodies (DVB. ATSC).
for example, sdp duplicates announcement functionality already available in digital TV
John Du (Intel, USA) - disagree - sdp is compeletely different and independant of existing TV announcement functionality
Dan Zigmond (Microsoft, USA) - notes that at least the proposal he is making could be adopted without fear of sectarian ownership if handled by W3C. The advantage of W3C is that it is outside of the traditional regional standards bodies for TV (DVB, ATSC etc.)
Glenn Adams (Spyglass, USA) - The core technologies are about XML and HTML but if these are not acceptable to broadcasters then the focus of the meeting should be elsewhere as W3C can not tell DVB or ATSC what they must do. It has to be co-operative in its approach.
The discussion was wide ranging. What should each STB have in it and how do you know what resources to provide as the media delivered is not known in advance? The ATSC approach focuses on the provision of Java (but not necessarily including Java script). A proposal to formulate profiles for HTML for supporting TV broadcast was voiced but the exact focus for this work is not clear at the moment.
It then appeared that there were three classic application areas [a] HTML enhanced TV broadcast (such as proposed by Dan Zigmund) for which Dan does not think that tuning is that relevant but others in the assembly disagree [b] a certain level of interactivity [c] a STB which will run all of the protocols and applications that there are, or will be, out there.
We discussed the coverage areas that the established broadcasting bodies were involved in setting standards and that on analysis it seems that W3C should focus on the provision of HTML and its derivatives for use within the TV environment.
maintained during the discussion
Chair: Hakon Lie, W3C
Notes: Ann Navarro, HTML writers guild
Name, affiliation, interest
Server Side Processing
Caching today, the same URL for different content, you might get the wrong stuff from the cache, based on who made the last request. HTTP WG is thinking along the same lines of improving caching, user agent field won't be sufficient. Designers want more control over what gets delivered, user agent field again isn't sufficient. What kind of things would a proxy server have to/must/can't do? Some devices may be low power, can't take a large document/image/style sheet, preprocessing allows better delivery.
Does memory limits of set-top boxes impact how this is done? Much memory dedicated to specific functions such as MPEG decoding, little let over for other processing.
Fear, authoring would become hard with preprocessing. Declarative processing would obviate that concern.
universal style sheet could parse required section and send on to the browser
Can media for delivery vary by device?
Liquid pages, presentation by HotWired
Education very important
Ann asks, would you deliver video to tv instead of text/images fo the web, based on device id? User situations can be changed based on device, server side or client-side. Office needs are different than couch potato needs. Automatic processing of content type.
If done purely on server-side, if I'm an IA that doesn't fit a profile, how will the content be delivered?
Unidentified agent would get high end rather than low end. Identification brings cut-back.
Meta-data about what's available on the site, then the IA could then decide what it wanted to access.
Addressing Norbert's concern, too many devices to design for? Is it really a problem?
Devices could help determine when style sheets are needed and when base info should be delivered.
Accessibility: need to provide alternatives to video/sound, etc. You could defer on how it's presented, though.
Switch construct in test attributes in SMIL recommendations. Data isn't switched, the construct is. Processing occurs on the client first, then redirected based on the requirements of that processing.
Object and container models help deliver options.
Meta-data on the content containers.
Aural example: pre-record information which you can then play as one long sequence, you might want to then avigate as a hierarchy, need additional information to find bits and pieces in the middle - seek, etc. Meta-data to describe alternative presentations has been fed into RDF from previous HTML work
storage/identification, composition, and presentation: separated out in regard to authoring.
composition = SMIL, or template series.
presentation = CSS
storage = HTML/XML/RDF
addressing schemes need attention wrt: virtual storage (live broadcast doesn't reside permanently, etc)
navigation/user control/behavior, where does that fit in model? (between comp/pres)
Apply to live model, you have a feed >> to a news desk, that adds presentation, behavior, templates
David: Live streaming doesn't necessarily need to be that complex. When reviewing stored clips, perhaps this much interaction would be good.
Tools are rather crude right now, no support for additional expense to enhance these options.
Designer most definitely now needs to have control over the presentation on the end device. Something that was intended for display on something large like a pc screen, trying to be displayed on a small device is frightening. Why ask the client to have to deal with this overwhelming amount of information?
New service opportunities for conversions (letterbox to tv screen size, etc).
Reuters produces focused television, could separate it out into just the audio track for phones, etc.
Astrid: Trying to tailor for specific devices in rich media can lead to trying to design for everyone, which leads to designing for no one.
Editorial control: high res. display where appropriate, then perhaps just deliver the sound bite in other circumstances.
FIND thresholds for sacalability across devices: What makes sense on what?
Meta-data, add information to the feed and make decisions later on, and combine it with a threshold.
Accessibility issues combined in threshold stoppage, when to say when and how?
Profiles - conformance rules that can be identified for a browser or device.
How do we get vendors to support specific profiles?
Astrid: Adjust triangle model
V - control
presentation (time, space, behavior)
Author-once approach has limits, editorial control can allow you to get around those limits by applying different situations.