W3C

– DRAFT –
Media and Entertainment IG vF2F meeting - Day 1

25 October 2021

Attendees

Present
Adam_Dawidziuk, Amy_Huang, Andreas_Tai, Barbara_Hochgesang__Intel, Benjamin_De_Kosnik, Calvaris, Chris_Lorenzo__Comcast, Chris_Needham__BBC, Dong-Young_Lee__LGE, Dr._Rachel_Yager, Eero_Hakkinen, Francois_Daoust__W3C, Frode_Hernes, Gary_Katsevman__Brightcove, Geun_Hyung_Kim__Gooroomee, Giuseppe_Pascale, Hiroshi_Fujisawa__NHK, Hiroshi_Ota__Yahoo!_Japan, Hyojin_Song__LGE, Igarashi_Tatsuya__Sony, Jaroslaw_Kubiec__XPERI, Jeff_Jaffe__W3C, John_Riviello, Jon_Piesing__TP_Vision, Judy_Brewer__W3C, Karen_Myers__W3C, Kazuhiro_Hoya__JBA, Kaz_Ashimura__W3C, Kinji_Matsumura__NHK, Mark_Corl, Mark_Lomas__BBC, Martin_Wonsiewicz, Michael_Bergman__CTA_WAVE, Michael_Dolan__ATSC, Paul_Hearty__Samsung,_WAVE,_ATSC, Phillip_Maness__Xperi, riju, Rob_Wilson, Shannon_Janus, Takio_Yamaoka__Yahoo!_Japan, Tatsuya_Sato__NHK, Tomoaki_Mizushima__IRI, Will_Law__Akamai, Wojciech_Mycek, Wouter_van_Boesschoten, Yasushi_Minoya__Sony, Zachary_Cava
Regrets
-
Chair
Chris_Lorenzo, Chris_Needham, Tatsuya_Igarashi
Scribe
cpn, kaz

Meeting minutes

Introduction and Logistics

https://docs.google.com/presentation/d/1UDtBUzAHau_T1mhIt-V5rS3j_AgW4J0AdnCtacSgdp4/edit <- ChrisN's slides

ChrisN: introduces the group to the participants
… list of resources here
… Agenda for today: industry updates from HbbTV and CTA WAVE, then web app performance
… This meeting is planned for 2 hours
… We will take minutes on IRC, and will be made public
… MEIG operates under the W3C Code of Conduct
… There will be 2 upcoming meetings,
… vF2F day 2 on Wednesday
… And our next monthly call in November
… The first hour today is industry updates

Kaz: We're taking notes on IRC, so please join http://irc.w3.org/?channels=#me
… please put your first name, family name and affiliation name for your zoom name

HbbTV Update

https://www.w3.org/2011/webtv/wiki/images/f/f1/HbbTV_Update_for_W3C_Media_and_Entertainment_Interest_Group-2021-10-25.pdf <-- Jon's slides

Jon: How does HbbTV fit with web technologies?
… Looking at apps, some HbbTV apps integrate with broadcast TV. They may be program independent, such as program guides etc, or program related
… Some people do subtitles and captions in a web app rather than rely on underlying OS technology
… Broadcaster catch-up services typically use the native DASH player rather than MSE
… HTML video element is used, replacing older object interface
… Information services with no broadcast integration. How does HbbTV relate to the web? Information apps are just standard web stuff
… From a TV set device point of view, let's look at the specs. HbbTV 2.01/2 are based on 2013 web specs
… 2.0.3 is based on 2018 CTA WAVE snapshot
… Implementations are different. New products may be based on a previous year's browser
… Evolutions of products may or may not update the HTML UA
… Some content providers say it's frustrating when a new TV has a 4 year old UA
… There's a cost to moving to a new UA version, either outsourced or in-house
… No business model for funding costs to the TV manufacturer
… Security bug fix costs
… How does HbbTV relate to the web? It's similar, yet different
… The HbbTV timeline starts with requirements capture. 2 years before products enter the market
… Call for technologies, voting to get consensus. We draft a spec, then once a spec is done then we create test assertions
… HbbTV gates spec publication on tests
… Then the test suite is created, funded from membership fees
… Then product is deployed based on the spec
… Manufacturers do what they think is appropriate for the market, the cycle could be longer or shorter than 2 years depending on broadcaster needs
… We have multiple spec versions in development at a time
… While test assertions are being developed, we'll work on requirements for the next version
… Today, we're working on HbbTV 2.0.4, adding integration with TV OS accessibility features
… Also voice assistant integration, such as Alexa and Google Assistant
… And DVB-I for live linear services
… Please come to the HbbTV symposium, coming soon

Jeff: Terrific presentation. In my view, any presentation that highlights real issues is good, gives us things to work on
… What comes out of W3C that has issues when it gets into HbbTV, e.g., release cycles, security issues
… Seems like a lot of work to do. How can we help you?

Jon: A lot of it comes to implementations rather than specs. What do browsers integrate and when?
… Specs are the easy bit

Jeff: We've tried overlays, some benefit and mixed results. How to strengthen those areas? Timelines for implementation

Jon: From HbbTV point of view, we've given up. Particularly with the move to WHATWG
… We just have to accept it is what is, and work as well as we can with it

MikeDolan: If you replace with HbbTV with ATSC, the slides would still be accurate
… Manufacturers are building consumer products, without necessarily coordination between them
… We don't have a continuous update model, which risks bad things happening during updates
… Higher-end TVs are conntected and interactive. Different to the web world. Appreciate the difficulties involved
… We experience the same things as HbbTV

<BarbaraH> +1

ChrisN: I've heard that HbbTV prefers to make use of existing specs where possible. Where there are requirements, can W3C help with development of new features?

Jon: With voice assistants, it's nothing to do with TV, so it's in some sense strange for us to define our own API
… We're defining this, but it would also be strange to define an API that wouldn't relate to TVs
… That may not be the best example, as it involves proprietary technology integration
… Architecturally we use a webserver with JSON over websockets to avoid browser integration
… Accessibility could be easier, as the basic OS features aren't hugely different. Some things in W3C such as high-contrast UI
… Yes, W3C could perhaps have done something. But it comes down to implementations, and there's no guarantee browser vendors would be interested
… Would be interesting things for collaboration, but uncertainty makes that difficult

Kaz: Regarding voice agents, there was discussion in the AC meeting. I ran two breakout sessions on speech interfaces
… Many requirements from stakeholders, including accessibility. I'm organising a workshop on voice agents
… W3C should be able to help you all with improved user interface. We're trying to involve browser vendors as well

ChrisN: Perhaps HbbTV could be interested to present at the workshop

voice breakout minutes (day 1)

(day 2)

Jon: Most of the work was done by BBC, so your BBC colleagues might be better placed than HbbTV to contribute

Barbara: Do you have any latency or end-to-end issues?
… As you do something like voice or transcription, any challenges with latency in doing end to end solutions?

Jon: We haven't build this yet. BBC have been working on it, but latency hasn't been mentioned

ChrisN: I can ask colleagues for a view on that

CTA WAVE update

<kaz> https://www.w3.org/2011/webtv/wiki/images/a/ac/WAVE-update-MEIG-TPAC-2021.pdf <-- Will's slides

Will: I chair the CTA WAVE project
… Many WAVE members have joined the call today
… It's a project run by the CTA, the same people who run the CES trade show, produce specs including HDMI, etc
… Our goal is to improve interop for OTT streaming. How to make content so that it works across devices?
… We publish standards, and make test suites, and now we coordinate across SDOs
… There's a side-group, CMAF Industry Forum
… WAVE bridges media and web standards. HTML5, MSE, EME, CSS. Also MPEG, ETSI, WHATWG, JavaScript
… Core specs are: a content spec, media API spec, device playback spec
… We also make test suites. Update on recent activities. The web is constantly changing, so hard to say a device should support a particular set of APIs
… We take a modern laptop and test it against the 4 browser codebases. We use that as a baseline
… It draws a line in the sand, and test devices against it
… New specs, common Media Client Data, to help improve CDN performance
… It's getting some publicity and usage today
… DASH/HLS interop spec: how to make one set of content usable by both
… Common Media Server Data: Data from an origin to a server to a client
… Common Token Format: Every CDN has a complex method for dealing with tokens. We're coming up with a standardised token format that can be used by every CDN and content distributor
… Test suites, credit to Louay at Fraunhofer Fokus
… WAVE extension to WPT, based on a test runner which was designed for desktop/mobile
… We test the Web API snapshot. Also our Device Playback Capabilities test suite. It's all free to use and open source
… Web Platform Tests is a cross browser test suite, but it's hard to run on embedded devices
… WAVE built a test runner that extends WPT for embedded devices. You configure it on a remote device, then execute on the embedded device
… Custom wrapper executes tests in a single window, and it has a REST API. You can get HTML or JSON format results
… You can use a completed test session to use as a basis for the next test
… The WAVE extension has been merged in to WPT, and that gets updated under contract from CTA
… Test suites are built on the web platform. We take the major browser code bases and we look at which features are well supported on modern hardware
… Can validate on embedded devices. HbbTV 2.0.3 uses the 2018 Web Media API snapshot
… Test runner is available on GitHub, and it's available in a Docker container to make it easy for you to run locally
… It takes several hours to run all the tests, so if you want to experiment please just run a few tests, e.g., MSE
… The Device Playback Capabilities Test Suite tests how well is playback plumbed into the device
… Jon Piesing has been leading on this
… For this we need known good mezzanine content. Then we synthesize test content to exercise certain features
… An observation framework with a camera automates gathering test results, to avoid manual testing
… The test content, when annotated, has visual markers, QR codes and audio codes: are you displaying all the frames, how accurate are you seeking?
… The Joint Content Conformance Project - lots of SDOs are doing work to check if content is good
… DASH-IF has a software tool, but the code base is hard to use and difficult to maintain and extend. There's a movement towards CMAF encapculation of content
… Our goal is to make a common conformance suite, to lower costs
… This project is coordinated by WAVE, credit to Mike Bergman. Coordination with people from DVB, HbbTV, Apple
… We have a signed letter of intent. There's an RFP to build the conformance suite. Please respond if anyone here is interested
… We'll get an ISOBMFF segment validator
… CMAF Industry Forum is for industry outreach. The group ran a survey on CMAF, and got responses from major TV brands
… Some of the questions we asked: Usage plans for CMAF, which codec profiles do you use, etc? The report will be published on the CMAF-IF website
… Are people aware of the relationship of CMAF to HTML media extensions? Most people said yes, but many said no, which is interesting
… More than half are using it today, 19% not planning to using. Worry about those not using it
… Many are doing both MP4 and MPEG2-TS, which could be to support older set-top boxes
… A lot of people intend to continue using both. That should be a smaller number as MP4 CMAF use increases
… What's blocking people from people using HTML5? Are you using MSE and EME?
… Interesting answers here: "Implementations are poor or half-baked", "Unreliable playback performance"
… Some are using app-based playback based on apps, but we've seen competing data
… The web is constantly evolving, HTML5 is stepping stone to the next version.
… Open to questions

Jeff: Thanks Will. You mentioned linkage with the W3C technologies, MSE, EME, HTML, CSS
… There's a bunch of new specs in the Media WG, WebCodecs, Media Capabilities, Media Playback Quality, etc
… Hope this is to the benefit of the media industry. Are these on your radar?

Will: Absolutely. We're looking at Media Capabilities, we've input requirements to the Media WG
… It's hard to introspect the device for its playback capabilities. Media Capabilities is included in our snapshot as it gets adopted by browsers
… WebCodecs doesn't have broad implementation, but we'd be moving towards it
… Evolve the test suite

<JohnRiv> Yes, I can confirm Media Capabilities is in the Snapshot

Jeff: Input from media companies helps move things forward at W3C

Will: It's not necessarily the media APIs that are the blocker
… Our stated goal for WAVE was web-based playback across devices
… We're running into headwinds, things not broadly supported
… It's a key question for MEIG

ChrisN: There has been discussion with the Media WG around CMAF and the potential need for a CMAF byte stream format spec. This is something we should follow up on

Will: Yes, we produced a proposal

<ChrisLorenzo> Hi - Survey for my talk if everyone could take the time to provide some valuable information - https://forms.gle/jfvudBgq87YA44z68

Web App Performance

<cpn> https://docs.google.com/presentation/d/1TQe-JeDM_8F7pBdX_ovixhF4Nok3Zk1dSFfbX4_gvZ8/edit <-- ChrisL's slides

ChrisL: My talk focuses on web apps on TV devices, not media playback performance
… I've been at Comcast for 14 years. I've had a great experience developing web apps in JavaScript
… I'll talk about where we are today and where we can go in future
… PWA performance optimisation
… We're building web content on native apps. The hardware wasn't great at rendering
… The latest iPhone is almost as fast as a desktop processor. You can build a PWA and make it fast on phones
… I joined the Flex team, building UIs in a browser
… We're pushing the limits of the hardware. The browser is WPE, WebKit for embedded devices
… It's stripped down to run on low-end hardare
… We started trying to build UIs on these boxes using HTML+CSS
… Designers would come up with beautiful images and animations. But when we built it we found the performance meant you couldn't use animations and other features
… We couldn't make a great UI because of the performance limitations
… HTML and CSS has a render cycle on thee main thread, it's time consuming and slow
… Two years ago, when I started working on TV devices, I looked at the Lightning framework
… This uses WebGL and <canvas>, not HTML and CSS
… I thought canvas was for drawing graphics or for games, why use it for TV UI?
… As a web developer, I love HTML+CSS, I needed to research this
… HTML+CSS lives on the DOM, whereas canvas is designed for complex bitmap operations
… DOM tries to do many things, makes it hard to do one thing, in our case render graphics
… The web platform has evolved. There's WebGL, WebGPU as next evolution for faster graphics on the web
… and WASM as a binary format. There's more flexibility now for how to create web content
… As we develop TV apps and want to make a great TV user experience, what do we need for the next generation of TV UIs?
… We don't know what the future holds. We need feedback on how the process is working. How to build UIs on TVs?
… I want better threading capabilities to offload from the main thread
… Memory debugging is also critical, devices have 500MB to 1GB of memory, so we need to be able to track down memory usage issues
… What are some best practices? What tools to use? HTML+CSS or canvas rendering? WASM?
… Upcoming in the agenda is MiniApps, could we have a search and discovery layout in a MiniApp?
… [Demo of Flex app]
… I can develop the TV UI on my computer's browser
… I could do something similar in HTML+CSS, but not sure how to make certain features, such as the highlight around the thumbnail image
… It's very performant. I've been trying to optimise the experience. I have some dropped frames
… It uses lazy loading. It's hard to keep to 60 fps, even 30 fps is difficult
… As we navigate the UI, we lazy load rows. Loading a row takes 50 ms on my Macbook Pro
… If you put that on a set-top box, it will take 200ms and be janky
… How to lazy load content? Be smarter with threads? Building a highly performant UI is challenging
… What can W3C do to offer new technologies for people building web UIs for TV?

<kaz> ChrisN: Any questions?

MarkLomas: Interesting presentation. When we discuss at the BBC, we get asked about accessibility
… How do you go about that?

ChrisL: You don't have the same accessibility needs on a TV. With voice guidance for navigation we have text strings that read out what actions are available
… It's important to define best practices for accessibility, but it can be done.

Dong_Young: I'm from LGE. My team is working on TV apps written in JavaScript. We use React + Redux
… What I find is that in our app time is used to execute JavaScript code moreso than DOM rendering. We don't find DOM is the bottleneck
… I think it's more convenient, when you want to write text, if we use WebGL, it can be as powerful as DOM for text rendering
… There are issues with marquee text in HTML
… Also, we find the web engine uses a lot of memory for the image cache
… TV apps use a lot of thumbnails that occupy the cache
… It can take a lot of memory. Would be good if there were a mechanism to control the image cache

ChrisL: We've also built with React+Redux. Depending on your application needs, HTML+CSS is fine
… It's only when you get to fancy animations that the bottlenecks appear
… Also handling AJAX responses. Need to figure out how to get better with web workers
… Marquee text is also an issue. How to make it fit a certain width and scroll?
… Images also take the majority of memory. How to have the highest quality image with lowest memory?

MarkLomas: About the Device Playback Spec from Will's presentation, would having a performance baseline test be useful?

Will: I'm not 100% sure, but we don't have extensive tests for canvas. We were very media-centric initially. The canvas side should be extended with other aspects of visual rendering
… We need experts in the WAVE group, we have people who know about ISOBMFF, but not so much about how to use canvas

ChrisL: Devices designed to play 4K video. Some devices have problems playing video if there's too much memory being used

Will: Has anyone built a test suite to see how well devices work?

ChrisL: Lightning has a test suite called Strike, it runs through all the graphical operations and gives a performance report

Will: Does that correlate with user experience?

ChrisL: It's pretty real world. We'll open source it at some point

Wouter: It does give good results. We've added CPU results. It runs on anything, good score on latest iPhones
… Identifying the boundaries between performance is good to where it's not is where it gets interesting from a performance point of view

John: I work with ChrisL and Wouter. I'm co-editor of the WAVE HTML snapshot
… Would that, or some other, test suite, be useful in industry? Testing that the API exists isn't enough
… What would people want to see from that test suite? Should push for a standard on what playback means on the web?

ChrisL: One thing I'd suggest for the test framework is to have a web page that plays video but increases memory usage at the same time
… Then you could see how much memory could be used before playback stutters

<kaz> https://www.w3.org/2021/Talks/1018-voice-dd-ka/20211018-voice-breakout-dd-ka.pdf

Kaz: Two comments: Regarding the voice user interface, we're organising a workshop, so we can help from that viewpoint
… Talking with people at Tokyo University, they're working on software-defined media, making a virtual device. This approach could also be of interest. They're interested in Web of Things for that purpose
… Joint discussion with WoT group would be helpful

<kaz> Software Defined Media (SDM) project

Benjamin: If we want to start baseline conformance testing, what device to use, what would a useful reference device be to use from a browser vendor point of view?

ChrisL: Great question, don't have an answer. The latest Google or Apple phone is much faster than the typical mobile devices used in some countries
… Don't know how often people get new TVs, but many may be using devices several years old

Jon: The thing about performance testing on TVs is that the product lifetime is a very long time
… We'd love to see a performance measurement that we can pass to suppliers

<jpiesing> TV manufacturers will only meet performance criteria if they (we) are forced to by content providers as a condition to get access to their content.

<jpiesing> A relatively objective test suite with a clear definition of a pass is would be good for TV manufacturers who outsource the UA to help focus their suppliers :)

ChrisL: I'm reminded of an article on benchmarking mobile devices from Addy Osmani
… If you compare a top of the line phone to an average phone, for parsing JSON, it's a 1.5 difference in parse time
… We could use that to develop a baseline. Test on different TVs to debug on real world devices

Will: I'm interested in Strike and Lightning. Who runs these projects?

Wouter: It's primarily Comcast open source. It's for the Lightning community. There are lots of variables to see if a device will run it, e.g., WebGL performance
… We're in the process of open sourcing

Will: This meeting is making clear is that UI rendering is a blocker. It would be an interesting fit to extend our test runner
… We'd want to make it not Lightning focused though

Wouter: That makes sense. WebGL is a great rendering engine, as it gets you close to the GPU
… Thanks to Netflix for setting the bar for rendering performance
… Building the UI in WebGL is expensive and needs specialist skills. We want to allow JS developers to build performant UIs
… Happy to discuss futher

Jeff: Will, you mentioned that a key thing to fix is UI rendering
… There's lots of pieces to that. Some may be in WebGPU, some elsewhere
… Are we on a path to fixing it? What else should we be doing?

Will: I don't think we're on a path to fixing it for TVs. TVs may have the weakest processors in the devices you typically use each day
… From a WAVE point of view, we want to measure the problem, hence the point about testing and measuring
… Having a spec to aim for. WAVE is interested in enabling that route
… What's currently happening isn't working

Kaz: When I talked with NHK, they're wondering about performance problems with switching from TV tuner to browser-based streaming and vice-versa
… Is there also a performance problem there?

ChrisN: I don't have data, but TV devices have limited hardware video decoders so you have to reinitialise the decoder pipeline, make sure the content you're switching to is buffered, etc

ChrisL: In response to Wouter, canvas and WebGL is a good solution, but there's not enough specialists
… Is WebGL a good solution? Is JavaScript on top of that a good solution? Do we need a different rendering that's more performant?
… We need people who've built TV apps to chime in

ChrisN: There are kind of different approaches. W3C provides many of the low level building blocks: WASM, WebGL, WebGPU, Workers
… Is that enough, and leave the rest to apps? Or should we standardize a layer on top of that?
… Web specs define functionality but not performance characteristics. Would it help to define those?
… What's the most useful direction, and how to move forward?

DongYoung: To Jeff's question, my biggest pain is memory optimisation
… A typical TV app uses lots of memory, and the web app doesn't have a way to optimise memory use
… If there's a hint for image caching that could help. Any mechanism that a web app could use to control the cache would be helpful

JohnRiv: What would performance spec language look like?
… What performance tooling could we create?

ChrisL: That depends on how the code is written. The Chrome tooling is there, maybe needs porting to webkit

ChrisN: Web specs run across a range of devices, so it may not make sense to write into the specs themselves
… The approach we take is to fingerprint the make/model of the device, then adapt our content based on our own performance measurements
… We measure which features work well or not, and the app switches those on or off
… I guess it's a kind of progressive enhancement based on device performance, which features are usable and performant

Jon: That point is important. Having a target is important, it's something you can aim for
… A set of benchmarks are great, but if there's no target for each benchmark, there's nothing you can use to leverage with suppliers

MarkLomas: On performance metrics, if you want a solid 60 fps or higher, you need to commit to not dropping frames
… We might have good tools in Chrome, but you don't have control over background tasks such as GC
… You also need to understand the complete system as much as the app
… Is there something that can be provided in that?
… For example, working with a game console (PS4) there was a guarantee from the manufacturer on the amount CPU that they'd use
… Something like that would be valuable

Wrap-up

Jeff: On how to follow up, you could run a TF to investigate and come up with recommendations
… On the discussion on performance, it's something we don't usually standardise. But it's important, so we should study how to do that
… Before we didn't know how to standardise accessibility, but now we do, and we're now starting that process for sustainability
… Performance seems like a graspable project

Kaz: I suggest we create a dedicated TF, to clarify which part can be handled by MEIG, and which needs help from external orgs

ChrisN: Yes, this seems to me like it should be a joint discussion between MEIG and WAVE
… Some parts may better fit W3C, some parts WAVE
… In a practical sense, the next meeting will be held on Wednesday to see the update on Hybridcast
… and then joint discussion with MiniApps on November 2
… We'll be very much interested and happy to host further discussion

Judy: I'm happy to help you get to the right people on accessibility

ChrisN: Definitely, thanks!
… We'll publish our minutes publicly, and send a summary of what we've discussed to the mailing list
… Thank you so much for your contributions, and to all the speakers!

[adjourned]

Minutes manually created (not a transcript), formatted by scribe.perl version 147 (Thu Jun 24 22:21:39 2021 UTC).