13:41:16 RRSAgent has joined #web-networks 13:41:16 logging to https://www.w3.org/2020/02/05-web-networks-irc 13:41:19 Zakim has joined #web-networks 13:41:37 Meeting: Web & Networks IG: Lessons from Network Information API WICG 13:42:03 Agenda: https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/0003.html 13:42:40 -> https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/att-0003/Network_Quality_Estimation_in_Chrome.pdf Slides: Network Quality Estimation In Chrome 13:43:44 Chair: Sudeep, DanD 13:43:50 RRSAgent, draft minutes v2 13:43:50 I have made the request to generate https://www.w3.org/2020/02/05-web-networks-minutes.html dom 13:43:54 RRSAgent, make log public 13:49:25 sudeep has joined #web-networks 13:52:28 cpn has joined #web-networks 13:56:47 Present+ sudeep, cpn, dom 13:56:56 Chair+ Song 13:58:06 Present+ Piers_O_Hanlon 14:00:02 Present+ Jordi_Gimenez, Dario_Sabella, Tarun_Bansal, Eric_Siow 14:01:22 Present+ Dan_Druta 14:03:35 Sudeep: today's session is an important one for the IG 14:03:45 ... in the past, we've covered a lot about MEC, CDN, network prediction 14:04:00 ... today we have folks from Google's Chrome team who implemented some APIs around networking 14:04:41 ... we're glad to have our guest speaker Tarun Bansal from the Chrome Team to give us insights about the APIs implemented in the networking space; how it is used, how useful it is, what lessons to draw from it 14:05:00 Tarun: I work on the Google Team and will talk about network quality estimation in Chrome 14:05:21 ... the talk is divided in 2 parts: use cases, and then technical details about how it works 14:05:38 ... my focus in the Chrome team is on networking and web page loading 14:05:50 ... I focus on the tail end of performance, very slow connections e.g. 3G 14:06:13 ... about 20% of page loads happen on 3G-like connections - which feels very slow, e.g. 20s before first content 14:06:25 ... videos would also take a lot of buffering time in these circumstances 14:06:51 ... the 3G share varies from market to market; e.g. 5% in the US, but up to 40% in e.g. developing countries 14:07:14 ... We have a service that provides continous estimates of network quality, covering RTT and bandwidth 14:07:30 ... we estimate network quality across all the paths, not specific to a single web servers 14:07:43 ... this focuses on the common hop from browser to network carrier 14:07:59 ... [this work got help from lots of folk, esp. Ben, Ilya, Yoav] 14:08:21 ... Before looking at the use cases, we need to understand how browsers load Web pages and why Web pages load are slow on slow connections 14:08:43 ... First, it is very challenging to optimize performance of Web pages - takes a lot of resoruces 14:09:04 ... Web pages typically load plenty of resources before showing any content (e.g. css, js, images, ...) 14:09:28 ... Not all of these resources are equally important - some have no UX impact (e.g. tracking, below-the-fold content) 14:09:47 ... loading everything in parallel works fine in fast connection, but in slow connections, it slows everything down 14:09:51 Chunming has joined #web-networks 14:10:27 piers has joined #web-networks 14:10:32 ... an optimal web page load should keep the network pipe full and should a lower-priority-resource should not slow down a higher-priority resource 14:10:55 ... e.g. loading a below-the-fold image should not slow down what's needed to show the top of the page 14:11:04 ... or a JS-for-ad shouldn't slow the core content of the page 14:11:23 ... this means a browser need to understand the network capacity to optimize loading of resources 14:11:43 ... this is what led to the creation of this network quality estimation service 14:12:30 ... Other uses include so called "browser interventions" which are meant to improve the overall quality of the Web by deviating from standard behavior in specific circumstances 14:12:38 ... in our case, e.g. when a network is very slow 14:13:05 ... another use case is to feed back to the network stack - e.g. using network timeouts 14:13:31 ... in the future, this could also be used to set an initial timeout in a smarter way (e.g. higher timeout in poor connection contexts) 14:13:53 ... lots of use cases for the browser vendor - what use would Web dev make of it? 14:14:26 ... We've exposed a subset of these values to the developers: RTT estimate, a bandwidth estimate, and a rough-categorization of network quality (in 4 values) 14:14:33 ... This was released in 2016 14:14:49 ... and is being used in around ~20% of web pages across all chrome platforms 14:14:54 ... examples of usage: 14:15:38 ... the Shaka player (an open source video player) use the network quality API to adjust the buffer; Facebook does this as well 14:16:02 ... some developers use it to inform the user that the slow connection will impact the time needed to complete an action 14:16:22 ... Now looking at the details of the implementation 14:16:35 Present+ Jonas_Svennebring 14:16:42 Present+ Doug_Eng 14:17:05 ... The first thing we look at is the kind of connection (e.g. wifi) 14:17:21 ... but that's not enough: there can be slow connections even on Wifi or 4G 14:17:49 ... a challenge in implementation this API is being able to make it work on all the different platforms which expose very different set of APIs 14:18:21 ... We also need to make it work on devices as they are, with often very limited access to the network layer 14:18:47 zkis has joined #web-networks 14:18:49 ... Typically, network quality is estimated by sending echo traffic to a server (e.g. speedtest) 14:19:18 ... but this isn't going to work for Chrome: privacy (don't want to send data to a server without user intent) 14:19:32 ... also don't want to maintain a server for this 14:19:47 ... we also want to make the measurement available to other Chromium-based browsers 14:19:55 ... so we're using passive estimation 14:20:07 ... for RTT, we use 3 sources of information based on the platform 14:20:19 ... the first is the HTTP layer which Chrome controls completely 14:20:32 ... the 2nd is the transport layer (TCP) for which some platforms provide information 14:20:46 ... the 3rd is the SPDY/HTTP2 and QUIC/HTTP3 layers 14:21:25 ... for HTTP, you measure the RTT by the time different between request and response - this is available on all platforms, completely within the Chrome codebase 14:21:46 ... there are limitations: the server processing time is included in the measurement 14:22:33 ... for H2 and QUIC connections, the requests are serialized on the same TCP or UDP request, which means the HTTP request can be queued behind other requests 14:22:42 ... which may inflate the measured RTT 14:23:01 ... it is mostly useful as an upport bound 14:23:05 s/upport/upper/ 14:23:31 Louay has joined #web-networks 14:23:34 ... for the TCP layer, we look at all the TCP sockets the browser has happened, and ask the kernel what RTT it has computed for these sockets 14:23:42 ... then we take a median 14:23:46 present+ Louay_Bassbouss 14:23:57 ... this is less noisy, but it still has its own limitations 14:24:12 ... it doesn't take into account packet loss; it doesn't deal with UDP sockets (e.g. if using QUIC) 14:24:25 ... and it's only available on some platforms - we can't do this on Windows or MacOS 14:24:32 ... this provides a lower bound RTT estimate 14:24:46 ... The 3rd source is the QUIC/HTTP2 Ping 14:25:08 ... Servers are expected to respond immediately to HTTP2 PING 14:25:23 ... this is available in Chrome, and it removes some of the limitations we discussed earlier 14:25:43 ... but not all servers support QUIC/H2, esp in some countries 14:26:01 ... not all servers that support QUIC/H2 support PING despite the spec requirement 14:26:11 ... and it can still be queued behind other packets 14:26:43 ... So we have these 3 sources of RTT, we take for each sources all the samples, and we aggregate them with a weighted median 14:27:17 ... we give more weight to the recent samples; compared to TCP which uses weighted average, we use weighted median to eliminate outliers 14:27:44 ... once we have these 3 values, we combine them using heuristics to a single value 14:27:53 ... these heuristics will vary from platform to platform 14:27:59 ... Is that RTT enough? 14:28:25 ... We have found that to estimate the real capacity, we need to estimate the bandwidth 14:28:38 ... there has been a lot of research on this, but none of them worked well for our use case 14:28:58 ... we do not want to check a server; we want a passive estimate 14:29:26 ... What are the challenges in estimating bandwidth? The first one is that we don't have cooperation from the server-side 14:29:48 ... e.g. we don't know what TCP flavor the server is using, we don't know their packet loss rates 14:30:49 ... so we use a simple approach: we measure how many bytes we get in a given time window with well defined properties (e.g. >128KB large, 5+ active requests) 14:31:00 ... the goal being to ensure the network is not under-utilized 14:31:24 ... with all these estimates, how do they quickly adapt to changing network conditions? 14:31:46 ... e.g. entering in a parking will slow down a 4G connection 14:32:14 ... we use the strength of the wireless signals 14:32:27 ... we also store information on well-known networks 14:32:48 ... To summarize, there are lots of use cases for knowing network quality - not just for browsers, also for Web developers 14:33:08 ... but there are lots of technical challenges from doing that from the app layer without access to the kernel layer 14:33:59 Piers: (BBC) I heard Yoav mention in the IETF that the netinfo RTT exposure might go away for privacy reasons 14:34:07 ... that was back at the last IETF meeting last year 14:34:38 Tarun: it's not clear if we should expose a continuous distribution of RTT - a more granular exposure could work 14:34:55 Chunming has joined #web-networks 14:34:56 Piers: so this is an ongoing discussion - can you say more about the privacy concerns? 14:35:03 Tarun: 2 concerns: one is fingerprinting 14:35:24 ... we round and add noise to the values to reduce fingerprint 14:35:45 ... another concern is that a lot of Web developers may not know how to consume continuous values 14:35:52 ... simplifying it make it easier to consume 14:36:52 ... we provide this in the Effective Connectivity Type - which can be easier to use to e.g. pick which image to load 14:37:22 Piers: we have ongoing work on TransportInfo in IETF that is trying to help with this 14:37:45 Tarun: if the server can identify the network quality and send it back to the browser, the browser could it more broadly 14:38:52 https://github.com/bbc/draft-ohanlon-transport-info-header/blob/master/draft-ohanlon-transport-info-header.md 14:39:29 Piers: one of the use cases is adaptive video streaming; could also useful for small object transports (which are hard to estimate in JS) 14:41:19 Tarun: is is mostly for short burst of traffic? 14:41:29 Piers: it's also for media as well 14:42:32 Tarun: so would the server keep data on typical quality from a given IP address? 14:42:44 Piers: it would be sent with a response header (e.g. along with the media) 14:43:24 DanD: (AT&T) for IETF QUIC, are you considering using the spin bit that is being specified? 14:43:36 Tarun: we're not using it, and I don't think there are plans to use it at the moment 14:43:55 ... QUIC itself maintains an RTT estimate which we're using 14:45:22 Dom: has there been work around network quality prediction - we have a presentation from an Intel team on the topic back in Sep 14:45:46 Tarun: not at the moment - we're relying on what the OS provides 14:46:17 Jonas: what we're doing for network prediction is to use info coming from the network itself (e.g. load shifting across cells) 14:46:25 ... we use this to do forward-looking prediction 14:46:38 Tarun: the challenge is that this isn't available at the application layer 14:47:08 ... e.g. they wouldn't be exposed to the Android APIs 14:47:53 ... an app wouldn't know the tower location - you can know which carrier it is, but not more than that 14:48:04 ... there is a also a lot variation across Android flavors 14:48:24 ... the common set is mostly signal strength and carrier identifier 14:48:48 Sudeep: would it be interesting for the browser which talks to the browser to talk to interfaces to the carrier network (e.g. via MEC)? 14:49:05 ... The carrier/operating networks may have more info about the channel conditions 14:49:13 Tarun: definitely yes 14:49:19 ... Android has an API which exposes this information 14:49:30 ... but it never took off, and most device manufacturers don't support it 14:49:47 ... there is a way to expose this in Android 14:50:06 ... I'm not sure what the practical concerns were, but it never took off 14:50:13 ... it would be super-useful if it was available 14:50:36 Sudeep: you spoke about RTT, bandwidth that got defined in W3C 14:51:02 ... but implementations can vary from one browser to another - is there any standardization about how these would be measured, or would this be UA dependent? 14:51:30 Tarun: it's spec as a "best-effort estimate" from the browser, so it's mostly up to the browser 14:51:41 ... right now it's only available in Chromium-based browsers 14:52:24 ... even Chromium-based implementations will vary from platform to platform 14:52:36 Dom: can you say more about the fact that is is not available in other browsers? 14:53:01 Tarun: I think it's a question of priority - we have a lot of users in developing markets which helped drive some of the priority for us 14:53:22 Song: (China Mobile) I'm interested in the accuracy of the network quality monitoring 14:53:34 ... you mention aggregating data from 3 sources: HTTP, TCP and QUIC 14:53:51 ... is the weights for these 3 sources fixed, or does it vary based on the scenario? 14:54:16 Tarun: it's very hard to measure accuracy 14:54:43 ... in lab studies (with controlled network conditions), the accuracy algorithm does quite well 14:55:01 ... we also do A/B studies, but it's hard given we don't really know the ground truth 14:55:49 ... so we measure the behavior of the consumer of the API, e.g. on the overall page load performance 14:56:11 ... we've seen 10-15% improvements when tuning the algorithm the right way 14:56:32 Song: when you measure the data from these 3 sources, are they exposed to the Web Dev? or only the aggregated value? 14:56:53 ... are there any chance to make the raw source data available to Web browsers? 14:57:01 Tarun: we only provide aggregated values 14:57:08 Piers: how often do you update the value? 14:57:21 Tarun: internally, everytime we send or receive a packet 14:57:40 ... we throttle it on the Web API - when the values have changed by more than 10% 14:57:51 Piers: that's a pretty large margin for adaptation 14:58:06 Tarun: most of the developers don't care about very precise estimates 14:58:25 ... it's pretty hard to write pages that takes into account that kind of continuous change 14:58:35 Piers: for media, more details are useful 14:58:46 Tarun: even then, you usually only have 2 or 3 resolutions to adopt to 14:59:04 Piers: but the timing of the adaptation might be sensitive 14:59:14 Piers: Any plans to provide more network info? 14:59:19 Tarun: no other plans as of now 14:59:34 ... we're open to it if there are other useful bits to expose 15:00:00 Sudeep: that's one of the topics the group is aiming to build on 15:00:14 ... are there other APIs in this space that you think would be useful to Web developers? 15:00:29 Tarun: I think most developers care about few different values 15:00:47 ... it's not clear they would use very detailed info 15:01:06 ... another challenge we see is around caching (e.g. different network resources for different network quality) 15:01:52 ... you might be loading new resources because you're on a different network quality, which if it is of low quality isn't counter productive 15:02:52 ... In general, server-side estimates are likely more accurate 15:03:06 Sudeep: Thank you Tarun for a very good presentation! 15:03:46 ... Going forward, we want to look at how these APIs can and need to be improved based on Web developers needs 15:04:07 ... we'll follow up with a discussion 15:04:41 ... Next week we have a presentation by Michael McCool on Edge computing - how to offload computing from a browser to the edge using Web Workers et al 15:04:47 ... call info will be sent to the list 15:06:17 RRSAgent, draft minutes v2 15:06:18 I have made the request to generate https://www.w3.org/2020/02/05-web-networks-minutes.html dom 16:30:17 Zakim has left #web-networks