IRC log of web-networks on 2020-02-05

Timestamps are in UTC.

13:41:16 [RRSAgent]
RRSAgent has joined #web-networks
13:41:16 [RRSAgent]
logging to
13:41:19 [Zakim]
Zakim has joined #web-networks
13:41:37 [dom]
Meeting: Web & Networks IG: Lessons from Network Information API WICG
13:42:03 [dom]
13:42:40 [dom]
-> Slides: Network Quality Estimation In Chrome
13:43:44 [dom]
Chair: Sudeep, DanD
13:43:50 [dom]
RRSAgent, draft minutes v2
13:43:50 [RRSAgent]
I have made the request to generate dom
13:43:54 [dom]
RRSAgent, make log public
13:49:25 [sudeep]
sudeep has joined #web-networks
13:52:28 [cpn]
cpn has joined #web-networks
13:56:47 [dom]
Present+ sudeep, cpn, dom
13:56:56 [dom]
Chair+ Song
13:58:06 [dom]
Present+ Piers_O_Hanlon
14:00:02 [dom]
Present+ Jordi_Gimenez, Dario_Sabella, Tarun_Bansal, Eric_Siow
14:01:22 [dom]
Present+ Dan_Druta
14:03:35 [dom]
Sudeep: today's session is an important one for the IG
14:03:45 [dom]
... in the past, we've covered a lot about MEC, CDN, network prediction
14:04:00 [dom]
... today we have folks from Google's Chrome team who implemented some APIs around networking
14:04:41 [dom]
... we're glad to have our guest speaker Tarun Bansal from the Chrome Team to give us insights about the APIs implemented in the networking space; how it is used, how useful it is, what lessons to draw from it
14:05:00 [dom]
Tarun: I work on the Google Team and will talk about network quality estimation in Chrome
14:05:21 [dom]
... the talk is divided in 2 parts: use cases, and then technical details about how it works
14:05:38 [dom]
... my focus in the Chrome team is on networking and web page loading
14:05:50 [dom]
... I focus on the tail end of performance, very slow connections e.g. 3G
14:06:13 [dom]
... about 20% of page loads happen on 3G-like connections - which feels very slow, e.g. 20s before first content
14:06:25 [dom]
... videos would also take a lot of buffering time in these circumstances
14:06:51 [dom]
... the 3G share varies from market to market; e.g. 5% in the US, but up to 40% in e.g. developing countries
14:07:14 [dom]
... We have a service that provides continous estimates of network quality, covering RTT and bandwidth
14:07:30 [dom]
... we estimate network quality across all the paths, not specific to a single web servers
14:07:43 [dom]
... this focuses on the common hop from browser to network carrier
14:07:59 [dom]
... [this work got help from lots of folk, esp. Ben, Ilya, Yoav]
14:08:21 [dom]
... Before looking at the use cases, we need to understand how browsers load Web pages and why Web pages load are slow on slow connections
14:08:43 [dom]
... First, it is very challenging to optimize performance of Web pages - takes a lot of resoruces
14:09:04 [dom]
... Web pages typically load plenty of resources before showing any content (e.g. css, js, images, ...)
14:09:28 [dom]
... Not all of these resources are equally important - some have no UX impact (e.g. tracking, below-the-fold content)
14:09:47 [dom]
... loading everything in parallel works fine in fast connection, but in slow connections, it slows everything down
14:09:51 [Chunming]
Chunming has joined #web-networks
14:10:27 [piers]
piers has joined #web-networks
14:10:32 [dom]
... an optimal web page load should keep the network pipe full and should a lower-priority-resource should not slow down a higher-priority resource
14:10:55 [dom]
... e.g. loading a below-the-fold image should not slow down what's needed to show the top of the page
14:11:04 [dom]
... or a JS-for-ad shouldn't slow the core content of the page
14:11:23 [dom]
... this means a browser need to understand the network capacity to optimize loading of resources
14:11:43 [dom]
... this is what led to the creation of this network quality estimation service
14:12:30 [dom]
... Other uses include so called "browser interventions" which are meant to improve the overall quality of the Web by deviating from standard behavior in specific circumstances
14:12:38 [dom]
... in our case, e.g. when a network is very slow
14:13:05 [dom]
... another use case is to feed back to the network stack - e.g. using network timeouts
14:13:31 [dom]
... in the future, this could also be used to set an initial timeout in a smarter way (e.g. higher timeout in poor connection contexts)
14:13:53 [dom]
... lots of use cases for the browser vendor - what use would Web dev make of it?
14:14:26 [dom]
... We've exposed a subset of these values to the developers: RTT estimate, a bandwidth estimate, and a rough-categorization of network quality (in 4 values)
14:14:33 [dom]
... This was released in 2016
14:14:49 [dom]
... and is being used in around ~20% of web pages across all chrome platforms
14:14:54 [dom]
... examples of usage:
14:15:38 [dom]
... the Shaka player (an open source video player) use the network quality API to adjust the buffer; Facebook does this as well
14:16:02 [dom]
... some developers use it to inform the user that the slow connection will impact the time needed to complete an action
14:16:22 [dom]
... Now looking at the details of the implementation
14:16:35 [dom]
Present+ Jonas_Svennebring
14:16:42 [dom]
Present+ Doug_Eng
14:17:05 [dom]
... The first thing we look at is the kind of connection (e.g. wifi)
14:17:21 [dom]
... but that's not enough: there can be slow connections even on Wifi or 4G
14:17:49 [dom]
... a challenge in implementation this API is being able to make it work on all the different platforms which expose very different set of APIs
14:18:21 [dom]
... We also need to make it work on devices as they are, with often very limited access to the network layer
14:18:47 [zkis]
zkis has joined #web-networks
14:18:49 [dom]
... Typically, network quality is estimated by sending echo traffic to a server (e.g. speedtest)
14:19:18 [dom]
... but this isn't going to work for Chrome: privacy (don't want to send data to a server without user intent)
14:19:32 [dom]
... also don't want to maintain a server for this
14:19:47 [dom]
... we also want to make the measurement available to other Chromium-based browsers
14:19:55 [dom]
... so we're using passive estimation
14:20:07 [dom]
... for RTT, we use 3 sources of information based on the platform
14:20:19 [dom]
... the first is the HTTP layer which Chrome controls completely
14:20:32 [dom]
... the 2nd is the transport layer (TCP) for which some platforms provide information
14:20:46 [dom]
... the 3rd is the SPDY/HTTP2 and QUIC/HTTP3 layers
14:21:25 [dom]
... for HTTP, you measure the RTT by the time different between request and response - this is available on all platforms, completely within the Chrome codebase
14:21:46 [dom]
... there are limitations: the server processing time is included in the measurement
14:22:33 [dom]
... for H2 and QUIC connections, the requests are serialized on the same TCP or UDP request, which means the HTTP request can be queued behind other requests
14:22:42 [dom]
... which may inflate the measured RTT
14:23:01 [dom]
... it is mostly useful as an upport bound
14:23:05 [dom]
14:23:31 [Louay]
Louay has joined #web-networks
14:23:34 [dom]
... for the TCP layer, we look at all the TCP sockets the browser has happened, and ask the kernel what RTT it has computed for these sockets
14:23:42 [dom]
... then we take a median
14:23:46 [Louay]
present+ Louay_Bassbouss
14:23:57 [dom]
... this is less noisy, but it still has its own limitations
14:24:12 [dom]
... it doesn't take into account packet loss; it doesn't deal with UDP sockets (e.g. if using QUIC)
14:24:25 [dom]
... and it's only available on some platforms - we can't do this on Windows or MacOS
14:24:32 [dom]
... this provides a lower bound RTT estimate
14:24:46 [dom]
... The 3rd source is the QUIC/HTTP2 Ping
14:25:08 [dom]
... Servers are expected to respond immediately to HTTP2 PING
14:25:23 [dom]
... this is available in Chrome, and it removes some of the limitations we discussed earlier
14:25:43 [dom]
... but not all servers support QUIC/H2, esp in some countries
14:26:01 [dom]
... not all servers that support QUIC/H2 support PING despite the spec requirement
14:26:11 [dom]
... and it can still be queued behind other packets
14:26:43 [dom]
... So we have these 3 sources of RTT, we take for each sources all the samples, and we aggregate them with a weighted median
14:27:17 [dom]
... we give more weight to the recent samples; compared to TCP which uses weighted average, we use weighted median to eliminate outliers
14:27:44 [dom]
... once we have these 3 values, we combine them using heuristics to a single value
14:27:53 [dom]
... these heuristics will vary from platform to platform
14:27:59 [dom]
... Is that RTT enough?
14:28:25 [dom]
... We have found that to estimate the real capacity, we need to estimate the bandwidth
14:28:38 [dom]
... there has been a lot of research on this, but none of them worked well for our use case
14:28:58 [dom]
... we do not want to check a server; we want a passive estimate
14:29:26 [dom]
... What are the challenges in estimating bandwidth? The first one is that we don't have cooperation from the server-side
14:29:48 [dom]
... e.g. we don't know what TCP flavor the server is using, we don't know their packet loss rates
14:30:49 [dom]
... so we use a simple approach: we measure how many bytes we get in a given time window with well defined properties (e.g. >128KB large, 5+ active requests)
14:31:00 [dom]
... the goal being to ensure the network is not under-utilized
14:31:24 [dom]
... with all these estimates, how do they quickly adapt to changing network conditions?
14:31:46 [dom]
... e.g. entering in a parking will slow down a 4G connection
14:32:14 [dom]
... we use the strength of the wireless signals
14:32:27 [dom]
... we also store information on well-known networks
14:32:48 [dom]
... To summarize, there are lots of use cases for knowing network quality - not just for browsers, also for Web developers
14:33:08 [dom]
... but there are lots of technical challenges from doing that from the app layer without access to the kernel layer
14:33:59 [dom]
Piers: (BBC) I heard Yoav mention in the IETF that the netinfo RTT exposure might go away for privacy reasons
14:34:07 [dom]
... that was back at the last IETF meeting last year
14:34:38 [dom]
Tarun: it's not clear if we should expose a continuous distribution of RTT - a more granular exposure could work
14:34:55 [Chunming]
Chunming has joined #web-networks
14:34:56 [dom]
Piers: so this is an ongoing discussion - can you say more about the privacy concerns?
14:35:03 [dom]
Tarun: 2 concerns: one is fingerprinting
14:35:24 [dom]
... we round and add noise to the values to reduce fingerprint
14:35:45 [dom]
... another concern is that a lot of Web developers may not know how to consume continuous values
14:35:52 [dom]
... simplifying it make it easier to consume
14:36:52 [dom]
... we provide this in the Effective Connectivity Type - which can be easier to use to e.g. pick which image to load
14:37:22 [dom]
Piers: we have ongoing work on TransportInfo in IETF that is trying to help with this
14:37:45 [dom]
Tarun: if the server can identify the network quality and send it back to the browser, the browser could it more broadly
14:38:52 [piers]
14:39:29 [dom]
Piers: one of the use cases is adaptive video streaming; could also useful for small object transports (which are hard to estimate in JS)
14:41:19 [dom]
Tarun: is is mostly for short burst of traffic?
14:41:29 [dom]
Piers: it's also for media as well
14:42:32 [dom]
Tarun: so would the server keep data on typical quality from a given IP address?
14:42:44 [dom]
Piers: it would be sent with a response header (e.g. along with the media)
14:43:24 [dom]
DanD: (AT&T) for IETF QUIC, are you considering using the spin bit that is being specified?
14:43:36 [dom]
Tarun: we're not using it, and I don't think there are plans to use it at the moment
14:43:55 [dom]
... QUIC itself maintains an RTT estimate which we're using
14:45:22 [dom]
Dom: has there been work around network quality prediction - we have a presentation from an Intel team on the topic back in Sep
14:45:46 [dom]
Tarun: not at the moment - we're relying on what the OS provides
14:46:17 [dom]
Jonas: what we're doing for network prediction is to use info coming from the network itself (e.g. load shifting across cells)
14:46:25 [dom]
... we use this to do forward-looking prediction
14:46:38 [dom]
Tarun: the challenge is that this isn't available at the application layer
14:47:08 [dom]
... e.g. they wouldn't be exposed to the Android APIs
14:47:53 [dom]
... an app wouldn't know the tower location - you can know which carrier it is, but not more than that
14:48:04 [dom]
... there is a also a lot variation across Android flavors
14:48:24 [dom]
... the common set is mostly signal strength and carrier identifier
14:48:48 [dom]
Sudeep: would it be interesting for the browser which talks to the browser to talk to interfaces to the carrier network (e.g. via MEC)?
14:49:05 [dom]
... The carrier/operating networks may have more info about the channel conditions
14:49:13 [dom]
Tarun: definitely yes
14:49:19 [dom]
... Android has an API which exposes this information
14:49:30 [dom]
... but it never took off, and most device manufacturers don't support it
14:49:47 [dom]
... there is a way to expose this in Android
14:50:06 [dom]
... I'm not sure what the practical concerns were, but it never took off
14:50:13 [dom]
... it would be super-useful if it was available
14:50:36 [dom]
Sudeep: you spoke about RTT, bandwidth that got defined in W3C
14:51:02 [dom]
... but implementations can vary from one browser to another - is there any standardization about how these would be measured, or would this be UA dependent?
14:51:30 [dom]
Tarun: it's spec as a "best-effort estimate" from the browser, so it's mostly up to the browser
14:51:41 [dom]
... right now it's only available in Chromium-based browsers
14:52:24 [dom]
... even Chromium-based implementations will vary from platform to platform
14:52:36 [dom]
Dom: can you say more about the fact that is is not available in other browsers?
14:53:01 [dom]
Tarun: I think it's a question of priority - we have a lot of users in developing markets which helped drive some of the priority for us
14:53:22 [dom]
Song: (China Mobile) I'm interested in the accuracy of the network quality monitoring
14:53:34 [dom]
... you mention aggregating data from 3 sources: HTTP, TCP and QUIC
14:53:51 [dom]
... is the weights for these 3 sources fixed, or does it vary based on the scenario?
14:54:16 [dom]
Tarun: it's very hard to measure accuracy
14:54:43 [dom]
... in lab studies (with controlled network conditions), the accuracy algorithm does quite well
14:55:01 [dom]
... we also do A/B studies, but it's hard given we don't really know the ground truth
14:55:49 [dom]
... so we measure the behavior of the consumer of the API, e.g. on the overall page load performance
14:56:11 [dom]
... we've seen 10-15% improvements when tuning the algorithm the right way
14:56:32 [dom]
Song: when you measure the data from these 3 sources, are they exposed to the Web Dev? or only the aggregated value?
14:56:53 [dom]
... are there any chance to make the raw source data available to Web browsers?
14:57:01 [dom]
Tarun: we only provide aggregated values
14:57:08 [dom]
Piers: how often do you update the value?
14:57:21 [dom]
Tarun: internally, everytime we send or receive a packet
14:57:40 [dom]
... we throttle it on the Web API - when the values have changed by more than 10%
14:57:51 [dom]
Piers: that's a pretty large margin for adaptation
14:58:06 [dom]
Tarun: most of the developers don't care about very precise estimates
14:58:25 [dom]
... it's pretty hard to write pages that takes into account that kind of continuous change
14:58:35 [dom]
Piers: for media, more details are useful
14:58:46 [dom]
Tarun: even then, you usually only have 2 or 3 resolutions to adopt to
14:59:04 [dom]
Piers: but the timing of the adaptation might be sensitive
14:59:14 [dom]
Piers: Any plans to provide more network info?
14:59:19 [dom]
Tarun: no other plans as of now
14:59:34 [dom]
... we're open to it if there are other useful bits to expose
15:00:00 [dom]
Sudeep: that's one of the topics the group is aiming to build on
15:00:14 [dom]
... are there other APIs in this space that you think would be useful to Web developers?
15:00:29 [dom]
Tarun: I think most developers care about few different values
15:00:47 [dom]
... it's not clear they would use very detailed info
15:01:06 [dom]
... another challenge we see is around caching (e.g. different network resources for different network quality)
15:01:52 [dom]
... you might be loading new resources because you're on a different network quality, which if it is of low quality isn't counter productive
15:02:52 [dom]
... In general, server-side estimates are likely more accurate
15:03:06 [dom]
Sudeep: Thank you Tarun for a very good presentation!
15:03:46 [dom]
... Going forward, we want to look at how these APIs can and need to be improved based on Web developers needs
15:04:07 [dom]
... we'll follow up with a discussion
15:04:41 [dom]
... Next week we have a presentation by Michael McCool on Edge computing - how to offload computing from a browser to the edge using Web Workers et al
15:04:47 [dom]
... call info will be sent to the list
15:06:17 [dom]
RRSAgent, draft minutes v2
15:06:18 [RRSAgent]
I have made the request to generate dom
16:30:17 [Zakim]
Zakim has left #web-networks