Tuesday, June 26, 2007

HTTP Client Handshake Characterization

Continuing in the "what's the latency on my DSL connection" theme (see outbound DNS and incoming SMTP posts), we finally get to looking at outbound HTTP connection latency.

I expected this sample to be the best of the lot for two reasons. First, these servers are self selected by members of my household and therefore have some kind of inherent locality to me. Second, web hosting implies a certain amount of infrastructure and expenditure that the other samples would not necessarily exhibit.

Let's face it - there is more than one Internet delivery system and if you will pay more you get a higher class of service (whether that be uncongested links, content distribution, etc.. etc..) and webhosting correlates with folks paying that tariff in a way that SMTP clients do not.

My expectations held up. The numbers actually perform even better than expected.

The sample:

  • 13,282 handshakes
  • 708 unique servers
  • 750 MB of HTTP data
  • 12.5 days
Here is the data, ranging from 25ms at the best, a median impressively at 48ms, and the worst case is 1.5 minutes. TCP's exponential backoff kicks in based on hardcoded multi second timers really obviously around the 99th percentile, resulting in some extreme outliers.


best - 25
10 - 35
20 - 38
30 - 40
40 - 43
50 - 48
60 - 52
70 - 61
80 - 101
90 - 116
worst - 93112 (1.5 mins)
The mean is 143 (thanks to some really big outliers due to exponential backoff of hardcoded 3 second timers), here the median is much more representative. A full 79 percent of handshakes are completed in a RTT of 100ms or less. More impressively, 70 percent were 61 ms or faster - which is certainly fast enough for most applications.

These positive results, combined with the slower DNS and SMTP client numbers show us that the client/server model of the web is provisioned much more effectively than any given link in a real peer to peer setup. This certainly shouldn't be a surprise, but it does give the lie to any an diagram of the 'net that uses unweighted edges.

This is my last post on latency from my little spot on the grid. I promise.

On a related thought, Mark Nottingham has a great post dealing with support for various aspects of HTTP in network intermediaries. I like characterization studies so much because they provide real data about what to optimize for, Mark's post provides real data about what worry about in implementations.