Latency, Bandwidth, and Station Wagons focused primarily on the limitations of network bandwidth, and the time required to transmit massive data volumes. While that is an interesting topic, and one that produces some surprising results (like the fact that FedEx is still faster than the Internet), it is not particularly relevant to the subject of Web performance, which depends on the time required to transmit many small files.
My post highlighted It's Still The Latency, Stupid, by William (Bill) Dougherty in edgeblog. Bill's title pays homage to a famous 1996 article by Stuart Cheshire about bandwidth and latency in ISP links, It's the Latency Stupid.
Over a decade later, Bill points out, Cheshire's writings are still relevant: One concept that continues to elude many IT managers is the impact of latency on network design ... Latency, not bandwidth, is often the key to network speed, or lack thereof. This is especially true when it comes to the download speeds (or response times) of Web pages and Web-based applications. In this post I explain why, providing some supporting references and examples to support my argument.
The edgeblog debate
Supporting Bill's point about the lack of understanding of this fact, Simon Howard (posting as fragglet) responded with It's the bandwidth, stupid, in which he disagreed strongly with Bill's post. A lively discussion ensued in the comments on edgeblog, beginning here.
Bill then followed his first post with Part 2, which discussed four possible tuning actions to reduce the impacts of network latency:
- Tweak the host TCP settings
- Change the protocol
- Move the service closer to the user
- Use a network accelerator
This post was also disputed by fragglet, prompting another response from Bill in the edgeblog comments, here. These discussions are an excellent illustration of the kinds of misunderstandings that exist about the role of network latency as a determinant of performance. Perhaps the most revealing exchange is this one:
Fragglet: You are saying that latency causes network problems and that by improving latency you can improve your network. I assert that this is false. If you have latency problems, they are a symptom of network congestion. If your network is suffering from serious congestion, it probably needs more bandwidth.
Bill: Wow. It is impressive how someone can miss the point so completely so many times. While network congestion will add to latency, latency is in and of itself a problem. In a network with zero congestion, latency will still be a problem. The problem is distance. More bandwidth cannot improve upon the speed of light. Sorry. This is the whole point of my article. Latency does cause issues unrelated to bandwidth or congestion. Those issues can be reduced with planning.
Indeed! I will now explain why fragglet is wrong and Bill is right. As promised (at the end of my station wagons post) I am going base my explanation on the 2001 article by Alberto Savoia, Web Page Response Time 101 [**], which I introduced previously. That's because even though his examples are a bit dated (how many people are still using a 28.8Kbps modem today?), Alberto's article provides a concise and readable explanation of the technical principles involved.
[** Warning: even though this reprint is just 6 pages, it's a 2.5Mb pdf, so wait until you're on a fast connection].
The page response time formula
The Complete Formula
R = 2(D+L+C)+(D+C/2)((T-2)/M)
B = Min line speed (bits per second)
C = Cc + Cs
Cc = Client processing time (seconds)
Cs = Server processing time (seconds)
D = Round trip delay (seconds)
L = Packet loss (fraction)
M = multiplexing factor
OHD = Overhead (fraction)
P = Payload (bytes)
R = Response Time (seconds)
T = application turns (count)
W = Window size (bytes)
The key to this issue is a simple formula for Web page download time. Credit for the original research goes to Peter Sevcik and John Bartlett of NetForecast Inc.. Their 2001 research report, Understanding Web Performance, was published in Business Communication Review (BCR) in October 2001, and can also be downloaded as a pdf. It explains Web page response time using "The Complete Formula" shown here.
Alberto's contribution was in simplifying this formula to the version shown in his article's Figure 1, reproduced below. He notes that:
This formula makes several generalizations and assumptions, and its accuracy varies through the possible range of values (it tends to overestimate below eight seconds and underestimate over eight seconds).
Actually, even that explanation hides some further assumptions. His reference to eight seconds involves assumptions about typical connection latency and bandwidth, typical Web page sizes, and the typical number elements (separately downloadable files) that make up a Web page.
But, as he says, for the purposes of this article, however, it will do just fine, since it introduces the key variables that impact page response time and shows you how they relate to each other, without introducing excessive complexity. Alberto describes the six key variables as follows:
The six parameters of Web response time
- Page size: Page size is measured in Kbytes, and on the surface, the impact of this variable is pretty obvious: the larger the page, the longer it takes to download. When estimating page size, however, many people fail to consider all the components that contribute to page size—all images, Java and other applets, banners from third sources, etc.—so make sure you don’t overlook anything.
- Minimum bandwidth: Minimum bandwidth is defined as the bandwidth of the smallest pipe between your content and the end user. Just as the strength of a chain is determined by its weakest link, the effective bandwidth between two end points is determined by the smallest bandwidth between them. Typically the limiting bandwidth is between the users and their ISPs.
- Round trip time: In the context of Web page response time, round-trip time (RTT) indicates the latency, or time lag, between the sending of a request from the user’s browser to the Web server and the receipt of the first few bytes of data from the Web server to the user’s computer. RTT is important because every request/response pair (even for a trivially small file) has to pay this minimum performance penalty. As we shall see in the next section, the typical Web page requires several request/response cycles.
- Turns: A typical Web page consists of a base page [or index page] and several additional objects such as graphics or applets. These objects are not transmitted along with the base page; instead, the base page HTML contains instructions for locating and fetching them. Unfortunately for end-user performance, fetching each of these objects requires a fair number of additional communication cycles between the user’s system and the Web site server—each of which is subject to the RTT delay I just mentioned.
- Server processing time: The last factor in the response time formula is the processing time required by the server and the client to put together [i.e. generate and render] the required page so it can be viewed by the requester. This can vary dramatically for different types of Web pages. On the server side, pages with static content require minimal processing time and will cause negligible additional delay. Dynamically created pages (e.g., personalized home pages like my.yahoo.com) require a bit more server effort and computing time, and will introduce some delay. Finally, pages that involve complex transactions (e.g., credit card verification) may require very significant processing time and might introduce delays of several seconds.
- Client processing time: On the client side, the processing time may be trivial (for a basic text-only page) to moderate (for a page with complex forms and tables) to extreme. If the page contains a Java applet, for example, the client’s browser will have to load and run the Java interpreter, which can take several seconds.
--Alberto Savoia, Web Page Response Time 101, STQE, July/August 2001
[Minor clarifications added]
There is no question that these six variables do account for Web page response times, and do operate in the directions prescribed in the formula. As one confirmation, consider the authoritative textbook on the quantitative aspects of computing, Computer Architecture by John L. Hennessy and David A. Patterson ("H&P"). In the 3rd Edition (2003), Chapter 8 covers networks, and page 798 gives this simple formula for the total latency of a message:
Total latency =
Sender overhead + Time of flight + Message size/Bandwidth + Receiver overhead
The Web uses TCP, which transmits data as message segments comprising one or more packets, with each segment being acknowledged. So we can view a Web page download as a succession of H&P's message transmissions. Because each TCP segment is followed by an acknowledgment, H&P's time of flight variable corresponds to Alberto's round-trip time. And so summing N instances of H&P's formula produces Alberto's formula, with Turns having a value of N.
This demonstrates that the formula is correct, but if we wanted to use it to predict page download time, the task of selecting the right values to plug into those variables involves some complications.
Turn, turn, turn ...
Alberto does explain how to use the formula. But in practice, I would expect those directions to produce a worst-case estimate for response time, mainly because the value of the Turns variable should probably be lower than Alberto's method suggests. Three factors affect the estimation of turn counts:
- Alberto assumes HTTP 1.0, but HTTP 1.1 is now the predominant platform. HTTP 1.1 implemented persistent TCP connections, so that browsers do not repeatedly close and reopen TCP connections with a server, and (for large files) TCP segments will be larger, because of the way TCP slow-start operates. So most browsers will now download a Web page using fewer turns.
- Browsers can open up to two parallel connections for each distinct server domain, removing some turns from the synchronous response time path.
- Most users of most Web pages already have some page elements in their browser caches, eliminating the turns that would be needed to fetch them. On the other hand, recent research at Yahoo suggests that browser caching benefits may be less than imagined -- see Performance Research, Part 2: Browser Cache Usage - Exposed! by Tenni Theurer.
Fortunately, such complications do not prevent us from using the formula to explain the relative importance of bandwidth and latency.
Implications of the formula
To demonstrate this, consider a newer version of the formula shown in Figure 2 below. This comes from a September 2006 NetForecast report, Field Guide to Application Delivery Systems. The only difference between this and Alberto's version is in its use of curly equals, signifying "is approximately equal to":
The NetForecast paper goes on to discuss how each of the six factors affects response times, using Figure 3 (below) to summarize its points.
In addition to the issues highlighted by NetForecast, we can evaluate the relative contributions of Bandwidth and Latency to overall response time by simply comparing the two factors [Payload/Bandwidth] and [Turns x RTT] in typical Web environments.
- [Payload/Bandwidth]: Alberto's paper focuses on dial-up performance, using the example of effective bandwidth being 4Kbytes/sec. In this situation, a 120K page may take 30 seconds to transfer, before we even consider any latency effects. But as bandwidth increases, this factor becomes progressively smaller. A broadband connection of 1.5Mbps can transfer about 150Kbytes/sec, reducing this factor to 0.8 secs for a 120K Web page, or 2 secs for a 300K page.
- [Turns x RTT]: According to NetForecast, the average Keynote Business 40 home page requires 60 turns and 300 kilobytes to load. If RTT (or "ping times") are in the range of 100-200ms (typical times for many consumers), 60 turns will add 6-12 secs for network latency. Ping times will certainly be faster for broadband connections than they are for dial-up, because of all the extra analog-digital conversion delays imposed by the dial-up technology. But at a certain point, the speed of light and the latencies of the carriers' network devices (hubs, routers) limit further improvement.
The bottom line ...
In the analysis above, I showed that as connection bandwidth increases, the effect of the first factor in the response time formula approaches zero. No such scaling effect exists for the second factor -- neither latency nor turns can ever be made to approach zero. In fact, as Web sites and applications keep growing in sophistication, turns are increasing. The NetForecast paper states that over the past decade turn counts for the Keynote Business 40 Web sites have grown 12% per year and payload has grown 20% per year.
Even the newest AJAX technology may increase, not reduce, turns, as designers attempt to replace large monolithic file downloads with multiple smaller requests. Unless those smaller requests can also be designed to happen asynchronously, while the user is doing something else, they will only add to the delay due to network latency.
This is why I agree with Bill Dougherty that It's Still The Latency, Stupid! And until someone finds a way to move bits faster than the speed of light, that's not going to change.
Tags: William Dougherty, edgeblog, Simon Howard, fragglet, performance, bandwidth, throughput, latency, ping time, round-trip time, Stuart Cheshire, Web performance, Web application, download time, Alberto Savoia, Peter Sevcik, John Bartlett, NetForecast, Fedex, Royal Pingdom, station wagon, Performance Matters
Response: ISO 13485 ImplementationDuring his professional career, Mark Kaganov has published several books and technical papers in the areas of research of plastic materials, the economics of manufacturing, the technology of ion-selective electrodes, QMS, EMS and Internet business. He has also authored five international patents. The first book, ?ISO 9001 - A Practical Guide ...