Latency, Bandwidth, and Station Wagons
One concept that continues to elude many IT managers is the impact of latency on network design. 11 years ago, Stuart Cheshire wrote a detailed analysis of the difference between bandwidth and latency in ISP links [It's the Latency Stupid]. Over a decade later, his writings are still relevant. Latency, not bandwidth, is often the key to network speed, or lack thereof.
That's from It's Still The Latency, Stupid by William (Bill) Dougherty, writing in edgeblog on May 31, 2007. Bill follows that opening paragraph with a very readable explanation of the vital importance of latency (round-trip time) as a factor affecting performance in TCP networking. He uses what he calls the Sandbag Problem to illustrate his points:
Shifting a Heap of Sand
Let’s say the two of us are trying to fill sandbags. My job is to scoop sand into a container and hand the full container to you (data). Your job is to empty the container into a sandbag and hand the empty container (ACK) back to me. Occasionally you drop the container so I have to fill it again (Retransmit). If we were standing next to each other, the time it takes for me to hand the container to you, have you empty it, and hand it back to me (latency) would be very small. Now imagine there is a 6′ wall between us, and I need to hand the container over to you.
The wall changes several aspects of our filling operation. First, the size of the container must be smaller because I cannot lift the same weight over my head that I can lift at waist level. Second, the time to complete one cycle would increase because it takes longer to lift the container 6′ than it does 3′. Third, you would drop more containers so retransmissions would increase. As the wall gets taller, the problem gets worse. If the wall were 10′ tall, we would be throwing containers instead of lifting them, so they would need to be even smaller. The containers would be traveling 20′ round trip instead of 12′ so the delay would increase 75%. And we would need to send a lot more containers to move the same amount of sand.
--William (Bill) Dougherty, It's Still The Latency, Stupid, edgeblog, May 31, 2007
This is a topic that anyone who cares about the performance of Web-based applications needs to understand, because it is the key to most performance optimization and tuning initiatives. And since both Bill's post and the ensuing discussion in the readers' comments are very educational, I decided immediately to write something linking to it. But because I wanted to tie in some other ideas, and discuss Web Page Response Time 101, a paper written in 2001 by Alberto Savoia (see Four Laws of Web Site Performance), I saved a reminder in a draft post until I had time to focus on it. That was a month ago.
Well, once you drop an idea in your mental in-tray, it starts to search for its proper place in the files. Connections pop up all over the place. Sure enough, once I had made a mental note to write about that edgeblog post, I began to notice other blog posts on the same topic. A couple of weeks later, I saw a post about Latency vs. Throughput by Steve Harris in his DSO Guy blog. Instead of people moving sandbags, Steve uses trains and planes shipping coal supplies in his example:
Shipping a Load of Coal
Imagine you have to move a bunch of coal across the country and deliver it to a coal processor. Now say that on the west coast, the receiver of the coal can process 100 units of coal an hour. You have 1 train that can haul 10,000 units of coal and takes 48 hours to get to its destination. You have 1 plane that can deliver 100 units of coal in 12 hours.
If the most important thing was to have the coal soon, then the plane is faster (lower latency). But, if the most important thing is to have the coal-processing pipeline filled on the west coast over time then train is faster (higher throughput).
--Steve Harris, Latency vs. Throughput in DSO Guy, June 14, 2007
Then last week I ran across this amusing item in the Royal Pingdom blog. Shunning such analogies as sandbags or coal sacks, it gets straight to the point:
FedEx still faster than the Internet
Imagine a company with two offices in different cities, perhaps even in different countries. Each office has a 100 megabit internet connection. If the company needs to send a large amount of data from one office to the other, theoretically a 100 megabit connection can muster about 45 gigabyte in one hour if there are no bottlenecks on the way. This ends up being just over one terabyte of data in 24 hours.
In other words, for anything larger than one terabyte, it would be faster for this company to just send the data on disks for over-night delivery.
--Royal Pingdom, April 11, 2007
And once I started actually looking for connections, I found this post by Jonathan Schwartz, CEO and President of Sun:
Moving A Petabyte of Data to Hong Kong by Sailboat
I made a speech last week at which I asserted it was faster to send a petabyte of data from San Francisco to Hong Kong by sailboat, than by the internet.
I got quite a few "how can that possibly be true?" kinds of questions, so here's the math. (Full disclosure, I am a mathematician by training, which guarantees me a lifetime of small "off by one" errors in all subsequent calculations - so if I get something wrong, be gentle).
A petabyte is a thousand terabytes, which is a million gigabytes, or a billion megabytes. Or 8 billion megabits. With me so far?
So if you had a half megabit per second internet connection, which is relatively high in the US (relatively low compared to residential bandwidth available in, say, Korea), it'd take you 16 billion seconds, or 266 million minutes, or 507 years to transmit the data. Can you sail to Hong Kong faster than that? At a full megabit, just divide the time in half. Even at a hundred megabits (about the highest, generally available, of any carrier I've seen), it's a few years.
--Jonathan Schwartz, Moving A Petabyte of Data, Jonathan's Blog, Mar 12, 2007
If you've been following my links and reading more than my excerpts alone, you've probably spotted a common thread running through these articles and the comments. I already knew about the station wagon analogy, and so I was looking for it.
Jumping on the band(width station)wagon
All these writers are following an already well-trodden path, and their conclusions echo a well-known observation first made by Andrew S. Tanenbaum in his book, Computer Networks:
Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway
Many writers quote this, for example Tony Dye in Station Wagon Bandwidth. Some then go into more detail; a humorous post on Everything.com calculates the bandwidth of a station wagon at 13 petabytes/second. Theo Lagendijk in Theo's Blog has the full context of the saying:
An industry standard Ultrium tape can hold 200 gigabytes. A box 60 x 60 x 60 cm can hold about 1000 of these tapes, for a total capacity of 200 terabytes, or 1600 terabits (1.6 petabits). A box of tapes can be delivered anywhere in the United States in 24 hours by Federal Express and other companies. The effective bandwidth of this transmission is 1600 terabits/86,400 sec, or 19Gbps. If the destination is only an hour away by road, the bandwidth is increased to over 400Gbps. No computer network can even approach this.
For a bank with many gigabytes of data to be backed up daily on a second machine (so the bank can continue to function even in the face of a major flood or earthquake), it is likely that no other transmission technology can even begin to approach magnetic tape for performance. Of course, networks are getting faster, but tape densities are increasing, too.
If we now look at cost, we get a similar picture. The cost of an Ultrium tape is around $40 when bought in bulk. A tape can be reused at least ten times, so the tape cost is maybe $4000 per box per usage. Add to this another $1000 for shipping (probably much less), and we have a cost of roughly $5000 to ship 200TB. This amounts to shipping a gigabyte for under 3 cents. No network can beat that. The moral of the story is:
Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.
—Andrew S. Tanenbaum, Computer Networks. Prentice-Hall, 1996
For more detailed discussions of the price/performance of Sneakernets, an article by Jeff Atwood (plus many reader comments) on the blog Coding Horror reviews some contributions by Jim Gray to The Economics of Bandwidth.
Who said that (first)?
Naturally, a Google search produces hundreds of citations about the bandwidth of station wagons; I have included a few interesting ones I found while writing this. But as with many familiar sayings, it seems that the station wagon analogy has evolved over 25 or more years, and its exact origins are now hard to pin down. A BBC UK article highlights some of this uncertainty. An article at SysAdmin humor cites Dennis Ritchie as a possible source, but then admits ignorance. The Wikipedia article (today) on Sneakernets does attribute both the current and an earlier version of the saying to Tanenbaum, but then descends into a stew of "alleged" references and possibilities.
The original version of this quotation came much earlier; the very first problem in Tanenbaum's 1981 textbook Computer Networks asks the student to calculate the throughput of a St. Bernard carrying floppy disks (which are said to hold 250 kilobytes of data). The first USENET citation is July 16, 1985, and it was widely considered a chestnut already, possibly dating from the 1970s. Other alleged speakers included Tom Reidel, Warren Jackson, or Bob Sutterfield. The station wagon and mag tapes were the canonical version, but variants using trucks or Boeing 747s and later storage technologies such as CD-ROMs would frequently appear.
—Wikipedia article on Sneakernet [July 12, 2007]
I will start a thread in my Who said that? discussion forum, just in case anyone wants to add something more concrete!
Next ...
In my next post, I will continue my review of Alberto Savoia's 2001 paper [**], and explain the relevance of today's digression.
[** Warning: even though it's just 6 pages, it's a 2.5Mb file, so wait until you're on a fast connection].
Tags: William Dougherty, edgeblog, performance, bandwidth, throughput, latency, round-trip time, Stuart Cheshire, Web performance, Web application, download time, Alberto Savoia, Steve Harris, DSO Guy, Fedex, Royal Pingdom, Jonathan Schwartz, sailboat, Hong Kong, station wagon, Andrew Tanenbaum, Jeff Atwood, Coding Horror, Jim Gray, sneakernets, Performance Matters



Reader Comments