<?xml version="1.0" encoding="UTF-8"?>
<!--Generated by Squarespace Site Server v5.9.2 (http://www.squarespace.com/) on Sat, 13 Mar 2010 06:37:16 GMT--><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><title>Web Performance Matters</title><link>http://www.webperformancematters.com/journal/</link><description>Web Performance Matters</description><lastBuildDate>Fri, 21 Aug 2009 22:54:51 +0000</lastBuildDate><copyright>Copyright © 2007 UpRight Marketing</copyright><language>en-US</language><generator>Squarespace Site Server v5.9.2 (http://www.squarespace.com/)</generator><item><title>jQuery Library Performance Alert</title><category>Performance in the News</category><category>Rich Internet Applications</category><category>Slowness</category><category>Software products</category><category>jQuery</category><dc:creator>Chris Loosley</dc:creator><pubDate>Fri, 21 Aug 2009 22:51:53 +0000</pubDate><link>http://www.webperformancematters.com/journal/2009/8/21/jquery-library-performance-alert.html</link><guid isPermaLink="false">115864:1113404:4965463</guid><description><![CDATA[<h2 class="Metadata">jQuery libraries moved to Google AJAX Libraries API</h2>

<span class="PageIllustration" style="background-color: #0F1923;"><img title="jQuery Logo" alt="Illustration: jQuery logo" src="http://www.webperformancematters.com/storage/post-graphics/jQuery_logo_215x53.gif" /> </span>

<p>Recently I've been doing a lot Web design and development work, using the <a href="http://www.squarespace.com/" class="offsite-link-inline">Squarespace</a> platform, a "A fully hosted, completely managed environment for creating and maintaining a website, blog or portfolio." I like Squarespace because it is xhtml/CSS based and lets me focus on a site's content and appearance. I get great performance and never have to deal with installing and managing any Web server software. Normally ...</p>

<p>Last night was the exception. I was working on some site updates, and every time I refreshed a page, there was an interminable delay while the page loaded. Coincidentally, I've been experiencing intermittent short outages in my AT&T U-verse service lately, so I put this down to another AT&T problem. But it was already after midnight anyway, so after trying a few things, I finally gave up and went to bed.  </p>

<p>This morning, all was explained, thanks to a post about a recent <a href="http://developers.squarespace.com/design-coding/post/869888" class="offsite-link-inline">jQuery change</a> on the Squarespace Community Forum, by <a href ="http://www.mrhobday.com/" class="offsite-link-inline">Stuart Hobday</a>, a Web designer in the UK who (naturally) also uses Squarespace. </p>

<p><a href="http://jquery.com/" class="offsite-link-inline">jQuery</a> is a lightweight JavaScript library that many people find extremely useful for coding site UI behaviors, because its syntax lets you focus on the relationships between JavaScript, HTML, and CSS. That is why it is now very <a href="http://en.wikipedia.org/wiki/List_of_Ajax_frameworks" class="offsite-link-inline">popular</a> among Web developers who rely on JavaScript or <a href="http://en.wikipedia.org/wiki/Ajax_framework" class="offsite-link-inline">Ajax frameworks</a> to animate their sites.</p>

<p>When a Web page contains JavaScript code that uses a library, the library code must first be downloaded to the client. jQuery has always offered Web developers two ways to do this: download the code and upload it (along with the rest of your content) from your own server, or let the page download it directly from a jQuery server located at <em>code.jquery.com</em>.</p> 

<p>The first approach is ideal once your site is in production and everything is working, but while you're in development, it's convenient to be able to grab the latest library code and be sure that you'll automatically get the benefit of any recent bug fixes or enhancements. So I think a lot of developers take that approach. And I imagine that there's quite a lot of jQuery code out there, in every day use, and still relying on a download from <em>code.jquery.com</em>.</p>

<p>And here's where the performance problem arises. Yesterday, that library was relocated to <em>ajax.googleapis.com</em>: 

<blockquote> 
<h4>code.jquery.com Redirected to Google Ajax APIs</h4>

<small>Posted August 20th, 2009 by Mike Hostetler</small>

<p>Starting at 10PM MT on August 20th, <a href="http://code.jquery.com/">code.jquery.com</a> will start redirecting (301) to <a href="http://ajax.googleapis.com/">ajax.googleapis.com</a> [<a href="http://code.google.com/apis/ajaxlibs/documentation/index.html#jquery">http://code.google.com/apis/ajaxlibs/documentation/index.html#jquery</a>].</p>
<p><strong>Immediate Impact:  </strong></p>

<ul>
<li>None</li>
<li>Redirection will occur using <a href="http://en.wikipedia.org/wiki/URL_redirection#HTTP_status_codes_3xx">301 “Permanent Moved”</a></li>
<li>Packed version will be replaced with minified version</li>
</ul>
<p><strong>  Long Term:  </strong></p>
<ul>
<li>Migrate any sites using <a href="http://code.jquery.com/">code.jquery.com</a> to <a href="http://code.google.com/apis/ajaxlibs/documentation/index.html#jquery">Google’s AJAX Libraries API</a></li>
</ul>

<p>Full documentation of Google’s Ajax API are available at <a href="http://code.google.com/apis/ajaxlibs/documentation/index.html#jquery">http://code.google.com/apis/ajaxlibs/documentation/index.html#jquery</a>. <a id="more-259"></a> For your convenience here are the old URLs on <a href="http://code.jquery.com/">code.jquery.com</a> and their new Google Ajax API counterpart:</p>
<dl>
<dt>jquery-latest.js</dt>
<dd><a href="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.js">http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.js</a></dd>
<dt>jquery-latest.pack.js</dt>

<dd><a href="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js">http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js</a></dd>
<dt>jquery-latest.min.js</dt>
<dd><a href="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js">http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js</a></dd>
<dt>jquery.js</dt>
<dd><a href="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.js">http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.js</a></dd>
<dt>jquery-1.3.2.min.js</dt>
<dd><a href="http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js">http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js</a></dd>

<p> ... </p>

<p class="QuoteSource">--for the full list, see the <a href="http://blog.jquery.com/2009/08/20/codejquerycom-redirected-to-google-ajax-apis">jQuery blog</a></p>
</blockquote> 

<p></p>

<p>Although the old downloads still work, it's been taking <em>code.jquery.com</em> a long time to issue those 301 redirects to the google library. In the meantime, your browser will sit there (probably with a message at the bottom saying "waiting for code.jquery.com" or something similar). If you see that, look in the source code, usually in the &lt;head&gt; section, for a jquery download that may be causing the problem. If you control the code, fix it. Otherwise call the tech support for the site and tell them what to do.
</p>
 
<strong>Tags:</strong>
<a href="http://technorati.com/tag/jquery" rel="tag">jQuery</a>,  
<a href="http://technorati.com/tag/performance" rel="tag">performance</a>,
<a href="http://technorati.com/tag/ajax" rel="tag">ajax</a>,
<a href="http://technorati.com/tag/google" rel="tag">Google</a>,
<a href="http://technorati.com/tag/download+time" rel="tag">download time</a>,
<a href="http://technorati.com/tag/performance+matters" rel="tag">Performance Matters</a>,
<a href="http://technorati.com/tag/web+performance" rel="tag">Web performance</a>
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-4965463.xml</wfw:commentRss></item><item><title>Where Performance Meets Availability</title><category>Availability Management</category><category>Performance and Usability</category><category>Reporting on Performance</category><category>Slowness</category><category>Standards (Apdex, ITIL, etc.)</category><dc:creator>Chris Loosley</dc:creator><pubDate>Mon, 09 Feb 2009 08:01:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2009/2/9/where-performance-meets-availability.html</link><guid isPermaLink="false">115864:1113404:1278241</guid><description><![CDATA[<h2 class="Metadata">Response Time Standards for Web Sites and Web Applications</h2>

<span class="PageIllustration" ><img title="Stopwatch" alt="Illustration: Stopwatch" src="http://www.webperformancematters.com/storage/post-graphics/Stopwatch2-JPG.JPG" /> </span>

<p>Earlier posts about <a href="http://www.webperformancematters.com/journal/2007/7/10/acceptable-response-times.html" class="PMref">Acceptable Response Times</a> have discussed how a Web site or application's responsiveness can <a href="http://www.webperformancematters.com/journal/2005/10/24/delight-satisfy-or-frustrate.html" class="PMref">Delight, Satisfy, or Frustrate</a> customers. </p>

<p>Availability, on the other hand, is a measure of a system's stability. It is not a performance metric, it is a software (or hardware) quality metric. </p>

<p>So, technically speaking, performance and availability are orthogonal issues. You would not call a London taxicab a high performance vehicle just because it runs for 300,000 miles without ever breaking down. Conversely, for broken software, or a car that will not start, performance measures are irrelevant.</p>

<p>Practically speaking, however, <em>availability</em> and <em>responsiveness</em> are interconnected concepts. How slow does a user interface have to be before the application becomes unusable, and might as well be completely <em>unavailable</em>?  Slowness matters; how long will you wait before you click away? Maybe you'll come back later if you really need to see that page. If not, how often do you ever revisit a page that didn't load when you were interested in reading it?</p> 

<p>Posing the question in this way reveals that <em>performance</em> and <em>availability</em> have a complicated relationship, one that involves business objectives, the user's experience, overall workload volumes, system scalability, and cost per transaction processed:</p>

<blockquote> 
<h4>Availability</h4>

<p>While availability is not actually a measure of performance, it certainly has an effect on performance:</p>
<ul>
<li>Unstable systems often have specific effects on performance in addition to the other consequences of system failure. For example, instability affects the average processing time of large batch workloads, because of the possibility of failures during processing.</li>
<li>Additional processing is inevitable as users try to catch up with their work following a period of down time, thereby creating artificial peaks in the distribution of work, leading to higher system utilization levels and longer response times.</li>
<li>Highly available systems ... usually invest in redundant hardware and software and impose a need for careful design that often benefits performance. In this case availability (and any improved performance) is being traded for cost.</li>
</ul>

<p class="QuoteSource">--High-Performance Client/Server</p>
</blockquote> 

<p>This is one reason why I particularly like the analysis methodology underlying the <a href="http://www.webperformancematters.com/journal/2005/10/19/apdex-application-performance-index.html" class="PMref"><strong>Apdex</strong></a> [Application Performance Index] standard. Once you have defined a response-time target, an Apdex-compliant reporting tool allocates every measurement into one of just three zones: <em>Satisfied</em>, <em>Tolerating</em>, and <em>Frustrated</em>. Measurements in the <em>Satisfied</em> zone score a full point, those in the <em>Tolerating</em> zone score half a point, and those in the <em>Frustrated</em> zone score <strong>zero</strong>. Nothing. Zilch.</p>

<p>For more about Apdex, see the <a href="http://www.apdex.org/index.html" class="offsite-link-inline" >Apdex Alliance</a> Web site. To receive occasional updates about Apdex-related activities, you can sign up to become a <a href="http://www.apdex.org/supporting.aspx" class="offsite-link-inline" >supporting member</a>. It's still free. 
</p>
 
<p class="Footnote"><strong>This post</strong> contains material first published in <a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 3: Performance Fundamentals, p61.
<strong>Tags:</strong>
<a href="http://technorati.com/tag/Apdex" rel="tag">Apdex</a>,  
<a href="http://technorati.com/tag/performance" rel="tag">performance</a>,
<a href="http://technorati.com/tag/availability" rel="tag">availability</a>,
<a href="http://technorati.com/tag/response+time" rel="tag">response time</a>,
<a href="http://technorati.com/tag/download+time" rel="tag">download time</a>,
<a href="http://technorati.com/tag/performance+matters" rel="tag">Performance Matters</a>,
<a href="http://technorati.com/tag/web+performance" rel="tag">Web performance</a>
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1278241.xml</wfw:commentRss></item><item><title>Why Technorati is Not Usable</title><category>About this site</category><category>Blogs and Publications</category><category>Performance and Usability</category><dc:creator>Chris Loosley</dc:creator><pubDate>Wed, 26 Sep 2007 10:30:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/9/26/why-technorati-is-not-usable.html</link><guid isPermaLink="false">115864:1113404:1278824</guid><description><![CDATA[<span><img class="PageIllustration" title="Usability Model" alt="Illustration: Four dimensions of usability" src="http://www.webperformancematters.com/storage/post-graphics/Usability Model.jpg" />

<p>I was going to write about performance and availability today, but this was not the post I had in mind. Technorati sidetracked me. So I'm going to write about Usability instead. Because Technorati provides a good counter-example -- how <em>not</em> to build a usable Web application that satisfies and retains customers. </p>

<p>In <a href="http://www.webperformancematters.com/journal/2005/10/17/web-usability-a-simple-framework.html" class="PMref">Web Usability: A Simple Framework</a>, I described a way to think about Web site or Web application usability.</p>

<p>In a second post, <a href="http://www.webperformancematters.com/journal/2005/11/9/the-dimensions-of-usability.html" class="PMref">The Dimensions of Usability</a>, I presented the graphic shown here, and discussed the four dimensions in a bit more detail. </p>

<p>These four dimensions are not alternative functional goals, to be weighed against one another and prioritized. Web application effectiveness is a four-step challenge:</p>

<blockquote>
<p>To satisfy customers, a Web site must fulfill four distinct needs: </p>

<ul>
<li><strong>Availability:</strong> A site that's unreachable, for any reason, is useless.</li>
<li><strong>Responsiveness:</strong> Having reached the site, pages that download slowly are likely to drive customers to try an alternate site.</li>
<li><strong>Clarity:</strong> If the site is sufficiently responsive to keep the customer's attention, other design qualities come into play. It must be simple and natural to use – easy to learn, predictable, and consistent.</li>
<li><strong>Utility:</strong> Last comes utility -- does the site actually deliver the information or service the customer was looking for in the first place?</li>
</ul>
<p class="QuoteSource">--<a href="http://www.webperformancematters.com/journal/2005/10/17/web-usability-a-simple-framework.html" class="PMref">Web Usability: A Simple Framework</a>, October 17, 2005</p>
</blockquote>

<p>As in a quiz show, to win the grand prize -- satisfied customers -- you have to get it right at every stage. Fail at any one and you <em>will</em> lose customers. Fail consistently at any one, and you will be out of business.</p>

<p><strong>As I experienced their service today, Technorati seemed to be failing on all four fronts</strong>.</p>

<h3>Availability</h3>

<p>First I noticed that my browser was replacing the thumbnail portrait that usually appears (near the bottom of my sidebar) under <em>technorati links</em> with alternative text. Next I tried Technorati's link to <em>Blogs that link here</em> and, eventually, was rewarded with:</p>

<span class="SectionIllustrationInline"><img src="http://www.webperformancematters.com/storage/post-graphics/Firefox%20Connection%20Reset.JPG" alt="Illustration: Firefox Connection Reset screen" title="Firefox Connection Reset screen"/></span> 
</p>

<p>It's not that I desperately need to see my own picture there at all times, or that I think my readers are dying to see the inbound links (or <em>blog reactions</em>, as Technorati calls them). I know that widget in my sidebar has marginal utility -- a few people may use it occasionally. That's why I put it near the bottom, where it doesn't interfere with anyone's ability to browse the site. If it works, it does no harm. On the other hand, if it's broken, it becomes a distinct liability. <strong>Broken links lower the quality of the whole site.</strong>.</p>

<p>In this case, the <em>connection reset</em> error means that the Web server accepted the request, but then took so long to respond that the browser timed out. While this may not qualify as a <em>broken link</em>, it had the same effect: <strong>the requested page was unavailable</strong>. </p>

<p>Upon checking back a few hours later, the sidebar link was working. Again, not a surprise. Intermittent outages like this have been characteristic of Technorati for a long time -- see my post on <a href="http://www.webperformancematters.com/journal/2007/6/6/taming-the-technorati-monster.html" class="PMref">Taming the Technorati Monster</a>.

<h3>Performance</h3>

<p>I'm not going to dwell on this, because if you've tried to find things on Technorati lately, you already know how the service performs. For me, it typically ranges from slow to glacial, for a search engine. Maybe it's just my particular interests -- <em>Web Performance</em> and <em>Application Responsiveness</em> are not especially hot topics. Perhaps other people who are interested in more popular topics are scoring cache hits and are actually getting good Web performance. That would be ironic!</p>

<h3>Clarity</h3>

<p>Returning to my problems with the <em>blog reactions</em> widget, normally I would let this incident pass without comment. But today I actually noticed the problem while I was doing a blog search about problems with Technorati's tag indexing and search functions, because my blog seems to have fallen off their radar lately. </p>

<p>In the past, tags used in a post would be indexed and returned in a search within the hour, often within minutes. Today, Technorati's search function was sure I had not published anything for the last 27 days. But when I navigated manually to their page for my blog, they displayed excerpts of all my more recent posts.</p>

<p>Some more digging revealed that even though Technorati's site was up, and even though I could use it to navigate manually to a page listing blog reactions, the sidebar link to that same information, which Technorati's widget was generating, did not work. <strong>In any Web site or application, these kinds of internal inconsistencies are hugely frustrating. They make me doubt the accuracy and completeness of any information the application returns.</strong></p>

<p> And not surprisingly, searching for help on Technorati itself does nothing to reassure me that they will be fixing these problems anytime soon. Quite the contrary, it confirms their problems, as in this amusing response:</p>

<span class="SectionIllustrationInline" style="margin:0 0 10px 0; padding:5px;"><img src="http://www.webperformancematters.com/storage/post-graphics/Technorati%20Search%20Problems.JPG" alt="Illustration: Technorati Search Problems" title="Technorati Search Problems"/></span>

<p>(Is Technorati <a href="http://www.uprightmatters.com/blog-home/2007/9/25/lets-stop-beatin-round-the-bush.html" class="offsite-link-inline">beatin' 'round the bush</a> here? Can a search engine actually lie about what it <em>really</em> knows? :)</p>

<h3>Utility</h3> 

<p>Google's <em>Blog Search</em> feature, however, did know something. It pointed me to this sensible post by <a href="http://www.blogger.com/profile/17388497877158577422" class="offsite-link-inline">ChristineMM</a>:</p>

<blockquote>
<h4>Trying Out Blog Widgets and Tools</h4>

<p>I would like a search box on my blog that lets my readers (and me) search for content from within the pages of my blog. The Google one that I used to use was not working right, I’d search for a word that was right in front of me or even in a blog post title and it would say there were no matches, so I dumped that function.</p>

<p>I then for a long time used a Technorati box. At some point I realized it was not working well either. Again I’d search for a keyword that was in a blog post title and it would say there were no matches. So this week I deleted it from my sidebar. What is the point of having a blog reader search for a topic on my blog, be told I never blogged on it, when in reality, I actually did?</p>

<p>...</p>

<p>One last thing I’ll mention is that I get a ton, and I mean a ton of blog readers through Google primarily and also some other Internet search engines. I feel that my regular use of Technorati tags helps my blog posts be found by Google and the other search engines. This drives traffic to my blog. So if you want to drive traffic to your blog, use Technorati tags in every blog post of substance.</p>

<p class="QuoteSource">--<a href="http://thethinkingmother.blogspot.com/2007/09/trying-out-blog-widgets-and-tools.html" class="offsite-link-inline">The Thinking Mother</a>, September 23, 2007</p>
</blockquote>

<p>Well said, Christine! If an application does not deliver the service you need, it's useless. I dropped Technorati's search box some time ago, for similar reasons. The integrated Squarespace search box is 1000 times more useful for searching the blog, and Google has the Web covered far more effectively than Technorati, in my opinion.</p>

<h3>The bottom line</h3>

<p>I think Christine may have homed in on the essence of the matter, sad though it may be. Unless Technorati can recover its original sense of purpose and fix its technical problems, it's not going to survive as an independent, useful, service. Perhaps its most significant contribution will be its promotion of a standard tagging format that is easily recognized and reused <em>by other search engines</em>.</p>

<p>So I'm not giving up on my Technorati tags yet, but I'm not counting on getting much value from their blog indexing or searching tools either. I've already removed their <em>blog search</em> and <em>tag cloud</em> functions from my sidebar, and their <em>blog reactions</em> widget is now on probation. Any more problems and it will be the next to go.</p>

<p class="Footnote"><strong>Tags:</strong> 
<a href="http://technorati.com/tag/technorati" rel="tag">Technorati</a>,
<a href="http://technorati.com/tag/tagging" rel="tag">tagging</a>,
<a href="http://technorati.com/tag/blog+reactions" rel="tag">blog reactions</a>,  
<a href="http://technorati.com/tag/problems" rel="tag">problems</a>,
<a href="http://technorati.com/tag/usability" rel="tag">usability</a>,
<a href="http://technorati.com/tag/availability" rel="tag">availability</a>,
<a href="http://technorati.com/tag/consistency" rel="tag">consistency</a>,
<a href="http://technorati.com/tag/clarity" rel="tag">clarity</a>,
<a href="http://technorati.com/tag/utility" rel="tag">utility</a>,
<a href="http://technorati.com/tag/Christinemm" rel="tag">Christinemm</a>,
<a href="http://technorati.com/tag/Thinking+Mother" rel="tag">Thinking Mother</a>,
<a href="http://technorati.com/tag/web+performance" rel="tag">Web performance</a>,
<a href="http://technorati.com/tag/performance+matters" rel="tag">Performance Matters</a>
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1278824.xml</wfw:commentRss></item><item><title>Human Factors and Blog Design</title><category>About this site</category><category>Blogs and Publications</category><dc:creator>Chris Loosley</dc:creator><pubDate>Sat, 22 Sep 2007 07:30:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/9/22/human-factors-and-blog-design.html</link><guid isPermaLink="false">115864:1113404:1269083</guid><description><![CDATA[<div class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/coding-horror-official-logo-small.png" alt="Illustration: Coding Horror logo" title="Coding Horror logo"/>
<br /><span class="PictureCaption"><a href="http://www.codinghorror.com/blog/" class="offsite-link-inline">Coding Horror</a></span></div>

<p>The best products are designed with <a href="http://en.wikipedia.org/wiki/Human_factors" class="offsite-link-inline">Human Factors</a> in mind. That's why <a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=95637" class="PMref">Web design and usability</a> is a frequent topic of my <em>Web Performance Matters</em> blog.</p>

<p>Jeff Atwood's blog -- <em>Coding Horror</em> -- focuses on <em>programming and human factors</em>. And according to a recent <a href="http://www.dailyblogtips.com/interview-with-jeff-atwood-from-coding-horror/" class="offsite-link-inline">interview</a> with Jeff on the site <em>Daily Blog Tips</em>, "the blog is attracting over 500,000 unique visitors every month, and it also counts 60,000 RSS readers, meaning that Jeff probably knows what he is talking about".</p>

<p>The <em>Coding Horror</em> logo was originally created to mark examples of dangerous code in the programming classic <a href="http://www.amazon.com/exec/obidos/ASIN/0735619670/" class="offsite-link-inline">Code Complete</a> by <a href="http://www.stevemcconnell.com/" class="offsite-link-inline">Steve McConnell</a>, which <a href="http://www.codinghorror.com/blog/archives/000021.html" class="offsite-link-inline">Jeff rates</a> as his "all-time favorite programming book."</p>

<p>I have <a href="http://www.amazon.com/gp/reader/0735619670/ref=sib_books_pg/105-5052943-5499615?ie=UTF8&keywords=Chris%20Loosley&p=S002&checkSum=X870d5rEZ6o3p7%252FpTRIE33KEyKo8%252FsKXBj4Qz4k3Ob0%253D" class="offsite-link-inline">recommended <em>Code Complete</em></a> myself.  <a href="http://www.codinghorror.com/blog/archives/000020.html" class="offsite-link-inline">Jeff's favorite books</a> are on my shelf too. So I respect his judgment and recommend his blog, which I have added to the blogroll on <em>Web Performance Matters</em>.
</p>

<h3>Thirteen blog clichés</h3>

<p>Jeff recently published <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a>, a post summarizing his "opinions about what makes blogs work well, and what makes blogs sometimes not work so well." These are presented as a list of common mistakes to avoid (or <a href="http://en.wikipedia.org/wiki/Anti-pattern" class="offsite-link-inline">anti-patterns</a>). If you have a blog, or are designing one, you've probably read similar articles before. Even so, Jeff's checklist is worth a look. All such lists tend to contain a core set of common guidelines to follow and/or pitfalls to avoid, but some of Jeff's opinions step outside the conventional wisdom.</p>

<p>Because I maintain two blogs -- <em>Web Performance Matters</em> and <a href="http://www.uprightmatters.com/" class="offsite-link-inline"><em>UpRight Matters</em></a> -- I decided to rate both blogs against Jeff's criteria. Here are edited versions of his recommendations, and my responses. To read Jeff's full discussions of each guideline, see the original. And for the full story, see the many responses posted by Jeff's readers in the comments section of his blog. </p>

<blockquote class="highlight">
<span class="full-image-float-right"><img src="http://www.webperformancematters.com/storage/post-graphics/blog-calendar_opt.jpg" title="Blog Cliche -- Calendar" alt="Illustration: Blog Cliche -- Calendar" /></span>

<h4>1. The Useless Calendar Widget</h4>

<p>I can't think of a <em>single</em> time I have ever found the blog calendar widget helpful. My computer already has a calendar function, so it's not like I need another calendar displayed in my web browser.</p>
<p>Every post carries an obvious datestamp, so I can easily discern when it was published. But knowing whether someone posted an entry on the third Tuesday of the month? Utterly useless. </p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>I agree! Someone reading a blog like <a href="http://www.dailykos.com/" class="offsite-link-inline">Daily Kos</a>, that publishes daily about politics or current affairs, might find a calendar useful. But a calendar isn't appropriate for our content, so we've never thought of including one. Even if we had, the Squarespace publishing platform we use (see bottom of sidebar) doesn't offer such a blog calendar widget -- another sign that it's not in great demand.</p> 

<blockquote class="highlight">
<h4>2. Random Images Arbitrarily Inserted In Text</h4>

<p>One of the cardinal rules of <a href="http://www.useit.com/papers/webwriting/" class="offsite-link-inline">web writing</a> is to <em>avoid large blocks of text</em>. There are plenty of <a href="http://www.useit.com/alertbox/9703b.html" class="offsite-link-inline">excellent web writing guides</a> that exhort you to break up your text, using bullets, numbered lists, quotes, paragraph breaks, images -- anything to avoid creating an intimidating wall of dense, impenetrable text. </p>
<p>But like all good advice, (this) can be taken too far. For example, when you find yourself inserting random pictures into your writing for the sole purpose of breaking up the text. As the old adage goes, <em>a picture is worth a thousand words</em>. <strong>But you should no more insert a random image into your writing than you would insert a thousand random words into your writing.</strong></p>
<p>Images are <em>not</em> glorified paragraph breaks. Images should contribute to the content and meaning of the article in a substantive way. And if they don't, they should be cut. Mercilessly.</p>
<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>I know I am sometimes guilty of writing long posts. But I won't write a thousand words unless I have something worthwhile (I hope :-) to explain, and I try to keep all my posts interesting by breaking up the text using <a href="http://www.webperformancematters.com/journal/2006/3/25/managing-rias-6-measurement-challenges.html" class="PMref">headings</a> or <a href="http://www.webperformancematters.com/journal/2007/8/21/asynchronous-architectures-4.html" class="PMref">images</a>. And I promise that we will <em>never</em> include an image that bears no relationship to the subject matter!</p>

<blockquote class="highlight">
<h4>3. No Information on the Author </h4>

<p>Every time a reader encounters a blog with no name in the byline, no background on the author, and no simple way to click through to find out <em>anything</em> about the author, it devalues not only the author's writing, but the credibility of blogging in general.</p>

<p>Maintaining a blog of any kind takes quite a bit of effort. It's irrational to expend that kind of effort without putting your name on it so you can benefit from it. And so we can too. It's a win-win scenario for you, Mr. Anonymous.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>I agree! That's why we provide <a href="http://www.webperformancematters.com/objectives/" class="PMref">brief</a> <a href="http://www.uprightmatters.com/author-184886/" class="UMref">introductions</a> and <a href="http://www.uprightmarketing.com/principals/" class="UMref">longer</a> author pages.</p>

<blockquote class="highlight">
<h4>4. Excess Flair</h4>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/social-bookmarks.png" alt="Illustration: Social bookmark icons" title="Social bookmark icons"/></span>

<p>Blogs work because they're simple. When we clutter up our blogs with a zillion widgets, features, and add-ons, we're destroying an essential part of what makes blogs worthwhile. Examples include "crazy" JavaScript image loading techniques, annoying pop-up image previews of links, and pictures of the last 10 visitors to your blog.</p>
    
<p>Before adding any new "feature" to your blog, consider whether its value outweighs the additional complexity it introduces. </p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>This recommendation can be controversial -- see the comments on <a href="http://www.codinghorror.com/blog/archives/000587.html" class="offsite-link-inline">Jeff's original post</a> on this topic. But I agree with Jeff, and I do try to <em>reduce</em> the clutter whenever possible. See my decision on item 6 below, for example. </p>

<blockquote class="highlight">
<h4>5. The Giant Blogroll</h4>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/giant-blogroll-2.JPG" alt="Illustration: Giant Blogroll" title="Giant Blogroll"/></span>

<p>Citing your references and influences is a great and necessary thing, but obsessively listing every single blog you read is just noise. If you're really reading this many blogs, you should be linking to them organically in your blog posts, in a sort of natural quid pro quo. Wearing a giant blogroll on your sleeve is an empty gesture that feels artificial and insincere.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>Agreed! On <em>Web Performance Matters</em>, I aim to keep <a href="http://www.webperformancematters.com/journal/2007/5/20/a-web-performance-blogroll.html" class="PMref">my blogroll</a> focused, and group the links into categories. We have not added a blogroll on <em>UpRight Matters</em> yet, but we plan to adopt the same approach.</p>

<blockquote class="highlight">
<h4>6. The Nebulous Tag Cloud</h4>

<span class="full-image-float-right"><img src="http://www.webperformancematters.com/storage/post-graphics/tagcloud.png" alt="tagcloud.png" title="tagcloud.png"/></span>
<p>Tagging content easily beats organizing everything into <a href="http://www.codinghorror.com/blog/archives/000246.html" class="offsite-link-inline">hierarchical folders</a>, and tag categories on blogs are moderately useful, particularly for bloggers who tend to bounce around among many different topics. What I've <em>never</em> found useful, however, is the stereotypical tag cloud visualization, where the size of the tag word varies with its frequency. </p>

<p>The perception is that tag cloud visualizations are cool, like badges of honor for the tagging club. The reality is that tag cloud visualizations are chaotic, noisy, and unusable. Keep the tagging, lose the cloud. A simple sorted list of tags, along with the number of posts associated with each tag, is much more effective.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>Content tagging and indexing is a complex subject, and one I have given much thought while developing our blogs. [I even read <a href="http://www.everythingismiscellaneous.com/" class="offsite-link-inline">Everything is Miscellaneous</a>, and started to write a post about it until I realized that I was just adding to the <em>echo-chamber</em> on that topic. See item 11 below.]</p>

<p>I believe that tagging with keywords has value, but the resulting <a href="http://en.wikipedia.org/wiki/Folksonomy" class="offsite-link-inline"><em>folksonomy</em></a> is most useful as a supplement to, not a replacement for, a carefully designed and consistently applied classification scheme or <a href="http://en.wikipedia.org/wiki/Information_architecture" class="offsite-link-inline">information architecture</a>. Therefore we will continue to index our content using both methods. </p>

<p>However, despite investing a lot of time implementing a <a href="http://www.webperformancematters.com/journal/2007/5/27/customizing-the-technorati-tag-cloud.html" class="offsite-link-inline">Technorati tag cloud</a> for <em>Web Performance Matters</em>, which has been sitting in my sidebar for 4 months, I have come to the same conclusion as Jeff -- it takes up space without adding any value. So I've now removed it.</p>

<p>I see this as an example of the dynamic nature of blogging. It's agile publishing: you don't have to get everything right the first time. You can create something, try it out for a while, refine it, or remove it altogether. In this vein, revising your blog's layout is greatly simplified if your publishing platform is CSS-based, like <a href="http://www.squarespace.com/?partnerTag=cj&planTag=blg" class="offsite-link-inline">Squarespace</a>.</p> 

<blockquote class="highlight">
<h4>7. Excessive Advertisements</h4>

<p>Advertising is a fact of life, but your blog is not <a href="http://www.flickr.com/photos/stuckincustoms/440698504/" class="offsite-link-inline">Times Square</a>. Does every square inch of whitespace <i>have</i> to be filled with paid links, Google AdSense, and ad banners? </p>

<p>Here's a related article on <a href="http://www.sitepronews.com/archives/2007/jan/15.html" class="offsite-link-inline">blog usability</a> that's a perfect -- even ironic -- example of how you can hurt your usability with excessive, obnoxious advertising. It's everywhere.</p>

<p>It is almost <i>never</i> in the reader's interest to see advertisements, so tread very lightly, and be respectful of your audience. If you take the time to advertise responsibly, you may find that readers appreciate you for it.</p>

<p>Well, probably not, but it can't hurt to try.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>We do have a few ads. I try to organize them tastefully, so that they don't interfere with the content.</p>

<blockquote class="highlight">
<h4>8. This Ain't Your Diary</h4>

<p>Let's be perfectly clear: readers aren't coming to your blog <a href="http://www.codinghorror.com/blog/archives/000536.html" class="offsite-link-inline">to read about you</a>. They're coming to find out <a href="http://headrush.typepad.com/creating_passionate_users/2005/01/users_shouldnt_.html" class="offsite-link-inline">what it can do for them</a>.</p>

<span class="full-image-float-right"><img src="http://www.webperformancematters.com/storage/post-graphics/diary.jpg" alt="Illustration: Diary" title="Diary"/></span>

<p>That said, blogs are a place for writers to find an interested audience, and a place for readers to find a helpful peer and a unique voice. It's OK to <a href="http://software.ericsink.com/entries/Goodbye_Sadie.html" class="offsite-link-inline">be yourself</a>; at some level, it is a cult of personality: people are reading not only because your content is useful to them, but because they like you. </p>

<p>It's normal to inject a regular dose of yourself into the conversation. But like Tabasco sauce and other powerful seasonings, a little YOU goes a long way. A <i>really</i> long way.  Write accordingly.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>Agreed! I won't be writing about my experiences remodeling my house, unless I see a connection worth exploring.</p>

<blockquote class="highlight">
<h4>9. Sorry I Haven't Written in a While</h4>

<p>If you haven't posted anything new to your blog in a while, don't waste our time with apologies. Just write! The best apology is new and improved content. Maybe with a wee bit more consistency this time, though:</p>

<ul class="grouptight">
<li>Pick a schedule you can live with, and stick to it</li>
<li>Don't produce <a href="http://www.codinghorror.com/blog/archives/000910.html" class="offsite-link-inline">substandard</a> posts, just to keep to a schedule</li>
<li>Talent is far less important than <a href="http://www.codinghorror.com/blog/archives/000187.html" class="offsite-link-inline">enthusiasm</a></li>
</ul>
 
<p>And the best way to demonstrate your enthusiasm -- and to improve -- is to get out there and <i>write</i>. Regularly.<p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>My <a href="http://www.webperformancematters.com/objectives/" class="PMref">objectives</a> for <em>Web Performance Matters</em> are much the same as when I started the blog two years ago -- <em>to contribute <strong>an organizing framework</strong> and <strong>a regular supply of ideas</strong></em>.  I have to admit, I've had a few long gaps in my writing. I've also apologized and promised to do better! But after reading Jeff's advice, I'm in a bind. Should I apologize for apologizing? I guess not. I'll just keep writing. 

<blockquote class="highlight">
<h4>10. Blogging About Blogging</h4>

<p>I find meta-blogging -- blogging about blogging -- <i>incredibly</i> boring. I said as much in a recent <a href="http://www.dailyblogtips.com/interview-with-jeff-atwood-from-coding-horror/" class="offsite-link-inline">interview</a> on a site that's all about blogging (hence the title, Daily Blog Tips). If you accept the premise that most of your readers are <i>not</i> bloggers, then it's highly likely they won't be amused, entertained, or informed by a continual stream of blog entries on the art of blogging.</p>
        
<p>Meta-blogging is like masturbating. Everyone does it, and there's nothing wrong with it. But writers who regularly get out a little to explore other topics will be healthier, happier, and ultimately more interesting to be around -- regardless of audience.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>Of course, Jeff's post and this one <em>are</em> about blogging. But the reason we care enough to research and write about these ideas comes back to the <em>Human Factors</em> dimension. As we work to improve our ability to serve and communicate more clearly, we want to share what we learn to help you connect with your own readers and communities.</p>

<p><a href="http://www.uprightmatters.com/author-184886" class="UMref">Cynthia</a> writes:


<div class="InlineTextBox">
<p>Since my <a href="http://www.webperformancematters.com/objectives/" class="UMref">blogging objective</a> is to make a competitive difference in the world, I took special note of Jeff’s post on <em>Thirteen Blog Clichés</em>, and of his focus on human factors.  Why?  Because I want to have conversations that matter -- presumably with humans! </p>

<p>While Chris and I stay focused on our respective blogging objectives, we work to apply the right technology to enhance understanding and to make the experience of conversing valuable for readers. Jeff’s opinions about effective technologies and techniques resonated with my blogging experience and my blogging <strong><em>intentions</em></strong>. If you have similar intentions, we think you will find them useful too.</p>
</div>

<blockquote class="highlight">
<h4>11. Mindless Link Propagation</h4>

<p>One of the most pernicious problems in blogging is the <a href="http://chris.pirillo.com/2006/08/18/10-ways-to-eliminate-the-echo-chamber/" class="offsite-link-inline"> echo chamber effect</a>. Most blog entries merely regurgitate what other people have said or add vapid commentary on top of news articles and press releases. Only the tiniest fraction of blog entries are original content, and only a tiny fraction of that fraction is worth your time.</p>
     
<p>If everyone knows about it, what value does that information have? My advice here is almost contrarian: if everyone else is talking about it, that means you should <em>avoid</em> talking about it. Switch things up. Seek out uncommon sites with unique information. If all you can find to talk about is what's already popular, you're not trying hard enough. Form your own opinion. Do your own research. Go out of your way to blaze a new trail and create something we haven't already seen hundreds of times before.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>That is so true! This issue is exactly what stopped me from completing my review of David Weinberger's book, <em>Everything is Miscellaneous</em>, even though I had read the book and dozens of pages of reviews by others. But then I found myself summarizing other readers' feedback, which amounted to a <em>folksonomy</em> about the topic of <em>folksonomies</em> -- that is, a <em>meta-folksonomy</em>. I began to wonder whether, if I were to review other similar discussions, and add some Technorati tags to my review, would I then be contributing to a <em>meta-meta-folksonomy</em>?!</p>

<p>People criticize <em>Web 2.0</em>, and the <em>blogosphere</em> in general, claiming that it's just a giant echo-chamber in which uninformed opinion is amplified, and true expertise is drowned out by uneducated bleating, like the sheep in <a href="http://en.wikipedia.org/wiki/Animal_Farm" class="offsite-link-inline">Animal Farm</a>. While that criticism may sometimes be true, it is not the whole story, and it does not do justice to the educational power of the Web.</p>

<p>In this case, I concluded that the world did not really need me to summarize what everyone else was saying about David Weinberger's opinions about tagging, folksonomies, and the wisdom of crowds. In fact, to write such a post would just fuel the critics' argument (and by the way, it was turning into a very long post). So I went back to writing about Web performance!</p>

<blockquote class="highlight">
<h4>12. Top (n) Lists</h4>

<span class="full-image-float-right"><img src="http://www.webperformancematters.com/storage/post-graphics/Following%20Instructions%20for%20Dummies.JPG" alt="Illustration: Following Instructions for Dummies" title="Following Instructions for Dummies"/></span>

<p>Yes, exactly like this one.</p>

<p>Lists are a great convention. They make sense, people understand them, and they're a logical way to structure your writing. But <a href="http://www.codinghorror.com/blog/archives/000932.html" class="offsite-link-inline"> don't let lists become a crutch</a>. I'm always taken aback when I see the "most popular" posts on a blog dominated by Top (n) Lists. Shortcuts are only meaningful if you know what it is, exactly, you're cutting.</p>
    
<p>If you find that the Top (n) List convention is a go-to tool in your writing toolkit, consider re-balancing your writing portfolio with longer, more in-depth pieces as well. Not everything should be a sprint; throw a few small marathons in there somewhere to complement your short distance skills.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>I agree that it's good to aim for a balance of short and longer posts. Having written technical articles for years before blogs existed, I'm actually more likely to write long essays -- see my comment on item 2 above. So to balance those marathon posts, I've found that deliberately trying to compose shorter posts whose structure is a list of guidelines or principles helps me keep that balance. My series on <a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044" class="PMref">Performance Wisdom</a> contains several examples of this approach.

<blockquote class="highlight">
<h4>13. No Comments Allowed</h4>

<p><a href="http://www.codinghorror.com/blog/archives/000538.html" class="offsite-link-inline">A blog without comments is not a blog</a>. Yes, there are exceptions for massively popular blogs where <a href="http://many.corante.com/archives/2007/07/20/spolsky_on_blog_comments_scale_matters.php" class="offsite-link-inline">comments don't scale</a>. But until that applies, the value of the two-way conversation far outweighs any minor inconvenience on your part. It's an open secret in the blogging community that <b>the comments are often better than the original blog entry itself</b>. Would you browse Amazon without the user reviews?</p>

<p>Don't be afraid of comments. Embrace them. Moderate them. The community will respect you for it, and your blog will be better for it as well.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>
</ol>

<p>We want comments! One of the primary reasons for blogging is to have conversations about the things that matter. It's not about us. Well, most of the time, anyway -- so we won't apologize if our writing occasionally lapses into introspection or vanity.</p>

<p class="Footnote"><strong>Note:</strong> This is cross-posted on <a href="http://www.webperformancematters.com/journal/2007/9/21/human-factors-and-blog-design.html" class="PMref"><em>Web Performance Matters</em></a> and <a href="http://www.uprightmatters.com/blog-home/2007/9/22/human-factors-and-blog-design.html" class="UMref"><em>UpRight Matters</em></a>. </p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/human+factors" rel="tag">human factors</a>,
<a href="http://technorati.com/tag/Jeff+Atwood" rel="tag">Jeff Atwood</a>,
<a href="http://technorati.com/tag/Coding+Horror" rel="tag">Coding Horror</a>,
<a href="http://technorati.com/tag/blog+clichés" rel="tag">blog clichés</a>,
<a href="http://technorati.com/tag/calendar+widget" rel="tag">calendar widget</a>,
<a href="http://technorati.com/tag/random+images" rel="tag">random images</a>,
<a href="http://technorati.com/tag/social+bookmarks" rel="tag">social bookmarks</a>,
<a href="http://technorati.com/tag/blogroll" rel="tag">blogroll</a>,
<a href="http://technorati.com/tag/tagging" rel="tag">tagging</a>,
<a href="http://technorati.com/tag/folksonomy" rel="tag">folksonomy</a>,
<a href="http://technorati.com/tag/Web20" rel="tag">Web 2.0</a>,
<a href="http://technorati.com/tag/tag+cloud" rel="tag">tag cloud</a>,
<a href="http://technorati.com/tag/blog+advertising" rel="tag">blog advertising</a>,
<a href="http://technorati.com/tag/meta-blogging" rel="tag">meta-blogging</a>,
<a href="http://technorati.com/tag/echo+chamber" rel="tag">echo chamber</a>,
<a href="http://technorati.com/tag/David+Weinberger" rel="tag">David Weinberger</a>,
<a href="http://technorati.com/tag/Everything+is+Miscellaneous" rel="tag">Everything is Miscellaneous</a>,
<a href="http://technorati.com/tag/Animal+Farm" rel="tag">Animal Farm</a>,
<a href="http://technorati.com/tag/blog+comments" rel="tag">blog comments</a>,
<a href="http://technorati.com/tag/blog+design" rel="tag">blog design</a>,
<a href="http://technorati.com/tag/Web+Performance+Matters" rel="tag">Web Performance Matters</a>,
<a href="http://technorati.com/tag/UpRight+Matters" rel="tag">UpRight Matters</a>
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1269083.xml</wfw:commentRss></item><item><title>If I Had A Hammer ...</title><category>Life, The Universe, and Everything</category><category>Optimization and Tuning</category><category>Software Engineering</category><dc:creator>Chris Loosley</dc:creator><pubDate>Mon, 10 Sep 2007 21:45:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/9/10/if-i-had-a-hammer.html</link><guid isPermaLink="false">115864:1113404:1075611</guid><description><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Hammer.JPG" alt="Illustration: Ultimate Geeks Multi-Tool Hammer" title="Ultimate Geeks Multi-tool Hammer"/></span>

<p><em>If I had a hammer<br />
I'd hammer in the morning<br />
I'd hammer in the evening<br />
All over this land<br />
I'd hammer out danger<br />
I'd hammer out a warning<br />
I'd hammer out love between my brothers and my sisters<br />
All over this land</em></p>
<p class="QuoteSource">--Pete Seeger and Lee Hays, 1949 [<a href="http://en.wikipedia.org/wiki/If_I_Had_a_Hammer" class="offsite-link-inline">Wikipedia</a>]</p>

<p>In May 2007, after I wrote about <a href="http://www.webperformancematters.com/journal/2007/5/15/controlling-what-you-cant-measure.html" class="PMref"><em>Controlling What You Can't Measure</em></a>, I had a conversation with Ben Simo (see the comments) about metrics and tools, during which I wrote:

<blockquote>
<p>Hammers and chisels can be very dangerous, but carpenters use them every day, and we don't brand them as "bad tools" just because some people don't know how to use them properly. Human nature being what it is, there will always be some incompetent fools who try to use a hammer and chisel to drive in a screw or open a bottle of beer, just because they have those tools handy. </p>
</blockquote> 

<p>A few days later, while reading a series of articles on <a href="http://www.webperformancematters.com/journal/2007/5/22/performance-engineering.html" class="PMref">Performance Engineering</a> written by Scott Barber, I noticed the following quotation:</p>

<blockquote>
<p>All parts should go together without forcing. You must remember that the parts you are reassembling were disassembled by you. Therefore, if you can't get them together again, there must be a reason. <strong>By all means, do not use a hammer.</strong></p>
<p class="QuoteSource">--IBM maintenance manual, 1925 [emphasis added]</p>

</blockquote>

<p>This priceless piece of advice is quoted by Scott in <a href="http://www-128.ibm.com/developerworks/rational/library/4266.html" class="offsite-link-inline">part 9</a> of his 14-part series. At the time I made a note to write something about this, but after that it just sat in the "ideas for blog posts" folder for the next 3 months.</p>

<p>Until today, when I happened across the <a href="http://nexus404.com/Blog/2007/04/15/clever-multi-tool-hammer/" class="offsite-link-inline">Ultimate Geeks Multi-Tool Hammer</a>. Now, if I had <em>this</em> hammer, it turns out that I actually <em>could</em> use it to drive in a screw or open a bottle of beer, without being branded as an incompetent fool!</p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/Scott+Barber" rel="tag">Scott Barber</a>,
<a href="http://technorati.com/tag/Ben+Simo" rel="tag">Ben Simo</a>,
<a href="http://technorati.com/tag/multi-tool" rel="tag">multi-tool</a>,
<a href="http://technorati.com/tag/hammer" rel="tag">hammer</a>,
<a href="http://technorati.com/tag/Performance+Matters" rel="tag">Performance Matters</a>
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1075611.xml</wfw:commentRss></item><item><title>Scalability is Not Optional</title><category>Architecture</category><category>Blogs and Publications</category><category>Software Engineering</category><dc:creator>Chris Loosley</dc:creator><pubDate>Fri, 07 Sep 2007 22:00:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/9/7/scalability-is-not-optional.html</link><guid isPermaLink="false">115864:1113404:1216908</guid><description><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Kent%20Langley.jpg" alt="Illustration: Kent Langley" title="Kent Langley"/></span>

<p>My recent post, <a href="http://www.webperformancematters.com/journal/2007/8/21/asynchronous-architectures-4.html" class="PMref">Asynchronous Architectures [4]</a>, summarized a presentation by Werner Vogels at the 2007 <a href="http://qcon.infoq.com/london-2007/conference/" class="offsite-link-inline">QCON</a> conference in London.</p>

<p>A subsequent post by <a href="http://www.productionscale.com/kent/" class="offsite-link-inline">Kent Langley</a> in his new <a href="http://www.productionscale.com/" class="offsite-link-inline">ProductionScale</a> blog -- entitled <a href="http://www.productionscale.com/home/2007/8/11/getting-rid-of-the-relational-database.html" class="offsite-link-inline"><em>Getting Rid of the Relational Database</em></a> -- supports the arguments advanced by Vogels.</p>

<p>Describing the relational database model as "the proverbial ball and chain in the relationship between scalable applications and the underlying infrastructure," Kent writes:</p>

<blockquote>
<p>The quest for seamless linear growth for technology applications is being hindered by the “elephant database.”</p>

<p>What would Amazon do? In a recent talk2 at QCON London Werner Vogel, the CTO of Amazon.com clearly noted that the relational database model is a essentially outdated for the needs of modern applications as a primary data storage medium. In other words, it is simply to slow and cumbersome.</p>

<p>Additionally, Mr. Vogel makes a critical point that in many, many cases relational databases are simply not necessary. Simple key/value pairs (hashes) are all you need.</p>

<p class="QuoteSource">--Joseph Kent Langley, <a href="http://www.productionscale.com/home/2007/8/11/getting-rid-of-the-relational-database.html" class="offsite-link-inline"><em>Getting Rid of the Relational Database</em></a>, August 11, 2007</p>
</blockquote>

<p>Kent goes on to describe why he believes "you should break out of a one-size-fits-all way of thinking when it comes to databases, data storage, and scalable systems. Vertical scaling by throwing hardware at it is no longer sufficient for modern web scale applications".</p>

<p>Kent also points to <a href="http://future.gigaom.com/2007/08/10/data-20-how-the-web-disrupts-our-relational-database-world/" class="offsite-link-inline">Data 2.0: How the Web disrupts our relational database world</a>, which he admits he has not read. Maybe if he had read the article he might have omitted this link.</p>

<p>Although the author of that article, Nitin Borwankar, supports Vogel's general conclusions, his style is to make sweeping pronouncements. He advances no technical arguments that lead up to his conclusion that <em>"The days of Data 1.0 are past. The days of Data 2.0 are dawning, and it promises to be very disruptive for mainstream database architectures on the Web"</em>.</p>

<h3>Other posts about scalability</h3>

<p>I recommend Kent's new blog, and I'm adding it to my blogroll. It looks as if it will contain regular discussions of performance topics. For example, since launching the blog in early August, Kent has already written about:</p>

<ul class="group">
<li><a href="http://www.productionscale.com/home/2007/8/5/scalable-lamp-caching.html" class="offsite-link-inline">Scalable LAMP: Caching</a></li>
<li><a href="http://www.productionscale.com/home/2007/8/5/varnish-a-web-accelerator.html" class="offsite-link-inline">Varnish - A Web Accelerator</a> [a fast reverse proxy caching system]</li>
<li><a href="http://www.productionscale.com/home/2007/8/21/the-power-of-mod_deflate.html" class="offsite-link-inline">The Power of mod_deflate</a> [about content compression]</li>
<li><a href="http://www.productionscale.com//home/2007/8/22/scalability-and-performance-a-few-resources.html" class="offsite-link-inline">Scalability and Performance: A Few Resources</a></li>
<li><a href="http://www.productionscale.com/home/testing-with-wbox.html" class="offsite-link-inline">Testing with WBox</a> [about scalability testing]</li>
</ul>

<h3>Proofread before publishing</h3>

<p>I do have one small complaint. As a blogger, I know the feeling of writing down my thoughts and wanting to get them published -- right now! The short publishing cycle is one of the attractions of a blog. But if Kent could just curb his enthusiasm for long enough to proofread his posts once carefully before hitting that <em>Publish</em> button, his thoughts would be a lot easier to follow:</p>

<blockquote>
<h4>Huh?</h4>
<ul class=group>
<li>Afterwards, I will follow up with a brief analysis or executive summary if you will of this infobit might mean for businesses. <span class="aside">Needs commas around "if you will," and "of" should be "of what".</span></li>
<li>By way if example using the techniques of bond arbitrage Stonebraker notes quite earnestly that it is a “latency arms race.” <span class="aside">Needs commas around "using ... arbitrage," and "if" should be "of".</span></li>
<li>So, is this inconclusive evidence of the pending death of the Relational Database? Of course not. <span class="aside">"Inconclusive" should be "conclusive".</span></li>
<li>But, it is trend spotting in that people are again noticing that there are other ways and that those other ways just might quite faster with modern applications. <span class="aside">"might quite" should be "might be".</span></li>
</ul>
<p class="QuoteSource">--Joseph Kent Langley, <a href="http://www.productionscale.com/home/2007/8/11/getting-rid-of-the-relational-database.html" class="offsite-link-inline"><em>Getting Rid of the Relational Database</em></a>, August 11, 2007</p>
</blockquote>

<p>Although I can infer what Kent is trying to say here, these glitches spoil the overall effect by forcing me to re-read his sentences to get the point. In elementary school, I learned the old English proverb: <a href="http://www.usingenglish.com/reference/idioms/spoil+the+ship+for+a+ha%27pworth+of+tar.html" class="offsite-link-inline"><em>Don't spoil the ship for a ha'pworth of tar</em></a>. I believe bloggers should consider readability to be as important as their message, if they want to build a faithful following. </p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,
<a href="http://technorati.com/tag/Kent+Langley" rel="tag">Kent Langley</a>,
<a href="http://technorati.com/tag/ProductionScale" rel="tag">ProductionScale</a>,
<a href="http://technorati.com/tag/Werner+Vogels" rel="tag">Werner Vogels</a>,
<a href="http://technorati.com/tag/QCON" rel="tag">QCON</a>,
<a href="http://technorati.com/tag/relational+database" rel="tag">relational database</a>,
<a href="http://technorati.com/tag/Web+applications" rel="tag">Web applications</a>,
<a href="http://technorati.com/tag/Performance+Matters" rel="tag">Performance Matters</a>
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1216908.xml</wfw:commentRss></item><item><title>Managing for Business Effectiveness</title><category>Articles and White Papers</category><category>Business Perspectives</category><category>Management Wisdom</category><dc:creator>Chris Loosley</dc:creator><pubDate>Wed, 29 Aug 2007 07:01:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/8/29/managing-for-business-effectiveness.html</link><guid isPermaLink="false">115864:1113404:964217</guid><description><![CDATA[<div class="PageWisdomWrapper">

<div class="WisdomTitle" >
<h3>Drucker on Effectiveness vs. Efficiency</h3>
<p class="WisdomClass" ><a href=" /display/ShowJournal?moduleId=1113404&categoryId=109667">Management Wisdom</a>: 3</p>
</div>

<p class="WisdomQuote">There is surely nothing quite so useless as doing with great efficiency what should not be done at all</p>

<div class="WisdomText">
</div>

<p class="QuoteSource">-- Peter Drucker, 1963</p>
</div>

<p><a href="http://en.wikipedia.org/wiki/Peter_Drucker" class="offsite-link-inline">Peter Drucker</a> is often called "the father of modern management". Many books and Web sites are devoted to his insights, some of which I have <a href="http://www.webperformancematters.com/journal/2006/3/14/deep-thoughts-on-management.html" class="offsite-link-inline">written about</a> previously.</p>

<p>This post highlights his incisive observation about the difference between <em>effectiveness</em> and <em>efficiency</em>. I have always found it to be especially memorable, and quoted it (twice) when discussing priorities and choices in my book about software performance. Unfortunately I got the source wrong, but thanks to Google I can now correct my mistake. </p>

<p>It appeared in <em>Managing for Business Effectiveness</em>, an article in the May/June 1963 edition of Harvard Business Review ("HBR"). You can also find it in a February 2006 HBR article -- <a href="http://harvardbusinessonline.hbsp.harvard.edu/b02/en/common/item_detail.jhtml?id=R0602J" class="offsite-link-inline">What Executives Should Remember</a> -- a collection of excerpts drawn from HBR articles by Drucker published between 1963 and 2004.</p>

<p>Because Drucker's remarks are equally relevant to technical performance management and business leadership, I am cross-posting this here on <em>Web Performance Matters</em>, and on our new <a href="http://www.uprightmatters.com/" class="UMref"><em>UpRight Matters</em></a> blog. </p>

<h3>Three key questions</h3>

<p>In his 1963 essay, Drucker states that there is no magic formula, checklist, or procedure that will substitute for the hard, demanding, risk-taking work of management. But he claims that "we know how to organize the job of managing for economic effectiveness and how to do it with both direction and results. The answers to the [following] three key questions ... are known, and have been known for such a long time that they should not surprise anyone."</p>
 
<blockquote>
<p><strong>1. What is the manager's job?</strong> It is to direct the resources and efforts of the business toward opportunities for economically significant results. This sounds trite—and it is. But every analysis of actual allocation of resources and efforts in business that I have ever seen or made showed clearly that <em>the bulk of time, work, attention, and money first goes to "problems" rather than to opportunities, and, secondly, to areas where even extraordinarily successful performance will have minimum impact on results.</em></p>
 
<p><strong>2. What is the major problem?</strong> It is fundamentally the confusion between effectiveness and efficiency that stands between doing the right things and doing things right. <em>There is surely nothing quite so useless as doing with great efficiency what should not be done at all.</em> Yet our tools—especially our accounting concepts and data—all focus on efficiency. What we need is (1) a way to identify the areas of effectiveness (of possible significant results), and (2) a method for concentrating on them.</p>

<p><strong>3. What is the principle?</strong> That, too, is well-known—at least as a general proposition. Business enterprise is not a phenomenon of nature but one of society. In a social situation, however, events are not distributed according to the "normal distribution" of a natural universe (that is, they are not distributed according to the U-shaped Gaussian curve). <em>In a social situation a very small number of events—10 percent to 20 percent at most—account for 90 percent of all results, whereas the great majority of events account for 10 percent or less of the results.</em></p>

<p class="QuoteSource">-- Peter Drucker, <em>Managing for Business Effectiveness</em>, Harvard Business Review, May/June 1963</p>
</blockquote>

<h3>A principled foundation</h3>

<p>Although Drucker writes about management effectiveness in the context of business performance, specialists in software or systems performance must ask the same questions and apply the same principles. In my book on performance management [<a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" class="offsite-link-inline">Amazon</a>], I described these principles as follows: </p>

<ul>
<li>The Centering Principle: Focus on the most performance-critical components.</li>
<li>The Efficiency Principle: Maximize the ratio of useful work to overhead.</li>
<li>The Pareto Principle: Prioritize the 20% of the problem that will return 80% of the benefits.</li>
</ul>

<p>Drucker concludes that the most crucial requirement for effective management is having ... </p>

<p> ... <em>the courage to go through with logical decisions -- despite all pleas to give this or that product another chance, and despite all such specious alibis as the accountant's "it absorbs overhead" or the sales manager's "we need a full product line."</em></p>

<p>This is one small example of the characteristic I find so appealing in Drucker's writing. His advice starts from an assumption that there <strong>are</strong> relevant principles, and that you can make decisions by reasoning logically from those foundations. As a mathematician, this way of looking at the world appeals to my sense of order and logic, rather than presenting me with a collection of unsupported assertions and beliefs. The HBR introduction to its review, <em>What Executives Should Remember</em>, sums up Drucker's appeal as follows:</p>

<blockquote>
<p>Executives had come to think they knew how to run companies, and Drucker took it upon himself to poke holes in their beliefs, lest organizations become stale. But he did so in a sympathetic way. He assumed that his readers were intelligent, rational, hardworking people of goodwill. If their organizations struggled, he believed it was usually because of outdated ideas, a narrow conception of a problem, or internal misunderstandings. His insights were ... practical idea-based essays for executives, and his clear-eyed humanistic writing enhanced the magazine time and again. He helped us all to think broadly and deeply.</p>

<p class="QuoteSource">-- What Executives Should Remember, Harvard Business Review, February 2006</p>
</blockquote>

<p>This is why so many people have enjoyed, and like <a href="http://www.marketingheadhunter.com/about.html" class="offsite-link-inline">Harry Joiner</a>, continue to <a href="http://www.marketingheadhunter.com/executive_search/2005/11/peter_drucker.html" class="offsite-link-inline">enjoy daily</a>, the wisdom of Peter Drucker.<p> 

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/management" rel="tag">management</a>,
<a href="http://technorati.com/tag/management+principles" rel="tag">management principles</a>,
<a href="http://technorati.com/tag/management+wisdom" rel="tag">management wisdom</a>,
<a href="http://technorati.com/tag/Peter+Drucker" rel="tag">Peter Drucker</a>,
<a href="http://technorati.com/tag/effectiveness" rel="tag">effectiveness</a>,
<a href="http://technorati.com/tag/efficiency" rel="tag">efficiency</a>,
<a href="http://technorati.com/tag/Pareto+Principle" rel="tag">Pareto Principle</a>,
<a href="http://technorati.com/tag/80-20+rule" rel="tag">80-20 rule</a>,
<a href="http://technorati.com/tag/Harry+Joiner" rel="tag">Harry Joiner</a>,
<a href="http://technorati.com/tag/UpRight+Matters" rel="tag">UpRight Matters</a>
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-964217.xml</wfw:commentRss></item><item><title>Asynchronous Architectures [4]</title><category>Architecture</category><category>Events</category><category>Foundations of Performance</category><category>Software Engineering</category><dc:creator>Chris Loosley</dc:creator><pubDate>Tue, 21 Aug 2007 07:01:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/8/21/asynchronous-architectures-4.html</link><guid isPermaLink="false">115864:1113404:1206878</guid><description><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Werner%20Vogels.jpg" alt="Illustration: Werner Vogels" title="Werner Vogels"/></span>

<p><em>This is the fourth in a series of posts presenting arguments for <strong>asynchronous architectures</strong> as the optimal way to build high-performance, scalable systems for a distributed environment.</em></p>

<p>In a <a href="http://qcon.infoq.com/qcon-london-2007/conference/" class="offsite-link-inline">QCon conference</a> presentation on <em><strong>Availability and Consistency</strong> or how the CAP theorem ruins it all</em>, <a href="http://www.infoq.com/presentations/availability-consistency" class="offsite-link-inline">Werner Vogels</a>, Amazon CTO, examines the tension between availability & consistency in large-scale distributed systems, and presents a model for reasoning about the trade-offs between different solutions.</p>

<p>I recommend you find time to watch the entire 52-minute video. The flash streaming technology that InfoQ uses is subject to buffering hiccups, and you may have to restart it a few times. So in case you want to jump to a specific section, I've assembled copies of Werner's slides, with short timestamped notes on the content of each section. Werner did not present his slides in their numbered order, so in my notes I identify slides using the numbers printed on them, not their presentation order.</p>

<h3>Introduction</h3>

<p><strong>0:50:</strong> CTO's must match business with technology. Most really big IT shops <em>must</em> push the edge of what commercial technology can do. Technology has a very long adoption cycle -- it takes about 10 to 15 years for new technology to mature and be effective. For leading companies like Amazon, that's too slow, the scalability challenges are so great that they demand advanced solutions. So shops are forced (in effect) to do their own research, take advanced steps, just to succeed in a competitive marketplace. </p>

<p><strong>2:15:</strong> Werner noted that his viewpoint disagreed strongly with that of <a href="http://qcon.infoq.com/qcon-london-2007/speakers/show_speaker.jsp?oid=137" class="offsite-link-inline">Cameron Purdy</a>, CEO of Tangosol, who was an advocate of database technology.</p>

<p><strong>3:00:</strong> He introduced Eric Brewer's CAP theorem -- more later. 
[See the end of my previous post in this series, <a href="http://www.webperformancematters.com/journal/2007/8/15/asynchronous-architectures-3.html" class="PMref">Asynchronous Architectures [3]</a>. The CAP theorem was first propounded in a 1998 presentation -- <a href="http://www.ccs.neu.edu/groups/IEEE/ind-acad/brewer/" class="offsite-link-inline">Lessons from Internet Services: ACID vs. BASE</a> -- by Dr. Eric Brewer of Inktomi, now a <a href="http://www.cs.berkeley.edu/~brewer/" class="offsite-link-inline">professor</a> at UC Berkeley].</p>

<h3>3:45: What is Scalability? [slide 2]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%202.JPG" alt="Vogels%20CAP%20Theorem%202.JPG" title="Vogels%20CAP%20Theorem%202.JPG"/></span>

<p><strong>3:45:</strong> The meat of Werner's QCon presentation really begins here. <em> Proportional</em> is the key word in these definitions. Adding resources should deliver increased capacity <em>proportional</em> to the added resources. Or if the intent was to deliver better performance, the gains should be <em>proportional</em> to the added resources. Performance here is not just about response, it could mean transfering more data or larger datasets. </p>

<p><strong>4:40:</strong> Another reason for needing scalability is to achieve fault-tolerance. Adding resources to achieve redundancy should not hurt your performance. Traditional technologies (like databases) won't give you this kind of scalability, because overheads increase as you scale up. These are subjects I discussed at length when explaining <em>The Parallelism Principle</em> in <em>High-Performence Client/Server</em>: </p>

<blockquote>
<h4>13.1  The Parallelism Principle: Exploit parallel processing</h4>
	
<p>Processing related pieces of work in parallel typically introduces additional synchronization overheads, and often introduces contention among the parallel streams. Use parallelism to overcome bottlenecks, provided the processing speedup offsets the additional costs introduced.</p>

<h4>13.2  Scalability and speed up</h4> 

<p><em>Scalability</em> refers to the capacity of the system to perform more total work in the same elapsed time, when its processing power is increased. </p>
<p><em>Speed up</em> refers to the capacity of the system to perform a particular task in a shorter time, when its processing power is increased.</p>
<p>In a system with linear scalability and speed up, any increase in processing power generates a proportional improvement in throughput or response time. </p>

<p class="QuoteSource">--<a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 13, pp383-385.</p>
</blockquote>

<h3><strong>8:15:</strong> Scalability for Real Systems [slide 3]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%203.JPG" alt="Vogels%20CAP%20Theorem%203.JPG" title="Vogels%20CAP%20Theorem%203.JPG"/></span>

<p><strong>7:00:</strong> Slide 3 is the conclusion of a cost discussion that begins before the slide is shown. The biggest threat to availability is bugs, which are a cost factor introduced by humans. So operating costs must not grow as you scale up.</p>

<h3><strong>8:45:</strong> But ... [slide 4]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/Vogels%20CAP%20Theorem%204.JPG" alt="Vogels%20CAP%20Theorem%204.JPG" title="Vogels%20CAP%20Theorem%204.JPG"/></span>

<p><strong>8:30:</strong> Traditional technologies, databases, two-phase commit may work for 2-4 nodes, but they will not scale to 100's (let alone 10.000) nodes. You may not have 10,000 nodes like Amazon, but you will run into these scalability challenges at 50-100 nodes. </p>

<h3>10:05: Principles for Scalable Service Design [slide 13]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/Vogels%20CAP%20Theorem%2013.JPG" alt="Vogels%20CAP%20Theorem%2013.JPG" title="Vogels%20CAP%20Theorem%2013.JPG"/></span>

<p><strong>10:05:</strong> Guidelines for services design at Amazon -- a checklist of lessons learned through hard experience:</p>

<ul>
<li><strong>10:05:</strong> Decentralize. Any algorithm that requires agreement will eventually become a bottleneck. Two-phase commit is an in effect an <em>unavailability</em> algorithm, it is guaranteed to fail as you scale up the number of participating services.</li>
<li><strong>10:50:</strong> Asynchrony. Make progress under all cicumstances, even if the world is burning around you. Even if fulfillment services are burning down, you want people to be able to place orders. So work locally, don't worry about the rest of the system.</li>
<li><strong>11:40:</strong> Autonomy: Each node should be able to make decisions based only on local state. If you need to reach agreement based on global conditions at high load, you are lost. Nodes may be failing, coming up, going down all the time. Probabilistic techniques work well in these circumstances.</li>
<li><strong>12:35:</strong> Controlled concurrency. Reduce concurrency as much as possible, so that you do not need to use locking.</li>
<li><strong>13:15:</strong> Controlled parallelism. Control traffic going to each node using careful load balancing; nodes must have spare CPU and I/O capacity so that they can do other tasks (like load re-balancing) in the background.</p>
<li><strong>14:10:</strong> Symmetry. Things work really well if all nodes do exactly the same thing. It is easy to add more nodes if nodes do not have to be identified as a <em>directory node</em>, a <em>data-storage node</em>, etc. Ideally, you just install the software and run it, and it responds to any client request and does whatever task is needed, or maybe forwards the request if necessary. This is the logical way to address a requirement that I first documented in a 1993 paper, and later included in Chapter 16, Architecture for High Performance, of my book: </p>
</ul>

<blockquote>
<p>For a large organization, moving to an enterprise client/server system represents a major shift from monolithic systems with fixed distribution to dynamic, heterogeneous, pervasively networked environments. The next generation of systems will be an order of magnitude more dynamic--always running, always changing--with thousands of machines in huge networks.</p>
<p>In such an environment, content components (service providers like DBMSs) and service consumers (e.g. GUIs) must be continually added and removed.</p>
<p>The key to doing this is the middle tier, the hub of the three tier architecture. In the first place, this central layer acts in a connecting role to let individual clients access multiple content servers, and (of course) servers support multiple clients. A separate central tier can also:</p>
<ul>
<li>Provide a set of services that can be dynamically invoked by both the consumer and content layers</li>
<li>Allow new services to be added without major reconfiguration</li>
<li>Allow services to be removed dynamically without affecting any participant not using those services</li>
<li>Allow one service provider to be replaced by another</li>
</ul>	
<p>These are all vital characteristics in a distributed computing environment.</p>
<p class="QuoteSource">--<a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 16, pp514-515.</p>
</blockquote>

<p><strong>15:10:</strong> Algorithms that force you to obtain agreement will become a bottleneck. So avoid using two-phase commit, maybe by denormalization to make sure your transaction always runs within a single node. Or split your task into multi-transaction workflows. You have to take an end-to-end look at the business transaction and decide. <span class="aside">[I have always advocated this approach -- see the conclusions of the second <a href="http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html" class="PMref">post</a> in this series].</span></p>

<p><strong>17:40:</strong> You can reuse some of those principles in building teams. Small teams are best, so that each team is responsible for a well-understood piece. Team effectiveness is just as important as architectural consistency. </p> 

<p><strong>19:00:</strong> We call this the two pizza rule -- <em>If you can't feed a team with two pizzas, it's too big</em>. OK, hungry just-out-of-college students do eat more, but they work harder too! As soon as you need more than 8 people, it's hard to understand what everyone is doing. Bigger teams, of 12 or more, must have meetings, and must spend a much larger percentage of their time communicating. This discussion harks back to the famous observation by Fred Brooks: </p>

<blockquote>
<h4>The Mythical Man-Month</h4>
<p>Men and months are interchangeable commodities only when a task can be partitioned among many workers <em>with no communication among them</em>... </p>
<p>In tasks that can be partitioned but which require communication among the subtasks, the effort of communication must be added to the amount of work to be done... </p>
<p>The added burden of communication is made up of two parts, training and intercommunication... </p>
<p>If each part of the task must be separately coordinated with each other part, the effort increases as n(n-1)/2. Three workers require three times as much pairwise intercommunication as two, four require six times as much as two. If, moreover, there need to be conferences among three, four, etc., workers to resolve things jointly, matters get worse yet. The added effort of communicating may fully counteract the division of the original task.
<p class="QuoteSource">--The Mythical Man-Month, Frederick P. Brooks, 20th Anniversary Edition pp17-18, Addison Wesley, 1995.</p>   
</blockquote>

<p><strong>21:20:</strong> At Amazon, not all the services (1000 or more) support Web interactions at Amazon.com. Many are in the back-end systems such as supply-chain, fulfillment, enterprise services, handling feeds from 3rd-party suppliers, item management, recommendations, personalization.</p>

<p><strong>22:15:</strong> Example of <em>statistically improbable phrases</em> (SIPs), an interesting digression about text analysis being implemented as yet another service. (Too long. The details are a plug for Amazon, but take time away from the main thread of the presentation). </p>

<p><strong>24:30:</strong> Conclusion of the SIP discussion -- you need dependency management, and contracts. Servers can give an SLA based on workload conditions, that clients must honour. Automatic dependency discovery -- Amazon has home-grown tools that can show where dependencies exist, and the effects of failures in a network of nodes.</p>

<h3>26:00:</strong> Scalability Through Smart System Engineering [slide 12]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%2012.JPG" alt="Vogels%20CAP%20Theorem%2012.JPG" title="Vogels%20CAP%20Theorem%2012.JPG"/></span>

<p><strong>26:00:</strong> Use scalable primitives. For example, RPC is <em>not</em> scalable. Don't conceal heterogeneity. We can pretend that systems don't fail, but in practice they do. That's the problem with RPC, it pretends to be a procedure call but it isn't, so transparency does not really work, failures <strong>do</strong> happen, performance differences <strong>do</strong> exist. So don't conceal these differences.</p>

<p><strong>28:00:</strong> Configuration management. If you have 1000 services, configuration becomes really important. If your applications involve strong consistency properties, they can create problems when people leave your team.</p>

<p><strong>29:00:</strong> Repair and recovery. Check out the work on <a href="http://roc.cs.berkeley.edu/roc_overview.html" class="offsite-link-inline">Recovery Oriented Computing</a> (at Stanford and Berkeley) by <a href="http://www.cs.berkeley.edu/~pattrsn/" class="offsite-link-inline">Dave Patterson</a> and others. Can you design services to restart fast, maybe by keeping log information? If that really works well, then you can design your entire system around the principles of recovery and restart. If you don't like the behavior or performance of a service, you can kill it and let it restart. If it can be functioning again in a minute, this systems design approach can be very effective.</p>

<h3>30:40: CAP Conjecture [slide 8]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%208.JPG" alt="Vogels%20CAP%20Theorem%208.JPG" title="Vogels%20CAP%20Theorem%208.JPG"/></span>

<p><strong>30:40:</strong> At Amazon, all data applications are dominated by this theorem. Traditional data applications assumed that if you stored something in a database it would never go away <span class="aside">[<strong>D</strong>urability, the "D" in the <a href="http://en.wikipedia.org/wiki/ACID" class="offsite-link-inline">ACID properties</a>]</span>. In reality, because of redundancy, many nodes can be working in parallel and storing information, and then bad things can happen.</p>

<h3>32:00: A Clash of Cultures [slide 5]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%205.JPG" alt="Vogels%20CAP%20Theorem%205.JPG" title="Vogels%20CAP%20Theorem%205.JPG"/></span>

<p><strong>32:00:</strong> There's nothing wrong with transactions, they create a nice clean programming paradigm, and are good for programmers. But you must design for failure cases, because <em>transactions can fail</em>. So the ACID properties are great if you can get them, but getting these guarantees is costly. It may be fine if only a single node accesses the data. but not if 10's or 100's of machines need access. <strong></p>

<p>[33:30]</strong> In that case another, more fuzzy, approach may be better -- BASE, in which data is <em>basically available, more or less</em>. Applications maintain a <em>soft state</em>, in which data is eventually consistent.</p>

<h3>34:05: ACID vs BASE [unnumbered]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%205a.JPG" alt="Vogels%20CAP%20Theorem%205a.JPG" title="Vogels%20CAP%20Theorem%205a.JPG"/></span>

<p><strong>34:05:</strong> ACID has a pessimistic behavior, it will fail if it cannot reach the guarantees that you want. Availability is less important than <em>consistency</em>. For BASE systems, <em>availability </em>is the most important, and you are willing to sacrifice something to ensure it. So, for example, in a Web application you will design an application to accept and store shopping-cart input from a customer, and deal with minor problems in the data later. It's a weaker level of consistency, but you never want to tell the customer you can't accept their input.</p>

<h3>35:50: Why the Divide? [Slide 7]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%207.JPG" alt="Vogels%20CAP%20Theorem%207.JPG" title="Vogels%20CAP%20Theorem%207.JPG"/></span>

<p><strong>35:50:</strong> CAP stands for <strong>C</strong>onsistency, <strong>A</strong>vailability, and <strong>P</strong>artitioning. Eric Brewer came up with the conjecture that <em>systems can only possess two of these three characteristics</em>, which was subsequently proved to be true. That means systems designers must make choices, they must decide how to handle data reads and writes. If you insist on always enforcing consistency, your system may have to reject data interactions, making the system (in effect) unavailable at certain times. If you value availability and want to always accept user interactions, your application must then deal with the fact that some of its responses may later turn out to have been inconsistent.</p>

<p><strong>38:45:</strong> Sometimes applications can deal with this. In the Web environment, the common technique of customer stickiness may help you to operate with lower levels of consistency. Once they have begun a session, customers are typically redirected to the same server cluster, or data center. So a local level of data consistency is sufficient; global consistency is unnecessary. </p>

<h3><strong>39:30:</strong> Consistency and Availability [slide 9]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%209.JPG" alt="Vogels%20CAP%20Theorem%209.JPG" title="Vogels%20CAP%20Theorem%209.JPG"/></span>

<p><strong>39:30:</strong> Many applications have a workflow behavior. First the customer interacts with the shopping cart. At this time, availability is the most important. After that you do all kinds of things with that data, and during those activities, consistency is the most important. Now, because you are not interacting with the customer directly, if you can't obtain consistency for one data item, you can move on to process a different data item and come back later. Then you get to the shipment and delivery phase, and at this time the database is mostly read only. A data architecture that forces you to use the same powerful tool -- like a big relational database -- for all these different activities is not ideal. If you select data storage solutions that are appropriate for each phase, it's easier to scale your solution.</p>

<h3><strong>44:00:</strong> Partition-Tolerance and Availability [slide 10]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%2010.JPG" alt="Vogels%20CAP%20Theorem%2010.JPG" title="Vogels%20CAP%20Theorem%2010.JPG"/></span>

<p><strong>44:00:</strong> It's hard to program for weaker levels of consistency. Amazon has developed some API's, but Werner had no time to discuss these solutions. Slide 10 lists some examples, which he discussed briefly. The core design approach involves guaranteeing the durability of data inputs while relaxing consistency enforcement, then returning later to deal with any inconsistencies. </p>

<h3><strong>45:40:</strong> Techniques [slide 11]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%2011.JPG" alt="Vogels%20CAP%20Theorem%2011.JPG" title="Vogels%20CAP%20Theorem%2011.JPG"/></span>

<p><strong>45:40:</strong> Read the slide, because Werner does not talk about it!</p>

<p>We used to use a lot of DB technology at Amazon. It works really well, especially if most applications manipulate single data records using their primary keys. You can still create accessors that iterate over the entire database, but these should be relegated to a lower priority, background, status. The primary interfaces should offer only simple get/put accesses. In a production database that supports transactions, you don't need to also support queries, especially if the data is just XML text anyway. If engineers know what's inside those XML records, they may start coding against it! </h3>

<p><strong>48:00:</strong> Whatever DM software you are running, databases that require high performance need specialists to configure, operate, and manage them. Engineers can't do it effectively, you need DBAs, even for very simple access patterns. <span class="aside">[<strong>Guideline 19.5</strong> in High-Performance Client/Server: Although DBMSs may offer similar features, implementations usually differ. Never assume that a design rule of thumb learned for one DBMS can be applied unchanged to another.]</span> </p>

<h3><strong>50:30:</strong> What does this mean for the data architecture?</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%2014.JPG" alt="Vogels%20CAP%20Theorem%2014.JPG" title="Vogels%20CAP%20Theorem%2014.JPG"/></span>

<p><strong>50:30:</strong> Again, read the slide, because in the edited presentation stream, Werner appears to be speaking to a different slide altogether. And then he ran out of time, so ...</p>

<p>... that's it. A really insightful and informative talk by Werner Vogels. No doubt his presentation could have been improved, given more time, or even just better use of the time available. But all the same, it is very stimulating (I think) and well worth several listens, until you grasp the central points -- all of which I agree with. In fact, Werner's conclusions circle back to the conclusion of my book, and my opening statement in the <a href="http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html" class="offsite-link-inline">first post</a> in this series: </p>

<div class="InlineTextBox">
<p><em>Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance (distributed) systems.</em></p>
</div>

<p>Documenting this talk has been both educational and satisfying. But -- since I can't type nearly as quickly as Werner can talk -- I may have misquoted him somewhere. If you spot a mistake please let me know, and I'll correct it.</p> 


<p class="Footnote"><strong>This series of posts</strong> contains some material first published in <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.</p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/Werner+Vogels" rel="tag">Werner Vogels</a>,
<a href="http://technorati.com/tag/Amazon" rel="tag">Amazon</a>,
<a href="http://technorati.com/tag/QCon" rel="tag">QCon</a>, 
<a href="http://technorati.com/tag/distributed+systems" rel="tag">distributed systems</a>, 
<a href="http://technorati.com/tag/asynchronous+architecture" rel="tag">asynchronous architecture</a>,
<a href="http://technorati.com/tag/Web+services" rel="tag">Web services</a>,
<a href="http://technorati.com/tag/SOA" rel="tag">SOA</a>,
<a href="http://technorati.com/tag/performance" rel="tag">performance</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,
<a href="http://technorati.com/tag/synchronization" rel="tag">synchronization</a>, 
<a href="http://technorati.com/tag/autonomy" rel="tag">autonomy</a>, 
<a href="http://technorati.com/tag/multi-transaction" rel="tag">multi-transaction</a>,
<a href="http://technorati.com/tag/workflow" rel="tag">workflow</a>, 
<a href="http://technorati.com/tag/David Patterson" rel="tag">David Patterson</a>,
<a href="http://technorati.com/tag/Recovery+Oriented+Computing" rel="tag">Recovery Oriented Computing</a>,
<a href="http://technorati.com/tag/Fred+Brooks" rel="tag">Fred Brooks</a>,
<a href="http://technorati.com/tag/Mythical+Man+Month" rel="tag">Mythical Man Month</a>,  
<a href="http://technorati.com/tag/Eric+Brewer" rel="tag">Eric Brewer</a>, 
<a href="http://technorati.com/tag/ACID+properties" rel="tag">ACID properties</a>,
<a href="http://technorati.com/tag/two-phase+commit" rel="tag">two-phase commit</a>, 
<a href="http://technorati.com/tag/BASE" rel="tag">BASE</a>, 
<a href="http://technorati.com/tag/CAP+theorem" rel="tag">CAP theorem</a>, 
<a href="http://technorati.com/tag/performance+matters" rel="tag">performance matters</a> 
</p>
]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1206878.xml</wfw:commentRss></item><item><title>Asynchronous Architectures [3]</title><category>Architecture</category><category>Foundations of Performance</category><category>Performance Wisdom</category><category>Software Engineering</category><dc:creator>Chris Loosley</dc:creator><pubDate>Wed, 15 Aug 2007 07:01:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/8/15/asynchronous-architectures-3.html</link><guid isPermaLink="false">115864:1113404:1203680</guid><description><![CDATA[<div class="PageWisdomWrapper">

<div class="WisdomTitle" >
<h3>Dan Pritchett's Design Principle</h3>
<p class="WisdomClass" ><a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044">Performance Wisdom</a>: 13</p>
</div>

<p class="WisdomQuote">
Always assume high latency, not low latency
</p>

<div class="WisdomText">
<p>One of the underlying principles is assuming high latency, not low latency. An architecture that is tolerant of high latency will operate perfectly well with low latency, but the opposite is never true.</p>
<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007</p>
</div>
</div>

<p><em>This is the third in a series of posts presenting arguments for <strong>asynchronous architectures</strong> as the optimal way to build high-performance, scalable systems for a distributed environment.</em></p>

<p>The first reviewed the case for asynchronous communication among interdependent components or services, and <a href="http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html" class="PMref">Bell's Law of Waiting</a>. The second highlighted <a href="http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html" class="PMref">The Fallacies of Distributed Computing</a>, and discussed the importance of reflecting the business process in distributed systems design.</p>

<p>This post reviews <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, an article about how asynchronous architectures can improve the quality of Web applications, published on the <a href="http://www.infoq.com/" class="offsite-link-inline">InfoQueue</a> site by eBay architect Dan Pritchett in May 2007. Dan's article is especially relevant today, given the high level of interest in adopting Web services and SOA approaches.</p>

<p>Dan explains why global, large-scale architectures need to address latency, and what architectural patterns can be applied to deal with it. He begins by invoking the second fallacy of distributed computing:</p>

<blockquote>
<p><strong>Latency.</strong></p>

<p>The time it takes packets to flow from one part of the world to another.  Everyone knows it exists. The second fallacy of distributed computing is &quot;Latency is zero&quot;.  Yet so many designs attempt to work around latency instead of embracing it.  This is unfortunate and in fact doesn't work for large-scale systems. Why?</p>

<p>In any large-scale system, there are a few inescapable facts:</p>
<ol>
    <li> A broad customer base will demand reasonably consistent performance across the globe.</li>
    <li> Business continuity will demand geographic diversity in your deployments.</li>
    <li> The speed of light isn't going to change.</li>
</ol>
<p>The last point can't be emphasized enough. The speed of light dictates that even if we can route packets at the speed of light, seems unlikely, it will take 30ms for a packet to traverse the Atlantic.</p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<h3>Latency hurts customer service</h3>

<p>He emphasises the connection between Internet latency and customer service:</p>

<blockquote>
<p><strong>The Internet is a part of foundation of the global economy.</strong></p>

<p>Companies need to reliably reach their customers regardless of where they may be located. Architectures that force close geographic proximity of the components limit the quality of service provided to geographically distributed customers. Response time will obviously degrade the further customers are from the servers, but so will reliability. Despite the tremendous increase in the reliability of traffic routing on the Internet, the further you are from a service, the more often that service will be effectively unavailable to you.</p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<h3>Latency tolerance</h3>

<p>After spelling out the principle that I have highlighed above as today's <em>Performance Wisdom</em>, he goes on to make the case for introducing asynchronous interactions as the way to achieve latency tolerance.</p>

<blockquote>
<p>The web has created an interaction style that is very problematic for building asynchronous systems. The web has trained the world to expect request/response interactions, with very low latency between the request and response. These expectations have driven architectures that are request/response oriented that lead to synchronous interactions from the browser to the data. This pattern does not lend itself to high latency connections.</p>

<p><strong>Latency tolerance can only be achieved by introducing asynchronous interactions to your architecture.</strong></p>

<p>The challenge becomes determining the components that can be decoupled and integrated via asynchronous interactions. An asynchronous architecture is far more than simply changing the request/response from a call to a series of messages though. The client is still expecting a response in a deterministic time. Asynchronous architectures shift from deterministic response time to probabilistic response time. Removing the determinism is uncomfortable for users and probably for your business units, but is critical to achieving true asynchronous interactions.<p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<p><a href="http://www.thbs.com/pdfs/sync_or_async.pdf" class="offsite-link-inline">Web Services in SOA - Synchronous or Asynchronous?</a>, a paper by Torry Harris Business Solutions, offers another introduction to the pro's and con's of synchronous and asynchronous architectures.</p>

<p>Dan accepts that not all Web application components can be designed to function asynchronously, but argues that designers can identify those use cases that do support synchronous interactions. These arguments confirm my <a href="http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html" class="PMref">earlier</a> conclusion that synchronous solutions must be combined with <em>asynchronous designs in which the user must accept that unconfirmed changes will be reflected in the enterprise database(s) at a later time</em>.</p>

<h3>Data partitioning</h3>

<p>He makes a very good point about the importance of designing data distribution from the outset:</p>

<blockquote>
<p>You can decompose your applications into a collection of loosely coupled components; expose your services using asynchronous interfaces, and yet still leave yourself parked in one data center with little hope of escape. You have to tackle your persistence model early in your architecture and require that data can be split along both functional and scale vectors or you will not be able to distribute your architecture across geographies. </p>

<p><strong>I recently read an article where the recommendation was to delay horizontal data spreading until you reach vertical scaling limits. I can think of few pieces of worse advice for an architect. Splitting data is more complex than splitting applications.</strong></p>

<p>But if you don't do it at the beginning, applications will ultimately take short cuts that rely on a monolithic schema. These dependencies will be extremely difficult to break in the future.</p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<h3>ACID or BASE?</h3>

<p>Noting that the traditional way to maintain database consistency across partitioned data requires <a href="http://en.wikipedia.org/wiki/ACID" class="offsite-link-inline">ACID-compliant</a> distributed transactions and <a href="http://en.wikipedia.org/wiki/Two-phase_commit_protocol" class="offsite-link-inline">two-phase commit protocols</a>, Dan advocates a (cleverly-named) alternative to the <strong>ACID</strong> properties, the BASE approach to database consistency:</p>

<blockquote>
<p>The problem with distributed transactions is they create synchronous couplings across the databases. Synchronous couplings are the antithesis of latency tolerant designs. The alternative to <strong>ACID</strong> is <strong>BASE</strong>:

<p><strong>B</strong>asically <strong>A</strong>vailable
<br /><strong>S</strong>oft state
<br /><strong>E</strong>ventually consistent
</p>

<p>BASE frees the model from the need for synchronous couplings. Once you accept that state will not always be perfect and consistency occurs asynchronous to the initiating operation, you have a model that can tolerate latency.</p> 
<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007</p>
</blockquote>

<p>Another article worth reading, <a href="http://xml.sys-con.com/read/43755.htm" class="offsite-link-inline">Web-Services Transactions</a> by Doug Kaye, advances similar arguments without using the <em>BASE</em> terminology.</p>

<h3>References: business-driven or event-driven architectures</h3>

<p>While these articles present, at a high-level, a convincing case for asynchronous architectures, many others have elaborated on the implementation details. Here are five examples of more detailed treatments, in approximately descending order of generality:</p>

<ul class="group">

<li>The wikipedia article on <a href="http://en.wikipedia.org/wiki/Event_Driven_Architecture" class="offsite-link-inline">Event-driven architecture</a>.</li>

<li><a href="http://elementallinks.typepad.com/bmichelson/2006/02/eventdriven_arc.html" class="offsite-link-inline">Event-Driven Architecture Overview</a>, a very detailed post by Brenda Michelson in her blog, <a href="http://elementallinks.typepad.com/bmichelson/" class="offsite-link-inline">Elemental Links</a>. 

<span class="aside">[A blog <a href="http://dougmcclure.net/blog/?p=41" class="offsite-link-inline">response</a> from <a href="http://dougmcclure.net/blog/about/" class="offsite-link-inline">Doug McClure</a> exemplifies the gulf between the concerns of business/application architects and those of IT/network management, despite all the talk of "alignment" these days. As I wrote in my book, "Middleware is the reason why network specialists and application programmers cannot communicate!" (Guideline 15.3, p469)</em>.]</span></li>

<li><a href="http://www.javaworld.com/javaworld/jw-01-2005/jw-0131-soa.html" class="offsite-link-inline">Event-driven services in SOA</a>by Jeff Hanson, JavaWorld.com, January 31, 2005.</li>

<li><a href="http://msdn2.microsoft.com/en-us/library/ms706253.aspx" class="offsite-link-inline">Message Queuing Applications</a>, Microsoft Developer Network (MSDN).</li>

<li><a href="http://developers.sun.com/jsenterprise/reference/techart/jse7/asynch.html" class="offsite-link-inline">Developing Asynchronous Web Services with Java Message Service</a> by Rico Cruz and Marina Sum, Sun Developer Network.</li>

</ul>

<h3>Next: the CAP theorem</h3>

<p>Dan points out that adopting the BASE approach to consistency forces you to understand a very important principle, known as <em>The CAP Theorem</em>. This theorem was first propounded in a 1998 presentation -- <a href="http://www.ccs.neu.edu/groups/IEEE/ind-acad/brewer/" class="offsite-link-inline">Lessons from Internet Services: ACID vs. BASE</a> -- by Dr. Eric Brewer of Inktomi, now a <a href="http://www.cs.berkeley.edu/~brewer/" class="offsite-link-inline">professor</a> at UC Berkeley.</p>

<blockquote>
<p>Of course there are situations where data needs to be consistent at the end of an operation. The CAP Theorem is a useful tool for determining what data to partition and what data must conform to ACID.</p>

<p><strong>The CAP Theorem states that when designing databases you consider three properties, Consistency, Availability, and Partitioning. You can have at most two of the three for any data model.</strong></p>

<p>Organizing your data model around CAP allows you to make the appropriate decisions with regards to consistency and latency.</p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<p>Because <em>The CAP Theorem</em> plays a crucial role in the design of large scalable systems using asynchronous architectures, I will be devoting my next post to it.</p>

<p class="Footnote"><strong>This series of posts</strong> contains some material first published in <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.</p>

<p class="Footnote"><strong>Tags:</strong> 
<a href="http://technorati.com/tag/distributed+systems" rel="tag">distributed systems</a>, 
<a href="http://technorati.com/tag/asynchronous+architecture" rel="tag">asynchronous architecture</a>,
<a href="http://technorati.com/tag/Web+services" rel="tag">Web services</a>,
<a href="http://technorati.com/tag/SOA" rel="tag">SOA</a>,
<a href="http://technorati.com/tag/middleware" rel="tag">middleware</a>,
<a href="http://technorati.com/tag/serialization" rel="tag">serialization</a>,
<a href="http://technorati.com/tag/synchronization" rel="tag">synchronization</a>, 
<a href="http://technorati.com/tag/queues" rel="tag">queues</a>, 
<a href="http://technorati.com/tag/decoupled+processes" rel="tag">decoupled processes</a>, 
<a href="http://technorati.com/tag/multi-transaction" rel="tag">multi-transaction</a>,
<a href="http://technorati.com/tag/workflow" rel="tag">workflow</a>, 
<a href="http://technorati.com/tag/distributed+computing" rel="tag">distributed computing</a>,
<a href="http://technorati.com/tag/Dan+Pritchett" rel="tag">Dan Pritchett</a>,
<a href="http://technorati.com/tag/eBay" rel="tag">eBay</a>,
<a href="http://technorati.com/tag/Brenda+Michelson" rel="tag">Brenda Michelson</a>,
<a href="http://technorati.com/tag/Elemental+Links" rel="tag">Elemental Links</a>,
<a href="http://technorati.com/tag/Doug+McClure" rel="tag">Doug McClure</a>,
<a href="http://technorati.com/tag/Eric+Brewer" rel="tag">Eric Brewer</a>, 
<a href="http://technorati.com/tag/Inktomi" rel="tag">Inktomi</a>,  
<a href="http://technorati.com/tag/ACID+properties" rel="tag">ACID properties</a>,
<a href="http://technorati.com/tag/two-phase+commit" rel="tag">two-phase commit</a>, 
<a href="http://technorati.com/tag/BASE" rel="tag">BASE</a>, 
<a href="http://technorati.com/tag/CAP+theorem" rel="tag">CAP theorem</a>, 
<a href="http://technorati.com/tag/application+performance" rel="tag">application performance</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,
<a href="http://technorati.com/tag/performance+wisdom" rel="tag">performance wisdom</a>,  
<a href="http://technorati.com/tag/performance+matters" rel="tag">performance matters</a> 
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1203680.xml</wfw:commentRss></item><item><title>Asynchronous Architectures [2]</title><category>Architecture</category><category>Foundations of Performance</category><category>Performance Wisdom</category><category>Software Engineering</category><dc:creator>Chris Loosley</dc:creator><pubDate>Tue, 14 Aug 2007 07:01:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html</link><guid isPermaLink="false">115864:1113404:1200536</guid><description><![CDATA[<div class="PageWisdomWrapper">

<div class="WisdomTitle" >
<h3>The Fallacies of Distributed Computing</h3>
<p class="WisdomClass" ><a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044">Performance Wisdom</a>: 12</p>
</div>

<div class="WisdomQuoteLong">
<ol>
<li>The network is reliable</li>
<li>Latency is zero</li>
<li>Bandwidth is infinite</li>
<li>The network is secure</li>
<li>Topology doesn't change</li>
<li>There is one administrator</li>
<li>Transport cost is zero</li>
<li>The network is homogeneous</li>
</ol>
</div>

<div class="WisdomText">
<p class="QuoteSource">-- <a href="http://java.sys-con.com/read/38665.htm" class="offsite-link-inline">Peter Deutsch</a>, James Gosling, Bill Joy, Tom Lyon</p>
</div>
</div>

<p><em>This post is the second in a series presenting arguments for <strong>asynchronous architectures</strong> as the optimal way to build high-performance, scalable systems for a distributed environment. The first post reviewed the general case for asynchronous communication among interdependent components or services, and highlighted <a href="http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html" class="PMref">Bell's Law of Waiting</a>.</em></p>

<p>In this post I discuss how the design of distributed systems should draw on that of manual business systems. Of course, distributed computing can shorten the timescales of some business operations enormously. But drawing analogies with the way manual systems work is an observation that will help us to design efficient and scalable distributed systems.</p> 

<p>In <a href="http://www.webperformancematters.com/journal/2007/8/10/five-scalability-principles.html" class="PMref">Five Scalability Principles</a>, I reviewed an <a href="http://www.mysql.com/why-mysql/scaleout/booking.html" class="offsite-link-inline">article</a> published by MySQL about the five performance principles that apply to all application scaling efforts. When discussing the first principle -- Don't think synchronously -- I stated that:</p>

<p><em>Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance (distributed) systems</em>.</p>

<h3>Starting from the wrong place</h3>
	
<p>Designing a computer system based on multi-transaction workflows is not a particularly revolutionary proposal. Indeed, if we had somehow been able to skip the first 40 years of the computer age and start automating our business systems using today’s technology, we would probably not have thought it the least bit unusual, because all manual systems work this way.</p> 
	
<p>But like our computers, our reasoning can be so logical that it lacks real thought. Occasionally, we need to balance our linear thinking with a small dose of simple wit like that ascribed to the Irish farmer, who, when asked by strangers for directions to a distant town, began his answer by saying “Well, I wouldn’t start from here”!</p>
 
<p>Most of our troubles stem from the fact that, when trying to reach the destination of distributed systems, we keep starting from the wrong place, namely the application designs and systems software of centralized computing:</p></p>
<ul>
<li>Design discussions dwell on how best to <a href="http://dev.mysql.com/tech-resources/articles/application_partitioning_wp.pdf" class="offsite-link-inline">partition</a> applications for the distributed environment. Manual systems, however, are already partitioned naturally -- the supposedly monolithic application that is being “partitioned” would not exist in the first place if it had not been conceived as “the right solution” by a designer with a centralized computing mindset.</li>
<li>The reason computer science devoted so much attention to <a href="http://en.wikipedia.org/wiki/Distributed_database" class="offsite-link-inline">distributed databases</a> and <a href="http://msdn2.microsoft.com/en-us/library/ms681205.aspx" class="offsite-link-inline">distributed transaction</a> management is because these concepts are extensions of the core mechanisms of centralized information processing -- shared databases and transaction monitors.</p>
</ul>

<p>But <a href="http://www.rgoarchitects.com/Files/fallacies.pdf" class="offsite-link-inline">The Fallacies of Distributed Computing</a> -- assembled during the 1990's by architects at Sun and the subject of today's <em>Performance Wisdom</em> (above) -- highlight crucial differences between centralized and distributed computing. Adding network components to an application introduces many potential problems that a centralized solution does not have to consider. So rather than trying to force the centralized mechanisms to work in a distributed environment, we should adopt mechanisms that are more appropriate.</p>

<h3>Starting from the right place</h3>
	
<p>In fact, we should start from the design of manual business systems. All large scale human systems are inherently distributed and asynchronous in nature. Even the participants in close knit team efforts operate asynchronously. We find chorus lines, cheerleaders, marching bands, and synchronized swimmers so interesting because they are such an aberration. So, before the advent of the centralized mainframe, the idea of recording an entire business transaction with a single synchronized set of human actions did not arise because it is so absurdly impossible.</p>

<p>Traditionally, the business process is divided into its natural components (or phases), according to the roles of the various human processors (or workers). Work flows through the phases, and information is recorded as necessary along the way. If anything goes wrong along the way, the appropriate set of compensating actions must be taken to undo whatever partial progress has been made. And the whole operation is designed to ensure that no irrevocable actions are taken too early in the process--usually meaning before the money is in the bank. Companies that mail out the diamonds before cashing the checks soon learn how to design a more effective multi-phase business process.</p>
	
<h3>Asynchronous architectures may reflect manual systems</h3>

<p>Manual systems always permit asynchronous operation of their separate components, because no other mode of operation is possible. Only computers make synchronous changes even possible. To a degree, centralized computing could deliver synchronous changes to related databases because the same computer managed all the system resources. Peaks in the workload could cause contention and delays, but provided the machine kept running, a congested machine acted as it’s own governor.</p> 
	
<p>But processor technology does not allow the centralized computing model to scale up without limits. When the workload surpasses the capabilities of the largest centralized processor, the only way to keep growing is to divide and conquer -- to create a network of computers. Networked computers demand a different approach to application design. Pursuing the vision of enterprise wide synchronization of information through networked computers can cause delays and inefficiencies when any part of the whole system operates below par. This is the Achilles heel of interdependent systems--the whole is no stronger than it’s weakest part.</p> 
	
<p>Therefore, the rigid concept of synchronous transactions must be replaced by a wider range of possible application designs: </p>
<ul class="group">
<li>The older, synchronous methods are still appropriate for changes that are within the scope of a single processor, or even for occasional communication between controlled components separated by a very carefully controlled, high speed network;</li>
<li>These must be combined with asynchronous designs in which the user must accept that unconfirmed changes will be reflected in the enterprise database(s) at a later time.</li>
</ul> 

<h3>Business process design</h3>

<p>When we do a good job of distributed systems design, it becomes an integral part of business process design. Rather than bending the business process to meet the needs of a centralized computer, we must blend the power of distributed computers into the business process.</p>
	
<p>Ironically, these changes bring us almost full circle back to the days of manual systems. In a manual system, it is normal for changes to be recorded quickly in one location, but for those changes to take a few days to percolate through the system, with the total processing time being somewhat uncertain. Distributed computing shortens the timescales, but applying many design principles that made manual systems work efficiently can help us to create effective and scalable distributed systems, even when we do not control the performance characteristics of all the components of those systems.</p>

<p>In my next post I will review some recent thinking about the intersection of asynchronous architectures and the world of Web services and SOA.</p>

<p class="Footnote"><strong>This post</strong> contains material first published in <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 16: Architecture for High Performance, pp509-510 and 526-527. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.</p>

<p class="Footnote"><strong>Tags:</strong> 
<a href="http://technorati.com/tag/distributed+systems" rel="tag">distributed systems</a>, 
<a href="http://technorati.com/tag/asynchronous+architecture" rel="tag">asynchronous architecture</a>,
<a href="http://technorati.com/tag/serialization" rel="tag">serialization</a>, 
<a href="http://technorati.com/tag/queues" rel="tag">queues</a>, 
<a href="http://technorati.com/tag/decoupled+processes" rel="tag">decoupled processes</a>, 
<a href="http://technorati.com/tag/multi-transaction" rel="tag">multi-transaction</a>,
<a href="http://technorati.com/tag/workflow" rel="tag">workflow</a>, 
<a href="http://technorati.com/tag/performance+wisdom" rel="tag">performance wisdom</a>,
<a href="http://technorati.com/tag/distributed+computing" rel="tag">distributed computing</a>,
<a href="http://technorati.com/tag/fallacies" rel="tag">fallacies</a>,
<a href="http://technorati.com/tag/Peter+Deutsch" rel="tag">Peter Deutsch</a>, 
<a href="http://technorati.com/tag/James+Gosling" rel="tag">James Gosling</a>, 
<a href="http://technorati.com/tag/Bill+Joy" rel="tag">Bill Joy</a>, 
<a href="http://technorati.com/tag/Tom+Lyon" rel="tag">Tom Lyon</a>, 
<a href="http://technorati.com/tag/Sun" rel="tag">Sun</a>, 
<a href="http://technorati.com/tag/software+performance" rel="tag">software performance</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,  
<a href="http://technorati.com/tag/performance+matters" rel="tag">performance matters</a> 
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1200536.xml</wfw:commentRss></item><item><title>Asynchronous Architectures [1]</title><category>Architecture</category><category>Foundations of Performance</category><category>Performance Wisdom</category><category>Software Engineering</category><dc:creator>Chris Loosley</dc:creator><pubDate>Mon, 13 Aug 2007 07:01:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html</link><guid isPermaLink="false">115864:1113404:1200201</guid><description><![CDATA[<div class="PageWisdomWrapper">

<div class="WisdomTitle" >
<h3>Bell's Law of Waiting</h3>
<p class="WisdomClass" ><a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044">Performance Wisdom</a>: 11</p>
</div>

<p class="WisdomQuote">All computers wait at the same speed</p>

<div class="WisdomText">
<p class="QuoteSource">-- Dr. Thomas E. Bell, Performance of Distributed Systems, Presentation to ICCM Capacity Management Forum 7, October 1993, San Francisco</p>
</div>
</div>

<p>In <a href="http://www.webperformancematters.com/journal/2007/8/10/five-scalability-principles.html" class="PMref">Five Scalability Principles</a>, I reviewed an <a href="http://www.mysql.com/why-mysql/scaleout/booking.html" class="offsite-link-inline">article</a> published by MySQL about the five performance principles that apply to all application scaling efforts. When discussing the first principle -- Don't think synchronously -- I stated that <em>Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance (distributed) systems</em>.</p>

<p>That's a quote from <a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, from a section on <em>Abandoning the Single Synchronous Transaction Paradigm</em>, in Chapter 15, <em>Architecture for High Performance</em>. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.</p>

<p>So I am planning some more posts built around excerpts from the manuscript. I'll be updating and generalizing the terminology as necessary for today's environments, and adding some guidelines in my <a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044" class="PMref">Performance Wisdom</a> series.</p>

<h3>Asynchronous architectures are more scalable</h3>

<p>The first posts will elaborate on the arguments for <strong>asynchronous architectures</strong> as the optimal way to build high-performance, scalable systems for a distributed environment. I begin by reviewing the general case for asynchronous communication among interdependent components or services. </p>

<p>In the typical distributed enterprise, there will inevitably be fluctuations in the distribution of work to be done, as business volumes rise and fall, and fluctuations in the availability of network and processing resources.</p>
	
<p>Even if we have designed our systems to accommodate peak processing volumes, it is normal for some servers, some application components,  or some part of a large network to be out of action some of the time. Therefore, if we design applications that require all resources to be available before we can complete any useful work, we reduce the availability of the whole system to the level of its most error prone component. </p>
	
<p>For optimal performance, we should design applications to accommodate unexpected peaks in the workload, server outages, and resource unavailability. This means application and system design must:</p>
<ul class="group">
<li>Emphasize <strong>concurrent operation</strong> in preference to workload serialization</li>
<li>Prefer <strong>asynchronous connections</strong> to synchronous ones between clients and servers</li>
<li>Place requests for service in <strong>queues</strong> and continue processing, rather than waiting for a response</li>
<li>Create opportunities for <strong>parallel processing</strong> of workload components</li>
<li>Distribute work to <strong>overflow servers</strong> to accommodate peak volumes</li>
<li>Provide <strong>redundant servers</strong> to take over critical workload components during peaks and outages</li>
</ul>

<h3>Design applications that don’t wait</h3>
	
<p>Each of these topics is too large to review in any detail in this post, but their central theme can be summed up as: <em>Design applications that don’t wait</em>. Note an important distinction between the behavior of individual transactions or units of work, and the behavior of the system as a whole. Individual transactions may indeed have to wait until they can obtain the processing resources they need. But the application as a whole should continue processing, with a minimal allocation of resources to any transactions flowing through the system.</p>

<p>That way, the scarce computing resources of one server do not sit idle waiting for delayed transactions to receive responses from other services or components. For example, here's some advice from BEA, taken from <a href="http://edocs.bea.com/wls/docs92/jms/design_best_practices.html#wp1058694" class="offsite-link-inline">Best Practices for Application Design</a> when programming the WebLogic Java Message Service (JMS):</p>
<blockquote>
<h4>Asynchronous vs. Synchronous Consumers</h4>
<p>In general, asynchronous (onMessage) consumers perform and scale better than synchronous consumers:</p>

<ul>
<li>Asynchronous consumers create less network traffic. Messages are pushed unidirectionally, and are pipelined to the message listener. Pipelining supports the aggregation of multiple messages into a single network call. </li>

<li>Asynchronous consumers use fewer threads. An asynchronous consumer does not use a thread while it is inactive. A synchronous consumer consumes a thread for the duration of its receive call. As a result, a thread can remain idle for long periods, especially if the call specifies a blocking timeout. </li>

<li>For application code that runs on a server, it is almost always best to use asynchronous consumers (which) prevents the application code from doing a blocking operation on the server. A blocking operation, in turn, idles a server-side thread; it can even cause deadlocks. Deadlocks occur when blocking operations consume all threads. When no threads remain to handle the operations required to unblock the blocking operation itself, that operation never stops blocking. </li>
</ul>
</blockquote>

<h3>Bell's Law</h3>

<p>In conclusion, when designing distributed systems, we should always recall Tom Bell's humorous observation, <em>All computers wait at the same speed</em>. A computing resource that is waiting to be used, especially a processor, is a wasted resource.</p>

<p class="Footnote"><strong>This post</strong> contains material first published in <a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 11: The Sharing Principle, p360. 
<strong>Tags:</strong> 
<a href="http://technorati.com/tag/distributed+systems" rel="tag">distributed systems</a>, 
<a href="http://technorati.com/tag/asynchronous+architecture" rel="tag">asynchronous architecture</a>,
<a href="http://technorati.com/tag/serialization" rel="tag">serialization</a>, 
<a href="http://technorati.com/tag/queues" rel="tag">queues</a>, 
<a href="http://technorati.com/tag/Bell's+Law" rel="tag">Bell's Law</a>, 
<a href="http://technorati.com/tag/Tom+Bell" rel="tag">Tom Bell</a>,
<a href="http://technorati.com/tag/performance+wisdom" rel="tag">performance wisdom</a>, 
<a href="http://technorati.com/tag/software+performance" rel="tag">software performance</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,  
<a href="http://technorati.com/tag/performance+matters" rel="tag">performance matters</a> 
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1200201.xml</wfw:commentRss></item><item><title>Five Scalability Principles</title><category>Articles and White Papers</category><category>Foundations of Performance</category><category>Optimization and Tuning</category><category>Performance Wisdom</category><dc:creator>Chris Loosley</dc:creator><pubDate>Fri, 10 Aug 2007 10:12:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/8/10/five-scalability-principles.html</link><guid isPermaLink="false">115864:1113404:1134876</guid><description><![CDATA[<div class="PageWisdomWrapper">
<div class="WisdomTitle" >
<h3>Five Scalability Principles</h3>
<p class="WisdomClass" ><a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044">Performance Wisdom</a>: 10</p>
</div>

<p class="WisdomQuote">Don’t think synchronously, ...</p>

<div class="WisdomText">
<p>... don&#8217;t think vertically, don&#8217;t mix transactions with business intelligence, avoid mixing hot and cold data, and don&#8217;t forget the power of memory.</p>
</ul>
 
<p class="QuoteSource">-- MySQL site, 2007</p>
</div>
</div>

<p><a href="http://www.mysql.com/why-mysql/scaleout/booking.html" class="offsite-link-inline">The 12 Days of Scale-Out</a> is a section of the <em>MySQL</em> site. It consists of a series of twelve articles, eleven of which are case studies describing large-scale MySQL implementations. <em>But Day Six</em> is a bit different -- it spells out five fundamental performance principles that apply to all application scaling efforts.</p>

<p>This subject is vitally important to MySQL, whose <em>server replication and high availability features ... allow high-traffic sites to horizontally 'Scale-Out' their applications, using multiple commodity machines to form one logical database -- as opposed to 'Scaling Up', starting over with more expensive and complex hardware and database technology</em>. </p>

<p>I know from first-hand experience that these claims are valid. At Keynote, my team used MySQL as the foundation for the <a href="http://www.keynote.com/products/web_performance/performance_measurement/performance_scoreboard.html" class="offsite-link-inline">Performance Scoreboard</a>. In this <a href="http://en.wikipedia.org/wiki/Data_mart" class="offsite-link-inline">data mart</a> application, MySQL supports supports the continuous insertion of new measurements at the rate of several million per day, plus hourly aggregation into summary tables, plus the queries needed to support continually updated dashboard displays for every customer, plus any ad hoc queries generated by customers doing diagnostic investigations.</p>

<h3>Learning the hard way</h3>

<p>According to the article, MySQL's <a href="http://www.mysql.com/news-and-events/press-release/release_2006_13.html" class="offsite-link-inline">database experts</a> ... <em>have seen many companies fall into a few common traps when they first design their systems, only to run into performance issues once the explosive growth hits</em>. So, adopting the <a href="http://www.webperformancematters.com/journal/2007/7/11/distributing-java-applications.html" class="PMref"><em>anti-pattern</em> approach</a> to providing guidance, the article presents the principles as the <a href="http://www.mysql.com/why-mysql/scaleout/scaleout_pitfalls.html" class="offsite-link-inline">Top Five Scale-Out
Pitfalls to Avoid</a>. These are:</p>

<ul class="grouptight">
<li>Don&#8217;t think synchronously</li>
<li>Don&#8217;t think vertically</li>
<li>Don&#8217;t mix transactions with business intelligence</li>
<li>Avoid mixing hot and cold data</li>
<li>Don&#8217;t forget the power of memory</li>
</ul>

<p>It is common for people to get smarter about performance when they have to find and fix problems. I learned about database performance first-hand between 1970 and 1995, while designing and tuning IBM database systems -- first IMS, and then DB2. In the process I discovered -- often the hard way -- that all large computer systems are subject to the same principles. And over the years I ran across other authors and teachers who (not surprisingly) had discovered the same things. Some had even invented memorable and insightful ways of describing their insights as "laws" or "rules".</p>

<p>When I wrote <a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, I tried to capture these clever sayings as numbered guidelines. So in this post, I'm going to reprint each of MySQL's five scale-out pitfalls, followed by the corresponding guidelines and some related excerpts from the manuscript of that book (marked as "HPCL"). </p>

<h3>1. Don&#8217;t think synchronously</h3>

<blockquote>
<p>Thinking synchronous is the single biggest mistake in architecting a Scale-Out design. Generally, when load is added to an already-loaded system, some part of the system will become a bottleneck -- and response times will increase. In scale-out, with a large system consisting of multiple machines, thinking synchronously will add a lot of wait time and hurt performance. Any truly large scale-out design will have to introduce asynchronous communication, parallelization, and strategies to deal with approximate or slightly outdated data.</p>
<p class="QuoteSource">--<a href="http://www.mysql.com/why-mysql/scaleout/scaleout_pitfalls.html" class="offsite-link-inline">Top Five Scale-Out Pitfalls to Avoid</a>, The 12 Days of Scale-Out, Day 6</p>
</blockquote>

<h4>Abandoning the Single Synchronous Transaction Paradigm</h4>

<p><strong>HPCL</strong>: The ... concept of a heterogeneous distributed database with synchronized updates is a vision of utopia that swims against the tide of computing technology. The tight controls over application processing that are possible on a mainframe are incompatible with many aspects of the move to widespread distributed processing... Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance enterprise client/server systems:</p>

<ul>
<li><strong>Decoupled processes.</strong> Decoupling occurs when we can separate the different parts of a distributed system so that no one process ever needs to stop processing to wait for the other(s). The driving force behind this recommendation is the Bell’s law of waiting: <em>All CPU’s wait at the same speed</em> (Guideline 11.17)</li>
<li><strong>Multi-transaction workflows.</strong> Often, we can split up the business transaction into a series of separate computer transactions. We call the result a multi-transaction workflow. The motivation for this recommendation is the Locality Principle, <em>Group components based on their usage</em> (Guideline 10.1), and (one corollary of) the Parallelism Principle, <em>Workflow parallelism can lower perceived response times</em> (Guideline 13.7)</li>
</ul>

<p class="QuoteSource">--<a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 16: Architecture for High Performance, pp506-507</p>

<h3>2. Don&#8217;t think vertically</h3>

<blockquote>
<p>It's a mistake to think that a system can be grown by scaling vertically, that is, by buying bigger machines with more CPUs. Throwing more power at an existing implementation -- which is probably synchronous and most likely already suffering from lock waits -- is only going to make it 'wait faster'. By planning for horizontal scale-out, almost from the start, a business is already planning in the direction of distributed, asynchronous systems, which will make it easier to add more capacity later on.</p>
<p class="QuoteSource">--<a href="http://www.mysql.com/why-mysql/scaleout/scaleout_pitfalls.html" class="offsite-link-inline">Top Five Scale-Out Pitfalls to Avoid</a>, The 12 Days of Scale-Out, Day 6</p>
</blockquote>

<p><strong>HPCL</strong>: In Chapters 11 and 12 we discussed the performance of shared resources, and how to minimize the delays caused by bottlenecks. When there is excessive demand for a single shared resource, one way to break the logjam is to divide and conquer. Dennis Shasha’s database design principle, <em>Partitioning breaks bottlenecks</em> [1], concisely expresses this motivation for processing work in parallel.</p>

<span class="SectionIllustrationInline"><img src="http://www.webperformancematters.com/storage/post-graphics/Dennis%20Shasha.JPG" alt="Illustration: Dennis Shasha" title="Dennis Shasha" /></span>

<p class="footnote">[1] <a href="http://cs.nyu.edu/shasha/" class="offsite-link-inline">Dennis E. Shasha</a> in <a href="http://www.informatik.uni-trier.de/~ley/db/journals/dr/Atzeni00a.html" class="offsite-link-inline">Database Tuning: A Principled Approach</a>, Prentice-Hall, 1992, p3</p>

<p>The idea behind parallelism is simple: take several items of work and process them at the same time. Naturally, this is faster than processing the same work serially. But while it may be an obvious and attractive design technique, processing in parallel does introduce new problems of its own. Not all workloads can be easily subdivided, and not all software is designed to work in parallel. Processing related pieces of work in parallel typically introduces additional synchronization overheads, and in many situations produces new kinds of contention among the parallel streams. Connie Smith’s version of the Parallelism Principle recognizes these complications: <em>Execute processing in parallel (only) when the processing speedup offsets communication overhead and resource contention delays</em> [2].</p>

<p class="footnote">[2] <a href="http://www.perfeng.com/instruct.htm" class="offsite-link-inline">Connie U. Smith</a>, Performance Engineering of Software Systems, Addison Wesley, 1990, p55</p>

<h4>The Parallelism Principle</h4>
	
<p>To sum up, Smith’s and Shasha’s two views nicely highlight the dilemma facing every designer when considering parallelism.  Combining them gives us The Parallelism Principle (Guideline 13.1), <em>Exploit parallel processing</em>:</p>

<p><em>Processing related pieces of work in parallel typically introduces additional synchronization overheads, and often introduces contention among the parallel streams. Use parallelism to overcome bottlenecks, provided the processing speedup offsets the additional costs introduced</em>.</p> 

<p class="QuoteSource">--<a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 13: The Parallelism Principle, pp382-383</p>

<h3>3. Don&#8217;t mix transactions with business intelligence</h3>

<blockquote>
<p>Many large systems are OLTP systems that do not have a data export phase inside the application. Therefore, they oftentimes contain vast amounts of business intelligence data. If an OLTP system, for the same number of users/articles/orders, grows over time, then it likely has a data warehouse or data mart struggling to get out. Separating the data onto different databases and/or servers will go a long way in improving performance for both the transactional application and analytic operations.</p>
<p class="QuoteSource">--<a href="http://www.mysql.com/why-mysql/scaleout/scaleout_pitfalls.html" class="offsite-link-inline">Top Five Scale-Out Pitfalls to Avoid</a>, The 12 Days of Scale-Out, Day 6</p>
</blockquote>

<p><strong>HPCL</strong>: How are we to reconcile an immediate need to process large decision-support queries with an ongoing transaction workload that demands high throughput and short response times? The more resources consumed by queries, the fewer remain to process transactions, and the larger the impact of the query workload on transaction throughput.</p>

<p>On the other hand, if we try to maintain overall throughput by artificially restricting the resources allocated to queries, queries take much longer. This alone can cause political problems, unless expectations have been set properly. But there is a potential side effect that is even more damaging. If the DBMS holds any database locks for a long running query, subsets of the transaction workload may have to wait until the query completes, causing erratic transaction response times.</p>

<h4>Inmon’s rule</h4>

<span class="SectionIllustrationInline"><img src="http://www.webperformancematters.com/storage/post-graphics/Bill%20Inmon.JPG" alt="Illustration: Bill Inmon" title="Bill Inmon" /></span>

<p>The usual resolution of this dilemma is to <em>avoid mixing short transactions with long-running queries</em> in the first place. I refer to this as <em>Inmon’s rule</em> because the well known speaker and author <a href="http://www.inmoncif.com/about/" class="offsite-link-inline">Bill Inmon</a> spent many years during the 1980s evangelizing this concept, making the rounds of the database user groups, talking about the importance of separating <em>operational</em> from <em>informational</em> processing. 

<p><em>Don’t mix short transactions with long-running queries. When we have high-performance operational workloads, we should keep them on a separate processor, separate from ad hoc or unknown queries, which may have massive processing requirements.</em> (Guideline 10.20)</p>

<p class="QuoteSource">--<a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 10: The Locality Principle, pp310-311</p>

<h3>4. Avoid mixing hot and cold data</h3>

<blockquote>
<p>Similar to #3 is mixing hot (frequently-changed) and cold (more static) data, especially when it comes to write activity. Since database writes are more difficult and expensive to scale, it is advisable to keep this type of data away from data that does not change that often. Again, separating the data onto different databases and/or servers can significantly enhance your application's performance.</p>
<p class="QuoteSource">--<a href="http://www.mysql.com/why-mysql/scaleout/scaleout_pitfalls.html" class="offsite-link-inline">Top Five Scale-Out Pitfalls to Avoid</a>, The 12 Days of Scale-Out, Day 6</p>
</blockquote>

<p>This one is interesting. In Chapter 12 on Database Locking, I discussed database hot spots:</p>

<p><strong>HPCL</strong>: Although locking is essential for maintaining data integrity, excessive contention can occur when too many concurrent applications need to lock the same data item, or the same small set of data items. Database designers call this type of locking bottleneck a <em>hot spot</em>. Because several applications are reading and writing to the same portion of a database, it is quite common for deadlocks to be caused by hot spots. A hot spot can arise for one of three reasons:</p>

<ul>
<li><strong>Natural hot spots exist in the business data.</strong> It is very rare for activity to be evenly distributed across all areas of the business--recall the Centering Principle (<em>Guideline 4.9. Think globally, focus locally.  This guideline is related to the <a href="http://en.wikipedia.org/wiki/Pareto_principle" class="offsite-link-inline">Pareto Principle</a>, or 80-20 rule.</em>). Sometimes, thanks to a highly skewed distribution of work, a large percentage of database updates apply to a small fraction of the business data. ... Automating these types of business applications tends to create database hot-spots naturally, unless we are careful to design the databases and the associated processing to eliminate them.</li>

<li><strong>The application’s design creates artificial hot spots.</strong> When applications are designed to maintain the current value of a derived statistic like a sequence number, a total, or an average, instant hot-spots are created in databases because every instance of a program must read and update the same data item. </li>

<li><strong>Locking protocols against physical data structures create artificial hot spots.</strong> Even though applications are not manipulating the same data, they may well be reading and writing to the same physical data or index pages, which will cause contention if page level locking is being used. Common examples of this type of problem occur when all the rows in a table fit on a small number of pages, or when data is inserted sequentially on the last page of a table based on a time or a sequence number.</li>
</ul>

<p class="QuoteSource">--<a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 12: Database Locking, pp373-374</p> 

<p>All my recommendations (pp 374-380) involved ways to reduce database contention by <em>spreading out</em> the data elements that comprise the hot spot. In contrast, the MySQL suggestion is to <em>separate</em> hot (frequently-changed) and cold (more static) data altogether, and then to focus on special tuning to improve the performance of the hot data. While I did not propose this approach, I believe it does not conflict with my own advice. As I noted when discussing Inmon's rule, <em>Inmon’s rule is not just for transactions and queries. It can be applied to any mix of workloads that have different performance characteristics.</em> </p>

<h3>5. Don&#8217;t forget the power of memory</h3>

<blockquote>
<p>Data accessed in memory produces infinitely better response times than the same data accessed on disk. Once the most-often referenced/accessed data -- oftentimes called the "working set" -- exceeds the amount of available memory, a system runs the risk of becoming disk-bound, with the end result being poorer performance. When designing a Scale-Out architecture, care must be taken not to exhaust a single server's memory allocations so that it becomes disk bound. Instead, an application's working set must be smartly divided among the servers participating in a scale-out design so that data is always accessible in RAM.</p>
<p class="QuoteSource">--<a href="http://www.mysql.com/why-mysql/scaleout/scaleout_pitfalls.html" class="offsite-link-inline">Top Five Scale-Out Pitfalls to Avoid</a>, The 12 Days of Scale-Out, Day 6</p>
</blockquote>

<h4>Substitute faster devices for slower ones</h4>
	
<p><strong>HPCL</strong>: The table (below) shows the typical hierarchy of computing resources found in an enterprise client/server environment, organized by relative speed with the fastest at the top. To improve an application’s performance, we must look for design changes that will make its resource usage pattern migrate upwards in the hierarchy.</p>

<p>
<table border=0 cellspacing=2 cellpadding=0>
 <tr>
  <td><b>Device Type</b></td>
  <td><b>Typical Service Time &nbsp;</b></td>
  <td><b>Relative to 1 second</b></td>
 </tr>

 <tr>
  <td><i>High Speed Processor Buffer</i></td>
  <td>10 nanoseconds</td>
  <td>1 second</td>
 </tr>

 <tr>
  <td><i>Random Access Memory</i></td>
  <td>60 nanoseconds</td>
  <td>~6 seconds</td>
 </tr>

 <tr>
  <td><i>Expanded Memory</i></td>
  <td>25 microseconds</td>
  <td>~1 hour</td>
 </tr>

 <tr>
  <td><i>Solid State Disk Storage</i></td>
  <td>1 millisecond</td>
  <td>~1 day</td>
 </tr>

 <tr>
  <td><i>Cached Disk Storage</i></td>
  <td>10 milliseconds</td>
  <td>~12 days</td>
 </tr>

 <tr>
  <td><i>Magnetic Disk Storage</i></td>
  <td>25 milliseconds</td>
  <td>~4 weeks</td>
 </tr>

 <tr>
  <td><i>Disk via MAN/High Speed LAN Server &nbsp;</i></td>
  <td>27 milliseconds</td>
  <td>~1 month</td>
 </tr>

 <tr>
  <td><i>Disk via Typical LAN Server</i></td>
  <td>35-50 milliseconds</td>
  <td>~6-8 weeks</td>
 </tr>

 <tr>
  <td><i>Disk via Typical WAN Server</i></td>
  <td>1-2 seconds</td>
  <td>~3-6 years</td>
 </tr>

 <tr>
  <td><i>Mountable Disk/Tape Storage</i></td>
  <td>3-15 seconds</td>
  <td>~10-50 years</td>
 </tr>
</table>
&nbsp;
</p>

<h4>Substitute memory and processor cycles for disk I/O</h4> 
	
<p>Database caching or buffering reduces the cost of re-reading frequently reused portions of a database by retaining them in memory, improving responsiveness by trading off memory for processor resources and disk I/O. Of course, the costs of searching for data in cache are all wasted overhead when the data isn’t actually there. This situation is termed a cache miss. The relatively small overhead of cache misses must be weighed against the larger savings we get whenever there is a cache hit. Typically, no matter how large the cache, cache hits consume fewer processor cycles and a lot less time than would the corresponding disk I/O.</p>

<p class="QuoteSource">--<a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 14: The Trade-off Principle, pp432-433</p> 

<p>Parallel processing can be done on machines with SMP, MPP, or various hybrid architectures ... Regardless of the hardware architecture, the objective is to increase processing power by adding more processors. But depending upon both the workload and how the processors and other devices are interconnected, other hardware constraints, for example, the memory or I/O bus, can prevent the full exploitation of the additional processors. If some other component is the bottleneck, we obtain no benefit from adding more processors--in fact, we may even make the bottleneck worse.</p> 

<p>One solution is to remove the constraint by giving each processor its own memory, and/or its own disk subsystem. This solution, however, complicates system software development. For example, database software must coordinate the contents of in memory data caches across the different processors if each processor has its own dedicated memory. Removing hardware constraints may simply move the problem to the software, unless the software has been specially written to overcome these problems.</p>

<p class="QuoteSource">--<a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 13: The Parallelism Principle, pp417-418</p> 

<h3>Your contributions ...</h3>

<p>I've included just a few extracts from my book that came to mind as I was compiling this post. If you know of other informative discussions of the same principles, in a book or on the Web, please post a comment below sharing your knowledge. Of course, questions or observations of your own are also welcome.</p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,
<a href="http://technorati.com/tag/MySQL" rel="tag">MySQL</a>,
<a href="http://technorati.com/tag/database" rel="tag">database</a>,
<a href="http://technorati.com/tag/performance" rel="tag">performance</a>,
<a href="http://technorati.com/tag/performance+wisdom" rel="tag">performance wisdom</a>,
<a href="http://technorati.com/tag/data+mart" rel="tag">Data mart</a>,
<a href="http://technorati.com/tag/Dennis+Shasha" rel="tag">Dennis Shasha</a>,
<a href="http://technorati.com/tag/Connie+Smith" rel="tag">Connie Smith</a>, 
<a href="http://technorati.com/tag/parallelism" rel="tag">Parallelism</a>,
<a href="http://technorati.com/tag/hot+spot" rel="tag">hot spot</a>,
<a href="http://technorati.com/tag/Bill+Inmon" rel="tag">Bill Inmon</a>, 
<a href="http://technorati.com/tag/Pareto+Principle" rel="tag">Pareto Principle</a>, 
<a href="http://technorati.com/tag/80-20+rule" rel="tag">80-20 rule</a>,
<a href="http://technorati.com/tag/working+set" rel="tag">working set</a>,
<a href="http://technorati.com/tag/cache" rel="tag">cache</a>,
<a href="http://technorati.com/tag/response+time" rel="tag">response time</a>,
<a href="http://technorati.com/tag/computer+memory" rel="tag">computer memory</a>,
<a href="http://technorati.com/tag/performance+matters" rel="tag">Performance Matters</a>,
<a href="http://technorati.com/tag/application+performance" rel="tag">application performance</a>
</p> 

]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1134876.xml</wfw:commentRss></item><item><title>Latency, Bandwidth, and Response Times</title><category>Articles and White Papers</category><category>Blogs and Publications</category><category>Foundations of Performance</category><category>Optimization and Tuning</category><dc:creator>Chris Loosley</dc:creator><pubDate>Tue, 24 Jul 2007 10:00:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/7/24/latency-bandwidth-and-response-times.html</link><guid isPermaLink="false">115864:1113404:1082660</guid><description><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Alberto%20Response%20Time%20101.jpg" alt="Illustration: Web Page Response Time 101" title="Web Page Response Time 101"/></span>

<p><a href="http://www.webperformancematters.com/journal/2007/7/13/latency-bandwidth-and-station-wagons.html" class="PMref"><em>Latency, Bandwidth, and Station Wagons</em></a> focused primarily on the limitations of network bandwidth, and <strong>the time required to transmit massive data volumes</strong>. While that is an interesting topic, and one that produces some surprising results (like the fact that <a href="http://royal.pingdom.com/?p=119" class="offsite-link-inline"><em>FedEx is still faster than the Internet</em></a>), it is not particularly relevant to the subject of <em>Web</em> performance, which depends on <strong>the time required to transmit many small files</strong>. 

<p>My post highlighted <a href="http://www.edgeblog.net/2007/its-still-the-latency-stupid/" class="offsite-link-inline">It's Still The Latency, Stupid</a>, by William (Bill) Dougherty in <a href="http://www.edgeblog.net/" class="offsite-link-inline">edgeblog</a>. Bill's title pays homage to a famous 1996 article by <a href="http://www.stuartcheshire.org/" class="offsite-link-inline">Stuart Cheshire</a> about bandwidth and latency in ISP links, <a class="offsite-link-inline" href="http://www.stuartcheshire.org/rants/Latency.html">It's the Latency Stupid</a>.</p>

<p>Over a decade later, Bill points out, Cheshire's writings are still relevant: <em>One concept that continues to elude many IT managers is the impact of latency on network design ... Latency, not bandwidth, is often the key to network speed, or lack thereof.</em> This is especially true when it comes to the download speeds (or response times) of Web pages and Web-based applications. In this post I explain why, providing some supporting references and examples to support my argument.</p> 

<h3>The edgeblog debate</h3>

<p>Supporting Bill's point about the lack of understanding of this fact, <a href="http://www.soulsphere.org/about.html" class="offsite-link-inline">Simon Howard</a> (posting as <em>fragglet</em>) responded with <a href="http://fragglet.livejournal.com/11924.html" class="offsite-link-inline">It's the bandwidth, stupid</a>, in which he disagreed strongly with Bill's post. A lively discussion ensued in the comments on edgeblog, beginning <a href="http://www.edgeblog.net/2007/its-still-the-latency-stupid/#comment-14051" class="offsite-link-inline">here</a>.</p>

<p>Bill then followed his first post with <a href="http://www.edgeblog.net/2007/its-still-the-latency-stupid-pt2/" class="offsite-link-inline">Part 2</a>, which discussed four possible tuning actions to reduce the impacts of network latency:</p>

<ol class="grouptight">
<li>Tweak the host TCP settings</li>
<li>Change the protocol</li>
<li>Move the service closer to the user</li>
<li>Use a network accelerator</li>
</ol>

<p>This post was also <a href="http://fragglet.livejournal.com/12153.html" class="offsite-link-inline">disputed</a> by fragglet, prompting another response from Bill in the edgeblog comments, <a href="http://www.edgeblog.net/2007/its-still-the-latency-stupid-pt2/#comment-14617" class="offsite-link-inline">here</a>. These discussions are an excellent illustration of the kinds of misunderstandings that exist about the role of network latency as a determinant of performance. Perhaps the most revealing exchange is <a href="http://www.edgeblog.net/2007/its-still-the-latency-stupid/#comment-14151" class="offsite-link-inline">this one</a>:</p>

<blockquote>
<p><strong>Fragglet:</strong> You are saying that latency causes network problems and that by improving latency you can improve your network. I assert that this is false. If you have latency problems, they are a symptom of network congestion. If your network is suffering from serious congestion, it probably needs more bandwidth.</p>

<p><strong>Bill:</strong> Wow. It is impressive how someone can miss the point so completely so many times. While network congestion will add to latency, latency is in and of itself a problem. In a network with zero congestion, latency will still be a problem. The problem is distance. More bandwidth cannot improve upon the speed of light. Sorry. This is the whole point of my article. Latency does cause issues unrelated to bandwidth or congestion. Those issues can be reduced with planning.</p>
</blockquote>

<p>Indeed! I will now explain why fragglet is wrong and Bill is right. As promised (at the end of my <a href="http://www.webperformancematters.com/journal/2007/7/13/latency-bandwidth-and-station-wagons.html" class="PMref">station wagons post</a>) I am going base my explanation on the 2001 article by Alberto Savoia, <a href=" /papers-and-talks/performance-management/Web%20page%20response%20time%20101%20Savoia%20STQE%202001%20.pdf" class="PMfile">Web Page Response Time 101</a> [**], which I <a href="http://www.webperformancematters.com/journal/2007/7/12/four-laws-of-web-site-performance.html" class="PMref">introduced</a> previously. That's because even though his examples are a bit dated (how many people are still using a 28.8Kbps modem today?), Alberto's article provides a concise and readable explanation of the technical principles involved.</p>

<p class="aside">[<strong>** Warning:</strong> even though this reprint is just 6 pages, it's a 2.5Mb pdf, so wait until you're on a fast connection].</p>

<h3>The page response time formula</h3>

<div class="SectionIllustrationInline">
<h4>The Complete Formula</h4>

<p><em>R = 2(D+L+C)+(D+C/2)((T-2)/M)
<br />+Dln((T-2)/M+1)+max(8P(1+OHD)/B,
<br />DP/W)/(1-sqrt(L))</em></p>

<p class="group">
B = Min line speed (bits per second)
<br />C = Cc + Cs
<br />Cc = Client processing time (seconds)
<br />Cs = Server processing time (seconds)
<br />D = Round trip delay (seconds)
<br />L = Packet loss (fraction)
<br />M = multiplexing factor
<br />OHD = Overhead (fraction)
<br />P = Payload (bytes)
<br />R = Response Time (seconds)
<br />T = application turns (count)
<br />W = Window size (bytes)</p>

<p class="QuoteSource">©NetForecast Inc.</p>
</div>

<p>The key to this issue is a simple formula for Web page download time. Credit for the original research goes to Peter Sevcik and John Bartlett of <a href="http://www.netforecast.com/" class="offsite-link-inline">NetForecast Inc.</a>. Their 2001 research report, <a href="http://www.bcr.com/carriers/internet_infrastructure/understand_web_performance_20011019390.htm" class="offsite-link-inline">Understanding Web Performance</a>, was published in Business Communication Review (<a href="http://www.bcr.com/" class="offsite-link-inline">BCR</a>) in October 2001, and can also be downloaded as a <a href="http://www.netforecast.com/Reports/NFR%205055%20Understanding%20Web%20Performance.pdf" class="offsite-link-inline">pdf</a>. It explains Web page response time using "The Complete Formula" shown here.</p>

<p>Alberto's contribution was in simplifying this formula to the version shown in his article's Figure 1, reproduced below. He notes that:</p>

<p><em>This formula makes several generalizations and assumptions, and its accuracy varies through the possible range of values (it tends to overestimate below eight seconds and underestimate over eight seconds).</em> </p>

<p>Actually, even that explanation hides some further assumptions. His reference to eight seconds involves assumptions about typical connection latency and bandwidth, typical Web page sizes, and the typical number elements (separately downloadable files) that make up a Web page. </p>

<p><span class="SectionIllustrationInline"><img src="http://www.webperformancematters.com/storage/post-graphics/Response%20Time%20Figure%201%20by%20Savoia.JPG" alt="Illustration: The Response Time Formula by Alberto Savoia" title="The Response Time Formula by Alberto Savoia"/></span>
</p>

<p class="clearLeft">But, as he says, <em>for the purposes of this article, however, it will do just fine, since it introduces the key variables that impact page response time and shows you how they relate to each other, without introducing excessive complexity</em>. Alberto describes the six key variables as follows:</p>

<blockquote>
<h4>The six parameters of Web response time</h4>
<ul class="group">
<li><strong>Page size</strong>: Page size is measured in Kbytes, and on the surface, the impact of this variable is pretty obvious: the larger the page, the longer it takes to download. When estimating page size, however, many people fail to consider all the
components that contribute to page size—all images, Java and other applets, banners from third sources, etc.—so make sure you don’t overlook anything.</li>

<li><strong>Minimum bandwidth</strong>: Minimum bandwidth is defined as the bandwidth of the <em>smallest pipe</em> between your content and the end user. Just as the strength of a chain is determined by its weakest link, the effective bandwidth between two end points is determined by the smallest bandwidth between them. Typically the limiting bandwidth
is between the users and their ISPs.</li>

<li><strong>Round trip time</strong>: In the context of Web page response time, round-trip time (RTT) indicates the latency, or time lag, between the sending of a request from the user’s browser to the Web server and the receipt of the first few bytes of data from the Web server to the user’s computer. RTT is important because every request/response pair (even for a trivially small file) has to pay this minimum performance penalty. As we shall see in the next section, the typical Web page requires several request/response cycles.</li>

<li><strong>Turns</strong>: A typical Web page consists of a <em>base page</em> [or index page] and several additional objects such as graphics or applets. These objects are not transmitted along with the base page; instead, the base page HTML contains instructions for locating and fetching them. Unfortunately for end-user performance, fetching each of these objects requires a fair number of additional communication cycles between the user’s system and the Web site server—each of which is subject to the RTT delay I just mentioned.</li>

<li><strong>Server processing time</strong>: The last factor in the response time formula is the processing time required by the server and the client to <em>put together</em> [i.e. generate and render] the required page so it can be viewed by the requester. This can vary dramatically for different types of Web pages. On the server side, pages with static content require minimal processing time and will cause negligible additional delay. Dynamically created pages (e.g., personalized home pages like my.yahoo.com) require a bit more server effort and computing time, and will introduce some delay. Finally, pages that involve complex transactions (e.g., credit card verification) may require very significant processing time and might introduce delays of several seconds.</li>

<li><strong>Client processing time</strong>: On the client side, the processing time may be trivial (for a basic text-only page) to moderate (for a page with complex forms and tables) to extreme. If the page contains a Java applet, for example, the client’s browser will have to load and run the Java interpreter, which can take several seconds.</li>
</ul>
<p class="QuoteSource">--Alberto Savoia, Web Page Response Time 101, STQE, July/August 2001<br />[Minor clarifications added]</p>
</blockquote>

<p>There is no question that these six variables do account for Web page response times, and do operate in the directions prescribed in the formula. As one confirmation, consider the authoritative textbook on the quantitative aspects of computing, <a href="http://books.elsevier.com/us//mk/us/subindex.asp?maintarget=&isbn=9781558605961" class="offsite-link-inline">Computer Architecture</a> by John L. Hennessy and David A. Patterson ("H&P"). In the 3rd Edition (2003), Chapter 8 covers networks, and page 798 gives this simple formula for <em>the total latency of a message</em>: </p>

<div class="InlineTextBox">
<p class="code">Total latency = <br />Sender overhead + Time of flight + Message size/Bandwidth + Receiver overhead</p>
</div>

<p>The Web uses TCP, which transmits data as message segments comprising one or more packets, with each segment being acknowledged. So we can view a Web page download as a succession of H&P's message transmissions. Because each TCP segment is followed by an acknowledgment, H&P's <em>time of flight</em> variable corresponds to Alberto's <em>round-trip time</em>. And so summing N instances of H&P's formula produces Alberto's formula, with <em>Turns</em> having a value of N. </p>

<p>This demonstrates that the formula is correct, but if we wanted to use it to <em>predict</em> page download time, the task of selecting the right values to plug into those variables involves some complications.</p>

<h3>Turn, turn, turn ...</h3>

<p>Alberto does explain how to use the formula. But in practice, I would expect those directions to produce a worst-case estimate for response time, mainly because the value of the <em>Turns</em> variable should probably be lower than Alberto's method suggests. Three factors affect the estimation of turn counts: </p>
<ol class="group">
<li>Alberto assumes HTTP 1.0, but HTTP 1.1 is now the predominant platform. HTTP 1.1 implemented persistent TCP connections, so that browsers do not repeatedly close and reopen TCP connections with a server, and (for large files) TCP segments will be larger, because of the way TCP <a href="http://en.wikipedia.org/wiki/Slow-start" class="offsite-link-inline">slow-start</a> operates. So most browsers will now download a Web page using fewer turns.</li> 
<li>Browsers can open up to two <em>parallel</em> connections for each distinct server domain, removing some turns from the synchronous response time path.</li>
<li>Most users of most Web pages already have some page elements in their browser caches, eliminating the turns that would be needed to fetch them. On the other hand, recent research at Yahoo suggests that browser caching benefits may be less than imagined -- see Performance Research, Part 2: <a href="http://yuiblog.com/blog/2007/01/04/performance-research-part-2/" class="offsite-link-inline">Browser Cache Usage - Exposed!</a> by Tenni Theurer.</li> 
</ol>

<p>Fortunately, such complications do not prevent us from using the formula to explain the relative importance of bandwidth and latency. </p>

<h3>Implications of the formula</h3>

<p>To demonstrate this, consider a newer version of the formula shown in Figure 2 below. This comes from a September 2006 NetForecast report, <a href="http://www.netforcast.com/Reports/NFR5085%20Field%20Guide%20to%20Application%20Delivery%20Systems.pdf" class="offsite-link-inline">Field Guide to Application Delivery Systems</a>. The only difference between this and Alberto's version is in its use of curly equals, signifying "is approximately equal to": </p>

<div class="SectionIllustrationInline"><img src="http://www.webperformancematters.com/storage/post-graphics/Response%20Time%20Formula%20by%20NF%201.JPG" alt="Response Time Formula by NetForecast" title="Response Time Formula by NetForecast"/>
<p class="caption"><strong>Figure 2. Response Time Formula by NetForecast</strong></p>
</div>

<p>The NetForecast paper goes on to discuss how each of the six factors affects response times, using Figure 3 (below) to summarize its points.</p>

<div class="SectionIllustrationInline"><img src="http://www.webperformancematters.com/storage/post-graphics/Response%20Time%20Formula%20by%20NF%202.JPG" alt="Response Time Causes and Effects, by NetForecast" title="Response Time Causes and Effects, by NetForecast"/>
<p class="caption"><strong>Figure 3. Response Time Causes and Effects</strong></p>
</div>

<p>In addition to the issues highlighted by NetForecast, we can evaluate the relative contributions of Bandwidth and Latency to overall response time by simply comparing the two factors [Payload/Bandwidth] and [Turns x RTT] in typical Web environments. </p>

<ul>
<li><strong>[Payload/Bandwidth]:</strong> Alberto's paper focuses on dial-up performance, using the example of effective bandwidth being 4Kbytes/sec. In this situation, a 120K page may take 30 seconds to transfer, before we even consider any latency effects. But as bandwidth increases, this factor becomes progressively smaller. A broadband connection of 1.5Mbps can transfer about 150Kbytes/sec, reducing this factor to 0.8 secs for a  120K Web page, or 2 secs for a 300K page. </li>

<li><strong>[Turns x RTT]:</strong> According to NetForecast, the average Keynote Business 40 home page requires 60 turns and 300 kilobytes to load. If RTT (or "ping times") are in the range of 100-200ms (typical times for many consumers), 60 turns will add 6-12 secs for network latency.  Ping times will certainly be faster for broadband connections than they are for dial-up, because of all the extra analog-digital conversion delays imposed by the dial-up technology. But at a certain point, the speed of light and the latencies of the carriers' network devices (hubs, routers) limit further improvement.</li>
</ul>

<h3>The bottom line ...</h3>

<p>In the analysis above, I showed that as connection bandwidth increases, the effect of the first factor in the response time formula approaches zero. No such scaling effect exists for the second factor -- neither latency nor turns can ever be made to approach zero. In fact, as Web sites and applications keep growing in sophistication, <strong>turns are increasing</strong>. The NetForecast paper states that over the past decade turn counts for the Keynote Business 40 Web sites have grown 12% per year and payload has grown 20% per year. </p>

<p>Even the newest AJAX technology may increase, not reduce, turns, as designers attempt to replace large monolithic file downloads with multiple smaller requests. Unless those smaller requests can also be designed to happen asynchronously, while the user is doing something else, they will only add to the delay due to network latency. </p>

<p>This is why I agree with Bill Dougherty that <strong>It's Still The Latency, Stupid!</strong> And until someone finds a way to move bits faster than the speed of light, that's not going to change.</p> 

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/William+Dougherty" rel="tag">William Dougherty</a>,
<a href="http://technorati.com/tag/edgeblog" rel="tag">edgeblog</a>,
<a href="http://technorati.com/tag/Simon+Howard" rel="tag">Simon Howard</a>,
<a href="http://technorati.com/tag/fragglet" rel="tag">fragglet</a>,
<a href="http://technorati.com/tag/performance" rel="tag">performance</a>,
<a href="http://technorati.com/tag/bandwidth" rel="tag">bandwidth</a>,
<a href="http://technorati.com/tag/throughput" rel="tag">throughput</a>, 
<a href="http://technorati.com/tag/latency" rel="tag">latency</a>,
<a href="http://technorati.com/tag/ping+time" rel="tag">ping time</a>,
<a href="http://technorati.com/tag/round-trip+time" rel="tag">round-trip time</a>,
<a href="http://technorati.com/tag/Stuart+Cheshire" rel="tag">Stuart Cheshire</a>,
<a href="http://technorati.com/tag/Web+performance" rel="tag">Web performance</a>, 
<a href="http://technorati.com/tag/Web+application" rel="tag">Web application</a>,
<a href="http://technorati.com/tag/download+time" rel="tag">download time</a>,
<a href="http://technorati.com/tag/Alberto+Savoia" rel="tag">Alberto Savoia</a>,
<a href="http://technorati.com/tag/Peter+Sevcik" rel="tag">Peter Sevcik</a>,
<a href="http://technorati.com/tag/John+Bartlett" rel="tag">John Bartlett</a>,
<a href="http://technorati.com/tag/NetForecast" rel="tag">NetForecast</a>,
<a href="http://technorati.com/tag/Fedex" rel="tag">Fedex</a>,
<a href="http://technorati.com/tag/Roral+Pingdom" rel="tag">Royal Pingdom</a>,  
<a href="http://technorati.com/tag/station+wagon" rel="tag">station wagon</a>,
<a href="http://technorati.com/tag/performance+matters" rel="tag">Performance Matters</a>
</p>
]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1082660.xml</wfw:commentRss></item><item><title>Java Performance Optimization</title><category>Book Reviews</category><category>Foundations of Performance</category><category>Optimization and Tuning</category><category>Performance Management</category><category>Software Engineering</category><dc:creator>Chris Loosley</dc:creator><pubDate>Fri, 20 Jul 2007 05:55:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/7/20/java-performance-optimization.html</link><guid isPermaLink="false">115864:1113404:1157555</guid><description><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Java%20Performance%20Management%20and%20Optimization%20cover.JPG" alt="Illustration: Pro Java EE5 Performance Management and Optimization (cover)" title="Pro Java EE5 Performance Management and Optimization"/></span>

<p>Do you subscribe to email newsletters? If you're like me, you get lots of them. New ones appear in my inbox every morning. They pile up, demanding to be read. In fact, they seem to breed like rabbits, producing new offspring -- when did I express an interest in <em>Enterprise VOIP Security Architecture</em> issues?  Sometimes in a housekeeping splurge I delete a few dozen at once, suffering a momentary twinge of anxiety at having perhaps missed something important. So usually I skim them before hitting the delete button. </p>

<p>TechTarget's <a href="http://searchsoftwarequality.techtarget.com/" class="offsite-link-inline">Search Software Quality</a> service seems to be especially prolific, but is also a regular source of interesting references -- like <a href="http://www.theserverside.com/tt/knowledgecenter/" class="offsite-link-inline">TheServerSide.com</a>, the subject of a recent note. According to the site's home page:</p>

<blockquote>
<h4>Java Performance Management for Large-Scale Systems</h4>
<p>There are many classes of enterprise applications that have stringent performance and scalability requirements. TheServerSide.com has assembled a collection of resources to help you better design, develop, test and manage high performance, large-scale systems - learn new and innovative approaches for performance tuning, memory management, concurrent programming, JVM clustering and more.</p>
</blockquote>

<h3>Pro Java EE 5 Performance Management and Optimization</h3>

<p>One of the articles listed on <em>TheServerSide.com</em> was a <a href="http://www.theserverside.com/tt/knowledgecenter/knowledgecenter.tss?l=ProJavaEE_Ch06" class="offsite-link-inline">book review</a> of <em>Pro Java EE 5 Peformance Management and Optimization</em> by Steven Haines:</p>

<blockquote>
<p><em>Pro Java EE 5 Performance Management and Optimization</em> features proven methodology to guarantee top-performing Java EE 5 applications and explains how to measure performance in your specific environment. The book details performance integration points throughout the development and deployment lifecycles. For QA and preproduction stages, author Steven Haines guides the reader through testing and deploying Java EE 5 applications with a focus on assessing capacity and discovering saturation points. Haines also defines the concept and application of wait-based tuning.</p>

<p>In addition, the book explains assessing and improving the health of applications upon deployment. The topics covered include trending, forecasting, and capacity assessing and planning. Haines also walks through the creation of a formal Java EE 5 Performance Management Plan, customized to an environment to help interpret and react to changing trends in usage patterns.</p>
<p class="QuoteSource">Published by <a href="http://apress.com/book/bookDisplay.html?bID=10073">Apress</a>, May 2006. ISBN: 1-59059-610-2</p>
</blockquote>

<p>The review includes the option to download the full text of two chapters:</p>

<blockquote>
<h4>Chapter 6: Performance Tuning Methodology</h4>
<p>... focuses on setting up a proper testing environment and explores the concept of wait-based tuning. Haines explains the steps necessary to implement a formal performance tuning methodology and guides the reader through a complete tuning example -- [<a href="http://www.theserverside.com/tt/articles/content/ProJavaEE/HainesChapter6.pdf" class="offsite-link-inline">pdf</a>].</p>

<h4>Chapter 9: Performance and Scalability Testing</h4>
<p>... discusses the difference between the concepts of performance and scalability, and outlines the strategy of ensuring performance before testing for scalability. Haines also leads a detailed exploration into the ultimate scalability test - the capacity assessment - and explains how to assemble a formal Capacity Assessment Report -- [<a href="http://www.theserverside.com/tt/articles/content/ProJavaEE/HainesChapter9.pdf" class="offsite-link-inline">pdf</a>].</p>
</blockquote>

<h3>Steven Haines</h3>

<p><a href="http://www.quest.com/newsroom/Steven-Haines.aspx" class="offsite-link-inline">Steven Haines</a> appears to be <a href="http://www.informit.com/guides/guide.asp?g=java&rl=1" class="offsite-link-inline">well qualified</a> to write on this subject. If you'd like more background, there's an <a href="http://www.javaperformancetuning.com/news/interview033.shtml" class="offsite-link-inline">interview</a> on the <a href="http://www.javaperformancetuning.com/index.shtml" class="offsite-link-inline">JavaPerformanceTuning.com</a> site. And here's a publisher's bio:</p>

<blockquote>
<p>Steven Haines is the author of three Java books: <i>The Java Reference Guide</i> (InformIT/Pearson, 2005), <i>Java 2 Primer Plus</i> (SAMS, 2002), and <i>Java 2 From Scratch</i> (QUE, 1999). In addition to contributing chapters and coauthoring other books, as well as technically editing software publications, he is the Java Host on InformIT.com. As an educator, he has taught all aspects of Java at Learning Tree University as well as at the University of California, Irvine. By day he works as a Java EE 5 Performance Architect at Quest Software, defining performance tuning and monitoring software as well as managing and performing Java EE 5 performance tuning engagements for large-scale Java EE 5 deployments, including those of several Fortune 500 companies.</p>
</blockquote>

<p>Also, SAMS Publishing has a three-part article he wrote in 2003 about <em>J2EE performance tuning</em> -- <a href="http://www.samspublishing.com/articles/article.asp?p=31274" class="offsite-link-inline">part 1</a>, <a href="http://www.samspublishing.com/articles/article.asp?p=31353" class="offsite-link-inline">part 2</a>, and <a href="http://www.samspublishing.com/articles/article.asp?p=31441" class="offsite-link-inline">part 3</a>.</p>

<h3>Other excerpts ...</h3>

<p>For more samples of the book's content, an article published on the <a href="http://www.javaworld.com/javaworld/jw-06-2006/jw-0619-tuning.html" class="offsite-link-inline">JavaWorld</a> site seems to contain most or all of Chapter 14, <em>Solving Common Java EE Performance Problems</em>. While Chapter 6 addresses the principles of performance tuning, this long article (13-parts) goes into a lot more technical detail. It concludes:</p>

<blockquote>
<h4>Solving common Java EE performance problems</h4>
<p>While each application and each environment is different, a common set of issues tends to plague most environments. This article focused not on application code issues, but on the following environmental issues that can manifest poor performance:</p>
<ul class="grouptight">
<li>Out-of-memory errors</li>
<li>Thread pool sizes</li>
<li>JDBC connection pool sizes</li>
<li>JDBC prepared statement cache sizes</li>
<li>Cache sizes</li>
<li>Pool sizes</li>
<li>Excessive transaction rollbacks</li>
</ul>
<p>In order to effectively diagnose performance problems, you need to understand how problem symptoms map the root cause of the underlying problem. If you can triage the problem to application code, then you need to forward the problem to the application support delegate, but if the problem is in the environment, then resolving it is within your control.</p>
<p class="QuoteSource">--Steven Haines, <a href="http://www.javaworld.com/javaworld/jw-06-2006/jw-0619-tuning.html" class="offsite-link-inline">JavaWorld.com</a>, June 19, 2006</p>
</blockquote>

<p>Finally, the <a href="http://books.google.com/books?id=YC3rEo8ze8oC&pg=PP1&ots=Dn6kwiaIU0&dq=Pro+Java+EE+5+Performance+Management+and+Optimization" class="offsite-link-inline">Google Books</a> site also contains about 25 shorter excerpts, each 3 or 4 pages, that will give you a really good picture of what the book covers and how Steven approaches his subject matter.</p>

<p>I'd be rather surprised if, after browsing all these informative and well-written samples, you don't decide that you need to order a complete copy for yourself. I have!</p>


<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/Steven+Haines" rel="tag">Steven Haines</a>,
<a href="http://technorati.com/tag/Quest" rel="tag">Quest</a>,
<a href="http://technorati.com/tag/Apress" rel="tag">Apress</a>,
<a href="http://technorati.com/tag/TechTarget" rel="tag">TechTarget</a>,
<a href="http://technorati.com/tag/Java" rel="tag">Java</a>, 
<a href="http://technorati.com/tag/JavaWorld" rel="tag">JavaWorld</a>,
<a href="http://technorati.com/tag/TheServerSide" rel="tag">TheServerSide</a>,
<br />
<a href="http://technorati.com/tag/performance+tuning" rel="tag">performance tuning</a>,
<a href="http://technorati.com/tag/performance" rel="tag">performance</a>,
<a href="http://technorati.com/tag/tuning" rel="tag">tuning</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,
 <a href="http://technorati.com/tag/performance+matters" rel="tag">Performance matters</a>,
<a href="http://technorati.com/tag/Web+performance" rel="tag">Web performance</a> 
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1157555.xml</wfw:commentRss></item><item><title>Latency, Bandwidth, and Station Wagons</title><category>Blogs and Publications</category><category>Foundations of Performance</category><category>Slowness</category><dc:creator>Chris Loosley</dc:creator><pubDate>Fri, 13 Jul 2007 19:05:00 +0000</pubDate><link>http://www.webperformancematters.com/journal/2007/7/13/latency-bandwidth-and-station-wagons.html</link><guid isPermaLink="false">115864:1113404:1146030</guid><description><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Station%20Wagon.jpg" alt="Illustration: Station Wagon" title="Station Wagon"/></span>

<p><em>One concept that continues to elude many IT managers is the impact of latency on network design. 11 years ago, Stuart Cheshire wrote a detailed analysis of the difference between bandwidth and latency in ISP links [<a class="offsite-link-inline" href="http://www.stuartcheshire.org/rants/Latency.html">It's the Latency Stupid</a>]. Over a decade later, his writings are still relevant. Latency, not bandwidth, is often the key to network speed, or lack thereof.</em></p>

<p class="clearLeft">That's from <a href="http://www.edgeblog.net/2007/its-still-the-latency-stupid/" class="offsite-link-inline">It's Still The Latency, Stupid</a> by <a href="http://www.blogforayear.com/profiles/william-dougherty" class="offsite-link-inline">William (Bill) Dougherty</a>, writing in edgeblog on May 31, 2007. Bill follows that opening paragraph with a very readable explanation of the vital importance of latency (round-trip time) as a factor affecting performance in TCP networking. He uses what he calls the <em>Sandbag Problem</em> to illustrate his points:</p>

<blockquote>
<h4>Shifting a Heap of Sand</h4>
<p>Let’s say the two of us are trying to fill sandbags. My job is to scoop sand into a container and hand the full container to you (data). Your job is to empty the container into a sandbag and hand the empty container (ACK) back to me. Occasionally you drop the container so I have to fill it again (Retransmit). If we were standing next to each other, the time it takes for me to hand the container to you, have you empty it, and hand it back to me (latency) would be very small. Now imagine there is a 6′ wall between us, and I need to hand the container over to you.</p>

<p>The wall changes several aspects of our filling operation. First, the size of the container must be smaller because I cannot lift the same weight over my head that I can lift at waist level. Second, the time to complete one cycle would increase because it takes longer to lift the container 6′ than it does 3′. Third, you would drop more containers so retransmissions would increase. As the wall gets taller, the problem gets worse. If the wall were 10′ tall, we would be throwing containers instead of lifting them, so they would need to be even smaller. The containers would be traveling 20′ round trip instead of 12′ so the delay would increase 75%. And we would need to send a lot more containers to move the same amount of sand.</p>
<p class="QuoteSource">--William (Bill) Dougherty, <a href="http://www.edgeblog.net/2007/its-still-the-latency-stupid/" class="offsite-link-inline">It's Still The Latency, Stupid</a>, edgeblog, May 31, 2007</p>
</blockquote>

<p>This is a topic that anyone who cares about the performance of Web-based applications needs to understand, because it is the key to most performance optimization and tuning initiatives. And since both Bill's post and the ensuing discussion in the readers' comments are very educational, I decided immediately to write something linking to it. But because I wanted to tie in some other ideas, and discuss <a class="offsite-link-inline" href="http://www.stickyminds.com/sitewide.asp?ObjectId=5030&amp;Function=edetail">Web Page Response Time 101</a>,  a paper written in 2001 by Alberto Savoia (see <a href="http://www.webperformancematters.com/journal/2007/7/12/four-laws-of-web-site-performance.html" class="PMref">Four Laws of Web Site Performance</a>), I saved a reminder in a draft post until I had time to focus on it. That was a month ago.</p>

<p>Well, once you drop an idea in your mental in-tray, it starts to search for its proper place in the files. Connections pop up all over the place. Sure enough, once I had made a mental note to write about that edgeblog post, I began to notice other blog posts on the same topic. A couple of weeks later, I saw a post about <a href="http://dsoguy.blogspot.com/2007/06/latency-v-throughput.html" class="offsite-link-inline">Latency vs. Throughput</a> by Steve Harris in his <em>DSO Guy</em> blog. Instead of people moving sandbags, Steve uses trains and planes shipping coal supplies in his example:</p>

<blockquote>
<h4>Shipping a Load of Coal</h4>
<p>Imagine you have to move a bunch of coal across the country and deliver it to a coal processor. Now say that on the west coast, the receiver of the coal can process 100 units of coal an hour. You have 1 train that can haul 10,000 units of coal and takes 48 hours to get to its destination. You have 1 plane that can deliver 100 units of coal in 12 hours.</p>

<p>If the most important thing was to have the coal soon, then the plane is faster (lower latency). But, if the most important thing is to have the coal-processing pipeline filled on the west coast over time then train is faster (higher throughput).</p>
<p class="QuoteSource">--Steve Harris, <a href="http://dsoguy.blogspot.com/2007/06/latency-v-throughput.html" class="offsite-link-inline">Latency vs. Throughput</a> in DSO Guy, June 14, 2007</p>
</blockquote>

<p>Then last week I ran across this amusing item in the <a href="http://royal.pingdom.com/?p=119" class="offsite-link-inline">Royal Pingdom</a> blog. Shunning such analogies as sandbags or coal sacks, it gets straight to the point:</p>

<blockquote>
<h4>FedEx still faster than the Internet</h4>
<p>Imagine a company with two offices in different cities, perhaps even in different countries. Each office has a 100 megabit internet connection. If the company needs to send a large amount of data from one office to the other, theoretically a 100 megabit connection can muster about 45 gigabyte in one hour if there are no bottlenecks on the way. This ends up being just over one terabyte of data in 24 hours.</p>

<p>In other words, for anything larger than one terabyte, it would be faster for this company to just send the data on disks for over-night delivery.</p>
<p class="QuoteSource">--Royal Pingdom, April 11, 2007</p>
</blockquote>

<p>And once I started actually <em>looking</em> for connections, I found this post by <a href="http://www.sun.com/aboutsun/executives/schwartz/bio.jsp" class="offsite-link-inline">Jonathan Schwartz</a>, CEO and President of Sun:<p>

<blockquote>
<h4>Moving A Petabyte of Data to Hong Kong by Sailboat</h4>
<p>I made a speech last week at which I asserted it was faster to send a petabyte of data from San Francisco to Hong Kong by sailboat, than by the internet.</p>

<p>I got quite a few "how can that possibly be true?" kinds of questions, so here's the math. (Full disclosure, I am a mathematician by training, which guarantees me a lifetime of small "off by one" errors in all subsequent calculations - so if I get something wrong, be gentle).<p>

<p>A petabyte is a thousand terabytes, which is a million gigabytes, or a billion megabytes. Or 8 billion megabits. With me so far?</p>

<p>So if you had a half megabit per second internet connection, which is relatively high in the US (relatively low compared to residential bandwidth available in, say, Korea), it'd take you 16 billion seconds, or 266 million minutes, or 507 years to transmit the data. Can you sail to Hong Kong faster than that? At a full megabit, just divide the time in half. Even at a hundred megabits (about the highest, generally available, of any carrier I've seen), it's a few years.</p>
<p class="QuoteSource">--Jonathan Schwartz, <a href="http://blogs.sun.com/jonathan/entry/moving_a_petabyte_of_data" class="offsite-link-inline">Moving A Petabyte of Data</a>, Jonathan's Blog, Mar 12, 2007</p>
</blockquote>

<p>If you've been following my links and reading more than my excerpts alone, you've probably spotted a common thread running through these articles and the comments. I already knew about the <em>station wagon</em> analogy, and so I was looking for it.</p>

<h3>Jumping on the band(width station)wagon</h3>

<p>All these writers are following an already well-trodden path, and their conclusions echo a well-known observation first made by <a href="http://en.wikipedia.org/wiki/Andrew_S._Tanenbaum" class="offsite-link-inline">Andrew S. Tanenbaum</a> in his book, Computer Networks:</p>

<blockquote>
<p><strong>Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway</strong></p>
</blockquote>

<p>Many writers quote this, for example Tony Dye in <a href="http://tonydye.typepad.com/main/2007/05/station_wagon_b.html" class="offsite-link-inline">Station Wagon Bandwidth</a>. Some then go into more detail; a humorous post on <a href="http://www.everything2.com/index.pl?node_id=507783" class="offsite-link-inline">Everything.com</a> calculates the bandwidth of a station wagon at 13 petabytes/second. <a href="http://theolagendijk.wordpress.com/about/" class="offsite-link-inline">Theo Lagendijk</a> in <a href="http://theolagendijk.wordpress.com/2007/03/12/never-underestimate-the-bandwidth-of-a-station-wagon-full-of-tapes-hurtling-down-the-highway/" class="offsite-link-inline">Theo's Blog</a> has the full context of the saying:</p>

<blockquote>
<p>An industry standard Ultrium tape can hold 200 gigabytes. A box 60 x 60 x 60 cm can hold about 1000 of these tapes, for a total capacity of 200 terabytes, or 1600 terabits (1.6 petabits). A box of tapes can be delivered anywhere in the United States in 24 hours by Federal Express and other companies. The effective bandwidth of this transmission is 1600 terabits/86,400 sec, or 19Gbps. If the destination is only an hour away by road, the bandwidth is increased to over 400Gbps. No computer network can even approach this.</p>

<p>For a bank with many gigabytes of data to be backed up daily on a second machine (so the bank can continue to function even in the face of a major flood or earthquake), it is likely that no other transmission technology can even begin to approach magnetic tape for performance. Of course, networks are getting faster, but tape densities are increasing, too.</p>

<p>If we now look at cost, we get a similar picture. The cost of an Ultrium tape is around $40 when bought in bulk. A tape can be reused at least ten times, so the tape cost is maybe $4000 per box per usage. Add to this another $1000 for shipping (probably much less), and we have a cost of roughly $5000 to ship 200TB. This amounts to shipping a gigabyte for under 3 cents. No network can beat that. The moral of the story is:</p>

<p><em>Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.</em></p>
<p class="QuoteSource">—Andrew S. Tanenbaum, Computer Networks. Prentice-Hall, 1996</p>
</blockquote>

<p>For more detailed discussions of the price/performance of <a href="http://en.wikipedia.org/wiki/Sneakernet" class="offsite-link-inline">Sneakernets</a>, an article by Jeff Atwood (plus many reader comments) on the blog <a href="http://www.codinghorror.com/blog/" class="offsite-link-inline">Coding Horror</a> reviews some contributions by Jim Gray to <a href="http://www.codinghorror.com/blog/archives/000783.html" class="offsite-link-inline">The Economics of Bandwidth</a>.</p>

<h3>Who said that (first)?</h3>

<p>Naturally, a Google search produces hundreds of citations about the bandwidth of station wagons; I have included a few interesting ones I found while writing this. But as with many familiar sayings, it seems that the station wagon analogy has evolved over 25 or more years, and its exact origins are now hard to pin down. A <a href="http://www.bbc.co.uk/dna/h2g2/A678576" class="offsite-link-inline">BBC UK article</a> highlights some of this uncertainty. An article at <a href="http://www.bpfh.net/sysadmin/never-underestimate-bandwidth.html" class="offsite-link-inline">SysAdmin humor</a> cites Dennis Ritchie as a possible source, but then admits ignorance. The Wikipedia article (today) on Sneakernets does attribute both the current and an earlier version of the saying to Tanenbaum, but then descends into a stew of "alleged" references and possibilities.</p>

<blockquote>
<p>The original version of this quotation came much earlier; the very first problem in Tanenbaum's 1981 textbook Computer Networks asks the student to calculate the throughput of a St. Bernard carrying floppy disks (which are said to hold 250 kilobytes of data). The first USENET citation is July 16, 1985, and it was widely considered a chestnut already, possibly dating from the 1970s. Other alleged speakers included Tom Reidel, Warren Jackson, or Bob Sutterfield. The station wagon and mag tapes were the canonical version, but variants using trucks or Boeing 747s and later storage technologies such as CD-ROMs would frequently appear.</p>
<p class="QuoteSource">—Wikipedia article on Sneakernet [July 12, 2007]</p>
</blockquote>

<p>I will start a thread in my <a href="http://www.webperformancematters.com/performance-forum/" class="PMref" >Who said that?</a> discussion forum, just in case anyone wants to add something more concrete!</p> 

<h3>Next ...</h3>
<p>In my next post, I will continue my review of <a href="http://www.webperformancematters.com/journal/2007/7/12/four-laws-of-web-site-performance.html" class="PMref">Alberto Savoia's 2001 paper</a> [**], and explain the relevance of today's digression.</p>

<p class="aside">[<strong>** Warning:</strong> even though it's just 6 pages, it's a 2.5Mb file, so wait until you're on a fast connection].</p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/William+Dougherty" rel="tag">William Dougherty</a>,
<a href="http://technorati.com/tag/edgeblog" rel="tag">edgeblog</a>,
<a href="http://technorati.com/tag/performance" rel="tag">performance</a>,
<a href="http://technorati.com/tag/bandwidth" rel="tag">bandwidth</a>,
<a href="http://technorati.com/tag/throughput" rel="tag">throughput</a>, 
<a href="http://technorati.com/tag/latency" rel="tag">latency</a>,
<a href="http://technorati.com/tag/round-trip+time" rel="tag">round-trip time</a>,
<a href="http://technorati.com/tag/Stuart+Cheshire" rel="tag">Stuart Cheshire</a>,
<a href="http://technorati.com/tag/Web+performance" rel="tag">Web performance</a>, 
<a href="http://technorati.com/tag/Web+application" rel="tag">Web application</a>,
<a href="http://technorati.com/tag/download+time" rel="tag">download time</a>,
<a href="http://technorati.com/tag/Alberto+Savoia" rel="tag">Alberto Savoia</a>,
<a href="http://technorati.com/tag/Steve+Harris" rel="tag">Steve Harris</a>,
<a href="http://technorati.com/tag/DSO+guy" rel="tag">DSO Guy</a>,  
<a href="http://technorati.com/tag/Fedex" rel="tag">Fedex</a>,
<a href="http://technorati.com/tag/Roral+Pingdom" rel="tag">Royal Pingdom</a>,  
<a href="http://technorati.com/tag/Jonathan+Schwartz" rel="tag">Jonathan Schwartz</a>, 
<a href="http://technorati.com/tag/sailboat" rel="tag">sailboat</a>,
<a href="http://technorati.com/tag/Hong+Kong" rel="tag">Hong Kong</a>,
<a href="http://technorati.com/tag/station+wagon" rel="tag">station wagon</a>,
<a href="http://technorati.com/tag/Andrew+Tanenbaum" rel="tag">Andrew Tanenbaum</a>,
<a href="http://technorati.com/tag/Jeff+Atwood" rel="tag">Jeff Atwood</a>,
<a href="http://technorati.com/tag/Coding Horror" rel="tag">Coding Horror</a>,
<a href="http://technorati.com/tag/Jim+Gray" rel="tag">Jim Gray</a>,
<a href="http://technorati.com/tag/sneakernets" rel="tag">sneakernets</a>,
<a href="http://technorati.com/tag/performance+matters" rel="tag">Performance Matters</a>
</p>]]></description><wfw:commentRss>http://www.webperformancematters.com/journal/rss-comments-entry-1146030.xml</wfw:commentRss></item></channel></rss>