<?xml version="1.0" encoding="UTF-8"?>
<!--Generated by Squarespace Site Server v5.0.0 (http://www.squarespace.com/) on Wed, 20 Aug 2008 16:51:40 GMT--><feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/"><title>Web Performance Matters</title><subtitle>Journal</subtitle><id>http://www.webperformancematters.com/journal/</id><link rel="alternate" type="application/xhtml+xml" href="http://www.webperformancematters.com/journal/"/><link rel="self" type="application/atom+xml" href="http://www.webperformancematters.com/journal/atom.xml"/><updated>2008-08-14T10:13:56Z</updated><generator uri="http://www.squarespace.com/" version="Squarespace Site Server v5.0.0 (http://www.squarespace.com/)">Squarespace</generator><entry><title>Why Technorati is Not Usable</title><category>Performance and Usability</category><category>Blogs and Publications</category><category>About this site</category><id>http://www.webperformancematters.com/journal/2007/9/26/why-technorati-is-not-usable.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/9/26/why-technorati-is-not-usable.html"/><author><name>Chris Loosley</name></author><published>2007-09-26T10:30:00Z</published><updated>2007-09-26T10:30:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<span><img class="PageIllustration" title="Usability Model" alt="Illustration: Four dimensions of usability" src="http://www.webperformancematters.com/storage/post-graphics/Usability Model.jpg" />

<p>I was going to write about performance and availability today, but this was not the post I had in mind. Technorati sidetracked me. So I'm going to write about Usability instead. Because Technorati provides a good counter-example -- how <em>not</em> to build a usable Web application that satisfies and retains customers. </p>

<p>In <a href="http://www.webperformancematters.com/journal/2005/10/17/web-usability-a-simple-framework.html" class="PMref">Web Usability: A Simple Framework</a>, I described a way to think about Web site or Web application usability.</p>

<p>In a second post, <a href="http://www.webperformancematters.com/journal/2005/11/9/the-dimensions-of-usability.html" class="PMref">The Dimensions of Usability</a>, I presented the graphic shown here, and discussed the four dimensions in a bit more detail. </p>

<p>These four dimensions are not alternative functional goals, to be weighed against one another and prioritized. Web application effectiveness is a four-step challenge:</p>

<blockquote>
<p>To satisfy customers, a Web site must fulfill four distinct needs: </p>

<ul>
<li><strong>Availability:</strong> A site that's unreachable, for any reason, is useless.</li>
<li><strong>Responsiveness:</strong> Having reached the site, pages that download slowly are likely to drive customers to try an alternate site.</li>
<li><strong>Clarity:</strong> If the site is sufficiently responsive to keep the customer's attention, other design qualities come into play. It must be simple and natural to use – easy to learn, predictable, and consistent.</li>
<li><strong>Utility:</strong> Last comes utility -- does the site actually deliver the information or service the customer was looking for in the first place?</li>
</ul>
<p class="QuoteSource">--<a href="http://www.webperformancematters.com/journal/2005/10/17/web-usability-a-simple-framework.html" class="PMref">Web Usability: A Simple Framework</a>, October 17, 2005</p>
</blockquote>

<p>As in a quiz show, to win the grand prize -- satisfied customers -- you have to get it right at every stage. Fail at any one and you <em>will</em> lose customers. Fail consistently at any one, and you will be out of business.</p>

<p><strong>As I experienced their service today, Technorati seemed to be failing on all four fronts</strong>.</p>

<h3>Availability</h3>

<p>First I noticed that my browser was replacing the thumbnail portrait that usually appears (near the bottom of my sidebar) under <em>technorati links</em> with alternative text. Next I tried Technorati's link to <em>Blogs that link here</em> and, eventually, was rewarded with:</p>

<span class="SectionIllustrationInline"><img src="http://www.webperformancematters.com/storage/post-graphics/Firefox%20Connection%20Reset.JPG" alt="Illustration: Firefox Connection Reset screen" title="Firefox Connection Reset screen"/></span> 
</p>

<p>It's not that I desperately need to see my own picture there at all times, or that I think my readers are dying to see the inbound links (or <em>blog reactions</em>, as Technorati calls them). I know that widget in my sidebar has marginal utility -- a few people may use it occasionally. That's why I put it near the bottom, where it doesn't interfere with anyone's ability to browse the site. If it works, it does no harm. On the other hand, if it's broken, it becomes a distinct liability. <strong>Broken links lower the quality of the whole site.</strong>.</p>

<p>In this case, the <em>connection reset</em> error means that the Web server accepted the request, but then took so long to respond that the browser timed out. While this may not qualify as a <em>broken link</em>, it had the same effect: <strong>the requested page was unavailable</strong>. </p>

<p>Upon checking back a few hours later, the sidebar link was working. Again, not a surprise. Intermittent outages like this have been characteristic of Technorati for a long time -- see my post on <a href="http://www.webperformancematters.com/journal/2007/6/6/taming-the-technorati-monster.html" class="PMref">Taming the Technorati Monster</a>.

<h3>Performance</h3>

<p>I'm not going to dwell on this, because if you've tried to find things on Technorati lately, you already know how the service performs. For me, it typically ranges from slow to glacial, for a search engine. Maybe it's just my particular interests -- <em>Web Performance</em> and <em>Application Responsiveness</em> are not especially hot topics. Perhaps other people who are interested in more popular topics are scoring cache hits and are actually getting good Web performance. That would be ironic!</p>

<h3>Clarity</h3>

<p>Returning to my problems with the <em>blog reactions</em> widget, normally I would let this incident pass without comment. But today I actually noticed the problem while I was doing a blog search about problems with Technorati's tag indexing and search functions, because my blog seems to have fallen off their radar lately. </p>

<p>In the past, tags used in a post would be indexed and returned in a search within the hour, often within minutes. Today, Technorati's search function was sure I had not published anything for the last 27 days. But when I navigated manually to their page for my blog, they displayed excerpts of all my more recent posts.</p>

<p>Some more digging revealed that even though Technorati's site was up, and even though I could use it to navigate manually to a page listing blog reactions, the sidebar link to that same information, which Technorati's widget was generating, did not work. <strong>In any Web site or application, these kinds of internal inconsistencies are hugely frustrating. They make me doubt the accuracy and completeness of any information the application returns.</strong></p>

<p> And not surprisingly, searching for help on Technorati itself does nothing to reassure me that they will be fixing these problems anytime soon. Quite the contrary, it confirms their problems, as in this amusing response:</p>

<span class="SectionIllustrationInline" style="margin:0 0 10px 0; padding:5px;"><img src="http://www.webperformancematters.com/storage/post-graphics/Technorati%20Search%20Problems.JPG" alt="Illustration: Technorati Search Problems" title="Technorati Search Problems"/></span>

<p>(Is Technorati <a href="http://www.uprightmatters.com/blog-home/2007/9/25/lets-stop-beatin-round-the-bush.html" class="offsite-link-inline">beatin' 'round the bush</a> here? Can a search engine actually lie about what it <em>really</em> knows? :)</p>

<h3>Utility</h3> 

<p>Google's <em>Blog Search</em> feature, however, did know something. It pointed me to this sensible post by <a href="http://www.blogger.com/profile/17388497877158577422" class="offsite-link-inline">ChristineMM</a>:</p>

<blockquote>
<h4>Trying Out Blog Widgets and Tools</h4>

<p>I would like a search box on my blog that lets my readers (and me) search for content from within the pages of my blog. The Google one that I used to use was not working right, I’d search for a word that was right in front of me or even in a blog post title and it would say there were no matches, so I dumped that function.</p>

<p>I then for a long time used a Technorati box. At some point I realized it was not working well either. Again I’d search for a keyword that was in a blog post title and it would say there were no matches. So this week I deleted it from my sidebar. What is the point of having a blog reader search for a topic on my blog, be told I never blogged on it, when in reality, I actually did?</p>

<p>...</p>

<p>One last thing I’ll mention is that I get a ton, and I mean a ton of blog readers through Google primarily and also some other Internet search engines. I feel that my regular use of Technorati tags helps my blog posts be found by Google and the other search engines. This drives traffic to my blog. So if you want to drive traffic to your blog, use Technorati tags in every blog post of substance.</p>

<p class="QuoteSource">--<a href="http://thethinkingmother.blogspot.com/2007/09/trying-out-blog-widgets-and-tools.html" class="offsite-link-inline">The Thinking Mother</a>, September 23, 2007</p>
</blockquote>

<p>Well said, Christine! If an application does not deliver the service you need, it's useless. I dropped Technorati's search box some time ago, for similar reasons. The integrated Squarespace search box is 1000 times more useful for searching the blog, and Google has the Web covered far more effectively than Technorati, in my opinion.</p>

<h3>The bottom line</h3>

<p>I think Christine may have homed in on the essence of the matter, sad though it may be. Unless Technorati can recover its original sense of purpose and fix its technical problems, it's not going to survive as an independent, useful, service. Perhaps its most significant contribution will be its promotion of a standard tagging format that is easily recognized and reused <em>by other search engines</em>.</p>

<p>So I'm not giving up on my Technorati tags yet, but I'm not counting on getting much value from their blog indexing or searching tools either. I've already removed their <em>blog search</em> and <em>tag cloud</em> functions from my sidebar, and their <em>blog reactions</em> widget is now on probation. Any more problems and it will be the next to go.</p>

<p class="Footnote"><strong>Tags:</strong> 
<a href="http://technorati.com/tag/technorati" rel="tag">Technorati</a>,
<a href="http://technorati.com/tag/tagging" rel="tag">tagging</a>,
<a href="http://technorati.com/tag/blog+reactions" rel="tag">blog reactions</a>,  
<a href="http://technorati.com/tag/problems" rel="tag">problems</a>,
<a href="http://technorati.com/tag/usability" rel="tag">usability</a>,
<a href="http://technorati.com/tag/availability" rel="tag">availability</a>,
<a href="http://technorati.com/tag/consistency" rel="tag">consistency</a>,
<a href="http://technorati.com/tag/clarity" rel="tag">clarity</a>,
<a href="http://technorati.com/tag/utility" rel="tag">utility</a>,
<a href="http://technorati.com/tag/Christinemm" rel="tag">Christinemm</a>,
<a href="http://technorati.com/tag/Thinking+Mother" rel="tag">Thinking Mother</a>,
<a href="http://technorati.com/tag/web+performance" rel="tag">Web performance</a>,
<a href="http://technorati.com/tag/performance+matters" rel="tag">Performance Matters</a>
</p>]]></content></entry><entry><title>Human Factors and Blog Design</title><category>Blogs and Publications</category><category>About this site</category><id>http://www.webperformancematters.com/journal/2007/9/22/human-factors-and-blog-design.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/9/22/human-factors-and-blog-design.html"/><author><name>Chris Loosley</name></author><published>2007-09-22T07:30:00Z</published><updated>2007-09-22T07:30:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<div class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/coding-horror-official-logo-small.png" alt="Illustration: Coding Horror logo" title="Coding Horror logo"/>
<br /><span class="PictureCaption"><a href="http://www.codinghorror.com/blog/" class="offsite-link-inline">Coding Horror</a></span></div>

<p>The best products are designed with <a href="http://en.wikipedia.org/wiki/Human_factors" class="offsite-link-inline">Human Factors</a> in mind. That's why <a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=95637" class="PMref">Web design and usability</a> is a frequent topic of my <em>Web Performance Matters</em> blog.</p>

<p>Jeff Atwood's blog -- <em>Coding Horror</em> -- focuses on <em>programming and human factors</em>. And according to a recent <a href="http://www.dailyblogtips.com/interview-with-jeff-atwood-from-coding-horror/" class="offsite-link-inline">interview</a> with Jeff on the site <em>Daily Blog Tips</em>, "the blog is attracting over 500,000 unique visitors every month, and it also counts 60,000 RSS readers, meaning that Jeff probably knows what he is talking about".</p>

<p>The <em>Coding Horror</em> logo was originally created to mark examples of dangerous code in the programming classic <a href="http://www.amazon.com/exec/obidos/ASIN/0735619670/" class="offsite-link-inline">Code Complete</a> by <a href="http://www.stevemcconnell.com/" class="offsite-link-inline">Steve McConnell</a>, which <a href="http://www.codinghorror.com/blog/archives/000021.html" class="offsite-link-inline">Jeff rates</a> as his "all-time favorite programming book."</p>

<p>I have <a href="http://www.amazon.com/gp/reader/0735619670/ref=sib_books_pg/105-5052943-5499615?ie=UTF8&keywords=Chris%20Loosley&p=S002&checkSum=X870d5rEZ6o3p7%252FpTRIE33KEyKo8%252FsKXBj4Qz4k3Ob0%253D" class="offsite-link-inline">recommended <em>Code Complete</em></a> myself.  <a href="http://www.codinghorror.com/blog/archives/000020.html" class="offsite-link-inline">Jeff's favorite books</a> are on my shelf too. So I respect his judgment and recommend his blog, which I have added to the blogroll on <em>Web Performance Matters</em>.
</p>

<h3>Thirteen blog clichés</h3>

<p>Jeff recently published <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a>, a post summarizing his "opinions about what makes blogs work well, and what makes blogs sometimes not work so well." These are presented as a list of common mistakes to avoid (or <a href="http://en.wikipedia.org/wiki/Anti-pattern" class="offsite-link-inline">anti-patterns</a>). If you have a blog, or are designing one, you've probably read similar articles before. Even so, Jeff's checklist is worth a look. All such lists tend to contain a core set of common guidelines to follow and/or pitfalls to avoid, but some of Jeff's opinions step outside the conventional wisdom.</p>

<p>Because I maintain two blogs -- <em>Web Performance Matters</em> and <a href="http://www.uprightmatters.com/" class="offsite-link-inline"><em>UpRight Matters</em></a> -- I decided to rate both blogs against Jeff's criteria. Here are edited versions of his recommendations, and my responses. To read Jeff's full discussions of each guideline, see the original. And for the full story, see the many responses posted by Jeff's readers in the comments section of his blog. </p>

<blockquote class="highlight">
<span class="full-image-float-right"><img src="http://www.webperformancematters.com/storage/post-graphics/blog-calendar_opt.jpg" title="Blog Cliche -- Calendar" alt="Illustration: Blog Cliche -- Calendar" /></span>

<h4>1. The Useless Calendar Widget</h4>

<p>I can't think of a <em>single</em> time I have ever found the blog calendar widget helpful. My computer already has a calendar function, so it's not like I need another calendar displayed in my web browser.</p>
<p>Every post carries an obvious datestamp, so I can easily discern when it was published. But knowing whether someone posted an entry on the third Tuesday of the month? Utterly useless. </p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>I agree! Someone reading a blog like <a href="http://www.dailykos.com/" class="offsite-link-inline">Daily Kos</a>, that publishes daily about politics or current affairs, might find a calendar useful. But a calendar isn't appropriate for our content, so we've never thought of including one. Even if we had, the Squarespace publishing platform we use (see bottom of sidebar) doesn't offer such a blog calendar widget -- another sign that it's not in great demand.</p> 

<blockquote class="highlight">
<h4>2. Random Images Arbitrarily Inserted In Text</h4>

<p>One of the cardinal rules of <a href="http://www.useit.com/papers/webwriting/" class="offsite-link-inline">web writing</a> is to <em>avoid large blocks of text</em>. There are plenty of <a href="http://www.useit.com/alertbox/9703b.html" class="offsite-link-inline">excellent web writing guides</a> that exhort you to break up your text, using bullets, numbered lists, quotes, paragraph breaks, images -- anything to avoid creating an intimidating wall of dense, impenetrable text. </p>
<p>But like all good advice, (this) can be taken too far. For example, when you find yourself inserting random pictures into your writing for the sole purpose of breaking up the text. As the old adage goes, <em>a picture is worth a thousand words</em>. <strong>But you should no more insert a random image into your writing than you would insert a thousand random words into your writing.</strong></p>
<p>Images are <em>not</em> glorified paragraph breaks. Images should contribute to the content and meaning of the article in a substantive way. And if they don't, they should be cut. Mercilessly.</p>
<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>I know I am sometimes guilty of writing long posts. But I won't write a thousand words unless I have something worthwhile (I hope :-) to explain, and I try to keep all my posts interesting by breaking up the text using <a href="http://www.webperformancematters.com/journal/2006/3/25/managing-rias-6-measurement-challenges.html" class="PMref">headings</a> or <a href="http://www.webperformancematters.com/journal/2007/8/21/asynchronous-architectures-4.html" class="PMref">images</a>. And I promise that we will <em>never</em> include an image that bears no relationship to the subject matter!</p>

<blockquote class="highlight">
<h4>3. No Information on the Author </h4>

<p>Every time a reader encounters a blog with no name in the byline, no background on the author, and no simple way to click through to find out <em>anything</em> about the author, it devalues not only the author's writing, but the credibility of blogging in general.</p>

<p>Maintaining a blog of any kind takes quite a bit of effort. It's irrational to expend that kind of effort without putting your name on it so you can benefit from it. And so we can too. It's a win-win scenario for you, Mr. Anonymous.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>I agree! That's why we provide <a href="http://www.webperformancematters.com/objectives/" class="PMref">brief</a> <a href="http://www.uprightmatters.com/author-184886/" class="UMref">introductions</a> and <a href="http://www.uprightmarketing.com/principals/" class="UMref">longer</a> author pages.</p>

<blockquote class="highlight">
<h4>4. Excess Flair</h4>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/social-bookmarks.png" alt="Illustration: Social bookmark icons" title="Social bookmark icons"/></span>

<p>Blogs work because they're simple. When we clutter up our blogs with a zillion widgets, features, and add-ons, we're destroying an essential part of what makes blogs worthwhile. Examples include "crazy" JavaScript image loading techniques, annoying pop-up image previews of links, and pictures of the last 10 visitors to your blog.</p>
    
<p>Before adding any new "feature" to your blog, consider whether its value outweighs the additional complexity it introduces. </p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>This recommendation can be controversial -- see the comments on <a href="http://www.codinghorror.com/blog/archives/000587.html" class="offsite-link-inline">Jeff's original post</a> on this topic. But I agree with Jeff, and I do try to <em>reduce</em> the clutter whenever possible. See my decision on item 6 below, for example. </p>

<blockquote class="highlight">
<h4>5. The Giant Blogroll</h4>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/giant-blogroll-2.JPG" alt="Illustration: Giant Blogroll" title="Giant Blogroll"/></span>

<p>Citing your references and influences is a great and necessary thing, but obsessively listing every single blog you read is just noise. If you're really reading this many blogs, you should be linking to them organically in your blog posts, in a sort of natural quid pro quo. Wearing a giant blogroll on your sleeve is an empty gesture that feels artificial and insincere.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>Agreed! On <em>Web Performance Matters</em>, I aim to keep <a href="http://www.webperformancematters.com/journal/2007/5/20/a-web-performance-blogroll.html" class="PMref">my blogroll</a> focused, and group the links into categories. We have not added a blogroll on <em>UpRight Matters</em> yet, but we plan to adopt the same approach.</p>

<blockquote class="highlight">
<h4>6. The Nebulous Tag Cloud</h4>

<span class="full-image-float-right"><img src="http://www.webperformancematters.com/storage/post-graphics/tagcloud.png" alt="tagcloud.png" title="tagcloud.png"/></span>
<p>Tagging content easily beats organizing everything into <a href="http://www.codinghorror.com/blog/archives/000246.html" class="offsite-link-inline">hierarchical folders</a>, and tag categories on blogs are moderately useful, particularly for bloggers who tend to bounce around among many different topics. What I've <em>never</em> found useful, however, is the stereotypical tag cloud visualization, where the size of the tag word varies with its frequency. </p>

<p>The perception is that tag cloud visualizations are cool, like badges of honor for the tagging club. The reality is that tag cloud visualizations are chaotic, noisy, and unusable. Keep the tagging, lose the cloud. A simple sorted list of tags, along with the number of posts associated with each tag, is much more effective.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>Content tagging and indexing is a complex subject, and one I have given much thought while developing our blogs. [I even read <a href="http://www.everythingismiscellaneous.com/" class="offsite-link-inline">Everything is Miscellaneous</a>, and started to write a post about it until I realized that I was just adding to the <em>echo-chamber</em> on that topic. See item 11 below.]</p>

<p>I believe that tagging with keywords has value, but the resulting <a href="http://en.wikipedia.org/wiki/Folksonomy" class="offsite-link-inline"><em>folksonomy</em></a> is most useful as a supplement to, not a replacement for, a carefully designed and consistently applied classification scheme or <a href="http://en.wikipedia.org/wiki/Information_architecture" class="offsite-link-inline">information architecture</a>. Therefore we will continue to index our content using both methods. </p>

<p>However, despite investing a lot of time implementing a <a href="http://www.webperformancematters.com/journal/2007/5/27/customizing-the-technorati-tag-cloud.html" class="offsite-link-inline">Technorati tag cloud</a> for <em>Web Performance Matters</em>, which has been sitting in my sidebar for 4 months, I have come to the same conclusion as Jeff -- it takes up space without adding any value. So I've now removed it.</p>

<p>I see this as an example of the dynamic nature of blogging. It's agile publishing: you don't have to get everything right the first time. You can create something, try it out for a while, refine it, or remove it altogether. In this vein, revising your blog's layout is greatly simplified if your publishing platform is CSS-based, like <a href="http://www.squarespace.com/?partnerTag=cj&planTag=blg" class="offsite-link-inline">Squarespace</a>.</p> 

<blockquote class="highlight">
<h4>7. Excessive Advertisements</h4>

<p>Advertising is a fact of life, but your blog is not <a href="http://www.flickr.com/photos/stuckincustoms/440698504/" class="offsite-link-inline">Times Square</a>. Does every square inch of whitespace <i>have</i> to be filled with paid links, Google AdSense, and ad banners? </p>

<p>Here's a related article on <a href="http://www.sitepronews.com/archives/2007/jan/15.html" class="offsite-link-inline">blog usability</a> that's a perfect -- even ironic -- example of how you can hurt your usability with excessive, obnoxious advertising. It's everywhere.</p>

<p>It is almost <i>never</i> in the reader's interest to see advertisements, so tread very lightly, and be respectful of your audience. If you take the time to advertise responsibly, you may find that readers appreciate you for it.</p>

<p>Well, probably not, but it can't hurt to try.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>We do have a few ads. I try to organize them tastefully, so that they don't interfere with the content.</p>

<blockquote class="highlight">
<h4>8. This Ain't Your Diary</h4>

<p>Let's be perfectly clear: readers aren't coming to your blog <a href="http://www.codinghorror.com/blog/archives/000536.html" class="offsite-link-inline">to read about you</a>. They're coming to find out <a href="http://headrush.typepad.com/creating_passionate_users/2005/01/users_shouldnt_.html" class="offsite-link-inline">what it can do for them</a>.</p>

<span class="full-image-float-right"><img src="http://www.webperformancematters.com/storage/post-graphics/diary.jpg" alt="Illustration: Diary" title="Diary"/></span>

<p>That said, blogs are a place for writers to find an interested audience, and a place for readers to find a helpful peer and a unique voice. It's OK to <a href="http://software.ericsink.com/entries/Goodbye_Sadie.html" class="offsite-link-inline">be yourself</a>; at some level, it is a cult of personality: people are reading not only because your content is useful to them, but because they like you. </p>

<p>It's normal to inject a regular dose of yourself into the conversation. But like Tabasco sauce and other powerful seasonings, a little YOU goes a long way. A <i>really</i> long way.  Write accordingly.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>Agreed! I won't be writing about my experiences remodeling my house, unless I see a connection worth exploring.</p>

<blockquote class="highlight">
<h4>9. Sorry I Haven't Written in a While</h4>

<p>If you haven't posted anything new to your blog in a while, don't waste our time with apologies. Just write! The best apology is new and improved content. Maybe with a wee bit more consistency this time, though:</p>

<ul class="grouptight">
<li>Pick a schedule you can live with, and stick to it</li>
<li>Don't produce <a href="http://www.codinghorror.com/blog/archives/000910.html" class="offsite-link-inline">substandard</a> posts, just to keep to a schedule</li>
<li>Talent is far less important than <a href="http://www.codinghorror.com/blog/archives/000187.html" class="offsite-link-inline">enthusiasm</a></li>
</ul>
 
<p>And the best way to demonstrate your enthusiasm -- and to improve -- is to get out there and <i>write</i>. Regularly.<p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>My <a href="http://www.webperformancematters.com/objectives/" class="PMref">objectives</a> for <em>Web Performance Matters</em> are much the same as when I started the blog two years ago -- <em>to contribute <strong>an organizing framework</strong> and <strong>a regular supply of ideas</strong></em>.  I have to admit, I've had a few long gaps in my writing. I've also apologized and promised to do better! But after reading Jeff's advice, I'm in a bind. Should I apologize for apologizing? I guess not. I'll just keep writing. 

<blockquote class="highlight">
<h4>10. Blogging About Blogging</h4>

<p>I find meta-blogging -- blogging about blogging -- <i>incredibly</i> boring. I said as much in a recent <a href="http://www.dailyblogtips.com/interview-with-jeff-atwood-from-coding-horror/" class="offsite-link-inline">interview</a> on a site that's all about blogging (hence the title, Daily Blog Tips). If you accept the premise that most of your readers are <i>not</i> bloggers, then it's highly likely they won't be amused, entertained, or informed by a continual stream of blog entries on the art of blogging.</p>
        
<p>Meta-blogging is like masturbating. Everyone does it, and there's nothing wrong with it. But writers who regularly get out a little to explore other topics will be healthier, happier, and ultimately more interesting to be around -- regardless of audience.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>Of course, Jeff's post and this one <em>are</em> about blogging. But the reason we care enough to research and write about these ideas comes back to the <em>Human Factors</em> dimension. As we work to improve our ability to serve and communicate more clearly, we want to share what we learn to help you connect with your own readers and communities.</p>

<p><a href="http://www.uprightmatters.com/author-184886" class="UMref">Cynthia</a> writes:


<div class="InlineTextBox">
<p>Since my <a href="http://www.webperformancematters.com/objectives/" class="UMref">blogging objective</a> is to make a competitive difference in the world, I took special note of Jeff’s post on <em>Thirteen Blog Clichés</em>, and of his focus on human factors.  Why?  Because I want to have conversations that matter -- presumably with humans! </p>

<p>While Chris and I stay focused on our respective blogging objectives, we work to apply the right technology to enhance understanding and to make the experience of conversing valuable for readers. Jeff’s opinions about effective technologies and techniques resonated with my blogging experience and my blogging <strong><em>intentions</em></strong>. If you have similar intentions, we think you will find them useful too.</p>
</div>

<blockquote class="highlight">
<h4>11. Mindless Link Propagation</h4>

<p>One of the most pernicious problems in blogging is the <a href="http://chris.pirillo.com/2006/08/18/10-ways-to-eliminate-the-echo-chamber/" class="offsite-link-inline"> echo chamber effect</a>. Most blog entries merely regurgitate what other people have said or add vapid commentary on top of news articles and press releases. Only the tiniest fraction of blog entries are original content, and only a tiny fraction of that fraction is worth your time.</p>
     
<p>If everyone knows about it, what value does that information have? My advice here is almost contrarian: if everyone else is talking about it, that means you should <em>avoid</em> talking about it. Switch things up. Seek out uncommon sites with unique information. If all you can find to talk about is what's already popular, you're not trying hard enough. Form your own opinion. Do your own research. Go out of your way to blaze a new trail and create something we haven't already seen hundreds of times before.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>That is so true! This issue is exactly what stopped me from completing my review of David Weinberger's book, <em>Everything is Miscellaneous</em>, even though I had read the book and dozens of pages of reviews by others. But then I found myself summarizing other readers' feedback, which amounted to a <em>folksonomy</em> about the topic of <em>folksonomies</em> -- that is, a <em>meta-folksonomy</em>. I began to wonder whether, if I were to review other similar discussions, and add some Technorati tags to my review, would I then be contributing to a <em>meta-meta-folksonomy</em>?!</p>

<p>People criticize <em>Web 2.0</em>, and the <em>blogosphere</em> in general, claiming that it's just a giant echo-chamber in which uninformed opinion is amplified, and true expertise is drowned out by uneducated bleating, like the sheep in <a href="http://en.wikipedia.org/wiki/Animal_Farm" class="offsite-link-inline">Animal Farm</a>. While that criticism may sometimes be true, it is not the whole story, and it does not do justice to the educational power of the Web.</p>

<p>In this case, I concluded that the world did not really need me to summarize what everyone else was saying about David Weinberger's opinions about tagging, folksonomies, and the wisdom of crowds. In fact, to write such a post would just fuel the critics' argument (and by the way, it was turning into a very long post). So I went back to writing about Web performance!</p>

<blockquote class="highlight">
<h4>12. Top (n) Lists</h4>

<span class="full-image-float-right"><img src="http://www.webperformancematters.com/storage/post-graphics/Following%20Instructions%20for%20Dummies.JPG" alt="Illustration: Following Instructions for Dummies" title="Following Instructions for Dummies"/></span>

<p>Yes, exactly like this one.</p>

<p>Lists are a great convention. They make sense, people understand them, and they're a logical way to structure your writing. But <a href="http://www.codinghorror.com/blog/archives/000932.html" class="offsite-link-inline"> don't let lists become a crutch</a>. I'm always taken aback when I see the "most popular" posts on a blog dominated by Top (n) Lists. Shortcuts are only meaningful if you know what it is, exactly, you're cutting.</p>
    
<p>If you find that the Top (n) List convention is a go-to tool in your writing toolkit, consider re-balancing your writing portfolio with longer, more in-depth pieces as well. Not everything should be a sprint; throw a few small marathons in there somewhere to complement your short distance skills.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>

<p>I agree that it's good to aim for a balance of short and longer posts. Having written technical articles for years before blogs existed, I'm actually more likely to write long essays -- see my comment on item 2 above. So to balance those marathon posts, I've found that deliberately trying to compose shorter posts whose structure is a list of guidelines or principles helps me keep that balance. My series on <a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044" class="PMref">Performance Wisdom</a> contains several examples of this approach.

<blockquote class="highlight">
<h4>13. No Comments Allowed</h4>

<p><a href="http://www.codinghorror.com/blog/archives/000538.html" class="offsite-link-inline">A blog without comments is not a blog</a>. Yes, there are exceptions for massively popular blogs where <a href="http://many.corante.com/archives/2007/07/20/spolsky_on_blog_comments_scale_matters.php" class="offsite-link-inline">comments don't scale</a>. But until that applies, the value of the two-way conversation far outweighs any minor inconvenience on your part. It's an open secret in the blogging community that <b>the comments are often better than the original blog entry itself</b>. Would you browse Amazon without the user reviews?</p>

<p>Don't be afraid of comments. Embrace them. Moderate them. The community will respect you for it, and your blog will be better for it as well.</p>

<p class="QuoteSource">--Jeff Atwood, <a href="http://www.codinghorror.com/blog/archives/000834.html" class="offsite-link-inline"><em>Thirteen Blog Clichés</em></a> [edited]</p>
</blockquote>
</ol>

<p>We want comments! One of the primary reasons for blogging is to have conversations about the things that matter. It's not about us. Well, most of the time, anyway -- so we won't apologize if our writing occasionally lapses into introspection or vanity.</p>

<p class="Footnote"><strong>Note:</strong> This is cross-posted on <a href="http://www.webperformancematters.com/journal/2007/9/21/human-factors-and-blog-design.html" class="PMref"><em>Web Performance Matters</em></a> and <a href="http://www.uprightmatters.com/blog-home/2007/9/22/human-factors-and-blog-design.html" class="UMref"><em>UpRight Matters</em></a>. </p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/human+factors" rel="tag">human factors</a>,
<a href="http://technorati.com/tag/Jeff+Atwood" rel="tag">Jeff Atwood</a>,
<a href="http://technorati.com/tag/Coding+Horror" rel="tag">Coding Horror</a>,
<a href="http://technorati.com/tag/blog+clichés" rel="tag">blog clichés</a>,
<a href="http://technorati.com/tag/calendar+widget" rel="tag">calendar widget</a>,
<a href="http://technorati.com/tag/random+images" rel="tag">random images</a>,
<a href="http://technorati.com/tag/social+bookmarks" rel="tag">social bookmarks</a>,
<a href="http://technorati.com/tag/blogroll" rel="tag">blogroll</a>,
<a href="http://technorati.com/tag/tagging" rel="tag">tagging</a>,
<a href="http://technorati.com/tag/folksonomy" rel="tag">folksonomy</a>,
<a href="http://technorati.com/tag/Web20" rel="tag">Web 2.0</a>,
<a href="http://technorati.com/tag/tag+cloud" rel="tag">tag cloud</a>,
<a href="http://technorati.com/tag/blog+advertising" rel="tag">blog advertising</a>,
<a href="http://technorati.com/tag/meta-blogging" rel="tag">meta-blogging</a>,
<a href="http://technorati.com/tag/echo+chamber" rel="tag">echo chamber</a>,
<a href="http://technorati.com/tag/David+Weinberger" rel="tag">David Weinberger</a>,
<a href="http://technorati.com/tag/Everything+is+Miscellaneous" rel="tag">Everything is Miscellaneous</a>,
<a href="http://technorati.com/tag/Animal+Farm" rel="tag">Animal Farm</a>,
<a href="http://technorati.com/tag/blog+comments" rel="tag">blog comments</a>,
<a href="http://technorati.com/tag/blog+design" rel="tag">blog design</a>,
<a href="http://technorati.com/tag/Web+Performance+Matters" rel="tag">Web Performance Matters</a>,
<a href="http://technorati.com/tag/UpRight+Matters" rel="tag">UpRight Matters</a>
</p>]]></content></entry><entry><title>If I Had A Hammer ...</title><category>Software Engineering</category><category>Optimization and Tuning</category><category>Life, The Universe, and Everything</category><id>http://www.webperformancematters.com/journal/2007/9/10/if-i-had-a-hammer.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/9/10/if-i-had-a-hammer.html"/><author><name>Chris Loosley</name></author><published>2007-09-10T21:45:00Z</published><updated>2007-09-10T21:45:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Hammer.JPG" alt="Illustration: Ultimate Geeks Multi-Tool Hammer" title="Ultimate Geeks Multi-tool Hammer"/></span>

<p><em>If I had a hammer<br />
I'd hammer in the morning<br />
I'd hammer in the evening<br />
All over this land<br />
I'd hammer out danger<br />
I'd hammer out a warning<br />
I'd hammer out love between my brothers and my sisters<br />
All over this land</em></p>
<p class="QuoteSource">--Pete Seeger and Lee Hays, 1949 [<a href="http://en.wikipedia.org/wiki/If_I_Had_a_Hammer" class="offsite-link-inline">Wikipedia</a>]</p>

<p>In May 2007, after I wrote about <a href="http://www.webperformancematters.com/journal/2007/5/15/controlling-what-you-cant-measure.html" class="PMref"><em>Controlling What You Can't Measure</em></a>, I had a conversation with Ben Simo (see the comments) about metrics and tools, during which I wrote:

<blockquote>
<p>Hammers and chisels can be very dangerous, but carpenters use them every day, and we don't brand them as "bad tools" just because some people don't know how to use them properly. Human nature being what it is, there will always be some incompetent fools who try to use a hammer and chisel to drive in a screw or open a bottle of beer, just because they have those tools handy. </p>
</blockquote> 

<p>A few days later, while reading a series of articles on <a href="http://www.webperformancematters.com/journal/2007/5/22/performance-engineering.html" class="PMref">Performance Engineering</a> written by Scott Barber, I noticed the following quotation:</p>

<blockquote>
<p>All parts should go together without forcing. You must remember that the parts you are reassembling were disassembled by you. Therefore, if you can't get them together again, there must be a reason. <strong>By all means, do not use a hammer.</strong></p>
<p class="QuoteSource">--IBM maintenance manual, 1925 [emphasis added]</p>

</blockquote>

<p>This priceless piece of advice is quoted by Scott in <a href="http://www-128.ibm.com/developerworks/rational/library/4266.html" class="offsite-link-inline">part 9</a> of his 14-part series. At the time I made a note to write something about this, but after that it just sat in the "ideas for blog posts" folder for the next 3 months.</p>

<p>Until today, when I happened across the <a href="http://nexus404.com/Blog/2007/04/15/clever-multi-tool-hammer/" class="offsite-link-inline">Ultimate Geeks Multi-Tool Hammer</a>. Now, if I had <em>this</em> hammer, it turns out that I actually <em>could</em> use it to drive in a screw or open a bottle of beer, without being branded as an incompetent fool!</p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/Scott+Barber" rel="tag">Scott Barber</a>,
<a href="http://technorati.com/tag/Ben+Simo" rel="tag">Ben Simo</a>,
<a href="http://technorati.com/tag/multi-tool" rel="tag">multi-tool</a>,
<a href="http://technorati.com/tag/hammer" rel="tag">hammer</a>,
<a href="http://technorati.com/tag/Performance+Matters" rel="tag">Performance Matters</a>
</p>]]></content></entry><entry><title>Scalability is Not Optional</title><category>Software Engineering</category><category>Blogs and Publications</category><category>Architecture</category><id>http://www.webperformancematters.com/journal/2007/9/7/scalability-is-not-optional.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/9/7/scalability-is-not-optional.html"/><author><name>Chris Loosley</name></author><published>2007-09-07T22:00:00Z</published><updated>2007-09-07T22:00:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Kent%20Langley.jpg" alt="Illustration: Kent Langley" title="Kent Langley"/></span>

<p>My recent post, <a href="http://www.webperformancematters.com/journal/2007/8/21/asynchronous-architectures-4.html" class="PMref">Asynchronous Architectures [4]</a>, summarized a presentation by Werner Vogels at the 2007 <a href="http://qcon.infoq.com/london-2007/conference/" class="offsite-link-inline">QCON</a> conference in London.</p>

<p>A subsequent post by <a href="http://www.productionscale.com/kent/" class="offsite-link-inline">Kent Langley</a> in his new <a href="http://www.productionscale.com/" class="offsite-link-inline">ProductionScale</a> blog -- entitled <a href="http://www.productionscale.com/home/2007/8/11/getting-rid-of-the-relational-database.html" class="offsite-link-inline"><em>Getting Rid of the Relational Database</em></a> -- supports the arguments advanced by Vogels.</p>

<p>Describing the relational database model as "the proverbial ball and chain in the relationship between scalable applications and the underlying infrastructure," Kent writes:</p>

<blockquote>
<p>The quest for seamless linear growth for technology applications is being hindered by the “elephant database.”</p>

<p>What would Amazon do? In a recent talk2 at QCON London Werner Vogel, the CTO of Amazon.com clearly noted that the relational database model is a essentially outdated for the needs of modern applications as a primary data storage medium. In other words, it is simply to slow and cumbersome.</p>

<p>Additionally, Mr. Vogel makes a critical point that in many, many cases relational databases are simply not necessary. Simple key/value pairs (hashes) are all you need.</p>

<p class="QuoteSource">--Joseph Kent Langley, <a href="http://www.productionscale.com/home/2007/8/11/getting-rid-of-the-relational-database.html" class="offsite-link-inline"><em>Getting Rid of the Relational Database</em></a>, August 11, 2007</p>
</blockquote>

<p>Kent goes on to describe why he believes "you should break out of a one-size-fits-all way of thinking when it comes to databases, data storage, and scalable systems. Vertical scaling by throwing hardware at it is no longer sufficient for modern web scale applications".</p>

<p>Kent also points to <a href="http://future.gigaom.com/2007/08/10/data-20-how-the-web-disrupts-our-relational-database-world/" class="offsite-link-inline">Data 2.0: How the Web disrupts our relational database world</a>, which he admits he has not read. Maybe if he had read the article he might have omitted this link.</p>

<p>Although the author of that article, Nitin Borwankar, supports Vogel's general conclusions, his style is to make sweeping pronouncements. He advances no technical arguments that lead up to his conclusion that <em>"The days of Data 1.0 are past. The days of Data 2.0 are dawning, and it promises to be very disruptive for mainstream database architectures on the Web"</em>.</p>

<h3>Other posts about scalability</h3>

<p>I recommend Kent's new blog, and I'm adding it to my blogroll. It looks as if it will contain regular discussions of performance topics. For example, since launching the blog in early August, Kent has already written about:</p>

<ul class="group">
<li><a href="http://www.productionscale.com/home/2007/8/5/scalable-lamp-caching.html" class="offsite-link-inline">Scalable LAMP: Caching</a></li>
<li><a href="http://www.productionscale.com/home/2007/8/5/varnish-a-web-accelerator.html" class="offsite-link-inline">Varnish - A Web Accelerator</a> [a fast reverse proxy caching system]</li>
<li><a href="http://www.productionscale.com/home/2007/8/21/the-power-of-mod_deflate.html" class="offsite-link-inline">The Power of mod_deflate</a> [about content compression]</li>
<li><a href="http://www.productionscale.com//home/2007/8/22/scalability-and-performance-a-few-resources.html" class="offsite-link-inline">Scalability and Performance: A Few Resources</a></li>
<li><a href="http://www.productionscale.com/home/testing-with-wbox.html" class="offsite-link-inline">Testing with WBox</a> [about scalability testing]</li>
</ul>

<h3>Proofread before publishing</h3>

<p>I do have one small complaint. As a blogger, I know the feeling of writing down my thoughts and wanting to get them published -- right now! The short publishing cycle is one of the attractions of a blog. But if Kent could just curb his enthusiasm for long enough to proofread his posts once carefully before hitting that <em>Publish</em> button, his thoughts would be a lot easier to follow:</p>

<blockquote>
<h4>Huh?</h4>
<ul class=group>
<li>Afterwards, I will follow up with a brief analysis or executive summary if you will of this infobit might mean for businesses. <span class="aside">Needs commas around "if you will," and "of" should be "of what".</span></li>
<li>By way if example using the techniques of bond arbitrage Stonebraker notes quite earnestly that it is a “latency arms race.” <span class="aside">Needs commas around "using ... arbitrage," and "if" should be "of".</span></li>
<li>So, is this inconclusive evidence of the pending death of the Relational Database? Of course not. <span class="aside">"Inconclusive" should be "conclusive".</span></li>
<li>But, it is trend spotting in that people are again noticing that there are other ways and that those other ways just might quite faster with modern applications. <span class="aside">"might quite" should be "might be".</span></li>
</ul>
<p class="QuoteSource">--Joseph Kent Langley, <a href="http://www.productionscale.com/home/2007/8/11/getting-rid-of-the-relational-database.html" class="offsite-link-inline"><em>Getting Rid of the Relational Database</em></a>, August 11, 2007</p>
</blockquote>

<p>Although I can infer what Kent is trying to say here, these glitches spoil the overall effect by forcing me to re-read his sentences to get the point. In elementary school, I learned the old English proverb: <a href="http://www.usingenglish.com/reference/idioms/spoil+the+ship+for+a+ha%27pworth+of+tar.html" class="offsite-link-inline"><em>Don't spoil the ship for a ha'pworth of tar</em></a>. I believe bloggers should consider readability to be as important as their message, if they want to build a faithful following. </p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,
<a href="http://technorati.com/tag/Kent+Langley" rel="tag">Kent Langley</a>,
<a href="http://technorati.com/tag/ProductionScale" rel="tag">ProductionScale</a>,
<a href="http://technorati.com/tag/Werner+Vogels" rel="tag">Werner Vogels</a>,
<a href="http://technorati.com/tag/QCON" rel="tag">QCON</a>,
<a href="http://technorati.com/tag/relational+database" rel="tag">relational database</a>,
<a href="http://technorati.com/tag/Web+applications" rel="tag">Web applications</a>,
<a href="http://technorati.com/tag/Performance+Matters" rel="tag">Performance Matters</a>
</p>]]></content></entry><entry><title>Managing for Business Effectiveness</title><category>Articles and White Papers</category><category>Business Perspectives</category><category>Management Wisdom</category><id>http://www.webperformancematters.com/journal/2007/8/29/managing-for-business-effectiveness.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/8/29/managing-for-business-effectiveness.html"/><author><name>Chris Loosley</name></author><published>2007-08-29T07:01:00Z</published><updated>2007-08-29T07:01:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<div class="PageWisdomWrapper">

<div class="WisdomTitle" >
<h3>Drucker on Effectiveness vs. Efficiency</h3>
<p class="WisdomClass" ><a href=" /display/ShowJournal?moduleId=1113404&categoryId=109667">Management Wisdom</a>: 3</p>
</div>

<p class="WisdomQuote">There is surely nothing quite so useless as doing with great efficiency what should not be done at all</p>

<div class="WisdomText">
</div>

<p class="QuoteSource">-- Peter Drucker, 1963</p>
</div>

<p><a href="http://en.wikipedia.org/wiki/Peter_Drucker" class="offsite-link-inline">Peter Drucker</a> is often called "the father of modern management". Many books and Web sites are devoted to his insights, some of which I have <a href="http://www.webperformancematters.com/journal/2006/3/14/deep-thoughts-on-management.html" class="offsite-link-inline">written about</a> previously.</p>

<p>This post highlights his incisive observation about the difference between <em>effectiveness</em> and <em>efficiency</em>. I have always found it to be especially memorable, and quoted it (twice) when discussing priorities and choices in my book about software performance. Unfortunately I got the source wrong, but thanks to Google I can now correct my mistake. </p>

<p>It appeared in <em>Managing for Business Effectiveness</em>, an article in the May/June 1963 edition of Harvard Business Review ("HBR"). You can also find it in a February 2006 HBR article -- <a href="http://harvardbusinessonline.hbsp.harvard.edu/b02/en/common/item_detail.jhtml?id=R0602J" class="offsite-link-inline">What Executives Should Remember</a> -- a collection of excerpts drawn from HBR articles by Drucker published between 1963 and 2004.</p>

<p>Because Drucker's remarks are equally relevant to technical performance management and business leadership, I am cross-posting this here on <em>Web Performance Matters</em>, and on our new <a href="http://www.uprightmatters.com/" class="UMref"><em>UpRight Matters</em></a> blog. </p>

<h3>Three key questions</h3>

<p>In his 1963 essay, Drucker states that there is no magic formula, checklist, or procedure that will substitute for the hard, demanding, risk-taking work of management. But he claims that "we know how to organize the job of managing for economic effectiveness and how to do it with both direction and results. The answers to the [following] three key questions ... are known, and have been known for such a long time that they should not surprise anyone."</p>
 
<blockquote>
<p><strong>1. What is the manager's job?</strong> It is to direct the resources and efforts of the business toward opportunities for economically significant results. This sounds trite—and it is. But every analysis of actual allocation of resources and efforts in business that I have ever seen or made showed clearly that <em>the bulk of time, work, attention, and money first goes to "problems" rather than to opportunities, and, secondly, to areas where even extraordinarily successful performance will have minimum impact on results.</em></p>
 
<p><strong>2. What is the major problem?</strong> It is fundamentally the confusion between effectiveness and efficiency that stands between doing the right things and doing things right. <em>There is surely nothing quite so useless as doing with great efficiency what should not be done at all.</em> Yet our tools—especially our accounting concepts and data—all focus on efficiency. What we need is (1) a way to identify the areas of effectiveness (of possible significant results), and (2) a method for concentrating on them.</p>

<p><strong>3. What is the principle?</strong> That, too, is well-known—at least as a general proposition. Business enterprise is not a phenomenon of nature but one of society. In a social situation, however, events are not distributed according to the "normal distribution" of a natural universe (that is, they are not distributed according to the U-shaped Gaussian curve). <em>In a social situation a very small number of events—10 percent to 20 percent at most—account for 90 percent of all results, whereas the great majority of events account for 10 percent or less of the results.</em></p>

<p class="QuoteSource">-- Peter Drucker, <em>Managing for Business Effectiveness</em>, Harvard Business Review, May/June 1963</p>
</blockquote>

<h3>A principled foundation</h3>

<p>Although Drucker writes about management effectiveness in the context of business performance, specialists in software or systems performance must ask the same questions and apply the same principles. In my book on performance management [<a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" class="offsite-link-inline">Amazon</a>], I described these principles as follows: </p>

<ul>
<li>The Centering Principle: Focus on the most performance-critical components.</li>
<li>The Efficiency Principle: Maximize the ratio of useful work to overhead.</li>
<li>The Pareto Principle: Prioritize the 20% of the problem that will return 80% of the benefits.</li>
</ul>

<p>Drucker concludes that the most crucial requirement for effective management is having ... </p>

<p> ... <em>the courage to go through with logical decisions -- despite all pleas to give this or that product another chance, and despite all such specious alibis as the accountant's "it absorbs overhead" or the sales manager's "we need a full product line."</em></p>

<p>This is one small example of the characteristic I find so appealing in Drucker's writing. His advice starts from an assumption that there <strong>are</strong> relevant principles, and that you can make decisions by reasoning logically from those foundations. As a mathematician, this way of looking at the world appeals to my sense of order and logic, rather than presenting me with a collection of unsupported assertions and beliefs. The HBR introduction to its review, <em>What Executives Should Remember</em>, sums up Drucker's appeal as follows:</p>

<blockquote>
<p>Executives had come to think they knew how to run companies, and Drucker took it upon himself to poke holes in their beliefs, lest organizations become stale. But he did so in a sympathetic way. He assumed that his readers were intelligent, rational, hardworking people of goodwill. If their organizations struggled, he believed it was usually because of outdated ideas, a narrow conception of a problem, or internal misunderstandings. His insights were ... practical idea-based essays for executives, and his clear-eyed humanistic writing enhanced the magazine time and again. He helped us all to think broadly and deeply.</p>

<p class="QuoteSource">-- What Executives Should Remember, Harvard Business Review, February 2006</p>
</blockquote>

<p>This is why so many people have enjoyed, and like <a href="http://www.marketingheadhunter.com/about.html" class="offsite-link-inline">Harry Joiner</a>, continue to <a href="http://www.marketingheadhunter.com/executive_search/2005/11/peter_drucker.html" class="offsite-link-inline">enjoy daily</a>, the wisdom of Peter Drucker.<p> 

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/management" rel="tag">management</a>,
<a href="http://technorati.com/tag/management+principles" rel="tag">management principles</a>,
<a href="http://technorati.com/tag/management+wisdom" rel="tag">management wisdom</a>,
<a href="http://technorati.com/tag/Peter+Drucker" rel="tag">Peter Drucker</a>,
<a href="http://technorati.com/tag/effectiveness" rel="tag">effectiveness</a>,
<a href="http://technorati.com/tag/efficiency" rel="tag">efficiency</a>,
<a href="http://technorati.com/tag/Pareto+Principle" rel="tag">Pareto Principle</a>,
<a href="http://technorati.com/tag/80-20+rule" rel="tag">80-20 rule</a>,
<a href="http://technorati.com/tag/Harry+Joiner" rel="tag">Harry Joiner</a>,
<a href="http://technorati.com/tag/UpRight+Matters" rel="tag">UpRight Matters</a>
</p>]]></content></entry><entry><title>Asynchronous Architectures [4]</title><category>Foundations of Performance</category><category>Software Engineering</category><category>Events</category><category>Architecture</category><id>http://www.webperformancematters.com/journal/2007/8/21/asynchronous-architectures-4.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/8/21/asynchronous-architectures-4.html"/><author><name>Chris Loosley</name></author><published>2007-08-21T07:01:00Z</published><updated>2007-08-21T07:01:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<span class="PageIllustration"><img src="http://www.webperformancematters.com/storage/post-graphics/Werner%20Vogels.jpg" alt="Illustration: Werner Vogels" title="Werner Vogels"/></span>

<p><em>This is the fourth in a series of posts presenting arguments for <strong>asynchronous architectures</strong> as the optimal way to build high-performance, scalable systems for a distributed environment.</em></p>

<p>In a <a href="http://qcon.infoq.com/qcon-london-2007/conference/" class="offsite-link-inline">QCon conference</a> presentation on <em><strong>Availability and Consistency</strong> or how the CAP theorem ruins it all</em>, <a href="http://www.infoq.com/presentations/availability-consistency" class="offsite-link-inline">Werner Vogels</a>, Amazon CTO, examines the tension between availability & consistency in large-scale distributed systems, and presents a model for reasoning about the trade-offs between different solutions.</p>

<p>I recommend you find time to watch the entire 52-minute video. The flash streaming technology that InfoQ uses is subject to buffering hiccups, and you may have to restart it a few times. So in case you want to jump to a specific section, I've assembled copies of Werner's slides, with short timestamped notes on the content of each section. Werner did not present his slides in their numbered order, so in my notes I identify slides using the numbers printed on them, not their presentation order.</p>

<h3>Introduction</h3>

<p><strong>0:50:</strong> CTO's must match business with technology. Most really big IT shops <em>must</em> push the edge of what commercial technology can do. Technology has a very long adoption cycle -- it takes about 10 to 15 years for new technology to mature and be effective. For leading companies like Amazon, that's too slow, the scalability challenges are so great that they demand advanced solutions. So shops are forced (in effect) to do their own research, take advanced steps, just to succeed in a competitive marketplace. </p>

<p><strong>2:15:</strong> Werner noted that his viewpoint disagreed strongly with that of <a href="http://qcon.infoq.com/qcon-london-2007/speakers/show_speaker.jsp?oid=137" class="offsite-link-inline">Cameron Purdy</a>, CEO of Tangosol, who was an advocate of database technology.</p>

<p><strong>3:00:</strong> He introduced Eric Brewer's CAP theorem -- more later. 
[See the end of my previous post in this series, <a href="http://www.webperformancematters.com/journal/2007/8/15/asynchronous-architectures-3.html" class="PMref">Asynchronous Architectures [3]</a>. The CAP theorem was first propounded in a 1998 presentation -- <a href="http://www.ccs.neu.edu/groups/IEEE/ind-acad/brewer/" class="offsite-link-inline">Lessons from Internet Services: ACID vs. BASE</a> -- by Dr. Eric Brewer of Inktomi, now a <a href="http://www.cs.berkeley.edu/~brewer/" class="offsite-link-inline">professor</a> at UC Berkeley].</p>

<h3>3:45: What is Scalability? [slide 2]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%202.JPG" alt="Vogels%20CAP%20Theorem%202.JPG" title="Vogels%20CAP%20Theorem%202.JPG"/></span>

<p><strong>3:45:</strong> The meat of Werner's QCon presentation really begins here. <em> Proportional</em> is the key word in these definitions. Adding resources should deliver increased capacity <em>proportional</em> to the added resources. Or if the intent was to deliver better performance, the gains should be <em>proportional</em> to the added resources. Performance here is not just about response, it could mean transfering more data or larger datasets. </p>

<p><strong>4:40:</strong> Another reason for needing scalability is to achieve fault-tolerance. Adding resources to achieve redundancy should not hurt your performance. Traditional technologies (like databases) won't give you this kind of scalability, because overheads increase as you scale up. These are subjects I discussed at length when explaining <em>The Parallelism Principle</em> in <em>High-Performence Client/Server</em>: </p>

<blockquote>
<h4>13.1  The Parallelism Principle: Exploit parallel processing</h4>
	
<p>Processing related pieces of work in parallel typically introduces additional synchronization overheads, and often introduces contention among the parallel streams. Use parallelism to overcome bottlenecks, provided the processing speedup offsets the additional costs introduced.</p>

<h4>13.2  Scalability and speed up</h4> 

<p><em>Scalability</em> refers to the capacity of the system to perform more total work in the same elapsed time, when its processing power is increased. </p>
<p><em>Speed up</em> refers to the capacity of the system to perform a particular task in a shorter time, when its processing power is increased.</p>
<p>In a system with linear scalability and speed up, any increase in processing power generates a proportional improvement in throughput or response time. </p>

<p class="QuoteSource">--<a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 13, pp383-385.</p>
</blockquote>

<h3><strong>8:15:</strong> Scalability for Real Systems [slide 3]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%203.JPG" alt="Vogels%20CAP%20Theorem%203.JPG" title="Vogels%20CAP%20Theorem%203.JPG"/></span>

<p><strong>7:00:</strong> Slide 3 is the conclusion of a cost discussion that begins before the slide is shown. The biggest threat to availability is bugs, which are a cost factor introduced by humans. So operating costs must not grow as you scale up.</p>

<h3><strong>8:45:</strong> But ... [slide 4]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/Vogels%20CAP%20Theorem%204.JPG" alt="Vogels%20CAP%20Theorem%204.JPG" title="Vogels%20CAP%20Theorem%204.JPG"/></span>

<p><strong>8:30:</strong> Traditional technologies, databases, two-phase commit may work for 2-4 nodes, but they will not scale to 100's (let alone 10.000) nodes. You may not have 10,000 nodes like Amazon, but you will run into these scalability challenges at 50-100 nodes. </p>

<h3>10:05: Principles for Scalable Service Design [slide 13]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/Vogels%20CAP%20Theorem%2013.JPG" alt="Vogels%20CAP%20Theorem%2013.JPG" title="Vogels%20CAP%20Theorem%2013.JPG"/></span>

<p><strong>10:05:</strong> Guidelines for services design at Amazon -- a checklist of lessons learned through hard experience:</p>

<ul>
<li><strong>10:05:</strong> Decentralize. Any algorithm that requires agreement will eventually become a bottleneck. Two-phase commit is an in effect an <em>unavailability</em> algorithm, it is guaranteed to fail as you scale up the number of participating services.</li>
<li><strong>10:50:</strong> Asynchrony. Make progress under all cicumstances, even if the world is burning around you. Even if fulfillment services are burning down, you want people to be able to place orders. So work locally, don't worry about the rest of the system.</li>
<li><strong>11:40:</strong> Autonomy: Each node should be able to make decisions based only on local state. If you need to reach agreement based on global conditions at high load, you are lost. Nodes may be failing, coming up, going down all the time. Probabilistic techniques work well in these circumstances.</li>
<li><strong>12:35:</strong> Controlled concurrency. Reduce concurrency as much as possible, so that you do not need to use locking.</li>
<li><strong>13:15:</strong> Controlled parallelism. Control traffic going to each node using careful load balancing; nodes must have spare CPU and I/O capacity so that they can do other tasks (like load re-balancing) in the background.</p>
<li><strong>14:10:</strong> Symmetry. Things work really well if all nodes do exactly the same thing. It is easy to add more nodes if nodes do not have to be identified as a <em>directory node</em>, a <em>data-storage node</em>, etc. Ideally, you just install the software and run it, and it responds to any client request and does whatever task is needed, or maybe forwards the request if necessary. This is the logical way to address a requirement that I first documented in a 1993 paper, and later included in Chapter 16, Architecture for High Performance, of my book: </p>
</ul>

<blockquote>
<p>For a large organization, moving to an enterprise client/server system represents a major shift from monolithic systems with fixed distribution to dynamic, heterogeneous, pervasively networked environments. The next generation of systems will be an order of magnitude more dynamic--always running, always changing--with thousands of machines in huge networks.</p>
<p>In such an environment, content components (service providers like DBMSs) and service consumers (e.g. GUIs) must be continually added and removed.</p>
<p>The key to doing this is the middle tier, the hub of the three tier architecture. In the first place, this central layer acts in a connecting role to let individual clients access multiple content servers, and (of course) servers support multiple clients. A separate central tier can also:</p>
<ul>
<li>Provide a set of services that can be dynamically invoked by both the consumer and content layers</li>
<li>Allow new services to be added without major reconfiguration</li>
<li>Allow services to be removed dynamically without affecting any participant not using those services</li>
<li>Allow one service provider to be replaced by another</li>
</ul>	
<p>These are all vital characteristics in a distributed computing environment.</p>
<p class="QuoteSource">--<a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 16, pp514-515.</p>
</blockquote>

<p><strong>15:10:</strong> Algorithms that force you to obtain agreement will become a bottleneck. So avoid using two-phase commit, maybe by denormalization to make sure your transaction always runs within a single node. Or split your task into multi-transaction workflows. You have to take an end-to-end look at the business transaction and decide. <span class="aside">[I have always advocated this approach -- see the conclusions of the second <a href="http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html" class="PMref">post</a> in this series].</span></p>

<p><strong>17:40:</strong> You can reuse some of those principles in building teams. Small teams are best, so that each team is responsible for a well-understood piece. Team effectiveness is just as important as architectural consistency. </p> 

<p><strong>19:00:</strong> We call this the two pizza rule -- <em>If you can't feed a team with two pizzas, it's too big</em>. OK, hungry just-out-of-college students do eat more, but they work harder too! As soon as you need more than 8 people, it's hard to understand what everyone is doing. Bigger teams, of 12 or more, must have meetings, and must spend a much larger percentage of their time communicating. This discussion harks back to the famous observation by Fred Brooks: </p>

<blockquote>
<h4>The Mythical Man-Month</h4>
<p>Men and months are interchangeable commodities only when a task can be partitioned among many workers <em>with no communication among them</em>... </p>
<p>In tasks that can be partitioned but which require communication among the subtasks, the effort of communication must be added to the amount of work to be done... </p>
<p>The added burden of communication is made up of two parts, training and intercommunication... </p>
<p>If each part of the task must be separately coordinated with each other part, the effort increases as n(n-1)/2. Three workers require three times as much pairwise intercommunication as two, four require six times as much as two. If, moreover, there need to be conferences among three, four, etc., workers to resolve things jointly, matters get worse yet. The added effort of communicating may fully counteract the division of the original task.
<p class="QuoteSource">--The Mythical Man-Month, Frederick P. Brooks, 20th Anniversary Edition pp17-18, Addison Wesley, 1995.</p>   
</blockquote>

<p><strong>21:20:</strong> At Amazon, not all the services (1000 or more) support Web interactions at Amazon.com. Many are in the back-end systems such as supply-chain, fulfillment, enterprise services, handling feeds from 3rd-party suppliers, item management, recommendations, personalization.</p>

<p><strong>22:15:</strong> Example of <em>statistically improbable phrases</em> (SIPs), an interesting digression about text analysis being implemented as yet another service. (Too long. The details are a plug for Amazon, but take time away from the main thread of the presentation). </p>

<p><strong>24:30:</strong> Conclusion of the SIP discussion -- you need dependency management, and contracts. Servers can give an SLA based on workload conditions, that clients must honour. Automatic dependency discovery -- Amazon has home-grown tools that can show where dependencies exist, and the effects of failures in a network of nodes.</p>

<h3>26:00:</strong> Scalability Through Smart System Engineering [slide 12]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%2012.JPG" alt="Vogels%20CAP%20Theorem%2012.JPG" title="Vogels%20CAP%20Theorem%2012.JPG"/></span>

<p><strong>26:00:</strong> Use scalable primitives. For example, RPC is <em>not</em> scalable. Don't conceal heterogeneity. We can pretend that systems don't fail, but in practice they do. That's the problem with RPC, it pretends to be a procedure call but it isn't, so transparency does not really work, failures <strong>do</strong> happen, performance differences <strong>do</strong> exist. So don't conceal these differences.</p>

<p><strong>28:00:</strong> Configuration management. If you have 1000 services, configuration becomes really important. If your applications involve strong consistency properties, they can create problems when people leave your team.</p>

<p><strong>29:00:</strong> Repair and recovery. Check out the work on <a href="http://roc.cs.berkeley.edu/roc_overview.html" class="offsite-link-inline">Recovery Oriented Computing</a> (at Stanford and Berkeley) by <a href="http://www.cs.berkeley.edu/~pattrsn/" class="offsite-link-inline">Dave Patterson</a> and others. Can you design services to restart fast, maybe by keeping log information? If that really works well, then you can design your entire system around the principles of recovery and restart. If you don't like the behavior or performance of a service, you can kill it and let it restart. If it can be functioning again in a minute, this systems design approach can be very effective.</p>

<h3>30:40: CAP Conjecture [slide 8]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%208.JPG" alt="Vogels%20CAP%20Theorem%208.JPG" title="Vogels%20CAP%20Theorem%208.JPG"/></span>

<p><strong>30:40:</strong> At Amazon, all data applications are dominated by this theorem. Traditional data applications assumed that if you stored something in a database it would never go away <span class="aside">[<strong>D</strong>urability, the "D" in the <a href="http://en.wikipedia.org/wiki/ACID" class="offsite-link-inline">ACID properties</a>]</span>. In reality, because of redundancy, many nodes can be working in parallel and storing information, and then bad things can happen.</p>

<h3>32:00: A Clash of Cultures [slide 5]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%205.JPG" alt="Vogels%20CAP%20Theorem%205.JPG" title="Vogels%20CAP%20Theorem%205.JPG"/></span>

<p><strong>32:00:</strong> There's nothing wrong with transactions, they create a nice clean programming paradigm, and are good for programmers. But you must design for failure cases, because <em>transactions can fail</em>. So the ACID properties are great if you can get them, but getting these guarantees is costly. It may be fine if only a single node accesses the data. but not if 10's or 100's of machines need access. <strong></p>

<p>[33:30]</strong> In that case another, more fuzzy, approach may be better -- BASE, in which data is <em>basically available, more or less</em>. Applications maintain a <em>soft state</em>, in which data is eventually consistent.</p>

<h3>34:05: ACID vs BASE [unnumbered]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%205a.JPG" alt="Vogels%20CAP%20Theorem%205a.JPG" title="Vogels%20CAP%20Theorem%205a.JPG"/></span>

<p><strong>34:05:</strong> ACID has a pessimistic behavior, it will fail if it cannot reach the guarantees that you want. Availability is less important than <em>consistency</em>. For BASE systems, <em>availability </em>is the most important, and you are willing to sacrifice something to ensure it. So, for example, in a Web application you will design an application to accept and store shopping-cart input from a customer, and deal with minor problems in the data later. It's a weaker level of consistency, but you never want to tell the customer you can't accept their input.</p>

<h3>35:50: Why the Divide? [Slide 7]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%207.JPG" alt="Vogels%20CAP%20Theorem%207.JPG" title="Vogels%20CAP%20Theorem%207.JPG"/></span>

<p><strong>35:50:</strong> CAP stands for <strong>C</strong>onsistency, <strong>A</strong>vailability, and <strong>P</strong>artitioning. Eric Brewer came up with the conjecture that <em>systems can only possess two of these three characteristics</em>, which was subsequently proved to be true. That means systems designers must make choices, they must decide how to handle data reads and writes. If you insist on always enforcing consistency, your system may have to reject data interactions, making the system (in effect) unavailable at certain times. If you value availability and want to always accept user interactions, your application must then deal with the fact that some of its responses may later turn out to have been inconsistent.</p>

<p><strong>38:45:</strong> Sometimes applications can deal with this. In the Web environment, the common technique of customer stickiness may help you to operate with lower levels of consistency. Once they have begun a session, customers are typically redirected to the same server cluster, or data center. So a local level of data consistency is sufficient; global consistency is unnecessary. </p>

<h3><strong>39:30:</strong> Consistency and Availability [slide 9]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%209.JPG" alt="Vogels%20CAP%20Theorem%209.JPG" title="Vogels%20CAP%20Theorem%209.JPG"/></span>

<p><strong>39:30:</strong> Many applications have a workflow behavior. First the customer interacts with the shopping cart. At this time, availability is the most important. After that you do all kinds of things with that data, and during those activities, consistency is the most important. Now, because you are not interacting with the customer directly, if you can't obtain consistency for one data item, you can move on to process a different data item and come back later. Then you get to the shipment and delivery phase, and at this time the database is mostly read only. A data architecture that forces you to use the same powerful tool -- like a big relational database -- for all these different activities is not ideal. If you select data storage solutions that are appropriate for each phase, it's easier to scale your solution.</p>

<h3><strong>44:00:</strong> Partition-Tolerance and Availability [slide 10]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%2010.JPG" alt="Vogels%20CAP%20Theorem%2010.JPG" title="Vogels%20CAP%20Theorem%2010.JPG"/></span>

<p><strong>44:00:</strong> It's hard to program for weaker levels of consistency. Amazon has developed some API's, but Werner had no time to discuss these solutions. Slide 10 lists some examples, which he discussed briefly. The core design approach involves guaranteeing the durability of data inputs while relaxing consistency enforcement, then returning later to deal with any inconsistencies. </p>

<h3><strong>45:40:</strong> Techniques [slide 11]</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%2011.JPG" alt="Vogels%20CAP%20Theorem%2011.JPG" title="Vogels%20CAP%20Theorem%2011.JPG"/></span>

<p><strong>45:40:</strong> Read the slide, because Werner does not talk about it!</p>

<p>We used to use a lot of DB technology at Amazon. It works really well, especially if most applications manipulate single data records using their primary keys. You can still create accessors that iterate over the entire database, but these should be relegated to a lower priority, background, status. The primary interfaces should offer only simple get/put accesses. In a production database that supports transactions, you don't need to also support queries, especially if the data is just XML text anyway. If engineers know what's inside those XML records, they may start coding against it! </h3>

<p><strong>48:00:</strong> Whatever DM software you are running, databases that require high performance need specialists to configure, operate, and manage them. Engineers can't do it effectively, you need DBAs, even for very simple access patterns. <span class="aside">[<strong>Guideline 19.5</strong> in High-Performance Client/Server: Although DBMSs may offer similar features, implementations usually differ. Never assume that a design rule of thumb learned for one DBMS can be applied unchanged to another.]</span> </p>

<h3><strong>50:30:</strong> What does this mean for the data architecture?</h3>

<span class="full-image-float-none"><img src="http://www.webperformancematters.com/storage/post-graphics/Vogels%20CAP%20Theorem%2014.JPG" alt="Vogels%20CAP%20Theorem%2014.JPG" title="Vogels%20CAP%20Theorem%2014.JPG"/></span>

<p><strong>50:30:</strong> Again, read the slide, because in the edited presentation stream, Werner appears to be speaking to a different slide altogether. And then he ran out of time, so ...</p>

<p>... that's it. A really insightful and informative talk by Werner Vogels. No doubt his presentation could have been improved, given more time, or even just better use of the time available. But all the same, it is very stimulating (I think) and well worth several listens, until you grasp the central points -- all of which I agree with. In fact, Werner's conclusions circle back to the conclusion of my book, and my opening statement in the <a href="http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html" class="offsite-link-inline">first post</a> in this series: </p>

<div class="InlineTextBox">
<p><em>Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance (distributed) systems.</em></p>
</div>

<p>Documenting this talk has been both educational and satisfying. But -- since I can't type nearly as quickly as Werner can talk -- I may have misquoted him somewhere. If you spot a mistake please let me know, and I'll correct it.</p> 


<p class="Footnote"><strong>This series of posts</strong> contains some material first published in <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.</p>

<p class="Footnote"><strong>Tags:</strong>
<a href="http://technorati.com/tag/Werner+Vogels" rel="tag">Werner Vogels</a>,
<a href="http://technorati.com/tag/Amazon" rel="tag">Amazon</a>,
<a href="http://technorati.com/tag/QCon" rel="tag">QCon</a>, 
<a href="http://technorati.com/tag/distributed+systems" rel="tag">distributed systems</a>, 
<a href="http://technorati.com/tag/asynchronous+architecture" rel="tag">asynchronous architecture</a>,
<a href="http://technorati.com/tag/Web+services" rel="tag">Web services</a>,
<a href="http://technorati.com/tag/SOA" rel="tag">SOA</a>,
<a href="http://technorati.com/tag/performance" rel="tag">performance</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,
<a href="http://technorati.com/tag/synchronization" rel="tag">synchronization</a>, 
<a href="http://technorati.com/tag/autonomy" rel="tag">autonomy</a>, 
<a href="http://technorati.com/tag/multi-transaction" rel="tag">multi-transaction</a>,
<a href="http://technorati.com/tag/workflow" rel="tag">workflow</a>, 
<a href="http://technorati.com/tag/David Patterson" rel="tag">David Patterson</a>,
<a href="http://technorati.com/tag/Recovery+Oriented+Computing" rel="tag">Recovery Oriented Computing</a>,
<a href="http://technorati.com/tag/Fred+Brooks" rel="tag">Fred Brooks</a>,
<a href="http://technorati.com/tag/Mythical+Man+Month" rel="tag">Mythical Man Month</a>,  
<a href="http://technorati.com/tag/Eric+Brewer" rel="tag">Eric Brewer</a>, 
<a href="http://technorati.com/tag/ACID+properties" rel="tag">ACID properties</a>,
<a href="http://technorati.com/tag/two-phase+commit" rel="tag">two-phase commit</a>, 
<a href="http://technorati.com/tag/BASE" rel="tag">BASE</a>, 
<a href="http://technorati.com/tag/CAP+theorem" rel="tag">CAP theorem</a>, 
<a href="http://technorati.com/tag/performance+matters" rel="tag">performance matters</a> 
</p>
]]></content></entry><entry><title>Asynchronous Architectures [3]</title><category>Foundations of Performance</category><category>Software Engineering</category><category>Performance Wisdom</category><category>Architecture</category><id>http://www.webperformancematters.com/journal/2007/8/15/asynchronous-architectures-3.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/8/15/asynchronous-architectures-3.html"/><author><name>Chris Loosley</name></author><published>2007-08-15T07:01:00Z</published><updated>2007-08-15T07:01:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<div class="PageWisdomWrapper">

<div class="WisdomTitle" >
<h3>Dan Pritchett's Design Principle</h3>
<p class="WisdomClass" ><a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044">Performance Wisdom</a>: 13</p>
</div>

<p class="WisdomQuote">
Always assume high latency, not low latency
</p>

<div class="WisdomText">
<p>One of the underlying principles is assuming high latency, not low latency. An architecture that is tolerant of high latency will operate perfectly well with low latency, but the opposite is never true.</p>
<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007</p>
</div>
</div>

<p><em>This is the third in a series of posts presenting arguments for <strong>asynchronous architectures</strong> as the optimal way to build high-performance, scalable systems for a distributed environment.</em></p>

<p>The first reviewed the case for asynchronous communication among interdependent components or services, and <a href="http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html" class="PMref">Bell's Law of Waiting</a>. The second highlighted <a href="http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html" class="PMref">The Fallacies of Distributed Computing</a>, and discussed the importance of reflecting the business process in distributed systems design.</p>

<p>This post reviews <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, an article about how asynchronous architectures can improve the quality of Web applications, published on the <a href="http://www.infoq.com/" class="offsite-link-inline">InfoQueue</a> site by eBay architect Dan Pritchett in May 2007. Dan's article is especially relevant today, given the high level of interest in adopting Web services and SOA approaches.</p>

<p>Dan explains why global, large-scale architectures need to address latency, and what architectural patterns can be applied to deal with it. He begins by invoking the second fallacy of distributed computing:</p>

<blockquote>
<p><strong>Latency.</strong></p>

<p>The time it takes packets to flow from one part of the world to another.  Everyone knows it exists. The second fallacy of distributed computing is &quot;Latency is zero&quot;.  Yet so many designs attempt to work around latency instead of embracing it.  This is unfortunate and in fact doesn't work for large-scale systems. Why?</p>

<p>In any large-scale system, there are a few inescapable facts:</p>
<ol>
    <li> A broad customer base will demand reasonably consistent performance across the globe.</li>
    <li> Business continuity will demand geographic diversity in your deployments.</li>
    <li> The speed of light isn't going to change.</li>
</ol>
<p>The last point can't be emphasized enough. The speed of light dictates that even if we can route packets at the speed of light, seems unlikely, it will take 30ms for a packet to traverse the Atlantic.</p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<h3>Latency hurts customer service</h3>

<p>He emphasises the connection between Internet latency and customer service:</p>

<blockquote>
<p><strong>The Internet is a part of foundation of the global economy.</strong></p>

<p>Companies need to reliably reach their customers regardless of where they may be located. Architectures that force close geographic proximity of the components limit the quality of service provided to geographically distributed customers. Response time will obviously degrade the further customers are from the servers, but so will reliability. Despite the tremendous increase in the reliability of traffic routing on the Internet, the further you are from a service, the more often that service will be effectively unavailable to you.</p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<h3>Latency tolerance</h3>

<p>After spelling out the principle that I have highlighed above as today's <em>Performance Wisdom</em>, he goes on to make the case for introducing asynchronous interactions as the way to achieve latency tolerance.</p>

<blockquote>
<p>The web has created an interaction style that is very problematic for building asynchronous systems. The web has trained the world to expect request/response interactions, with very low latency between the request and response. These expectations have driven architectures that are request/response oriented that lead to synchronous interactions from the browser to the data. This pattern does not lend itself to high latency connections.</p>

<p><strong>Latency tolerance can only be achieved by introducing asynchronous interactions to your architecture.</strong></p>

<p>The challenge becomes determining the components that can be decoupled and integrated via asynchronous interactions. An asynchronous architecture is far more than simply changing the request/response from a call to a series of messages though. The client is still expecting a response in a deterministic time. Asynchronous architectures shift from deterministic response time to probabilistic response time. Removing the determinism is uncomfortable for users and probably for your business units, but is critical to achieving true asynchronous interactions.<p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<p><a href="http://www.thbs.com/pdfs/sync_or_async.pdf" class="offsite-link-inline">Web Services in SOA - Synchronous or Asynchronous?</a>, a paper by Torry Harris Business Solutions, offers another introduction to the pro's and con's of synchronous and asynchronous architectures.</p>

<p>Dan accepts that not all Web application components can be designed to function asynchronously, but argues that designers can identify those use cases that do support synchronous interactions. These arguments confirm my <a href="http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html" class="PMref">earlier</a> conclusion that synchronous solutions must be combined with <em>asynchronous designs in which the user must accept that unconfirmed changes will be reflected in the enterprise database(s) at a later time</em>.</p>

<h3>Data partitioning</h3>

<p>He makes a very good point about the importance of designing data distribution from the outset:</p>

<blockquote>
<p>You can decompose your applications into a collection of loosely coupled components; expose your services using asynchronous interfaces, and yet still leave yourself parked in one data center with little hope of escape. You have to tackle your persistence model early in your architecture and require that data can be split along both functional and scale vectors or you will not be able to distribute your architecture across geographies. </p>

<p><strong>I recently read an article where the recommendation was to delay horizontal data spreading until you reach vertical scaling limits. I can think of few pieces of worse advice for an architect. Splitting data is more complex than splitting applications.</strong></p>

<p>But if you don't do it at the beginning, applications will ultimately take short cuts that rely on a monolithic schema. These dependencies will be extremely difficult to break in the future.</p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<h3>ACID or BASE?</h3>

<p>Noting that the traditional way to maintain database consistency across partitioned data requires <a href="http://en.wikipedia.org/wiki/ACID" class="offsite-link-inline">ACID-compliant</a> distributed transactions and <a href="http://en.wikipedia.org/wiki/Two-phase_commit_protocol" class="offsite-link-inline">two-phase commit protocols</a>, Dan advocates a (cleverly-named) alternative to the <strong>ACID</strong> properties, the BASE approach to database consistency:</p>

<blockquote>
<p>The problem with distributed transactions is they create synchronous couplings across the databases. Synchronous couplings are the antithesis of latency tolerant designs. The alternative to <strong>ACID</strong> is <strong>BASE</strong>:

<p><strong>B</strong>asically <strong>A</strong>vailable
<br /><strong>S</strong>oft state
<br /><strong>E</strong>ventually consistent
</p>

<p>BASE frees the model from the need for synchronous couplings. Once you accept that state will not always be perfect and consistency occurs asynchronous to the initiating operation, you have a model that can tolerate latency.</p> 
<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007</p>
</blockquote>

<p>Another article worth reading, <a href="http://xml.sys-con.com/read/43755.htm" class="offsite-link-inline">Web-Services Transactions</a> by Doug Kaye, advances similar arguments without using the <em>BASE</em> terminology.</p>

<h3>References: business-driven or event-driven architectures</h3>

<p>While these articles present, at a high-level, a convincing case for asynchronous architectures, many others have elaborated on the implementation details. Here are five examples of more detailed treatments, in approximately descending order of generality:</p>

<ul class="group">

<li>The wikipedia article on <a href="http://en.wikipedia.org/wiki/Event_Driven_Architecture" class="offsite-link-inline">Event-driven architecture</a>.</li>

<li><a href="http://elementallinks.typepad.com/bmichelson/2006/02/eventdriven_arc.html" class="offsite-link-inline">Event-Driven Architecture Overview</a>, a very detailed post by Brenda Michelson in her blog, <a href="http://elementallinks.typepad.com/bmichelson/" class="offsite-link-inline">Elemental Links</a>. 

<span class="aside">[A blog <a href="http://dougmcclure.net/blog/?p=41" class="offsite-link-inline">response</a> from <a href="http://dougmcclure.net/blog/about/" class="offsite-link-inline">Doug McClure</a> exemplifies the gulf between the concerns of business/application architects and those of IT/network management, despite all the talk of "alignment" these days. As I wrote in my book, "Middleware is the reason why network specialists and application programmers cannot communicate!" (Guideline 15.3, p469)</em>.]</span></li>

<li><a href="http://www.javaworld.com/javaworld/jw-01-2005/jw-0131-soa.html" class="offsite-link-inline">Event-driven services in SOA</a>by Jeff Hanson, JavaWorld.com, January 31, 2005.</li>

<li><a href="http://msdn2.microsoft.com/en-us/library/ms706253.aspx" class="offsite-link-inline">Message Queuing Applications</a>, Microsoft Developer Network (MSDN).</li>

<li><a href="http://developers.sun.com/jsenterprise/reference/techart/jse7/asynch.html" class="offsite-link-inline">Developing Asynchronous Web Services with Java Message Service</a> by Rico Cruz and Marina Sum, Sun Developer Network.</li>

</ul>

<h3>Next: the CAP theorem</h3>

<p>Dan points out that adopting the BASE approach to consistency forces you to understand a very important principle, known as <em>The CAP Theorem</em>. This theorem was first propounded in a 1998 presentation -- <a href="http://www.ccs.neu.edu/groups/IEEE/ind-acad/brewer/" class="offsite-link-inline">Lessons from Internet Services: ACID vs. BASE</a> -- by Dr. Eric Brewer of Inktomi, now a <a href="http://www.cs.berkeley.edu/~brewer/" class="offsite-link-inline">professor</a> at UC Berkeley.</p>

<blockquote>
<p>Of course there are situations where data needs to be consistent at the end of an operation. The CAP Theorem is a useful tool for determining what data to partition and what data must conform to ACID.</p>

<p><strong>The CAP Theorem states that when designing databases you consider three properties, Consistency, Availability, and Partitioning. You can have at most two of the three for any data model.</strong></p>

<p>Organizing your data model around CAP allows you to make the appropriate decisions with regards to consistency and latency.</p>

<p class="QuoteSource">-- Dan Pritchett, <a href="http://www.infoq.com/articles/pritchett-latency" class="offsite-link-inline">The Challenges of Latency</a>, May 2, 2007 [emphasis added]</p>
</blockquote>

<p>Because <em>The CAP Theorem</em> plays a crucial role in the design of large scalable systems using asynchronous architectures, I will be devoting my next post to it.</p>

<p class="Footnote"><strong>This series of posts</strong> contains some material first published in <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.</p>

<p class="Footnote"><strong>Tags:</strong> 
<a href="http://technorati.com/tag/distributed+systems" rel="tag">distributed systems</a>, 
<a href="http://technorati.com/tag/asynchronous+architecture" rel="tag">asynchronous architecture</a>,
<a href="http://technorati.com/tag/Web+services" rel="tag">Web services</a>,
<a href="http://technorati.com/tag/SOA" rel="tag">SOA</a>,
<a href="http://technorati.com/tag/middleware" rel="tag">middleware</a>,
<a href="http://technorati.com/tag/serialization" rel="tag">serialization</a>,
<a href="http://technorati.com/tag/synchronization" rel="tag">synchronization</a>, 
<a href="http://technorati.com/tag/queues" rel="tag">queues</a>, 
<a href="http://technorati.com/tag/decoupled+processes" rel="tag">decoupled processes</a>, 
<a href="http://technorati.com/tag/multi-transaction" rel="tag">multi-transaction</a>,
<a href="http://technorati.com/tag/workflow" rel="tag">workflow</a>, 
<a href="http://technorati.com/tag/distributed+computing" rel="tag">distributed computing</a>,
<a href="http://technorati.com/tag/Dan+Pritchett" rel="tag">Dan Pritchett</a>,
<a href="http://technorati.com/tag/eBay" rel="tag">eBay</a>,
<a href="http://technorati.com/tag/Brenda+Michelson" rel="tag">Brenda Michelson</a>,
<a href="http://technorati.com/tag/Elemental+Links" rel="tag">Elemental Links</a>,
<a href="http://technorati.com/tag/Doug+McClure" rel="tag">Doug McClure</a>,
<a href="http://technorati.com/tag/Eric+Brewer" rel="tag">Eric Brewer</a>, 
<a href="http://technorati.com/tag/Inktomi" rel="tag">Inktomi</a>,  
<a href="http://technorati.com/tag/ACID+properties" rel="tag">ACID properties</a>,
<a href="http://technorati.com/tag/two-phase+commit" rel="tag">two-phase commit</a>, 
<a href="http://technorati.com/tag/BASE" rel="tag">BASE</a>, 
<a href="http://technorati.com/tag/CAP+theorem" rel="tag">CAP theorem</a>, 
<a href="http://technorati.com/tag/application+performance" rel="tag">application performance</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,
<a href="http://technorati.com/tag/performance+wisdom" rel="tag">performance wisdom</a>,  
<a href="http://technorati.com/tag/performance+matters" rel="tag">performance matters</a> 
</p>]]></content></entry><entry><title>Asynchronous Architectures [2]</title><category>Foundations of Performance</category><category>Software Engineering</category><category>Performance Wisdom</category><category>Architecture</category><id>http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/8/14/asynchronous-architectures-2.html"/><author><name>Chris Loosley</name></author><published>2007-08-14T07:01:00Z</published><updated>2007-08-14T07:01:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<div class="PageWisdomWrapper">

<div class="WisdomTitle" >
<h3>The Fallacies of Distributed Computing</h3>
<p class="WisdomClass" ><a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044">Performance Wisdom</a>: 12</p>
</div>

<div class="WisdomQuoteLong">
<ol>
<li>The network is reliable</li>
<li>Latency is zero</li>
<li>Bandwidth is infinite</li>
<li>The network is secure</li>
<li>Topology doesn't change</li>
<li>There is one administrator</li>
<li>Transport cost is zero</li>
<li>The network is homogeneous</li>
</ol>
</div>

<div class="WisdomText">
<p class="QuoteSource">-- <a href="http://java.sys-con.com/read/38665.htm" class="offsite-link-inline">Peter Deutsch</a>, James Gosling, Bill Joy, Tom Lyon</p>
</div>
</div>

<p><em>This post is the second in a series presenting arguments for <strong>asynchronous architectures</strong> as the optimal way to build high-performance, scalable systems for a distributed environment. The first post reviewed the general case for asynchronous communication among interdependent components or services, and highlighted <a href="http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html" class="PMref">Bell's Law of Waiting</a>.</em></p>

<p>In this post I discuss how the design of distributed systems should draw on that of manual business systems. Of course, distributed computing can shorten the timescales of some business operations enormously. But drawing analogies with the way manual systems work is an observation that will help us to design efficient and scalable distributed systems.</p> 

<p>In <a href="http://www.webperformancematters.com/journal/2007/8/10/five-scalability-principles.html" class="PMref">Five Scalability Principles</a>, I reviewed an <a href="http://www.mysql.com/why-mysql/scaleout/booking.html" class="offsite-link-inline">article</a> published by MySQL about the five performance principles that apply to all application scaling efforts. When discussing the first principle -- Don't think synchronously -- I stated that:</p>

<p><em>Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance (distributed) systems</em>.</p>

<h3>Starting from the wrong place</h3>
	
<p>Designing a computer system based on multi-transaction workflows is not a particularly revolutionary proposal. Indeed, if we had somehow been able to skip the first 40 years of the computer age and start automating our business systems using today’s technology, we would probably not have thought it the least bit unusual, because all manual systems work this way.</p> 
	
<p>But like our computers, our reasoning can be so logical that it lacks real thought. Occasionally, we need to balance our linear thinking with a small dose of simple wit like that ascribed to the Irish farmer, who, when asked by strangers for directions to a distant town, began his answer by saying “Well, I wouldn’t start from here”!</p>
 
<p>Most of our troubles stem from the fact that, when trying to reach the destination of distributed systems, we keep starting from the wrong place, namely the application designs and systems software of centralized computing:</p></p>
<ul>
<li>Design discussions dwell on how best to <a href="http://dev.mysql.com/tech-resources/articles/application_partitioning_wp.pdf" class="offsite-link-inline">partition</a> applications for the distributed environment. Manual systems, however, are already partitioned naturally -- the supposedly monolithic application that is being “partitioned” would not exist in the first place if it had not been conceived as “the right solution” by a designer with a centralized computing mindset.</li>
<li>The reason computer science devoted so much attention to <a href="http://en.wikipedia.org/wiki/Distributed_database" class="offsite-link-inline">distributed databases</a> and <a href="http://msdn2.microsoft.com/en-us/library/ms681205.aspx" class="offsite-link-inline">distributed transaction</a> management is because these concepts are extensions of the core mechanisms of centralized information processing -- shared databases and transaction monitors.</p>
</ul>

<p>But <a href="http://www.rgoarchitects.com/Files/fallacies.pdf" class="offsite-link-inline">The Fallacies of Distributed Computing</a> -- assembled during the 1990's by architects at Sun and the subject of today's <em>Performance Wisdom</em> (above) -- highlight crucial differences between centralized and distributed computing. Adding network components to an application introduces many potential problems that a centralized solution does not have to consider. So rather than trying to force the centralized mechanisms to work in a distributed environment, we should adopt mechanisms that are more appropriate.</p>

<h3>Starting from the right place</h3>
	
<p>In fact, we should start from the design of manual business systems. All large scale human systems are inherently distributed and asynchronous in nature. Even the participants in close knit team efforts operate asynchronously. We find chorus lines, cheerleaders, marching bands, and synchronized swimmers so interesting because they are such an aberration. So, before the advent of the centralized mainframe, the idea of recording an entire business transaction with a single synchronized set of human actions did not arise because it is so absurdly impossible.</p>

<p>Traditionally, the business process is divided into its natural components (or phases), according to the roles of the various human processors (or workers). Work flows through the phases, and information is recorded as necessary along the way. If anything goes wrong along the way, the appropriate set of compensating actions must be taken to undo whatever partial progress has been made. And the whole operation is designed to ensure that no irrevocable actions are taken too early in the process--usually meaning before the money is in the bank. Companies that mail out the diamonds before cashing the checks soon learn how to design a more effective multi-phase business process.</p>
	
<h3>Asynchronous architectures may reflect manual systems</h3>

<p>Manual systems always permit asynchronous operation of their separate components, because no other mode of operation is possible. Only computers make synchronous changes even possible. To a degree, centralized computing could deliver synchronous changes to related databases because the same computer managed all the system resources. Peaks in the workload could cause contention and delays, but provided the machine kept running, a congested machine acted as it’s own governor.</p> 
	
<p>But processor technology does not allow the centralized computing model to scale up without limits. When the workload surpasses the capabilities of the largest centralized processor, the only way to keep growing is to divide and conquer -- to create a network of computers. Networked computers demand a different approach to application design. Pursuing the vision of enterprise wide synchronization of information through networked computers can cause delays and inefficiencies when any part of the whole system operates below par. This is the Achilles heel of interdependent systems--the whole is no stronger than it’s weakest part.</p> 
	
<p>Therefore, the rigid concept of synchronous transactions must be replaced by a wider range of possible application designs: </p>
<ul class="group">
<li>The older, synchronous methods are still appropriate for changes that are within the scope of a single processor, or even for occasional communication between controlled components separated by a very carefully controlled, high speed network;</li>
<li>These must be combined with asynchronous designs in which the user must accept that unconfirmed changes will be reflected in the enterprise database(s) at a later time.</li>
</ul> 

<h3>Business process design</h3>

<p>When we do a good job of distributed systems design, it becomes an integral part of business process design. Rather than bending the business process to meet the needs of a centralized computer, we must blend the power of distributed computers into the business process.</p>
	
<p>Ironically, these changes bring us almost full circle back to the days of manual systems. In a manual system, it is normal for changes to be recorded quickly in one location, but for those changes to take a few days to percolate through the system, with the total processing time being somewhat uncertain. Distributed computing shortens the timescales, but applying many design principles that made manual systems work efficiently can help us to create effective and scalable distributed systems, even when we do not control the performance characteristics of all the components of those systems.</p>

<p>In my next post I will review some recent thinking about the intersection of asynchronous architectures and the world of Web services and SOA.</p>

<p class="Footnote"><strong>This post</strong> contains material first published in <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 16: Architecture for High Performance, pp509-510 and 526-527. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.</p>

<p class="Footnote"><strong>Tags:</strong> 
<a href="http://technorati.com/tag/distributed+systems" rel="tag">distributed systems</a>, 
<a href="http://technorati.com/tag/asynchronous+architecture" rel="tag">asynchronous architecture</a>,
<a href="http://technorati.com/tag/serialization" rel="tag">serialization</a>, 
<a href="http://technorati.com/tag/queues" rel="tag">queues</a>, 
<a href="http://technorati.com/tag/decoupled+processes" rel="tag">decoupled processes</a>, 
<a href="http://technorati.com/tag/multi-transaction" rel="tag">multi-transaction</a>,
<a href="http://technorati.com/tag/workflow" rel="tag">workflow</a>, 
<a href="http://technorati.com/tag/performance+wisdom" rel="tag">performance wisdom</a>,
<a href="http://technorati.com/tag/distributed+computing" rel="tag">distributed computing</a>,
<a href="http://technorati.com/tag/fallacies" rel="tag">fallacies</a>,
<a href="http://technorati.com/tag/Peter+Deutsch" rel="tag">Peter Deutsch</a>, 
<a href="http://technorati.com/tag/James+Gosling" rel="tag">James Gosling</a>, 
<a href="http://technorati.com/tag/Bill+Joy" rel="tag">Bill Joy</a>, 
<a href="http://technorati.com/tag/Tom+Lyon" rel="tag">Tom Lyon</a>, 
<a href="http://technorati.com/tag/Sun" rel="tag">Sun</a>, 
<a href="http://technorati.com/tag/software+performance" rel="tag">software performance</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,  
<a href="http://technorati.com/tag/performance+matters" rel="tag">performance matters</a> 
</p>]]></content></entry><entry><title>Asynchronous Architectures [1]</title><category>Foundations of Performance</category><category>Software Engineering</category><category>Performance Wisdom</category><category>Architecture</category><id>http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/8/13/asynchronous-architectures-1.html"/><author><name>Chris Loosley</name></author><published>2007-08-13T07:01:00Z</published><updated>2007-08-13T07:01:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<div class="PageWisdomWrapper">

<div class="WisdomTitle" >
<h3>Bell's Law of Waiting</h3>
<p class="WisdomClass" ><a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044">Performance Wisdom</a>: 11</p>
</div>

<p class="WisdomQuote">All computers wait at the same speed</p>

<div class="WisdomText">
<p class="QuoteSource">-- Dr. Thomas E. Bell, Performance of Distributed Systems, Presentation to ICCM Capacity Management Forum 7, October 1993, San Francisco</p>
</div>
</div>

<p>In <a href="http://www.webperformancematters.com/journal/2007/8/10/five-scalability-principles.html" class="PMref">Five Scalability Principles</a>, I reviewed an <a href="http://www.mysql.com/why-mysql/scaleout/booking.html" class="offsite-link-inline">article</a> published by MySQL about the five performance principles that apply to all application scaling efforts. When discussing the first principle -- Don't think synchronously -- I stated that <em>Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance (distributed) systems</em>.</p>

<p>That's a quote from <a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, from a section on <em>Abandoning the Single Synchronous Transaction Paradigm</em>, in Chapter 15, <em>Architecture for High Performance</em>. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.</p>

<p>So I am planning some more posts built around excerpts from the manuscript. I'll be updating and generalizing the terminology as necessary for today's environments, and adding some guidelines in my <a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044" class="PMref">Performance Wisdom</a> series.</p>

<h3>Asynchronous architectures are more scalable</h3>

<p>The first posts will elaborate on the arguments for <strong>asynchronous architectures</strong> as the optimal way to build high-performance, scalable systems for a distributed environment. I begin by reviewing the general case for asynchronous communication among interdependent components or services. </p>

<p>In the typical distributed enterprise, there will inevitably be fluctuations in the distribution of work to be done, as business volumes rise and fall, and fluctuations in the availability of network and processing resources.</p>
	
<p>Even if we have designed our systems to accommodate peak processing volumes, it is normal for some servers, some application components,  or some part of a large network to be out of action some of the time. Therefore, if we design applications that require all resources to be available before we can complete any useful work, we reduce the availability of the whole system to the level of its most error prone component. </p>
	
<p>For optimal performance, we should design applications to accommodate unexpected peaks in the workload, server outages, and resource unavailability. This means application and system design must:</p>
<ul class="group">
<li>Emphasize <strong>concurrent operation</strong> in preference to workload serialization</li>
<li>Prefer <strong>asynchronous connections</strong> to synchronous ones between clients and servers</li>
<li>Place requests for service in <strong>queues</strong> and continue processing, rather than waiting for a response</li>
<li>Create opportunities for <strong>parallel processing</strong> of workload components</li>
<li>Distribute work to <strong>overflow servers</strong> to accommodate peak volumes</li>
<li>Provide <strong>redundant servers</strong> to take over critical workload components during peaks and outages</li>
</ul>

<h3>Design applications that don’t wait</h3>
	
<p>Each of these topics is too large to review in any detail in this post, but their central theme can be summed up as: <em>Design applications that don’t wait</em>. Note an important distinction between the behavior of individual transactions or units of work, and the behavior of the system as a whole. Individual transactions may indeed have to wait until they can obtain the processing resources they need. But the application as a whole should continue processing, with a minimal allocation of resources to any transactions flowing through the system.</p>

<p>That way, the scarce computing resources of one server do not sit idle waiting for delayed transactions to receive responses from other services or components. For example, here's some advice from BEA, taken from <a href="http://edocs.bea.com/wls/docs92/jms/design_best_practices.html#wp1058694" class="offsite-link-inline">Best Practices for Application Design</a> when programming the WebLogic Java Message Service (JMS):</p>
<blockquote>
<h4>Asynchronous vs. Synchronous Consumers</h4>
<p>In general, asynchronous (onMessage) consumers perform and scale better than synchronous consumers:</p>

<ul>
<li>Asynchronous consumers create less network traffic. Messages are pushed unidirectionally, and are pipelined to the message listener. Pipelining supports the aggregation of multiple messages into a single network call. </li>

<li>Asynchronous consumers use fewer threads. An asynchronous consumer does not use a thread while it is inactive. A synchronous consumer consumes a thread for the duration of its receive call. As a result, a thread can remain idle for long periods, especially if the call specifies a blocking timeout. </li>

<li>For application code that runs on a server, it is almost always best to use asynchronous consumers (which) prevents the application code from doing a blocking operation on the server. A blocking operation, in turn, idles a server-side thread; it can even cause deadlocks. Deadlocks occur when blocking operations consume all threads. When no threads remain to handle the operations required to unblock the blocking operation itself, that operation never stops blocking. </li>
</ul>
</blockquote>

<h3>Bell's Law</h3>

<p>In conclusion, when designing distributed systems, we should always recall Tom Bell's humorous observation, <em>All computers wait at the same speed</em>. A computing resource that is waiting to be used, especially a processor, is a wasted resource.</p>

<p class="Footnote"><strong>This post</strong> contains material first published in <a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, Chapter 11: The Sharing Principle, p360. 
<strong>Tags:</strong> 
<a href="http://technorati.com/tag/distributed+systems" rel="tag">distributed systems</a>, 
<a href="http://technorati.com/tag/asynchronous+architecture" rel="tag">asynchronous architecture</a>,
<a href="http://technorati.com/tag/serialization" rel="tag">serialization</a>, 
<a href="http://technorati.com/tag/queues" rel="tag">queues</a>, 
<a href="http://technorati.com/tag/Bell's+Law" rel="tag">Bell's Law</a>, 
<a href="http://technorati.com/tag/Tom+Bell" rel="tag">Tom Bell</a>,
<a href="http://technorati.com/tag/performance+wisdom" rel="tag">performance wisdom</a>, 
<a href="http://technorati.com/tag/software+performance" rel="tag">software performance</a>,
<a href="http://technorati.com/tag/scalability" rel="tag">scalability</a>,  
<a href="http://technorati.com/tag/performance+matters" rel="tag">performance matters</a> 
</p>]]></content></entry><entry><title>Five Scalability Principles</title><category>Articles and White Papers</category><category>Foundations of Performance</category><category>Optimization and Tuning</category><category>Performance Wisdom</category><id>http://www.webperformancematters.com/journal/2007/8/10/five-scalability-principles.html</id><link rel="alternate" type="text/html" href="http://www.webperformancematters.com/journal/2007/8/10/five-scalability-principles.html"/><author><name>Chris Loosley</name></author><published>2007-08-10T10:12:00Z</published><updated>2007-08-10T10:12:00Z</updated><content type="html" xml:lang="en-US"><![CDATA[<div class="PageWisdomWrapper">
<div class="WisdomTitle" >
<h3>Five Scalability Principles</h3>
<p class="WisdomClass" ><a href="http://www.webperformancematters.com/display/ShowJournal?moduleId=1113404&categoryId=98044">Performance Wisdom</a>: 10</p>
</div>

<p class="WisdomQuote">Don’t think synchronously, ...</p>

<div class="WisdomText">
<p>... don&#8217;t think vertically, don&#8217;t mix transactions with business intelligence, avoid mixing hot and cold data, and don&#8217;t forget the power of memory.</p>
</ul>
 
<p class="QuoteSource">-- MySQL site, 2007</p>
</div>
</div>

<p><a href="http://www.mysql.com/why-mysql/scaleout/booking.html" class="offsite-link-inline">The 12 Days of Scale-Out</a> is a section of the <em>MySQL</em> site. It consists of a series of twelve articles, eleven of which are case studies describing large-scale MySQL implementations. <em>But Day Six</em> is a bit different -- it spells out five fundamental performance principles that apply to all application scaling efforts.</p>

<p>This subject is vitally important to MySQL, whose <em>server replication and high availability features ... allow high-traffic sites to horizontally 'Scale-Out' their applications, using multiple commodity machines to form one logical database -- as opposed to 'Scaling Up', starting over with more expensive and complex hardware and database technology</em>. </p>

<p>I know from first-hand experience that these claims are valid. At Keynote, my team used MySQL as the foundation for the <a href="http://www.keynote.com/products/web_performance/performance_measurement/performance_scoreboard.html" class="offsite-link-inline">Performance Scoreboard</a>. In this <a href="http://en.wikipedia.org/wiki/Data_mart" class="offsite-link-inline">data mart</a> application, MySQL supports supports the continuous insertion of new measurements at the rate of several million per day, plus hourly aggregation into summary tables, plus the queries needed to support continually updated dashboard displays for every customer, plus any ad hoc queries generated by customers doing diagnostic investigations.</p>

<h3>Learning the hard way</h3>

<p>According to the article, MySQL's <a href="http://www.mysql.com/news-and-events/press-release/release_2006_13.html" class="offsite-link-inline">database experts</a> ... <em>have seen many companies fall into a few common traps when they first design their systems, only to run into performance issues once the explosive growth hits</em>. So, adopting the <a href="http://www.webperformancematters.com/journal/2007/7/11/distributing-java-applications.html" class="PMref"><em>anti-pattern</em> approach</a> to providing guidance, the article presents the principles as the <a href="http://www.mysql.com/why-mysql/scaleout/scaleout_pitfalls.html" class="offsite-link-inline">Top Five Scale-Out
Pitfalls to Avoid</a>. These are:</p>

<ul class="grouptight">
<li>Don&#8217;t think synchronously</li>
<li>Don&#8217;t think vertically</li>
<li>Don&#8217;t mix transactions with business intelligence</li>
<li>Avoid mixing hot and cold data</li>
<li>Don&#8217;t forget the power of memory</li>
</ul>

<p>It is common for people to get smarter about performance when they have to find and fix problems. I learned about database performance first-hand between 1970 and 1995, while designing and tuning IBM database systems -- first IMS, and then DB2. In the process I discovered -- often the hard way -- that all large computer systems are subject to the same principles. And over the years I ran across other authors and teachers who (not surprisingly) had discovered the same things. Some had even invented memorable and insightful ways of describing their insights as "laws" or "rules".</p>

<p>When I wrote <a class="offsite-link-inline" href="http://www.amazon.com/exec/obidos/tg/detail/-/0471162698/002-1562714-5063248?v=glance" >High-Performance Client/Server</a>, I tried to capture these clever sayings as numbered guidelines. So in this post, I'm going to reprint each of MySQL's five scale-out pitfalls, followed by the corresponding guidelines and some related excerpts from the manuscript of that book (marked as "HPCL"). </p>

<h3>1. Don&#8217;t think synchronously</h3>

<blockquote>
<p>Thinking synchronous is the single biggest mistake in architecting a Scale-Out design. Generally, when load is added to an already-loaded system, some part of the system will become a bottleneck -- and response times will increase. In scale-out, with a large system consisting of multiple machines, thinking synchronously will add a lot of wait time and hurt performance. Any truly large scale-out design will have to introduce asynchronous communication, parallelization, and strategies to deal with approximate or slightly outdated data.</p>
<p class="QuoteSource">--<a href="http://www.mysql.com/why-mysql/scaleout/scaleout_pitfalls.html" class="offsite-link-inline">Top Five Scale-Out Pitfalls to Avoid</a>, The 12 Days of Scale-Out, Day 6</p>
</blockquote>

<h4>Abandoning the Single Synchronous Transaction Paradigm</h4>

<p><strong>HPCL</strong>: The ... concept of a heterogeneous distributed database with synchronized updates is a vision of utopia that swims against the tide of computing technology. The tight controls over application processing that are possible on a mainframe are incompatible with many aspects of the move to widespread distributed processing... Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance enterprise client/server systems:</p>

<ul>
<li><strong>Decoupled processes.</strong> Decoupling occurs when we can separate the different parts of a distributed system so that no one process ever needs to stop processing to wait for the other(s). The driving force 