Collected thoughts about software and site performance ...
Web performance matters. Responsive sites can make the online experience effective, even enjoyable. A slow site can be unusable. This site is about online performance, how to achieve and maintain it, its impact on user experience, and ultimately on site effectiveness.
Home | Entries about Foundations of Performance (19), in reverse date order:
This is the fourth in a series of posts presenting arguments for asynchronous architectures as the optimal way to build high-performance, scalable systems for a distributed environment.
In a QCon conference presentation on Availability and Consistency or how the CAP theorem ruins it all, Werner Vogels, Amazon CTO, examines the tension between availability & consistency in large-scale distributed systems, and presents a model for reasoning about the trade-offs between different solutions. I recommend you find time to watch the entire 52-minute video.
Dan Pritchett's Design Rule
Always assume high latency, not low latency
This post is the third in a series presenting arguments for asynchronous architectures as the optimal way to build high-performance, scalable systems for a distributed environment.
The first reviewed the case for asynchronous communication among interdependent components or services, and Bell's Law of Waiting. The second highlighted The Fallacies of Distributed Computing, and discussed the importance of reflecting the business process in distributed systems design.
This post reviews The Challenges of Latency, an article about how asynchronous architectures can improve the quality of Web applications, published on the InfoQueue site by eBay architect Dan Pritchett in May 2007. Dan's article is especially relevant today, given the high level of interest in adopting Web services and SOA approaches.
The Fallacies of Distributed Computing
1. The network is reliable
2. Latency is zero
3. Bandwidth is infinite
This post is the second in a series presenting arguments for asynchronous architectures as the optimal way to build high-performance, scalable systems for a distributed environment. The first post reviewed the general case for asynchronous communication among interdependent components or services, and highlighted Bell's Law of Waiting.
The Fallacies of Distributed Computing highlight crucial differences between centralized and distributed computing. Network components introduce potential problems that a centralized solution does not have to consider.
In this post I discuss how the design of distributed systems should draw on that of manual business systems. Of course, distributed computing can shorten the timescales of some business operations enormously. But drawing analogies with the way manual systems work is an observation that will help us to design efficient and scalable distributed systems.
Bell's Law of Waiting
All computers wait at the same speed
In Five Scalability Principles, I reviewed an article published by MySQL about the five performance principles that apply to all application scaling efforts. When discussing the first principle -- Don't think synchronously -- I stated that Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance (distributed) systems.
That's a quote from High-Performance Client/Server, from a section on Abandoning the Single Synchronous Transaction Paradigm, in Chapter 15, Architecture for High Performance. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.
So I am planning some more posts built around excerpts from the manuscript. I'll be updating and generalizing the terminology as necessary for today's environments, and adding some guidelines in my Performance Wisdom series.
Five Scalability Principles
Don’t think synchronously, ...
The 12 Days of Scale-Out is a section of the MySQL site. It consists of a series of twelve articles, eleven of which are case studies describing large-scale MySQL implementations. But Day Six is a bit different -- it spells out five fundamental performance principles that apply to all application scaling efforts.
This subject is vitally important to MySQL, whose server replication and high availability features ... allow high-traffic sites to horizontally 'Scale-Out' their applications, using multiple commodity machines to form one logical database -- as opposed to 'Scaling Up', starting over with more expensive and complex hardware and database technology.
I know from first-hand experience that these claims are valid. At Keynote, my team used MySQL as the foundation for the Performance Scoreboard. In this data mart application, MySQL supports supports the continuous insertion of new measurements at the rate of several million per day, plus hourly aggregation into summary tables, plus the queries needed to support continually updated dashboard displays for every customer, plus any ad hoc queries generated by customers doing diagnostic investigations.
Latency, Bandwidth, and Station Wagons focused primarily on the limitations of network bandwidth, and the time required to transmit massive data volumes. While that is an interesting topic, and one that produces some surprising results (like the fact that FedEx is still faster than the Internet), it is not particularly relevant to the subject of Web performance, which depends on the time required to transmit many small files.
My post highlighted It's Still The Latency, Stupid, by William (Bill) Dougherty in edgeblog. Bill's title pays homage to a famous 1996 article by Stuart Cheshire about bandwidth and latency in ISP links, It's the Latency Stupid.
Over a decade later, Bill points out, Cheshire's writings are still relevant: One concept that continues to elude many IT managers is the impact of latency on network design ... Latency, not bandwidth, is often the key to network speed, or lack thereof. This is especially true when it comes to the download speeds (or response times) of Web pages and Web-based applications. In this post I explain why, providing some supporting references and examples to support my argument.
Do you subscribe to email newsletters? If you're like me, you get lots of them. New ones appear in my inbox every morning. They pile up, demanding to be read. In fact, they seem to breed like rabbits, producing new offspring -- when did I express an interest in Enterprise VOIP Security Architecture issues? Sometimes in a housekeeping splurge I delete a few dozen at once, suffering a momentary twinge of anxiety at having perhaps missed something important. So usually I skim them before hitting the delete button.
TechTarget's Search Software Quality service seems to be especially prolific, but is also a regular source of interesting references -- like TheServerSide.com, the subject of a recent note. According to the site's home page:
Java Performance Management for Large-Scale Systems
There are many classes of enterprise applications that have stringent performance and scalability requirements. TheServerSide.com has assembled a collection of resources to help you better design, develop, test and manage high performance, large-scale systems - learn new and innovative approaches for performance tuning, memory management, concurrent programming, JVM clustering and more.
One concept that continues to elude many IT managers is the impact of latency on network design. 11 years ago, Stuart Cheshire wrote a detailed analysis of the difference between bandwidth and latency in ISP links [It's the Latency Stupid]. Over a decade later, his writings are still relevant. Latency, not bandwidth, is often the key to network speed, or lack thereof.
That's from It's Still The Latency, Stupid by William (Bill) Dougherty, writing in edgeblog on May 31, 2007. Bill follows that opening paragraph with a very readable explanation of the vital importance of latency (round-trip time) as a factor affecting performance in TCP networking. He uses what he calls the Sandbag Problem to illustrate his points:
Jonathan Kohl, is a consultant, author, and speaker who specializes in software testing. His blog, Collaborative Software Testing, includes many discussions of frameworks, heuristics, and mnemonics that serve as guides for different aspects of testing.
In particular, a November 2006 post on Modeling Web Applications presented a 9-part framework for testing Web applications and the associated mnemonic, FP DICTUMM.
The Law of Measurements
The result of any measurement will depend upon what is measured, how the measurement is done, and how the results are computed
In the last of these, I concluded that Gilb's observation (Anything you need to quantify can be measured in some way that is superior to not measuring it at all) gets across the value of measurements without making any claims that are too far-reaching or contentious.
A follow-up comment and the ensuing conversation with Ben Simo -- author of Quality Frog, a blog about software testing and software quality -- reminded me of this post, which I'd been meaning to complete and publish for a while. I'll explain the reasons for the delay below.