« Asynchronous Architectures [2] | Home | Five Scalability Principles »

Asynchronous Architectures [1]

Bell's Law of Waiting

Performance Wisdom: 11

All computers wait at the same speed

-- Dr. Thomas E. Bell, Performance of Distributed Systems, Presentation to ICCM Capacity Management Forum 7, October 1993, San Francisco

In Five Scalability Principles, I reviewed an article published by MySQL about the five performance principles that apply to all application scaling efforts. When discussing the first principle -- Don't think synchronously -- I stated that Decoupled processes and multi-transaction workflows are the optimal starting point for the design of high-performance (distributed) systems.

That's a quote from High-Performance Client/Server, from a section on Abandoning the Single Synchronous Transaction Paradigm, in Chapter 15, Architecture for High Performance. My 1998 book is out of print now, and contains some outdated examples and references. But most of the discussions of performance principles are timeless, and you can pick up a used copy for about $3.00 at Amazon.

So I am planning some more posts built around excerpts from the manuscript. I'll be updating and generalizing the terminology as necessary for today's environments, and adding some guidelines in my Performance Wisdom series.

Asynchronous architectures are more scalable

The first posts will elaborate on the arguments for asynchronous architectures as the optimal way to build high-performance, scalable systems for a distributed environment. I begin by reviewing the general case for asynchronous communication among interdependent components or services.

In the typical distributed enterprise, there will inevitably be fluctuations in the distribution of work to be done, as business volumes rise and fall, and fluctuations in the availability of network and processing resources.

Even if we have designed our systems to accommodate peak processing volumes, it is normal for some servers, some application components, or some part of a large network to be out of action some of the time. Therefore, if we design applications that require all resources to be available before we can complete any useful work, we reduce the availability of the whole system to the level of its most error prone component.

For optimal performance, we should design applications to accommodate unexpected peaks in the workload, server outages, and resource unavailability. This means application and system design must:

  • Emphasize concurrent operation in preference to workload serialization
  • Prefer asynchronous connections to synchronous ones between clients and servers
  • Place requests for service in queues and continue processing, rather than waiting for a response
  • Create opportunities for parallel processing of workload components
  • Distribute work to overflow servers to accommodate peak volumes
  • Provide redundant servers to take over critical workload components during peaks and outages

Design applications that don’t wait

Each of these topics is too large to review in any detail in this post, but their central theme can be summed up as: Design applications that don’t wait. Note an important distinction between the behavior of individual transactions or units of work, and the behavior of the system as a whole. Individual transactions may indeed have to wait until they can obtain the processing resources they need. But the application as a whole should continue processing, with a minimal allocation of resources to any transactions flowing through the system.

That way, the scarce computing resources of one server do not sit idle waiting for delayed transactions to receive responses from other services or components. For example, here's some advice from BEA, taken from Best Practices for Application Design when programming the WebLogic Java Message Service (JMS):

Asynchronous vs. Synchronous Consumers

In general, asynchronous (onMessage) consumers perform and scale better than synchronous consumers:

  • Asynchronous consumers create less network traffic. Messages are pushed unidirectionally, and are pipelined to the message listener. Pipelining supports the aggregation of multiple messages into a single network call.
  • Asynchronous consumers use fewer threads. An asynchronous consumer does not use a thread while it is inactive. A synchronous consumer consumes a thread for the duration of its receive call. As a result, a thread can remain idle for long periods, especially if the call specifies a blocking timeout.
  • For application code that runs on a server, it is almost always best to use asynchronous consumers (which) prevents the application code from doing a blocking operation on the server. A blocking operation, in turn, idles a server-side thread; it can even cause deadlocks. Deadlocks occur when blocking operations consume all threads. When no threads remain to handle the operations required to unblock the blocking operation itself, that operation never stops blocking.

Bell's Law

In conclusion, when designing distributed systems, we should always recall Tom Bell's humorous observation, All computers wait at the same speed. A computing resource that is waiting to be used, especially a processor, is a wasted resource.

This post contains material first published in High-Performance Client/Server, Chapter 11: The Sharing Principle, p360. Tags: , , , , , , , , ,

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>