Wednesday, November 14, 2007

Are synchronous systems doomed?

Asynchronism of communication is a basic principle of many distributed systems. Did you ever think why? Why developers do so much staff to handle things asynchronously while sync communications are much simpler?

If you'll see around you shall find many examples of asynchronism. Most organizations exploit this principles, engineers exploit this principle, even nature
has built neural systems asynchronous.

So why? The point is implicit parallelism and reliability introduced by such systems.

Let's imagine a chain of components used by each other and connected by relatively long "links". By "relatively long" I mean information propagation time can be compared with processing time or longer. In synchronous systems the first component in the chain will wait at least while other component shall receive the message. In worst case it will wait for results of the message processing. So transmission delays will sum and increase the first component idle time.

Link failure in that case is a disaster. The only general way to detect it is timeout which increases request handling time (read the fist component idle time). Moreover, there is no way to learn was the request processed or not.

Obviously, synchronous requests are hard to persist. As recovery of distributed state (which in synchronous system is usually represented by execution stack or system components) represents magnificent technical and administrative challenge.

Asynchronous systems are something different. Components work in 'fire-and-forget' mode. So no time (resources) is spent to conserve. Components do not idle and may create additional requests, process data & etc. So it is much more parallel.

Moreover, it is easy to design system in a way when all state is incapsulated inside the messages. Persisting them system may archive significant firmness for link/components failures.

Now take a look onto modern hardware design. In common computer (actually in supercomputers also, but there can be exceptions) CPUs are quicker than communication channels. This has little common with multicomponent architecture. But the trend is not to increase number of CPUs linked with _communication channels_ either on single computer level and on distributed cluster level.

eBay is building their systems in async way. The most part of payment transactions are processed in async way.

Don't you think it is time to learn message passing libraries and designs?