Tuesday, October 20, 2009

Armstrong Thesis (Chapter 2)

The next couple of weeks will involve reading Joe Armstrong's thesis, Making reliable distributed systems in the presence of software errors, presented to the Royal Institute of Technology in Stockholm, Sweden in December 2003. The topics and issues described will be intriguing as we read each chapter.

This chapter describes an architecture for building fault-tolerant systems required for telecommunications switching networks. If you think about it, you expect your aging rotary dial phone to always get a dial tone when you pick up the receiver and be able to complete a phone call with no interrupts.

Telecommunications systems have some of the most stringent technical requirements. Due to the vast number of customers, switching systems are inherently concurrent and distributed across multiple locations of varying distances. These same systems are designed for many years of continuous operation requiring the performance of software or hardware updates without stopping the system.

The thesis advocates fault-isolation and forbids the use of shared data to deprecate the need for locks or mutexes. A software error in a concurrent process should not influence processing in the other processes in the system. The isolation is developed further by only allowing processes to communicate by message passing. This concept is intriguing as I have worked on systems implementing this type of isolation and how well it has worked when one process has an error. Unfortunately, if certain processes error out or crash, the rest of the system may not receive any message traffic to continue processing.

Armstrong suggests the use of the Erlang programming language to satisfy the requirements set forth in the thesis. Erlang was designed by Ericsson to support distributed, fault-tolerant, soft real-time, non-stop applications. The first version was actually developed by Joe Armstrong 17 years earlier.

I look forward to reading the remaining chapters to see what insight Armstrong can provide about telecommunications systems and the implementation of the architectural model.

No comments:

Post a Comment