Tuesday, March 12, 2013

Erlang and "Let it Crash" Programming

Although Erlang is designed to encourage/facilitate a massively parallel programming style, its error handling may be even more noteworthy.

  1. Error handling in Erlang is very different from error handling in conventional programming languages. The key observation here is to note that the error-handling mechanisms were designed for building fault-tolerant systems, and not merely for protecting from program exceptions. You cannot build a fault-tolerant system if you only have one computer. The minimal configuration for a fault tolerant system has two computers. These must be configured so that both observe each other. If one of the computers crashes, then the other computer must take over whatever the first computer was doing. This means that the model for error handling is based on the idea of two computers that observe each other.
  2. Links in Erlang are provided to control error propagation paths for errors between processes. An Erlang process will die if it evaluates illegal code, so, for example, if a process tries to divide by zero it will die. The basic model of error handling is to assume that some other process in the system will observe the death of the process and take appropriate corrective actions. But which process in the system should do this? If there are several thousand processes in the system then how do we know which process to inform when an error occurs? The answer is the linked process. If some process A evaluates the primitive link(B) then it becomes linked to A . If A dies then B is informed. If B dies then A is informed.
    Using links, we can create sets of processes that are linked together. If these are normal processes, they will die immediately if they are linked to a process that dies with an error. The idea here is to create sets of processes such that if any process in the set dies, then they will all die. This mechanism provides the invariant that either all the processes in the set are alive or none of them are. This is very useful for programming error-recovery strategies in complex situations. As far as I know, no other programming language has anything remotely like this.


No comments:

Post a Comment