Robin Milner’s book Communication and Concurrency involves a take on state machines that is fundamentally incorrect.
“Now in standard automata theory, an automaton is interpreted as a language i.e. as a set of strings over the alphabet.“
That’s not at all correct, but let’s accept the claim for now and follow the argument. Consider two state machines A and B with an alphabet of events {a,b,c,d} and A has states {A,A1,A2, A3} and B has states {B,B’,B1,B2,B3}. The state machine transitions can be given by ordered triplets (state1,input, state2) that show the input label on a directed edge between state1 and state2. For Milner’s example:
state machine A has transitions { (A,a,A1), (A1,b,A2), (A1,c,A3), (A3,d,A) },
state machine B has transitions: { (B,a,B1) (B,a,B’), (B1,b, B2), (B’,c,B3), (B3,d,B)}.
B is non-deterministic because there are 2 “a” transitions from state B. Milner points out that if we consider A2 and B2 to be accept states, both machines accept the same language (acd)*ab. So far so good. At this point Milner asks us to think of {a,b,c,d} as “ports” or maybe buttons that can be pushed. The button “is unlocked if the agent can perform the associated action and can be depressed to make it do the action, otherwise the button is locked and cannot be depressed.” Then: “after the a-button is depressed a difference emerges between A and B. For A – which is deterministic – b and c will be unlocked, while for B – which is non-deterministic – sometimes only b will be unlocked and sometimes only c will be unlocked.” If you don’t look carefully, you’ll miss a subtle change of conditions that has significant ramifications.
An experimenter or external agent trying to push these buttons will discover a difference between the two machines eventually because sometimes after an initial “a” input on the second state machine a “b” is possible and sometimes not, although on the first state machine after an “a” the “b” is always possible. But how does the external agent determine that B will not perform a “b” action sometimes? The external agent “attempt[s] to depress b” and fails – the locked/unlocked state of each button is visible to the external agent. So Milner has changed definitions in the middle of the argument. At the start, finite state machines were language recognizers with, as the classical text on automata theory explains: “output limited to a binary signal: ‘accept/don’t accept’ ” [Hopcroft and Ullman]. Those automata will not tell us anything else about a word other than that binary condition: is it in the language or not. But Milner’s button state machines tell us also what buttons are locked and what are unlocked in the terminal state reached by the word. So Milner’s state machines distinguish words that a recognizer state machine does not. In fact, these Milner state machines have 5 binary outputs in each state – indicating the locked/unlocked status of each button plus accept/don’t accept. State machines with more than a binary output alphabet are called Moore or Mealy machines in poor old standard automata theory.
Standard automata theory does not “interpret” state machines “as a language” but there is a theorem that the class of languages recognized by those finite state binary output deterministic state machines is the same as the class of languages recognized by finite state non-deterministic state machines. Two machines that recognize the same language may be distinct in many other ways (there is a whole theory of state machine minimization important in digital circuit design). And state machines that have additional outputs (sometimes called “transducers”) are essentially descriptions of maps from input strings to output strings or from input strings to output value in the terminal state \(M:E^*\to X^*\). Standard automata theory would say Milner’s two machines accept the same language of strings, but produce different languages of strings.
Standard automata theory, as far as I know, has never really considered the case of non-deterministic Moore machines but the extension is trivial. Milner’s transition systems are just labeled directed graphs with a root vertex – state machines. Each vertex is a state. The output of a state is a the set of labels that label edges starting at that state plus a binary accept value. Associate each finite sequence of labels with a set of possible vertexes that can be reached from root vertex following the sequence of labels and construct a set of pairs (s,Q) for each sequence s so that Q is that set of possible destinations. As far as I can tell on a quick look, Milner’s bisimulation is simply equality of those sets of pairs between two of these graphs.
State machines represent composition and communication in terms of automata products
Another common argument in the process algebra literature is that automata cannot express interaction. Here’s a relatively complete argument.
The simplest model of behaviour is to see behaviour as an input/output function. A value or input is given at the beginning of the process, and at some moment there is a(nother) value as outcome or output. This model was used to advantage as the simplest model of the behaviour of a computer program in computer science, from the start of the subject in the middle of the twentieth century. It was instrumental in the development of (finite state) automata theory. In automata theory, a process is modeled as an automaton. An automaton has a number of states and a number of transitions, going from one state to a(nother) state. A transition denotes the execution of an (elementary) action, the basic unit of behaviour. Besides, there is an initial state (sometimes, more than one) and a number of final states.A behaviour is a run, i.e. a path from initial state to final state. Important from the beginning is when to consider two automata equal, expressed by a notion of equivalence.
On automata, the basic notion of equivalence is language equivalence: a behaviour is characterized by the set of executions from the initial state to a final state. An algebra that allows equational reasoning about automata is the algebra of regular expressions (see e.g. [58]). Later on, this model was found to be lacking in several situations. Basically, what is missing is the notion of interaction: during the execution from initial state to final state, a system may interact with another system. This is needed in order to describe parallel or distributed systems, or so-called reactive systems. When dealing with interacting systems, we say we are doing concurrency theory, so concurrency theory is the theory of interacting, parallel and/or distributed systems.[BAETTEN]
We see the same argument about language equivalence discussed above and a second argument about interaction that is actually more difficult to understand. Here is a product of automata in which each automaton can communicate with (interact with) each other automaton from a paper published in 1962.
Definition 1. Let {M1, M2.. ., Mn} be a set of Moore type machines in which the outputs of any machine Mi i = 1, 2,…, n, may be used as inputs to other machines. We shall say that this set of interconnected machines is concurrently operating if the next state (state at time t + 1) of each machine Mi depends on the present state of Mi, the present outputs (which depend only on the present states) of the machines to which it is connected, and the present external input. The ordered n-tuple (or “configuration”) of the present states of the machines M1 M2 . . ., Mn will be referred to as the state of the interconnected machine.[LOOP]
This product is certainly flexible enough to describe any type of communication. In fact, it can easily accommodate the type of synchronous message passing considered fundamental in the process algebras. Suppose we have Moore type state machines that output requests to send and receive messages on channels. Put n of them together in a product of the type Hartmanis defines and add another state machine to act as a scheduler. The input to the scheduler is the set of requests of the other machines. The output of the scheduler is a prioritization of the other factor machines. The highest priority machine then can act by either using the external input or the output of a communicating factor machine – which will also make a transition.
[MILNER] A calculus of communicating systems, Robin Milner. Springer (LNCS 92), 1980. ISBN 3-540-10235-3
[MILNER2] The Polyadic pi-Calculus: a Tutorial (1991)
[RABINSCOTT] M. O. Rabin and D. Scott, “Finite Automata and their Decision Problems”, IBM Journal of Research and Development, 3:2 (1959) pp.115-125.
[HARTMANIS] Hartmanis, J.; and Stearns, R. E. (1966), Algebraic Structure Theory of Sequential Machines (Prentice Hall)
[LOOP] Juris Hartmanis: Loop-Free Structure of Sequential Machines Information and Control 5(1): 25-43 (1962)
[HALGEBRA] Hoare, C. A. 1993. Algebra and models. In Proceedings of the 1st ACM SIGSOFT Symposium on Foundations of Software Engineering (Los Angeles, California, United States, December 08 – 10, 1993). D. Notkin, Ed. SIGSOFT ’93. ACM, New York, NY, 1-8. DOI= http://doi.acm.org/10.1145/256428.167053
[HENNESSY] Hennessy, M. and Milner, R. 1985. Algebraic laws for nondeterminism and concurrency. J. ACM 32, 1 (Jan. 1985), 137-161. DOI= http://doi.acm.org/10.1145/2455.2460
[BAETEN] Baeten, J. C. 2005. A brief history of process algebra. Theor. Comput. Sci. 335, 2-3 (May. 2005), 131-146. DOI= http://dx.doi.org/10.1016/j.tcs.2004.07.036