automata 071 Despite some deep results, algebraic automata theory has fallen out of favor in theoretical computer science. Reasons include the disciplines failings such as a love of over-generality, weak mathematical background of people working on “formal methods”, and gap between theoreticians and engineers. But perhaps the key reason is that traditional state machine presentations in terms of state sets and transition maps are awkward and carry too much irrelevant information.

The standard presentation of a Moore machine is  M=(A,X,S,st,δ,γ)  where  “A” is the alphabet of events, “X” is the set of outputs, “st” is the start state,  δ:S x A→ S is the transition function and γ:S → X is the output function.  But  we really don’t care about the state set in all but the simplest state machines. Interesting systems have ridiculously large numbers of states.  Names of states don’t matter.  Extra states don’t matter. Suppose “s” is an element of “S” that is unreachable – no sequence of inputs will drive M to “s” from “st”. In that case what difference does removing “s” from “S” make? Or suppose s and s’ are both reachable but not only is γ(s)=γ(s’) but for any sequence of inputs that can drive the machine from s to a new state, the output of that new state is identical to the output of the state reached by following the same sequence of inputs from s’.  In that case, we could simplify S and δ by removing one of the duplicative states and merging paths.

What’s more interesting is to consider what output a Moore machine will produce from a sequence of inputs: thinking of the Moore machine as a map from input strings to outputs. This is  a function that can be constructed by primitive recursion on strings.  First, extend δ to strings to define a new function Δ. The   empty string that has zero length doesn’t cause any state change Δ(s,empty)=s   and if “w” is a string and “wa” is the string obtained by appending “a” to “w”, then Δ(s,wa)= δ(Δ(s,w),a).  Define  M*(w) = γ(Δ(st,w)). The idea is that M*(w) is the output produced by M in the state reached by following “w” from the initial state.  If you look at M* you see it contains all the interesting information in M -  because the names of states and unreachable states and duplicate states are not interesting. In fact in the 1950s Myhill and Nerode described how to generate a state machine (not necessarily a finite state one) from any function on the set of strings over an alphabet.milenr 001

These functions are much more useful in describing complex state systems than are Moore machines – not least because they can be partially specified when we don’t know or care about all the possible behaviors of the system and because they can be composed in a way to represent any interconnection.

In the next post, I’m going to discuss composition of message passing processes and compare to the “formal methods” approach which is based on the incorrect assumption that automata need to be “enhanced” in some way and that replaces automata with things that have very little mathematical structure at all.

Recursion and state
Tagged on: