[Edited 9/10] There’s an easy connection between data structures and basic abstract algebra that may or may not mean anything, but keeps getting rediscovered. I’ve never seen it clearly explained (anyone with a reference or who notices an error in my rusty algebra- please send me a note! A discrete note would be nice -) ).

What keeps on getting rediscovered is that objects with invertible and commutative methods would make things very convenient especially for transactions and for concurrent access. What I have not seen explained is that this discovery reflects the distinction between groups and monoids.

A monoid is one of the simplest algebraic structures and is usually given as a set M of elements and and a binary operation \(\circ\) on elements of M with the following properties:

  • Closure: for every a and b in M, \(a \circ b\in M \).
  • Associative:    \(a \circ  \) \(( b \circ c ) \) = \((a \circ b) \circ c \) for every a, b, and c in M
  • Identity: M contains an element I so that for every \( a \in M, I \circ a = a \circ I =a \).

That’s it. A pair \((M,\circ)\) and three conditions: closure, associativity and identity. The natural numbers 0,1,2,… form a monoid with ordinary integer addition as the operation and 0 as the identity. The positive numbers form a monoid with 1 as the identity and multiplication as the operation. More intriguingly for our purposes, the methods of an abstract data type or object generate a monoid.

A monoid \( (M, \circ)\) is “generated” from a set X if and only if X is a subset of M and every element of M can be constructed from elements of X using the operation. For example, the monoid of non-negative integers with addition as the operation can be generated from \(\{0,1\}\) since \(2= 1+1, 3 = (1+1)+1\) and so on. One really simple type of monoid is the set \(E^*\) of all finite strings that can be generated from an alphabet with string concatenation as the operator.

We can use the methods of an object as the generators of a monoid as follows. Let’s define an object by a set V of possible values and a set F of methods that can act on the object. A method is really just a function from V to V so that if the object has value v and we apply method f the object value becomes f(v). The binary associative operation of “function composition” can be used to construct “programs” from methods: \(f (gh))(v) = f(g(h(v)))\). The monoid generated by the methods, then, is the set of programs that can be constructed using function composition on the methods. All we have to do further is insist that there is a null method \(Null(v)= v\) is in F.

For example, if our data structure is a stack and our methods are \(push_v\) for every v in some set of values V, plus pop, add (pop the top two values, add and push the result), sub, dup, swap, and \(null\), we can build a Forth Monoid.

If a is an element of a monoid and there is a second element b so that \(a \circ b = I\)  then b is called the “inverse” of a. If there is an inverse for every element of the monoid, then the monoid is called a “group”. (Note that in a group if b is the inverse of a, then a is the inverse of b because of associativity. If the group operation commutes, the group is called Abelian or commutative.

Most monoids are not groups. For example, our Forth Monoid has an inverse for “dup” (“pop” ) but pop does not have an inverse for a very interesting reason: pop loses information. Add has no inverse for the same reason: if the stack is (100,200) and an add method produces (300), then there is no operation that will deduce magically that 100 and 200 were on the stack before and not (0,300) or (1,299) or …  Monoids that are not groups lose information when they compute programs. If an object is associated with a group, then no matter what the starting value of the object, if we know the program that was applied to the object, we can produce (in theory) an anti-program that returns us to the initial state. An example of an object that generates a group is an incrementer with the single method “add1” with values over the set 0 – (232 -1) The inverse of add1 is sub1 which can be generated by 232 add1’s (note that  the words “efficient” or “practical” did not appear anywhere above).

It has been noticed more than once that data structures whose monoids are groups would be easier to parallelize or to use for transactions. For example, if programs can be inverted: so that for every program p there is a -p with -pp = I then we can unroll back to earlier states. That is, if we apply f,g,h to the data structure, and then realize that the transaction cannot be completed, we know that \(-h(-g(-f(f(g(h(x)))))) = x\). We do not have to know what “x” was to undo the transactions. If the programs “commute”, life is even easier, since undoing a single method application can be done at any time, without undoing other method applications. With a little thought you can see that a commutative group data structure is convenient for parallel computation because the order of operations doesn’t matter (less synchronization required) and because every operation is reversible. So if Task T starts modifying the data structure and more important task K interrupts, T’s partially completed changes to the data structure can be rolled back and K can do what it needs.

I’m not completely convinced that stumbling into group theory tells us anything useful about data structures except that computer science is hard and is mostly concerned with horribly irregular structures that do not possess convenient properties. But groups and monoids keep popping up in computer science, even though theoretical CS people seem to not be interested anymore. Famously, state machines determine monoids (or semi-groups) and there is a body of mathematics called Krohn-Rhodes theory which is concerned with factoring monoids into cascades of groups and residual monoids that correspond with splitting state machines into pipelines that have no feedback. Also there is a nice relationship between logging and group structure: Most data structures cannot be practically converted into group like structures even though every monoid can be made into a group by adding a log.

Suppose we make our Forth object have values (data-stack,log-stack) where the log-stack is essentially a stack of data removed from the stack so that “pop” removes the top element of the stack and pushes it onto the log stack – then “invert-pop” can pop the log-stack and push the value onto the data stack. I wonder if “bounded invertibility” would be an interesting property. The interesting Forth objects have a bounded stack anyways. Suppose we add a smaller log-stack so we can invert some programs: suppose the methods are invertible and programs constructed by composing at most K methods are invertible.

Anyways, there is something worth digging out in all this: think of filesystems designed to be K-deep group-like.

(I was inspired or provoked to produce this note after attending a talk by Maurice Herlihy at UT. I can’t say I understood much of Professor Herlihy’s talk (my fault, not his) but he was discussing the utility of converting data structure monoids to groups without mentioning either groups or monoids. I also found out that distributed shared memory is still around- how depressing.)

Data structures and algebra
Tagged on: