Originally Nov 30, 2017. (revised 2021,2023)
Russell introduced what is often viewed as his most original contribution to mathematical logic: his theory of types—which in essence tries to distinguish between sets, sets of sets, etc. by considering them to be of different “types”, and then restricts how they can be combined. I must say that I consider types to be something of a hack. And indeed I have always felt that the related idea of “data types” has very much served to hold up the long-term development of programming languages. (Mathematica, for example, gets great flexibility precisely from avoiding the use of types—even if internally it does use something like them to achieve various practical efficiencies.) – Wolfram.
There are two different and incompatible reasons for being interested in types. The first comes out of metamathematics via Russell. The second reason is to prevent type errors and facilitate efficient compilation/execution in programming languages. Programming type errors mostly involve combination of elements that have incompatible semantics but compatible representations. For example, \(x+y\) should probably fail on compilation if \(x\) is an int and \(y\) is a string. And there was a rocket that exploded because quantities in metric units and english units were mixed. If \(x\) is an integer representing centimeters and y is an integer representing inches, \(x+y\) should either provoke a type conversion (which may be impossible to get right) or an error, preferably in the compilation stage not in the combustion stage. Programming language types can also facilitate efficient computation by making it possible to compile, for example, an addition to a C unsigned int as a single machine instruction, or with SIMD instructions. Applying Russell’s and other metamathematical type systems (such as the typed lambda calculus) to programming languages as a way of addressing programming language types is common in CS research, but I think that effort is mistaken.
- In mathematics, types are not sets and subtypes are not subsets. This may be due to confusing usage in mathematical practice where, for example, we often see references to integers as a subset of reals. But in mathematical usage, people generally understand that while integers are a subset of the real numbers, “the integers” usually means “the algebraic system Z of all integers with the usual binary operations of addition and multiplication. ” (MacLane, Birkhoff). This is why it’s not a surprise that 1/n does not have a solution in “the integers” although it has a solution in the rationals when \(n \neq 0\) and “\(m +x = n\)” may not have a solution in the natural numbers but does in the integers.
- In computer programming, types are definitely not sets nor are subtypes subsets. Barbara Liskov observed that Square objects differ from the superset of Rectangle objects in that, for example, there is a well defined operation to change the length of a pair of edges of a Rectangle, leaving the other pair unchanged, but that operation violates Square semantics. The ISO C standard has created a vast muddle by attempting to impose set theoretic semantics that are simultaneously both baroque and too simple. In both cases, the nature of set theory – as on purpose having the most limited possible, most general, semantics, is well suited to metamathematics, but awkward when it comes to the complex types found in programming. The simple semantics of sets means that richer objects need more and more axioms. Consider what a set theoretic specification of floating point numbers would look like.
- Hierarchies. Mathematical types are not organized in a tree structure. You can start with lambda calculus or primitive recursion on monomials and build out a theory of real numbers or start with the system of reals and limit yourself to Peano Arithmetic or finite arithmetic for that matter. The group \(Z_5\) is not a child of the group \(Z\) even though you can map the elements of \(Z_5\) into \(Z\). Relationships between mathematical systems are complex and structural relationships have to be discovered before they can be axiomatized.
- Implicit Promotion – It may be that FORTRAN was the original sinner, automatically promoting “integer” fixed point numbers to floating points in expressions like i*r. This leads to confusion and they should have required something like float(i) *r in such expressions. Implicit promotion is a major weakness in C, for example where there are paradoxical interactions between promotion and types.
Unfortunately, the standard alternative is too ad-hoc. For example, typedefs in C are apparently are not to be taken seriously. The following program compiles with not a single error or warning.
#include <stdio.h >; #include < stdlib.h >; int main(int argc, char **argv){ typedef int metric_t; typedef int english_t; metric_t x; english_t y; if (argc < 3){ printf("test inches centimeters\n"); exit(0); } y = atoi(argv[1]); x = atoi(argv[2]); printf("x= %d y=%d x+y = %d\n",x,y, x+y); }
The way to get around this in C, and I wish I could remember where I read this to credit the author, is to use structures.
What would be nice is some mathematical representation theory to allow us to reason about concrete types and abstract ones they may represent. Digital computers operate on one type of data – sequences of bits. We have various, often partial, mappings from bit sequences to mathematical objects and other kinds of objects (e.g. images, videos, rules, executable programs, etc. ). For example, sequences of n-bits \(s=(s_1,\dots s_n)\) can map to non-negative integers by
\[ \mu s = \Sigma_{i=1}^n (2^{i-1}(s)_i)\]
In most digital computers addition and multiplication are supported on sequences of those lengths so that diagrams commute in the sense that for the computer operation “add”, \(\mu add(s,u) = (\mu s + \mu u )\bmod 2^n)\). Thus, \(\mu\) is not just a map \(\mu:\{0,1\}^n\to \mathcal{N}\) but a map into the system \((\mathcal{N}_k,+,-)\) where \(k = 2^n\) for the word length \(n\). The map is both more and less than that, because e.g. \(\mu div(s,u) =\lfloor (\mu s /\mu u)\rfloor\) and we have additional operations like bitwise arithmetic and shifting. And this is for the simplest of these kinds of mappings. The floating point mappings from bit sequences to reals are immensely complex because of rounding and the peculiar non-uniform distribution so that e.g. if \(\phi\) is the mapping from bit sequences of length 64 to reals using floating point representation, then \(\phi u – \phi s = \phi u\) does not necessarily imply that \(\phi(s)=0\) .