The Liar paradox is an old chestnut: consider “This sentence is false”. The naive analysis runs thus: assume it is true, then it has to be what it says, that is false, a contradiction; so, assume it is false, then it actually is what it says, that is true, also a contradiction! How to solve this? The classical mathematical approach is to ban such self-referential sentences. But that throws the baby out with the bathwater: there’s nothing paradoxical about “This sentence has five words” or even about “This sentence has one hundred words”.
The Liar is about an approach to solving the paradox, rather than simply banning it. The usual way to analyse such cases is to build a mathematical model in set theory, then use it to define and analyse the truth values of constructs modelled this way. In the case of the Liar paradox, this involves modelling self-referential propositions. But classical (ZF) set theory is formulated to avoid self-referential sets: the Axiom of Foundation is there to prevent it. Sets have members: these members can themselves be sets, with members of their own. What the Axiom of Foundation says is that if you follow this membership relation downwards, you always eventually get to the “bottom”: atomic members that are not sets (or are the empty set, which has no members, sets or otherwise). This means there are no infinitely descending chains of membership (it isn’t “turtles all the way down”), and there are no circular membership relations (a set cannot be a member of itself, which is what makes it hard to model self-reference).
So, when your mathematics isn’t up to the job – use different mathematics! In this case, the authors use Aczel’s brand of nonwellfounded set theory as a basis for building their models (despite what it might sound like from its name, it is a perfectly well-defined and consistent mathematical theory). In chapter 3, the authors summarise this theory in enough detail to understand how it is being used in their subsequent modelling of the paradox. The approach has a visual representation, in modelling sets as graphs (of the membership relation): wellfounded sets must have acyclic graphs; nonwellfounded sets can have cycles in their graphs. This give a nice intuition for what’s going on, and the explanations have a good mix of English text and mathematical rigour. It can sometimes be a bit confusing, however. For example:
Here “Aczel’s conception of a set” refers to these nonwellfounded sets (or hypersets), now defined in terms of graphs. A graph is “a set of nodes”. What kind of set are these nodes? Wellfounded? Nonwellfounded (relying on a circular definition)? Does it make a difference?
(A very minor problem with the exposition is due to the example atomic members chosen. On p40 we get the equation a = {Max, a}. I had a moment’s confusion, trying to work out what was being maximised, before I remembered that the atoms in the example language include the authors’ children’s names Max and Claire. Moral of this tale: if you are a logician, do not name your children after mathematical functions!)
Now, these new hypersets look strange. In fact, they initially look so counter-intuitive that they must be “wrong”. But that’s because we have been brought up on wellfounded set theory, with its assumptions now bedded into our intuition. We have “got used to it”. But we can get used to nonwellfounded sets, too:
So, once they have a mathematical toolkit up to the job, the authors go ahead with a traditional approach: use this set theory to model the various self-referential sentences, statements and propositions; give this a semantics or two; analyse the resulting systems. They analyse the system in two different ways, which they call “Russellian” and “Austinian”. (They emphasise that these are not actually the approaches that Russell and Austin advocated, but that they are in the spirit of their approaches.) The analyses give different answers. (What, you wanted the answer? But why are you surprised that the answer depends on how you formulate the question?)
Summarising brutally, and inevitably misleadingly, the analyses run as follows.
The Russellian analysis rests on a subtle distinction between denial and negation. Negation is a “positive” statement: it states that there are facts of the world that make proposition p false. Denial is a “negative” statement: it denies that there are facts that make p true. And these are not (in this formulation) the same thing. (p79.the fact of it being false is not a fact of the world). The analysis shows that the naive formulation conflates these two.
The Austinian analysis rests on the approach that propositions are made in the context of situations, and can have different truth values in different situations. The analysis shows that the naive formulation confuses different situations. It takes the form of “diagonalisation” argument: assume you know all the facts of the world, then construct a new fact that is true, but is not in your original set.
This is all fascinating and informative. As usual, naive analysis breaks down at extreme or “edge” cases. What we have here is a thorough analysis of these edges that does not shirk the problem by simply banning it, but takes it seriously, and applies a mathematical approach that fits the problem, thereby shedding light, and uncovering assumptions.
Despite it being interesting, I didn’t initially set out to read this to discover more about the Liar paradox: I read it to find out more about nonwellfounded set theory. That is because a delegate’s throwaway remark during a conference made me wonder if it might be useful for thinking about emergent properties. I hadn’t previously come across it (despite the fact that ideas like bisimulation, and some hairier branches of computer science, are apparently based on it, or equivalent to it). So I could have stopped reading after chapter 3, but it was too interesting! Anyway, I was heartened to see the pictorial approach, and even more heartened to see that much of my existing intuition would probably be fine:
But that comment about induction is interesting. I’m glad I carried on reading, because in chapter 4 we get:
The hairs on the back of my neck rose when I read that. It seems to imply that the whole basis of reductionism is wellfoundedness. (It might be true that the axiom of foundation has played almost no role in mathematics outside of set theory itself, but set theory has had an enormous impact on the way scientists model the world.) In this view, we start at the bottom, with the “atoms”, and construct things inductively. Everything not required (constructible this way) is forbidden. Because wellfoundedness is so strongly entrenched, this seems like the natural, maybe the only, way to do it. But nonwellfoundedness isn’t like this. There are nonwellfounded things that just cannot be built this way: there are things with no “bottom” or “beginning” (it can be turtles all the way down, or all the way back in time, in this model!); there are things that are intrinsically circular, self-referential, chicken and egg, strange loops even. Now everything not forbidden is compulsory; now there is so much more room to have the kind of things we need. But, what would it mean to have, even to “construct”, material things with such nonwellfounded structure?
So, I’m off to read more about this, to see if it might be a better way to model emergent and self-organising systems than classical wellfounded set theory.
The authors explore the properties of diagrams, charts, and maps, and their use in problem solving and teaching basic reasoning skills. As computers make visual representations more commonplace, it is important for professionals, researchers and students in computer science, philosophy, and logic to develop an understanding of these tools; this book clarifies the relationship between visuals and information.
This is the book I came to after reading The Liar, in the hope of finding out more about non-wellfounded set theory, as the notions of circular reference seem important in a theory of emergence. Circularity has a bad press in parts of mathematics, where everything is based on (well-founded) set theory, which bans circular reference. But the new theory removes that restriction:
One area of circularity is self-application, a thing computer scientists are fond of doing. Here there is some discussion of partial evaluation, compiler compilers, and the relationship between them, resulting in self-applicative formulae like [mix](mix,mix). My question is, can this be extended to physical processes, like life, and get round some of the problems Rosen has modelling life using (well-founded) set theory? For modelling autocatalytic sets of chemicals? For getting around the other cyclic "chicken-and-egg" descriptions of the origin of life? I didn't get an answer from this book, but I got more to think about.
Circularity isn't only interesting for deep questions about life and compiler compilers. It is helpful when defining something in terms of itself is the most natural approach. The idea is to give such definitions a meaning. There are some nice little example here, which don't need to go anywhere near the full machinery developed. One particularly cute one is on p53, asking the question of a sequence of fair coin flips: what is the probability that the first time we get heads it will be on an even-numbered flip? and showing, using circularity (going through an argument and returning to the original situation, so defining something in terms of itself) that the answer is 1/3 (and hence, incidentally, that it is possible to use a fair coin to make a three-way choice).
Another intriguing little sidebar is the relationship to game semantics.
(Ehrenfeucht-Fraïssé games were first cast in this form in 1961.) This passage struck a chord for me, from way back in the late 1970s, when I was doing my undergraduate degree. In the calculus term of the "mathematics for physicists" part of the course, many of the results were posed as Epsilon-Delta definitions. Our lecturer explicitly called this style of definition a game: "If I give you an \(\epsilon\) [ie, \(\forall \epsilon\) ], you can always find a \(\delta\) [ie, \(\exists \delta\) ] such that the property in question holds."
There are also helpful comments on the status of modelling (whether or not using set theory), that need to be remembered:
The authors go on to discuss the Liar paradox from this perspective: Say this same Jones also models the Liar sentences, and discovers an identity that leads to the paradox. But the fact that [these models of the Liar sentences] are identical is evidence of an inadequacy of his modelling scheme. [This identity] shouldn't be regarded as a discovery about the meaning of [the Liar sentences]
The crucial difference between wellfounded and nonwellfounded sets (or, less perjoratively, hypersets) is:
That is, any binary relationship graph is isomorphic to a (hyper)set membership graph. With classical wellfounded sets, the binary relations have to be acyclic (sets cannot contain themselves) and have "bottom" elements (the set membership relation has to have initial "atoms" that are not themselves sets). With nonwellfounded sets, neither has to be true. The binary relation R can have cycles (a R a) and can have chains with no "beginning" (think of the successor relation over the integers, rather than just the natural numbers). So they can define the hyperset of (infinite) binary trees without the need to define any leaves, for example.
A remark towards the end throws an interesting light on mathematicians and logicians:
Who would have thought that mathematicians were worried about modelling real phenomena, as opposed to building elegant abstract theories? Indeed, much of this book is written in a dense, pure mathematical style: "Definition: Γ is smooth iff Γ is monotone, proper, and if almost all sets of urelements are very new for Γ", and so on. All the coalgebras and other mathematics are beyond my level of mathematical knowledge, so I had to skim those parts, tying to extract the message, if not the details: always dangerous with mathematics! (And all the talk of smooth operators reminded me irresistibly of that dastardly character Curly Pi.) A few pages later, they admit to not having written the book I wanted (not their fault, of course!):
I wanted more on applications, how to use those hypersets to do something interesting computationally. But the reference to Milner's work, and also to the importance of the technique of bisimulation for deciding if two hypersets are the same, at least points me in the right direction next: I should look more at CCS.
So, I got a lot out of this book. But nowhere near as much as it has in it! If I had, I would have certainly given it a significantly higher rating.
The authors, observing that information flow is possible only within a connected distribution system, provide a mathematically rigorous, philosophically sound foundation for a science of information. They illustrate their theory by applying it to a wide range of phenomena, from file transfer to DNA, from quantum mechanics to speech act theory. An important interdisciplinary text, this book is ideal for graduate students and researchers in mathematics, computer science, philosophy, linguistics, logic, and cognitive science.