*Mathematics and Humour*. 1980*Innumeracy: mathematical illiteracy and its consequences*. 1988*Beyond Numeracy: an uncommon dictionary of mathematics*. 1991*A Mathematician Reads the Newspaper*. 19951998*Once Upon a Number*.

- The self is a conceptual chimera. 2006. (In
*What is Your Dangerous Idea?*)

Paulos here investigates the differences, and similarities, between everyday and mathematical reasoning. His main investigation in on statistical arguments, with smaller forays into logic and complexity. He contrasts the anecdotes, stories, myths that we use in everyday life, and the mathematical counterparts that have grown and been refined from them. His main thrust is that these give us two connected but different ways of understanding the world, and that we should be alert to these differences. It is important that we don't go about:

p5.
mistaking anecdotes for statistical
evidence or ... taking averages to be descriptive of individual cases

It's not that one way is right, and one is wrong. They offer two possible ways of understanding the world.

p26.
Stories and statistics offer us
complementary choices of knowing a lot about a few people or knowing a
little about many people.

Each has its problems, and each its advantages. For example, anecdotal reasoning can lead us to make incorrect generalisations, because the very richness of the story that makes it so compelling gives us so much information that there are accidental correlations.

pp26-27.
If the number of traits considered is
large compared to the number of people being surveyed, there will
appear to be more of a relationship among the traits than actually
obtains. ...

You can find perfect correlations that mean nothing for any*N* people and *N*
characteristics. ...

Just as stories are sometimes a corrective to the excessive abstraction of statistics, statistics are sometimes a corrective to the misleading richness of stories.

You can find perfect correlations that mean nothing for any

Just as stories are sometimes a corrective to the excessive abstraction of statistics, statistics are sometimes a corrective to the misleading richness of stories.

The mathematical approach can help us see when these errors might occur. He illustrates this point with several standard statistical "paradoxes", none the worse for being repeated. One very good point he makes, in the case of "digging the dirt" on celebrities and others, is that we should weigh the evidence presented against the effort taken to produce that evidence:

p55.
the ratio of the amount of dirt
unearthed to the time and resources spent digging for it (or for
something that can pass for dirt). ... I don't think I have a
particularly disreputable group of friends and acquaintances, but few
could withstand a 30-million-dollar investigation into their private
lives.

He has lots of good little anecdotes (deliberately used to be memorable!), although occasionally they don't give enough background or explanation. (Maybe he didn't want to clutter the main text with too much detail, but that's what appendices are for.) For example, he gives an example of a paradox about the odds of poker hands with wildcards, but doesn't explain the details:

p85.
The less probable the hand, the higher
its rank. Three of a kind is less probable (and hence of a higher
rank) than two pair, ... the introduction of wild cards and the
discretion that they allow players can jumble the order of the various
possible hands.

With two wild cards it becomes*more*
likely that you will be dealt three of a kind than two pair. (Any pair
combined with a wild card is three of a kind.) Since in this situation
you are less likely to obtain two pair, such a hand should beat three
of a kind. Suppose you change the rules and declare this by fiat, so
that players choosing between two pair and three of a kind will now
choose two pair. Under these altered rules it again becomes *more
likely* that they will be dealt two pair rather than three of a
kind.

With two wild cards it becomes

On the face of it, this looks weird. How can changing the rules
change the probability of the hands? What I *think* is going on
is that, given a set of rules, you decide which of two (or more)
possible hands you have, given wildcards, and choose the "better"
hand. If the better hand is three of a kind, you will choose your
wildcards to make three of a kind even if they could also make two
pairs, and three of a kind becomes more likely under this choice.
Conversely, if two of a kind is deemed to be the better hand, then,
*dealt exactly the same hand*, you would choose to use any
wildcards to make two of a kind, which then, because of this different
choice, becomes the more likely hand. (But I may be wrong, since I
don't know poker!) The point, however, is that, given choices
(wildcards), things become complicated and difficult to order, and
that life is full of such choices, and that the choices we make can
affect the value of what we have chosen.

There is another place where he doesn't explain everything that's
needed. That's in the section on extensional versus intensional
definitions. He explains what an *extensional* definition is ...

p87.
Standard scientific and mathematical
logic is termed *extensional* since objects and sets are
determined by their extensions (i.e., by their members). That is,
entities are the same if they have the same members, even if they are
referred to differently. In everyday *intensional* ... logic,
this isn't so. Entities that are equal in extensional logic can't
always be interchanged in intensional logic.

... but not what an intensional definition is. The difference is one
is by membership {Alice, Bob, Eve}, and one is by property {everyone
over 5'6" tall in this room}. It might so happen that two
definitions happen to refer to the same members (Alice, Bob, Eve
happen to be the only people of the specified height in the room at
the moment), but that could change. One common step in mathematical
proof to "substitute equals for equals": if I see "2+2",
I can substitute "4". But Paulos points out that we cannot
substitute an extensional definition for an intensional one (or vice
versa), because, what if Alice leaves the room? The important point is
that we use intensional definitions *all the time*, and can
easily find ourselves in difficulties, silliness, or even tragedy, if
the "equals for equals" rule is followed *blindly*.
The point is that mathematics is being used to model the real world,
and the fit isn't perfect, so we need to think about what we are
doing:

p92.
any statistical study on a structured
entity---a game, a welfare system, marriages, a historical era---is
likely to be fatally flawed if it fails to take the structure into
account, say by mindlessly substituting extensionally equivalent
entities for one another within the study.

Converting well-understood English statements into mathematics is non-trivial for a wide range of reasons.

pp100-101.
... interpretation sometimes depending
on verb tense, for example. The following two arguments are not
equivalent, despite having the same form.

A cat needs water to survive.

Therefore my cat Puffin needs water to survive.A dog is barking in the backyard.

Therefore my dog Ginger is barking in the backyard.

He then makes a very interesting suggestion. Recently, "situational
logic" has tried to take context into account. (I've been reading
a bit about this lately. *The
Liar*, for example, uses it to analyse certain logical
paradoxes. I've also recently read *Vicious
Circles*, again by Barwise, for reasons that can be found in
my review. I read *Once Upon a Number* in order to have a break,
and read something a bit different. So I was surprised partway through
when Paulos says that Barwise was his thesis advisor! Fortunately I'm
already inured to these kind of coincidences, so I didn't mind that,
coincidentally, Paulos also has an explanation of how this sort of
thing happens all the time.) Anyway, Paulos suggests that there needs
to be a similar situational approach to other areas of mathematical
modelling:

p105.
Just as situation semantics attempts to
accommodate more of the richness of everyday understanding in an
extended formal logical calculus, so a "situation statistics"
should be developed that builds in some of the checks on wayward
probabilities that commonsense narrative suggests.

There is also a section on complexity. In another intriguing suggestion, Paulos speculates about the underlying mathematical cause of self-organisation in Kauffman's Random Boolean Networks:

p165.
What happens, however, is that one
observes "order for free"---more or less stable cycles of
light configurations, different ones for different initial conditions.
As far as I know, the result is only empirical, but I suspect it may
be a consequence of a Ramsey-type theorem too difficult to prove.

He also gives a solution to the "problem of induction" that I hadn't come across before (the usual problem is that justifying "the future will be like the past" by using the fact that, in the past, the future has been like the past, is circular). I rather like it, since it fits in with ideas of complexity and emergence rather well:

p167.
Charles Saunders Peirce and
Hans Reichenbach advanced a different
pragmatic justification of induction. It amounts roughly to this:
Maybe induction does not work, but if anything does, induction will.
Maybe there is no persisting order in the universe, but if there is
any on any level of abstraction, induction will eventually find it (or
its absence) on the next higher level.

This idea that induction works across levels is interesting. Levels are important in complexity and emergence (an emergent property usually being described at a higher level that the properties it emerges from). I'd never thought of stream-of-consciousness literature as an example of emergence:

p170.
the stream-of-consciousness novels of
James Joyce and Virginia Woolf in the early part of twentieth century
can be seen as the beginning of an attempt to discern pattern on one
level by simply describing the most mundane actions and thoughts on a
lower level.

He then goes on to describe a measure of complexity.

p171.
Zurek
defined physical entropy to be the sum of Claude Shannon's information
content, measuring the improbability or surprise inherent in a yet to
be fully revealed entity, and Chaitin's
complexity, measuring the algorithmic information content of what's
already been revealed.

His speculation that follows is fascinating:

pp171-172.
Imagine two readers encountering a new
short story or novel. One is a very sophisticated litteratuer, while
the other is quite naive. For the first reader the story holds few
surprises, its twists and tropes being old hat. The second reader,
however, is amazed at the plot, the characters, the verbal artistry.
The question is, How much information is in the story? ... The first
reader's mind already has encoded within it large portions of the
story's complexity; the second reader's mind is relatively
unencumbered by such complexity. The Shannon information content of
the story---its improbability or the surprise it engenders---is thus
less for the first reader whose mind's complexity is, in this regard,
greater, whereas the opposite holds for the second reader. As they
read the story both readers' judgments of improbability or surprise
dwindle, albeit at different rates, and their minds' complexity rises,
again, differentially. The sum of these two---the physical
entropy---remains constant and is a measure of the information content
of the mind-story system.

The reason it is fascinating (to me, at least!) is because of
problems I have with some definitions of emergence, that require the
observer to experience "surprise" at the emergent property.
So, the second time the same thing happens, it's no longer emergent,
then? But taking this definition into account, the second time it
happens, the complexity of the *observer* has changed. So maybe
there's something more going on here. Hmm.

He finishes off with a nice little example that not all equally probable things are equally probable (depending on how we look at them).

pp182-183.
I was told that in some states
hand-picked lottery numbers won more often than did machine-generated
numbers. ... the claim is not necessarily nonsense. In fact, it nicely
illustrates one way in which personal wishes can sometimes seem to
affect large, impersonal phenomena.

Consider the following simplified lottery. In a comically small town, the mayor draws a number from a fishbowl every Saturday night. Balls numbered from 1 to 10 are in the bowl, and only two townspeople bet each week. George picks a number between 1 and 10 at random. Martha, on the other hand, always picks 9, her lucky number. Although George and Martha are equally likely to win, the number 9 will win more frequently than will any other number. The reason is that two conditions must be met for a number to win: it must be drawn on Saturday night by the mayor and it must be chosen by one of the participants. Since Martha always picks 9, the second condition in her case is always met, so whenever the mayor draws a 9 from the bowl, 9 wins. This is not the case with, say, 4. The mayor may draw a 4 from the bowl, but chances are George will not have picked a 4 so it will not often be a winning number. George and Martha have equal chances of winning, and each ball has an equal chance of being chosen by the mayor, but not all numbers have the same chance of winning.

Consider the following simplified lottery. In a comically small town, the mayor draws a number from a fishbowl every Saturday night. Balls numbered from 1 to 10 are in the bowl, and only two townspeople bet each week. George picks a number between 1 and 10 at random. Martha, on the other hand, always picks 9, her lucky number. Although George and Martha are equally likely to win, the number 9 will win more frequently than will any other number. The reason is that two conditions must be met for a number to win: it must be drawn on Saturday night by the mayor and it must be chosen by one of the participants. Since Martha always picks 9, the second condition in her case is always met, so whenever the mayor draws a 9 from the bowl, 9 wins. This is not the case with, say, 4. The mayor may draw a 4 from the bowl, but chances are George will not have picked a 4 so it will not often be a winning number. George and Martha have equal chances of winning, and each ball has an equal chance of being chosen by the mayor, but not all numbers have the same chance of winning.

So, all in all, some great food for thought.