Books : reviews

William Goldbloom Bloch.
The Unimaginable Mathematics of Borges' Library of Babel.
OUP. 2008

rating : 3.5 : worth reading
review : 3 January 2011

In Borges' famous short story, The Library of Babel is a Vast labyrinth of hexagonal chambers, containing all possible books of a particular format. This book (which is, of course, somewhere in the Library, albeit in a "translated" format, because of the need to spell out all the numbers) discusses the mathematics behind various aspects of the Library. The crucial information is as follows:

p3. Twenty bookshelves, five to each side, line four of the hexagon's six sides ...
p4. each bookshelf holds thirty-two books identical in format; each book contains four hundred ten pages; each page, forty lines; each line, approximately eighty black letters. There are also letters on the front cover of each book; those letters neither indicate nor prefigure what the pages inside will say.
p5. There are twenty-five orthographic symbols
p6. In all the Library, there are no two identical books.

The 25 symbols are 22 letters of the alphabet, the comma, the period, and the space.

In chapter 1, Bloch calculates the number of books in the Library. He ignores the "approximately" in the phrase "approximately eighty black letters", and implicitly takes "letter" to be synonymous with "orthographic symbol" (despite apparently contradictory evidence in the story's footnote 1 that defines the symbols to be 22 letters plus punctuation and spaces, and a line later that mentions the twenty-two orthographic symbols; however, making this simplifying assumption does allow Bloch to prove that there must be at least one hexagon that isn't completely full, since the total number of books is not divisible by 640, the number stored in a full hexagon). Assuming exactly 80 symbols drawn from a 25 symbol alphabet, then there are 80 × 40 × 410 = 1 312 000 symbols per book, and so exactly 251 312 000, or approximately 101 834 097 (or even more approximately 10106), different books in the Library.

This Vast result (despite being much smaller than a googolplex) justifies the use of "unimaginable" in the book title; although we can calculate and write down this number, we cannot imagine it, cannot understand its size. Bloch neatly demonstrates the unimaginability by considering particular subsets of the Library, for example, all the books where every character is the letter g, except for a mere 16 occurrences of h. There are more than 1084 of these books, more than enough to fill the universe; astronomically many. Yet this huge subset is a Vanishingly small part of the overall Library. He points out that even if every book were the size of a proton (10-15m across) and the whole observable universe (1027m across) were filled with them, that would still be "only" 10126 books. This too is a Vanishingly small proportion of the total books in the Library, and Bloch gives up trying to imagine them at this point. (The reason I am spelling "Vast" and "Vanishingly small" with a capital "V" here, is that Dennett uses the same Library to help build intuitions of these sizes, and then uses Vast and Vanishingly small as a convenient shorthand to recall the sizes later.)

Let's see if we can go a few steps further in our imaginations. Call a universe full of proton-sized books a Level 1 annex. Consider now shrinking this annex itself down to the size of a proton. We can now fill the universe with 10126 shrunken Level 1 annexes; call this a Level 2 annex. It contains 10126 × 10126 = 10252 books -- not really that many more, compared to the full Library. But let's keep repeating this process: shrink the Level 2 annex down to the size of a proton, and fill the universe with 10126 shrunken Level 2 annexes to produce a Level 3 annex with 10126 × 10252 = 10378 books. Keep doing this; eventually the Level 14 557 annex will have 1014 557 × 126 = 101 834 182 books, or more than the entire Library. This at least brings the individual numbers down to conceivable levels (although imagining shrinking the universe by a linear factor of 1042, and repeating that more than 14 thousand times, is even more mindblowing than watching some of the Mandelbrot fractal deep-zoom videos). This helps to demonstrate that, although "astronomical" is used as an adjective to connote mind-blowing huge values, the size of space is peanuts compared to the Vastness of combinatoric values.

In chapter 2, now that we have a feel for the unimaginably Vast size of the Library, Bloch turns to the problem of cataloguing the Library (the books being shelved in no discernable order), and here the problems with the Library really start. A catalogue entry needs two pieces of information: about the book, and about its location. First, the book: how to identify it? The title on the cover is no use, for there are not enough titles to go round. What do I mean by this? Bloch wonders if, with spinal titles of 19 letters, this increases the number of books by a factor of 2519, or about 3 × 1026: one copy of each book for each title. But I think a more interesting point can be made by assuming the books are unique, and that the titles are just some kind of (random) annotation to them. Then, assuming the full range of titles is uniformly distributed, even with more than 1026 titles available, there must be about 101 834 077 distinct books with any given title. Hardly a useful discriminating feature. This demonstrates the problem of finding a "short" description of the book to put in the catalogue: there are not enough short descriptions. For the Vast majority of the books in the Library, the shortest description (that distinguishes it from other books) is the book itself. Most books cannot be "compressed" to a short description. Or, as Bloch puts it, the Library is its own catalogue.

Now let's consider the location part of the information in the catalogue. It will need to note the hexagon, the shelf, and the position along the shelf. There are 32 positions per shelf, 20 shelves per hexagon, and ... 101 834 094 hexagons. We don't want to number the hexagons using digits, as there are no digits in the language: but we can label them using the 25 symbols (ie, "number" them in base 25), so that this label can be written in the language of a catalogue book. However, written in base 25, the label for a hexagon requires about 1 311 997 characters, or very nearly an entire book's worth (importantly, this is not a coincidence: even with different sized books or different sized alphabets, the conclusion is the same). Thus the catalogue entry for a book is the size of two books: one to identify the book, one to identify its location. But since the books are shelved in no discernable order, there is no way to link these two books: given the "identity" book, how to find the "location" book? (One solution forbidden to us here is to shelve the books in alphabetical order; then each book encodes its own location.)

This last point leads on to a problem I've not seen addressed (although it must have been by some author). Bloch allows books longer than 410 pages:

p36. We are also willing to include books longer than 410 pages, so long as the title page includes reference to an appropriate volume number.

I assume that the idea is that if one volume says "Moby Dick, volume one" on its title page, then the next is to say "Moby Dick, volume two". But there is a Vast number of books that say "Moby Dick, volume two": which is the "right" one? There is no way to know, as all possibilities exist, and the reference to the right one is the length of a whole book. Dennett also claims that books can be strung together, and even used more than once, to get larger chunks of information. But this leads to a reductio ad absurdum, which he notes, but does not discuss:

[Dennett, p109, footnote 4.] Some books may get used more than once. The most profligate case is the easiest to understand: since there are volumes that each contain a single character and are otherwise blank, repeated use of these 100 volumes [Dennett's example uses a 100 character alphabet] will create any text of any length. As Quine points out in his informative and amusing essay "Universal Library" (in Quine 1987), if you avail yourself of this strategy of re-using volumes, and translate everything into the ASCII code your word-processor uses, you can store the whole Library of Babel in two extremely slender volumes, in one of which is printed a 0 and in the other of which appears a 1! (Quine also points out that the psychologist Theodor Fechner propounded the fantasy of the universal library long before Borges.).

Of course, the reason this does not work is that the information linking the volumes together in the right order is being ignored. Where is that stored? Consider using the 0 and 1 volumes in this way to capture the binary string "101010". First we need the information that the first volume to use is "1", then that the next volume is "0", then that the next is "1", and so on. So the order we need to use to volumes is "101010". Oh look, that's the string we wanted to encode in the first place! And it is longer than either of the one-character "volumes" used. Stitching together 410-page volumes is exactly the same, but using a base 101 834 097 encoding instead of binary: we still need to store the order somewhere. (I made the same comment in my criticism of the 'dust' in Permutation City). So, in the Library, there are no multi-volume books, and therefore no information content in the books larger than 1 312 000 base-25 character strings, because there is no way of linking to the "next" volume.

One way to look at the information content in the books and the Library is to think of the "Kolmogorov Complexity", or how short a computer program can be to output a particular string. As noted earlier, most books are "incompressible": there are not enough short descriptions, not even a program shorter than PRINT(...the particular 251 312 000 character string that is the book's contents...). However, the contents of entire Library is extremely compressible: a relatively short program could be written to output every single book in turn, although it would take a Vast length of time to execute. However, it would output the books in some kind of alphabetical order, but we know the books on the shelves are not ordered. So we see that the librarians are looking in the wrong place: the information is not in the books themselves, it is in the order the books are on the shelves!

In chapter 3, Bloch examines the meaning of the story's final footnote:

p10. ... all that is required is a single volume, ... that would consist of an infinite number of infinitely thin pages.

Of course, the finite Library discussed above (precisely one copy of each possible book) would not require a volume with an infinite number of pages. But let's assume an infinite series of copies of the Library. Under this assumption, Bloch gives three explanations of what this volume might be like, including what the resulting thickness of the entire volume would be. He touches on infinite series, sets of measure zero, and non-standard analysis (all at a very high level!). Given this sophistication, the chapter's "aftermath" about logarithms is a bit of an anti-climax. I was also surprised that he did not consider encoding the Library in a single real number (by concatenating all the books, and interpreting it as one base-25 "decimal" with a Vast number of digits).

In chapter 4, Bloch moves away from the size of the Library to its topology, again from the information given in the story:

p4. The Library is a sphere whose exact center is any hexagon and whose circumference is unattainable. ...
p6. The Library is unlimited but periodic.

From this is is clear that the Library is a 3-sphere. Bloch provides a nice little excursion into 3-spheres, 3-tori, and non-orientable 3-Klein bottles, with this time the "aftermath" going into some nice details about visualising higher dimensional spaces.

By chapter 5, Bloch is investigating more obscure, and possibly more interesting, properties of the Library. He points out an ambiguity in Borges' description of how the hexagons are arranged, and demonstrates that the "obvious" reading leads to a serpentine labyrinthine Library where spatially adjacent hexagons may be essentially unreachable along the available paths.

In chapter 6, Bloch looks at different interpretations of making an Order from multiple copies of the Library. Instead of an infinite Library made of identical copies of the single Library (since this is indistinguishable from a 3-torus), he investigates copies where the books are in different orders, then copies of this where the copies are in different orders, and so on. After asking us to contemplate the number of orderings of orderings of orderings of the books, of which there are (((251 312 000)!)!)!, (which is even more mind-blowingly inconceivably combinatorically Vast than a single Library, even Vaster by far than a googolplex, but still nowhere near the size of Graham's number) he goes on to discuss the fact that for a given spatial structure of a "unit Library" (called a libit), the overall ordering of orderings of ... is unique. (This, of course, destroys any information content there might have been in the books' order on the shelves.)

In chapter 7, Bloch draws an amusing analogy between the actions of a librarian moving through the hexagons, consulting and annotating volumes, and a Turing Machine. This chapter is rather unnecessarily entitled "A Homomorphism", since:

p121. I stress that I am using the term "homomorphism" metaphorically. (In fact, I'm using "homomorphism" as a synonym for "metaphor.")

In chapter 8, Bloch discusses some other writers' analyses of The Library of Babel, showing some mathematical errors they make, and pointing out further interesting observations. And finally, he discusses how much Borges might have known of Cantor's and Russell's work, from analysing the annotations Borges made in the books in his own, much smaller, and much more useful, library.

If you are interested in teasing out the implications of what might appear to be throw-away lines in short stories, and if you need further convincing that Borges was a genius, this is a fun way to start.