The story of the pioneers of artificial life
I've known for a while that Google offers spelling corrections (my spelling is none too hot, and my typing a little erratic, so I often see the "Showing results for <correctly spelled words>; search instead for <what I actually typed>" result. What I did not know, until I read this history of Google, was how the corrections are implemented. If I had thought about it, I might have assumed some sort of dictionary checker, although on reflection, the suggestions are usually so much better than anything any other spell checker comes up with. What Google actually uses is user data, gathered from analysing user corrections.
Google uses this "insane amounts of user data plus extremely clever algorithms to exploit that data in novel ways, in order to perform the task at hand" to the wide range of things it does. I loved learning about all these different applications of the approach. Levy details this technique in Google's approach to language translation:
Back in the day, AI was mainly about clever algorithms. Then it got rebadged as "Machine Learning", using similar algorithms but now applied to messy real world, rather than artificial or sanitised data. Now Google are taking the next step in AI: qualitative improvements in functionality from quantitative changes in data volumes, plus the clever algorithms:
Going from data to meaning via compression algorithms!
This insistence of being data driven, of backing up arguments with statistics and evidence, permeates everything they do, including performance management.
They are not always driven by data, however. They do have some prejudices:
One of the key messages from this story is how old intuitions about what is expensive in computer systems no longer hold. An appreciation of current constraints leads to new kinds of solutions. This applies to Gmail, for example, where you no long need to throw anything away, rather than more traditional email applications that require constant scrimping and archiving:
I first became consciously aware of how storage abundance and faster algorithms, particularly search, can change things qualitatively when reading Cochrane's Tips for Time Travellers a decade ago, than had it reinforced by Weinberger's Everything is Miscellaneous. Here the concept is made explicit in relation to Chrome:
More storage, more speed, more data: more is different.
The other place where this different mind-set is clear is in Google's approach to server farms. Rather than requiring high accuracy and good quality of all their systems, they recognise that maybe 10% of their servers will fail, and have infrastructure in place to work around this. This makes their systems scalable, and, boy, have they scaled! While the rest of the world has been agonising over how to do parallelism "properly", these kids have just ploughed ahead and done it. As a result, we now have sophisticated software support for highly parallel cloud computing.
This combination of scale, sophisticated infrastructure, and appreciation of where the real costs lie allows some startling solutions:
Power consumption is the expensive part; when you are big enough to have server farms located in different parts of the planet, just don't use those servers where it's hot today!
When facing technological problems, Google is supreme at solving them. When facing people issues, maybe not so great. Their problems in China, with copyright, with social networking applications, with public perception of their actions -- these all seem to stem from the fact that not only don't they seem to get people who think differently from them, they don't seem to get that there are people who think differently. They think that all they need is the evidence, and people will fall in line. All the evidence and reasoning in the world doesn't help if people are starting from different axioms, however.
However, as I too am more interested in technology than in people, I get them. This also meant I wasn't as enamoured of those parts of the book that are about the people, rather than the technology, but, hey. It does have rather less than usual of that irritating pop science style of introducing characters with little illustrative snippets. However, it suffers from the occasional impenetrably parochial cultural references (here provided by that very introductory style I dislike):
I read this as "who looked a bit like <a character I've never heard of> in <a show I've never heard of> but loomed over Silicon Valley like <a person I've never heard of> in <a sports team (I assume!) I've never heard of>'s glory years". I assume it means something like: "John Doerr looked like a nerd but nevertheless had a lot of influence", which is a bit insulting (to nerds), really.
Despite these relatively minor niggles, I enjoyed reading this, and learned a lot, about Google, about big data, and about a possible future of computing.