Against the Gods: The Remarkable Story of Risk Read online

Page 15


  In 1784, the King made Laplace an examiner of the Royal Artillery, a post that paid a handsome salary. But under the Republic, Laplace lost no time in proclaiming his "inextinguishable hatred to royalty."5 Almost immediately after Napoleon came to power, Laplace announced his enthusiastic support for the new leader, who gave him the portfolio of the Interior and the title of Count; having France's most respected sci entist on the staff added respectability to Napoleon's fledgling government. But Napoleon, having decided to give Laplace's job to his own brother, fired Laplace after only six weeks, observing, "He was a worse than mediocre administrator who searched everywhere for subtleties, and brought into the affairs of government the spirit of the infinitely small."6 So much for academics who approach too close to the seats of power!

  Later on, Laplace got his revenge. He had dedicated the 1812 edition of his magisterial Theorie analytique des probabilites to "Napoleon the Great," but he deleted that dedication from the 1814 edition. Instead, he linked the shift in the political winds to the subject matter of his treatise: "The fall of empires which aspired to universal dominion," he wrote, "could be predicted with very high probability by one versed in the calculus of chance."' Louis XVIII took appropriate note when he assumed the throne: Laplace became a Marquis.

  Unlike Laplace, Gauss was reclusive and obsessively secretive. He refrained from publishing a vast quantity of important mathematical research-so much, in fact, that other mathematicians had to rediscover work that he had already completed. Moreover, his published work emphasized results rather than his methodology, often obliging mathematicians to search for the path to his conclusions. Eric Temple Bell, one of Gauss's biographers, believes that mathematics might have been fifty years further along if Gauss had been more forthcoming; "Things buried for years or decades in [his] diary would have made half a dozen great reputations had they been published promptly."'

  Fame and secretiveness combined to make Gauss an incurable intellectual snob. Although his primary achievement was in the theory of numbers, the same area that had fascinated Fermat, he had little use for Fermat's pioneering work. He brushed off Fermat's Last Theorem, which had stood as a fascinating challenge to mathematicians for over a hundred years, as "An isolated proposition with very little interest for me, because I could easily lay down a multitude of such propositions, which one could neither prove nor dispose of."9

  This was not an empty boast. In 1801, at the age of 24, Gauss had published Disquisitiones Arithmeticae, written in elegant Latin, a trail blazing, historic work in the theory of numbers. Much of the book is obscure to a non-mathematician, but what he wrote was beautiful music to himself. 1° He found "a magical charm" in number theory and enjoyed discovering and then proving the generality of relationships such as this:

  Or, in general, that the sum of the first n successive odd numbers is n2. This would make the sum of the first 100 odd numbers, from 1 to 199, equal to 1002, or 10,000; and the sum of the numbers from 1 to 999 would be equal to 250,000.

  Gauss did deign to demonstrate that his theoretical work had important applications. In 1800, an Italian astronomer discovered a small new planet-technically, an asteroid-that he named Ceres. A year later Gauss set out to calculate its orbit; he had already calculated lunar tables that enabled people to figure out the date of Easter in any year. Gauss was motivated in large part by his desire to win a public reputation. But he also wanted to join his distinguished mathematical ancestors-from Ptolemy to Galileo and Newton-in research into celestial mechanics, quite aside from wishing to outdo the astronomical work of his contemporary and benefactor, Laplace. In any event, this particular problem was enticing in itself, given the paucity of relevant data and the speed with which Ceres rotated around the sun.

  After a spell of feverish calculation, he came up with a precisely correct solution and was able to predict the exact location of Ceres at any moment. In the process he had developed enough skill in celestial mechanics to be able to calculate the orbit of a comet in just an hour or two, a task that took other scientists three or four days.

  Gauss took special pride in his achievements in astronomy, feeling that he was following in the footsteps of Newton, his great hero. Given his admiration for Newton's discoveries, he grew apoplectic at any reference to the story that the fall of an apple on Newton's head had been the inspiration for discovering the law of gravity. Gauss characterized this fable as:

  Silly! A stupid, officious man asked Newton how he discovered the law of gravitation. Seeing that he had to deal with a child intellect, and wanting to get rid of the bore, Newton answered that an apple fell and hit him on the nose. The man went away fully satisfied and completely enlightened."

  Gauss took a dim view of humanity in general, deplored the growing popularity of nationalist sentiments and the glories of war, and regarded foreign conquest as "incomprehensible madness." His misanthropic attitudes may have been the reason why he stuck so close to home for so much of his life.12

  Gauss had no particular interest in risk management as such. But he was attracted to the theoretical issues raised by the work in probability, large numbers, and sampling that Jacob Bernoulli had initiated and that had been carried forward by de Moivre and Bayes. Despite his lack of interest in risk management, his achievements in these areas are at the heart of modern techniques of risk control.

  Gauss's earliest attempts to deal with probability appeared in a book titled Theoria Motus (Theory of Motion), published in 1809, on the motion of heavenly bodies. In the book Gauss explained how to estimate an orbit based on the path that appeared most frequently over many separate observations. When Theoria Motus came to Laplace's attention in 1810, he seized upon it with enthusiasm and set about clarifying most of the ambiguities that Gauss had failed to elucidate.

  Gauss's most valuable contribution to probability would come about as the result of work in a totally unrelated area, geodesic measurement, the use of the curvature of the earth to improve the accuracy of geographic measurements. Because the earth is round, the distance between two points on the surface differs from the distance between those two points as the crow flies. This variance is irrelevant for distances of a few miles, but it becomes significant for distances greater than about ten miles.

  In 1816, Gauss was invited to conduct a geodesic survey of Bavaria and to link it to measurements already completed by others for Denmark and northern Germany. This task was probably little fun for an academic stick-in-the-mud like Gauss. He had to work outdoors on rugged terrain, trying to communicate with civil servants and others he considered beneath him intellectually-including fellow scientists. In the end, the study stretched into 1848 and filled sixteen volumes when the results were published.

  Since it is impossible to measure every square inch of the earth's surface, geodesic measurement consists of making estimates based on sample distances within the area under study. As Gauss analyzed the distribution of these estimates, he observed that they varied widely, but, as the estimates increased in number, they seemed to cluster around a central point. That central point was the mean-statistical language for the average-of all the observations; the observations also distributed themselves into a symmetrical array on either side of the mean. The more measurements Gauss took, the clearer the picture became and the more it resembled the bell curve that de Moivre had come up with 83 years earlier.

  The linkage between risk and measuring the curvature of the earth is closer than it might appear. Day after day Gauss took one geodesic measurement after another around the hills of Bavaria in an effort to estimate the curvature of the earth, until he had accumulated a great many measurements indeed. Just as we review past experience in making a judgment about the probability that matters will resolve themselves in the future in one direction rather than another, Gauss had to examine the patterns formed by his observations and make a judgment about how the curvature of the earth affected the distances between various points in Bavaria. He was able to determine the accuracy of his observations
by seeing how they distributed themselves around the average of the total number of observations.

  The questions he tried to answer were just variations on the kinds of question we ask when we are making a risky decision. On the average, how many showers can we expect in New York in April, and what are the odds that we can safely leave our raincoat at home if we go to New York for a week's vacation? If we are going to drive across the country, what is the risk of having an automobile accident in the course of the 3,000-mile trip? What is the risk that the stock market will decline by more than 10% next year?

  The structure Gauss developed for answering such questions is now so familiar to us that we seldom stop to consider where it came from. But without that structure, we would have no systematic method for deciding whether or not to take a certain risk or for evaluating the risks we face. We would be unable to determine the accuracy of the information in hand. We would have no way of estimating the probability that an event will occur-rain, the death of a man of 85, a 20% decline in the stock market, a Russian victory in the Davis Cup matches, a Democratic Congress, the failure of seatbelts, or the discovery of an oil well by a wildcatting firm.

  The process begins with the bell curve, the main purpose of which is to indicate not accuracy but error. If every estimate we made were a precisely correct measurement of what we were measuring, that would be the end of the story. If every human being, elephant, orchid, and razor-billed auk were precisely like all the others of its species, life on this earth would be very different from what it is. But life is a collection of similarities rather than identities; no single observation is a perfect example of generality. By revealing the normal distribution, the bell curve transforms this jumble into order. Francis Galton, whom we will meet in the next chapter, rhapsodized over the normal distribution:

  [T]he "Law Of Frequency Of Error"... reigns with serenity and in complete self-effacement amidst the wildest confusion. The huger the mob ... the more perfect is its sway. It is the supreme law of Unreason. Whenever a large sample of chaotic elements are taken in hand ... an unsuspected and most beautiful form of regularity proves to have been latent all along.13

  Most of us first encountered the bell curve during our schooldays. The teacher would mark papers "on the curve" instead of grading them on an absolute basis-this is an A paper, this is a C+ paper. Average students would receive an average grade, such as Bor C+ or 80%. Poorer and better students would receive grades distributed symmetrically around the average grade. Even if all the papers were excellent or all were terrible, the best of the lot would receive an A and the worst a D, with most grades falling in between.

  Many natural phenomena, such as the heights of a group of people or the lengths of their middle fingers, fall into a normal distribution. As Galton suggested, two conditions are necessary for observations to be distributed normally, or symmetrically, around their average. First, there must be as large a number of observations as possible. Second, the observations must be independent, like rolls of the dice. Order is impossible to find unless disorder is there first.

  People can make serious mistakes by sampling data that are not independent. In 1936, a now-defunct magazine called the Literary Digest took a straw vote to predict the outcome of the forthcoming presidential election between Franklin Roosevelt and Alfred Landon. The magazine sent about ten million ballots in the form of returnable postcards to names selected from telephone directories and automobile registrations. A high proportion of the ballots were returned, with 59% favoring Landon and 41% favoring Roosevelt. On Election Day, Landon won 39% of the vote and Roosevelt won 61%. People who had telephones and drove automobiles in the mid-1930s hardly constituted a random sample of American voters: their voting preferences were all conditioned by an environment that the mass of people at that time could not afford.

  Observations that are truly independent provide a great deal of useful information about probabilities. Take rolls of the dice as an example.

  Each of the six sides of a die has an equal chance of coming up. If we plotted a graph showing the probability that each number would appear on a single toss of a die, we would have a horizontal line set at one-sixth for each of the six sides. That graph would bear absolutely no resemblance to a normal curve, nor would a sample of one throw tell us anything about the die except that it had a particular number imprinted on it. We would be like one of the blind men feeling the elephant.

  Now let us throw the die six times and see what happens. (I asked my computer to do this for me, to be certain that the numbers were random.) The first trial of six throws produced four 5s, one 6, and one 4, for an average of exactly 5.0. The second was another hodgepodge, with three 6s, two 4s, and one 2, for an average of 4.7. Not much information there.

  After ten trials of six throws each, the averages of the six throws began to cluster around 3.5, which happens to be the average of 1+2+3+4+5+6, or the six faces of the die-and precisely half of the mathematical expectation of throwing two dice. Six of my averages were below 3.5 and four were above. A second set of ten trials was a mixed bag: three of them averaged below 3.0 and four averaged above 4.0; there was one reading each above 4.5 and below 2.5.

  The next step in the experiment was to figure the averages of the first ten trials of six throws each. Although each of those ten trials had an unusual distribution, the average of the averages came to 3.48! The average was reassuring, but the standard deviation, at 0.82, was wider than I would have liked.* In other words, seven of the ten trials fell between 3.48 + 0.82 and 3.48 - 0.82, or between 4.30 and 2.66; the rest were further away from the average.

  Now I commanded the computer to simulate 256 trials of six throws each. The first 256 trials generated an average almost on target, at 3.49; with the standard deviation now down to 0.69, two-thirds of the trials were between 4.18 and 2.80. Only 10% of the trials averaged below 2.5 or above 4.5, while more than half landed between 3.0 and 4.0.

  The computer still whirling, the 256 trials were repeated ten times. When those ten samples of 256 trials each were averaged, the grand average came out to 3.499. (I carry out the result to three decimal places to demonstrate how close I came to exactly 3.5.) But the impressive change was the reduction of the standard deviation to only 0.044. Thus, seven of the ten samples of 256 trials fell between the narrow range of 3.455 and 3.543. Five were below 3.5 and five were above. Close to perfection.

  Quantity matters, as Jacob Bernoulli had discovered. This particular version of his insight-the discovery that averages of averages miraculously reduce the dispersion around the grand average-is known as the central limit theorem. This theorem was first set forth by Laplace in 1809, in a work he had completed and published just before he came upon Gauss's Theoria Motus in 1810.

  Averages of averages reveal something even more interesting. We began the experiment I have just described with a die with, as usual, six faces, each of which had an equal chance of coming up when we threw it. The distribution then was flat, bearing no likeness at all to a normal distribution. As the computer threw the die over and over and over, accumulating a growing number of samples, we gleaned more and more information about the die's characteristics.

  Very few of the averages of six throws came out near one or six; many of them fell between two and three or between four and five. This structure is precisely what Cardano worked out for his gambling friends, some 250 years ago, as he groped his way toward the laws of chance. Many throws of a single die will average out at 3.5. Therefore, many throws of two dice will average out at double 3.5, or.7.0. As Cardano demonstrated, the numbers on either side of 7 will appear with uniformly diminishing frequency as we move away from 7 to the limits of 2 and 12.

  The normal distribution forms the core of most systems of risk management. The normal distribution is what the insurance business is all about, because a fire in Chicago will not be caused by a fire in Atlanta, and the death of one individual at one moment in one place has no relationship to the death of another individual at anot
her moment in a different place. As insurance companies sample the experience of millions of individuals of different ages and of each gender, life expectancies begin to distribute themselves into a normal curve. Consequently, life insurance companies can come up with reliable estimates of life expectancies for each group. They can estimate not only average life expectancies but also the ranges within which actual experience is likely to vary from year to year. By refining those estimates with additional data, such as medical histories, smoking habits, domicile, and occupational activities, the companies can establish even more accurate estimates of life expectancies.*

  On occasion, the normal distribution provides even more important information than just a measure of the reliability of samples. A normal distribution is most unlikely, although not impossible, when the observations are dependent upon one another-that is, when the probability of one event is determined by a preceding event. The observations will fail to distribute themselves symmetrically around the mean.

  In such cases, we can profitably reason backwards. If independence is the necessary condition for a normal distribution, we can assume that evidence that distributes itself into a bell curve comes from observations that are independent of one another. Now we can begin to ask some interesting questions.

  How closely do changes in the prices of stocks resemble a normal distribution? Some authorities on market behavior insist that stock prices follow a random walk-that they resemble the aimless and unplanned lurches of a drunk trying to grab hold of a lamppost. They believe that stock prices have no more memory than a roulette wheel or a pair of dice, and that each observation is independent of the preceding observation. Today's price move will be whatever it is going to be, regardless of what happened a minute ago, yesterday, the day before, or the day before that.