Our Methodology

What do we mean when we talk about odds? When we say, “My doctor says the odds are one in ten that the test will be positive,” we’re expressing probability. In mathematical terms, statements like these put fractions into words. When we say, “the odds are one in ten,” think of a fraction, with the first, lower number as the numerator, or top number in the fraction, and the second, larger number, as the denominator, or bottom number. So, “one in ten” literally means one tenth, or a ten-percent chance. Each Odds Statement in Book of Odds expresses the probability that a specific occurrence will take place, given the number of situations in which that occurrence might take place. Since it is past experience which provides a basis for expecting what will take place, Odds Statements are based entirely on past counts.

Each statement in Book of Odds contains certain required components. Consider the example, “The odds a person will be injured by lightning in a year are 1 in 973,000 (US, 1995 - 2007).” First, we have to know what will happen, in this case, a lightning strike. Second, we have to know to whom it will happen—a person, any person. As we narrow that definition (a farmer, a golfer) the odds will change. Next, the statement tells us the parameters, or limitations, of the calculation. In this case, there are parameters of time (a single year) and of place (US) and information about the data span from over which the underlying data was collected (1995-2007). Any change to these parameters may change the odds. Some Odds Statements, such as those about the ideal fair coin toss coming up heads or tails, have no such parameters, and are considered true everywhere and any time because they are defined that way.

What Makes a Good Odds Statement?

First and foremost, we choose to include an Odds Statement for its interest and informational value. The ways the facts and figures available to us may be combined and expressed are effectively infinite. The mission of Book of Odds is to hunt through this vastness and assemble Odds Statements that are entertaining, useful, and relevant to people’s lives. They won’t all be relevant or interest to everyone, but every Odds Statement will be relevant or interesting to someone. We aim to present data and information objectively, and without bias, but acknowledge that these decisions about what to include are made by human beings. We are fallible. We have biases we are not aware of. We must work with the terms the data collectors choose to use. Our principles of selection, however, are not knowingly biased to support one position or another. When we address controversial subjects, we seek to maintain a neutral perspective, shedding light, but not heat, on politically charged issues.

Every Odds Statement in Book of Odds must measure up to some specific standards. Among the factors we consider are these:

  • Transparency – The process of arriving at a statement must be clear and the sources explicitly named, so that anyone could duplicate it using the same sources.
  • Clarity – Readers must be able to understand all Odds Statements without specialized technical or professional knowledge.
  • Upgradability – Book of Odds is a “living” document based on a database that continuously updates as new data become available and new insights enable us to improve the quality of Odds Statements.
  • Confidence – The best data in the world is still only approximate. We work to make each Odds Statement as accurate as we can through careful selection and evaluation of credible sources. When we create an Odds Statement we assess our confidence in it on a five-point scale, with a sixth rating for “speculative” odds, those which are assumption based.

Confidence and Sources

The confidence rating of our Odds Statements is rooted in the validity and credibility of the research material from which it is derived. The sources employed by Book of Odds are enormously diverse, but each is evaluated according to a strict set of standards. For all sources we ask: who collected information from whom, in what manner, and for what purpose.

For survey data and experimental trials, we evaluate the underlying hypothesis or research questions, study design, sample frame and size, and whether it accurately reflects the population under study, methodology of analysis, fairness of presentation of the data, explanations of variables and limitations, reproducibility of results, and quality of peer review. Further, we examine the sponsoring body and those executing the study, to be aware if they have a vision or mission or mandate that might influence its agenda. We don’t dismiss any source with a point of view out of hand but we are cautious.

We are aware that there is a considerable body of research literature related to surveys, trials, and experimental protocols, and supplement any observation by monitors and watchdog agencies with our own knowledge and judgment.

In addition, we seek independent external reviews of our sources, consulting book reviews, commentary and reviews in academic journals, and contacting relevant and appropriate specialists, including authors of related academic work, industry or research specialists, editors of and contributors to relevant journals, and any and all credible experts uncovered in our own investigations.

Calibration Values or Ranking Odds Statements

To facilitate search for the most interesting and useful Odds Statements we have developed algorithms to rate their calibration value, or the likelihood that they will be meaningful to a user. Traditional keyword relevance is augmented by several factors. One factor which will grow in importance as our traffic increases will be what our users tell us is most interesting and useful. As a foundation, however, we have created a judgment-based ranking provided by our researchers. They take into account how easily a person can relate to the Odds Statement because of its human perspective, or familiarity. They also take into account how broadly interesting or surprising the Odds Statement is.

We do it this way: We imagine a reference set of only 10,000 Odds Statements in a “book.” Only the most irresistible Odds Statements, the highest rated, would belong in this set. We then lower the bar to the general topic area, allowing a larger number into the set and then to the specific topic area, allowing an even larger number. This method has proved very useful and although it will be augmented by our users' feedback it will always provide an element of our ranking scheme.

Tense Conventions

The invention of a reference work is really the invention of a set of conventions followed by their application with relentless consistency. The most subtle and important of our conventions relate to tenses. Odds Statements naming past dates or historical events such as wars are in the past tense, for example. Odds Statements describing an outer or inner state of being or using the predicate nominative use the present tense. Most Odds Statements use the future tense, however, despite being based on past counts. This practice has the advantage of placing our readers and users into the condition we experience at all times, that of being about to learn what the future holds. Our internal methods document explains it this way:

We assume in virtually all of our Odds Statements that we are viewing the events and actions to be described from the time before their count began. From this perspective what is in the sentence is what a perfectly prescient forecast would have yielded. This we term the “future implicative.” From this perspective, the sentence becomes lively. It invites the reader to imagine standing poised at the beginning of the reference period, wondering perhaps what will happen next.”

There are perhaps 20 such tense rules and as we encounter exceptions or extensions we modify them by case law, as it were. The practical benefit to our researchers is that they know how to choose the correct tense to use according to our conventions although almost any odds statement might be expressed in more than one. The practical benefit to our users is that they will quickly pick up an innate sense of how to understand the Odds Statements.

How We Use Odds Statements

Odds Statements may be seen as the atomic unit or building blocks of Book of Odds. They may be organized in many fascinating ways. Some of these include:

  • Calibration pairs – When an Odds Statement about something one has experienced helps to make sense of an Odds Statement about something one has not experienced.
  • Comparison pairs – Odds Statements which vary by one or more of the standard components.
  • Topical lists – Lists of Odds Statements based on theme.
  • Time series – The same Odds Statement shown changing over time.
  • Odds of Me – A list of Odds Statements a person has identified as applying to him- or herself.
  • Interesting Odds – A list of Odds Statements a person has identified as memorable, interesting, and likely to be personally calibrational.
  • Geographic distributions – A view of the Odds Statements allocated to a map.
  • Streams – A view of Odds Statements in probability order, making it possible to calibrate and discover amazing juxtapositions.
  • Odds Threads – A series of related Odds Statements, each contingent on the prior, which tells a story of the narrowing of the odds in a vivid visualization.

Caveat: Not for Calculating Personal Probabilities

Odds Statements are based on recorded past occurrences among a large group of people. They do not pretend to describe the specific risk to a particular individual, and as such cannot be used to make personal predictions. For example, if a person learns that there is a quantifiable probability of a cure for a specific disease, those statistics cannot take into account this person’s personal genetic disposition or medical history, unique environmental factors, the experience of the treating physician, the accuracy of tests performed, the development of new treatments, and so on. However, statements in Book of Odds can further ones understanding of risk by providing context and calibration.