Notes on “Why Foxes Are Better Forecasters Than Hedgehogs”, Phil Tetlock

The lecture is here.

Good political (note: or investing) judgment is much more controversial than good medical or engineering judgment.

  • It appears too subjective.

It is possible to construct benchmarks for good judgment.

Knowledge and meta-knowledge are both important:

  • Meta-knowledge: do you know what you know and don’t know?
    • To find this out, you have to “keep score” of your own mental accuracy.

Tetlock, a professor, conducted an experiment that asked subjects to evaluate whether a statement was true or false and to indicate their confidence in their answers.

  • For example, “France is larger in area than Spain.”

In doing this, he measured two things:

  • Calibration, the ability to assign subjective probabilities to outcomes that correspond to their objective probabilities.
    • E.g., if you say there’s a 70% chance of something happening, it happens about 70% of the time.
  • Discrimination, the ability to assign sharply higher probabilities to outcomes that occur than to those that do not.

Tetlock’s most famous study assessed 28,000 predictions from 284 experts for calibration and discrimination over an 18-year period.

  • The predictions had shorter ranges (1-3 yrs.) for faster-moving forecasts like stock market predictions and longer ones (10-25 yrs.) for things like border changes.

The author received tenure in 1984, when the Reagan-era Soviet scare was at its peak.

  • Liberals were frightened that Reagan was pushing the US to the brink, while conservatives agreed with Reagan but thought the best-case scenario would be containment -> a Soviet retreat to neo-Stalinism.
  • The dominant view was that nothing good would come of the heightened tensions.
  • Very few predicted a Gorbachev-esque liberalization.

Then came Gorbachev.

  • This was an “outcome-irrelevant learning situation”: each party justified the move through its own lens.
    • Liberals deemed the Reagan arms buildup a waste of money, while conservatives claimed credit.
      • Interestingly, many conservatives, including Robert Gates, identified Gorbachev as a neo-Stalinist until late in the game.

The attitudes of those holding positions wrt the Iraq War were strikingly similar to the reactions to the Soviet collapse.

  • If the forecasters’ views of W. Bush & Reagan and the US’s policy outcomes were good/good or bad/bad, then they were vindicated in either case.
  • When there was a mixed view vs. policy result (good/bad or bad/good), then forecasters engaged in belief system defenses.
    • E.g., “we took the safe option by invading Iraq” or “things could have been just as good or better if we hadn’t wasted the money on nuclear defense”
  • These ex-post justifications lead to an intellectual stalemate, since it’s very difficult to assess what other states of the world might have been.

Conclusion: partisans across the opinion spectrum are vulnerable to occasional bouts of ideologically induced insanity.

How, then, do you judge judgment?

  • He mentions this in his book, Expert Political Judgment: you have to lay out standards in advance that cannot be denounced later.

Takeaways:

  • Experts’ ideological leanings (what they thought) didn’t matter as much as how they thought.
  • Hedgehogs are experts who relate everything to a central vision or set of first principles.
    • E.g., Marxism, libertarianism, idealism, etc.
    • Hedgehogs are deductive—in a sense, they take a very scientific approach.
    • Hedgehogs extend their theories into many domains and are more confident in their ability to predict.
  • Foxes don’t cling to one overarching explanation, instead using multiple, often incongruous, frameworks.
    • Foxes are skeptical of grand theories and tend to be diffident about their ability to forecast.
    • Foxes have a somewhat contrarian, self-critical cognitive style.
  • Hedgehogs making LT predictions within their domains of expertise were the most inaccurate, while foxes making ST predictions within their domains were the most accurate.
    • The foxes were not only better calibrated, as suggested above, but also better at discrimination.
  • Hedgehogs tend to assign higher probabilities to big changes that do materialize.
    • For example, ethnic nationalist fundamentalists who had been predicting the Soviet Union’s demise since the 1960s were eventually right…
    • However, this comes at the expense of a lot of false positives.
      • E.g., a Great American Depression in the ‘90s or Dow 36,000.
    • Hedgehogs push big ideas as far as reasonable—and often beyond.
  • Foxes pick bits and pieces of big hedgehog theories to create mish-mashes with more predictive power than the original ideas.
    • In fact, a small group of foxes who had integrated liberal assessments of fractures within the Soviet leadership with the conservative view that the Soviet system could not endure liberalization had correctly predicted Gorbachev + the USSR’s fall.
    • Foxes thus tend to annoy people across the political spectrum.
  • Hedgehogs are 2.5x to 3x more willing to assign 0% or 100% probabilities to events than foxes, which hurts them when compared to outcomes.
  • “Mathematically, the subjective probability scale has linear intervals: moving from 0 to .1 should be the same as moving from .1 to .2 or .2 to .3.”
    • “For human beings, that’s not the way things work.”
      • Amos Tversky was fond of describing 3 subjective probabilities: “impossible, certain, and maybe.”
    • “When you assign a probability of 0 or 1, people really listen.”
  • Experts did beat Berkeley undergraduates who had been briefed briefly on the same issues.
  • “My data is more consistent with the Andy Warhol 15 Minutes of Fame hypothesis.”
    • The foxes tend to be right more often, but only slightly so.
  • Prediction markets tend to introduce somewhat more self-critical predictions, since experts are explicitly held accountable.
  • Aggregating expert opinions helps hedgehogs: their average predictions become much better, while foxes experience less of a benefit.
  • “What are people consuming when they’re consuming expert judgment?
    • Are they consuming truth claims, or are they consuming interesting stories?”
    • The Saudi regime’s disintegration has been wrongly predicted for decades, but the stories painted by the apocalyptic side are vastly more “mediagenic.”
    • “There is an inverse relationship between what makes people attractive as public presenters and what makes them accurate forecasters.”
      • “The more ‘howevers’, ‘buts’, and ‘althoughs’ you hear in an expert’s statements… the more people’s eyes are likely to glaze over, but the more likely it is that the expert’s subjective probabilities are likely to translate into something reasonably accurate.”
  • Experts are more likely to increase their confidence in their prior position when they’re right than to decrease their confidence when they’re wrong.
    • It doesn’t feel like dogmatism to them: things that weren’t reasonably foreseeable simply happened.
      • E.g., following the close Quebec secession vote, “Canada almost did come apart.”
    • Foxes change their minds more quickly.
  • Tetlock’s experience suggests that good thinking can be generalized, but it’s very hard.
  • Among the population he studied, at the tails, ~20% were either foxes or hedgehogs, but he supposes that ~60% of the population is “ambidextrous”, exhibiting traits of both.

Leave a Reply