Friday, March 5, 2010

Simple versus Complicated

When you're modeling a phenomenon, you face a tradeoff between complexity and ease of use. If you build a model with extremely accurate predictions, it's probably very complex and difficult to use. Conversely, if you build a simple model that's easy to use, you won't be able to capture as much of the behavior of the phenomenon you're modeling.

Here are two examples: long-run growth models and orbital dynamics.

Long-run growth models. Your typical long-run growth model deals in three variables: A, Y, K, and N - technology, GDP, capital, and employment, respectively. You can crudely model long-term growth trends by knocking these two variables together. However, it's very difficult to capture any details; for example, not all forms of capital are the same - some is used for long-term projects, some for short-term projects, a highway here might be more efficient than a highway there, etc. To make the predictions correct, you have to generally use A as a "fudge factor"; the model still predicts a particular pattern in the data, but it's not good enough to get the pattern from more basic principles.

Similarly, a more complicated growth model would require knowledge of the time-distribution and geographic distribution of capital, population and inclination distributions of workers, various forms of technology, and so forth. It would be significantly more difficult to use.

Orbital dynamics. Let's think about the Sun and the Earth. Newton's laws can explicitly be solved for a two-body problem, giving the motion of each body about the center of mass of the system. However, the solution is rather complicated. Since the Sun is much, much, much bigger than the Earth, we may as well assume the Sun is stationary. The problem, instead of dealing with the motion of two bodies, only has to describe the motion of the Earth in a central force field -- much easier.

Thursday, March 4, 2010

How not to do science

Check out this (warning: .pdf) study. It claims that African inequality is dropping rapidly. You don't have to go too far for big red WARNING signs to pop up:
The World Bank concurs: “In 1990, 28.3 percent of the people in low and middle‐income countries lived on less than $1 a day. By 1999 the share had fallen to 21.6 percent, driven mainly by strong growth in China and India (…) In Sub‐Saharan, where the GDP per capita fell by 5 percent, the extreme poverty rate rose from 47.4 percent in 1990 to 49 percent in 1999. The numbers are believed to be still rising” (World Bank (2004).) The U.N. Millenium Campaign Deputy Director for Africa says: “Poverty continues to intensify due to the exclusion of groups of people on the basis of class, caste, gender, disability, age, race, religion and other status,” (UN Millenium Campaign (2009).) This conventional wisdom is further documented and critically reviewed in Easterly (2009).
In this paper, we use the methodology of Pinkovskiy and Sala‐i‐Martin (2009) to estimate income distributions for African countries, and compute their poverty rates, and inequality and welfare indices for the period 1970‐2006. Our results show that the conventional wisdom that Africa is not reducing poverty is wrong. In fact, since 1995, African poverty has been falling steadily.

This should be setting off low-level bullshit detectors. But it gets better. The method the authors used is very simple: curve-fitting a lognormal distribution to PPP GDP and gini coefficients. The lognormal is entirely characterized by two numbers, the mean and the variance. Any curve-fitting to data is going to result in errors in mean and variance -- errors which, in the name of honesty, should be quoted. They don't show up anywhere.

So what are the authors doing in this paper? They are applying a model of income distributions to one set of data, refusing to specify uncertainties in the model's predictions, and using the results of the model to contradict existing data. To say the least, this is completely backwards: disagreement between the model's predictions and World Bank data should be evidence against the use of lognormal distributions and the fitting methods used in the paper, not evidence against the World Bank data.

This, kids, is how not to do science.

Wednesday, March 3, 2010

"What is truth?"

Have you ever seen a baby crawling around learning things? The next time you're around a baby who's old enough to move (i.e., >4 months or so), sit down and have some playtime. It's quite a joy to watch: checking out new things, going back to old ones - shaking, pounding, tasting, smelling, touching, feeling, listening, using every sense to try to figure out how it works.

What's going on in their heads? We can't really know for sure, but we can guess. They're trying to figure out how the world works. There's a world around them, and they don't know what's going on -- so, characteristically, they launch into figuring it out. You might characterize this as building expectations for how things should behave. In other words, they're developing models for how the world works. Their expectations are based on predictions the model makes.

Babies are natural scientists. It's informal, sure, but the informal scientific method is there: poke at it, watch its response, and take it into account in your mental model of how it works.

It's a shame we adults have to relearn this -- a lot of us never actually do. Some lucky people never lose it, but after a couple of decades, the old mental shortcuts kick in, and we forget what it's like to actively learn. We forget how to engage life, how to approach new ideas skeptically, how to be unsure, and in fact be sure of just how unsure you are. (To be fair, the "adult" version of the scientific method is a lot more rigorous - just the other day, I measured G to be (5.3 +/- 1.2) x 10^-11 Nm^2kg^-2*; that sort of quantification is far, far beyond my daughter. But the basic idea of the scientific method is the same, and when you apply it to life in general, you can't always specifically quantify your uncertainty; you just have to get a feel of how off your best guess might be and go with it.)

Part of the reason for this forgetfulness is the mental shortcuts we figure out as we go ahead in life. When you're figuring out something new, like the fact that things fall when you drop them (wow!), you're very tentative. As you experience it more, your uncertainty decreases. And after a while, your uncertainty gets so low that it's not worth your time to think about it anymore: you just assume that when you drop something, it will automatically fall - end of story.

But that black-and-white thinking, while convenient, is harmful. As adults, we face a world full of uncertainty -- and if we've gotten sloppy by not engaging life, we have simply forgotten what it's like to be uncertain! Thus breed black-and-white thinking and other symptoms of intellectual sloth.

So, moral of the day? To be skeptical, be more like babies -- get a sense of your uncertainty, and realize that the things you "know", you really just have very high confidence in because you have a lot of (experiential) evidence** for them.

* Which is cool in and of itself; how many people can say they've measured Newton's gravitational constant to within 20%?

** Mind, it's often good to go back and requestion these things too once you have the adult version of the scientific method, but all too many people don't even have the baby version. Once you get the adult version, you'll sort of automatically question them anyway.