Home » What’s the range of uncertainty regarding the population of the Americas in 1492?

What’s the range of uncertainty regarding the population of the Americas in 1492?

A few days ago we discussed a post by historian Sean Manning who, in the context of a review of a book by economist Brad DeLong, wrote:

I [Manning] don’t see much value in estimating the population of the world in 6000 BCE when we can’t agree on the population of the Americas in 1492 within a factor of 20, and took decades to agree on the population of the Roman empire under Augustus within a factor of two.

I pointed this to someone by email, who disagreed with the claim of a factor-of-20 uncertainty:

The lowest estimate I am aware of was A.L. Kroeber’s 8 million estimate in 1938, based on backward extrapolation from 1800s populations without any recognition of the effects of the effects of plagues on a greenfield population. Dobyns, reacting against Kroeber in 1966, pulled a 20:1 “depopulation ratio” from the first century of contact out of thin air. People who think hard about this guess at around 40 million, with some anchoring out of an unwillingness to be a weirdo, but not much.

If the best estimate is 40 million, then a factor-of-20 uncertainty might correspond to a range of 40*c(1/sqrt(20),sqrt(20)), that is, 9 million to 180 million.

This motivated Manning to look back at the literature:

It has been a while since I read Henige’s Numbers from Nowhere and if the 20:1 figure is out of date I will correct it. Some recent research that I have to hand which shows the magnitude of the problem:

David Henige, “Recent Work and Prospects in American Indian Contact Population,” History Compass 6/1 (2008) pp. 183–206, doi: 10.1111/j.1478-0542.2007.00490.x

In a recent canvass of hemispheric depopulation the Italian historical demographer Massimo Livi-Bacci is cautious in his numbers. For example, he thinks that Hispaniola’s contact population was ‘several hundred thousands’ rather than the 2 million to 8 million estimates proffered by various High Counters.

John C. Caldwell and Thomas Schindlmayr, “Historical Population Estimates: Unraveling the Consensus,” Population and Development Review, Vol. 28, No. 2 (June 2002), pp. 183-204 especially page 201 https://www.jstor.org/stable/3092809

It might be possible to work out homeostatic maximum carrying capacity but this has proved impossible even in pre-contact Australia, an island with a stable hunting and gathering system. Had it been possible to estimate Aboriginal population at first contact, something could have been done, but the modern estimates for 1788 vary by an order of five, from 300,000 to 1,500,000 (see Caldwell, Missingham, and Marck 2001). Pre-1492 estimates of Amerindian populations vary by at least the same multiple.

It is debated whether the population of Easter Island on European contact was around 15,000 or 3,000 which is also 5:1 http://dx.doi.org/10.1126/sciadv.ado1459

A recent takedown of the historical population estimates published by people like McEvedy and Jones is:

Timothy W. Guinnane, “We Do Not Know the Population of Every Country in the World For the Past Two Thousand Years,” The Journal of Economic History, Vol. 83, No. 3 (2023) pp. pp. 912 – 938 https://doi.org/10.1017/S0022050723000293 Preprint at https://www.repository.cam.ac.uk/items/6c7f7b04-c2c4-4e34-a3b1-bbe740ddb4df/full

Just looking at Manning’s reply, not looking up the references, I see the claim by Caldwell and Schindlmayr that “pre-1492 estimates of Amerindian populations vary by at least the same multiple,” with that “same multiple” referring to the “order of five” mentioned in the previous sentence.

So I edited my post and changed Manning’s “within a factor of 5” to “within a factor of 20.”

In comments, John Mashey wrote:

There’s been much analysis since Caldwell & Schindlmayr (2002) and Henige (2008). Koch, Brierly, Maslin, Lewis wrote “Earth system impacts of the European arrival and Great Dying in the Americas after 1492” (2018), https://www.sciencedirect.com/science/article/pii/S0277379118307261, 24p, including ~17 pages of analysis of the various population estimates, concluding:
“Our estimate of the number of people living in the Americas in 1492 CE is 60.5 million, with an interquartile range (IQR) of 44.8-78.2 million…”

Assuming for simplicity or approximation a normal distribution, the interquartile range is +/- 0.67 sd. The 95% range is +/- 2 sd. If we consider the upper and lower 95% points as a reasonable range of uncertainty, this would give a factor of (78.2/44.8)^3 = 5.3. So “a factor of 5” still seemed about right to me, if you trust that source. We can also look at the endpoints of this interval, which would be sqrt(44.8*78.2)*c(1/sqrt(5.3),sqrt(5.3)), which is 25 to 135 million.

Manning then released a new post returning to the question:

The Population of the Americas in 1492 is Disputed . . .

Colin McEvedy and Richard Jones . . . in their 1978 Atlas of World Population History . . . acknowledge disputes about the pre-Columbian population of the USA and Canada within a factor of 20, and disputes about the population of Mexico within a factor of 6. Their arguments for one end of the range are no more sophisticated than “it seems to be generally accepted” and that if the population of Mexico had been as high as 30 million, then the rate of decline which this implies would be an “improbability.” . . .

But 1978 is a long time ago, so if you prefer you can check a more recent survey. . . .

The introduction of European diseases caused much of North America to revert from fields and parkland to secondary forest, because the people who had been burning the brush to encourage deer or clearing forest to grow maize died or fled. [In their study published in 2023], Alexander Koch and colleagues wanted to guess how this impacted the world system, so they tried to estimate the population of the Americas in 1492. They began by surveying the literature and found a wide range of estimates:

• 2.2 to 52 million for Central Mexico (factor of 24)
• 2.3 million to 13 million for Yucatan (factor of 6)
• 1 to 20 million in Amazonia (factor of 20)
• 0.9 to 18 million in the USA and Canada (factor of 20)

They could not see any way to decide who was correct with the resources available to them and decided to take their omnium gatherum, assume the guesses reflected a statistical distribution, and find the middle of that distribution (“We included all the prior studies and did not make any judgement on their relative quality.”) . . . So both quick-and-dirty researchers in the 1970s, and a literature review in 2023, find estimates that differ by a factor of 6 to a factor of 20. . . .

As David Henige pointed out on 1998, most of these estimates start with numbers in the writings of early travellers or the records of colonial and post-colonial states. There are many reasons to question these numbers, such that sometimes a number in one story is based on a number in another story not a count, or that by the time any settler made a count many of the natives had been killed by disease, forced labour, expulsion from their homelands into wasteland, and plain old murder. Researchers then add, multiply, divide, and subtract the numbers in their sources until they feel right. . . . There has been no great improvement in these methods during my or my parents’ lifetime, although certainly sometimes a new source is discovered or archaeological work affects the arithmetic.

Given enough money and labour and equipment, archaeologists can survey large regions for evidence of settlements before European contact. This kind of research only covers a few areas, it works best for people with durable houses and pottery, and converting counts of potsherds or potholes to people is very difficult. It can be hard to tell the difference between a very large village, and a village where it was customary to rebuild your house in a new location every time the old one decayed. . . .

It is well known that when you ask people to give you a number for something, and there is a plausible minimum but no maximum, the numbers will be skewed upwards by people with big imaginations. Counts of First Nations and American Indian populations in the early 20th century generally provide a minimum for the pre-contact population of North America, although even then there are difficulties because the reintroduction of the horse and the spread of native and Old World crops let people live in places where few of them had lived before. . . . But it is easier to guess arbitrarily high numbers than arbitrarily low numbers, and so any attempt to take an average or a median will skew high. . . .

In ancient Afro-Eurasia, the only societies where we have good data are the Greco-Roman world, Egypt, and Han China. We can estimate the population of those societies within say a factor of 3 using contemporary censuses and very detailed archaeology. Evidence for those societies is no help in estimating the population of Hispaniola (Hati and the Dominican Republic) or Newfoundland in 1492, because the ways they lived and were organized were totally different. . . .

The abstract of Alexander Koch’s study emphasizes the middle range of their population estimate not the wide range in their sources (and Koch and colleagues explain why its hard to know historical populations, but you have to read the whole article to see their explanation). . . .

OK, so what to do about all this? I’m not sure! I see four big issues here:

1. It’s not clear what we should mean when we say that we know a positive number within a factor of X—even in the ideal situation when the uncertainty about X can be explained by a known probability distribution. I’m taking X to be the ratio of the upper and lower bounds of the 95% central uncertainty interval, but that’s just one way to define it.

2. The population of the Americas in 1492 is the sum of the population in 1492 of the land that is currently Canada, plus the population in 1492 of what is now the United States, plus the population in 1492 of what is now Mexico, plus the population in 1492 of what is now Hispaniola, plus the population in 1492 of what is now Cuba, etc etc etc. Our uncertainties in these numbers are, presumably, positively correlated, but not 100% correlated.

Koch et al. (2023) obtain their uncertainty about the total population by dividing the hemisphere into seven regions: Caribbean (“Most estimates are between 300,000 and 500,000 people”), Mexico (“estimates for central Mexico and Yucatan combined, which is considered representative for all of Mexico, range from less than 3 million to over 52 million with many at around 20 million”), Central America (“Estimates range from 0.8 million . . . to 10.8–13.5 million . . . Most estimates range between 4.75 million and 6 million), Inca Territory (“estimates range from 4.1 million to 43.8 million with a likely population of around 20 million, based on the sum of the most widely accepted figures for each of the regions”), Amazonia (“Estimates include 1.5-2 million based on an average of present-day densities . . . 3.2 million based on tribe-by-tribe counts . . . 5.5 million extrapolated from eastern Ecuador . . . and from 5.1 to 20 million . . . Recent findings . . . indicate larger populations, with most recent estimates ranging between 8 and 20 million people), North America (“The lower range . . . lies between 900,000 and 2.4 million . . . The highest estimate of 18 million . . . has been criticized . . . More recent estimates derived from geospatial interpolation of archaeological sites range between 2.8 million and 5.7 million), and the Rest of the Americas (“Venezuela with 600,000–1.5 million, Uruguay and Paraguay, estimated together as 285,000–1.1 million, and Argentina with 300,000–500,000 people . . . The total estimate for the remainder of the Americas is between 1.2 and 3.1 million”).

Then they put these numbers together . . . I’m not 100% sure what they do here. Here’s what they say:

National estimates within a region are cross-combined and their sums form a regional estimate for each of the seven regions. . . . Next, cross-combining and taking the sums of these regional estimates (combinations) gives a hemisphere-wide population frequency distribution, with the higher occurrence rate of similar results reflecting higher frequencies in the distribution.

I think this means that they’re doing a kind of bootstrap, where they’re taking the different estimates they’ve collected and using these to represent an uncertainty distribution for each region, and then they’re assuming independence of the uncertainties.

As noted above, assuming independence doesn’t seem right, and this makes me think they’re understating their total uncertainty.

3. The estimates are constructed by adding together numbers that come from rough extrapolations from crude models. I don’t know of any good alternatives here–given that this all happened in the distant past, “rough extrapolations from crude models” is pretty much the only game in town–but it’s still an issue. To stick with that Koch et al. paper, their distributions include some estimates that they themselves (Koch et al.) don’t seem to think are reasonable. On the third hand, all of these estimates are rough extrapolations so maybe the uncertainty ranges are too narrow. Just for example, can they really estimate the 1492 population of Argentina as “300,000–500,000”? Just intuitively this sounds way too precise. Their citations for that area come from papers from 1954 and 1976. OK, this is not a big deal–Argentina isn’t where all the people were living. The point is that it’s hard to know what to do of this mix of implausibly precise and implausibly extreme estimates for different regions.

4. There will always be a demand for precise numbers. This does not mean we should laugh at the estimates we have, just that we sometimes need to push against our desire for a quick number. And there are pressures in the other direction: as Manning discussed in his posts, there are various political reasons for people to want to give low or high numbers.

A factor of 20 (for example, 9 million to 180 million) seems soooo wide to me. I’m more comfortable with a factor of 5 (for example, 25 million to 125 million). But that might just come from my comfort with modern governmental statistics. Maybe 9 million and 180 million both are legitimate possibilities. I don’t have a good sense of what’s known here, let alone what we don’t know we don’t know.

To say the population of the hemisphere in 1492 could be known to within a factor of 2, that indeed doesn’t seem reasonable given the difficulty of estimating the population at that time in any given region.

What’s it all for?

At this point, the question arises: What do we plan to do with this number? Or, to put it more baldly, who cares what was the population of the Americas in 1492?

For sure, it makes sense for historians, epidemiologists, political scientists, economists, etc., to want to know what was the population in particular cities and regions of the Americas, to get a sense of what happened after contact with the Europeans.

But what are you supposed to do with the total population of the Americas? One thing you can do is add it to your estimates of the total population of every other region of the world to get an estimate of global population. A couple of my books have these data; here’s the relevant bit from Active Statistics:

I think I typed these numbers in from . . . the aforementioned Atlas of World Population History, which sits on my bookshelf—I saw it in a bookstore and bought it many years ago, just because I was curious about the topic. I’ve always been interested in statistics and was very excited upon discovering the Statistical Abstract of the United States as a teenager. So that’s one answer to who cares: stats nerds!

Notice, though, that the numbers in my table above, which were taken from that atlas, show no uncertainty. Kind of embarrassing for a statistics book, no?

Another way to say this is, if a number is particularly difficult to estimate, maybe you should question your need to estimate it. The world population in 1492, or even the population of the Americas at that time . . . what does it mean? There was so little interaction between different parts of the world, or even between different regions in the Americas, that these total population numbers have no particular meaning in themselves.

I mean, sure, there was a number, which we’ll never know or even be able to accurately estimate, of the total number of humans living in the Americas at midnight GMT on 1 Jan 1492 or whatever, but there’s nothing you can really with it. It would be like, oh, I dunno, what if you do a census of someone’s house and find that it contains 240 books, 35 DVDs, 5 magazines, 2 newspapers, 3 old vinyl records, and an 8-track tape? You could add these up and say that there are 286 news and entertainment items in the house, but this “286” isn’t really the answer to any question.

Getting back to the population of the hemisphere, this all came up because Manning used uncertainty in population to cast doubt on DeLong’s claims about historical rates of growth. This led us down the rabbit hole of trying to summarize uncertainty in this one particular population summary as a sort of proxy battle about the level of precision that can be expected from historical estimates of population, health, and economic production and consumption. Before the modern globalized era, the relevant numbers here are local and regional, not continental and global. I recommend going back and reading Manning’s post and also probably DeLong’s book, about which Manning has much to praise.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *