Don't forget about long data
Historically, among societies with no possibility for technological contact, those with larger initial populations have had faster technological change and population growth. We know this not just from big data analysis, but from something Samuel Arbesman said--in Wired this week--we need more of: long data.
Arbesman said big data as it is currently applied often looks at data sets that while large, still represent a snapshot in time. He says "we need to stop getting stuck only on big data and start thinking about long data."
The conclusion above regarding the correlation between population size and technology innovation came from such a study by Michael Kremer in The Quarterly Journal of Economics: "Population Growth and Technological Change: One Million B.C. to 1990." That's a long data set.
He also cited as an example of long data a study called "Four Thousand Years of Urban Growth," which contains datasets of city populations over millennia. While more immediate business applications may drive big data applications today, Arbesman said long data is needed because "We're a species that evolves over ages--not just short hype cycles--so we can't ignore datasets of long timescale. They offer us much more information than traditional datasets of big data that only span several years or even shorter time periods."
Also, by taking the equivalent of snapshots in time using big data on short timescales, results can be skewed by a shifting baseline, which changes the baseline or starting point for comparing future research with existing research.
Arbesman said we need to add long data to our big data toolkit not just for analyzing slow changes over time, but for fast changes as well. "Big data puts slices of knowledge in context. But to really understand the big picture, we need to place a phenomenon in its longer, more historical context.
For more on long data:
- see the Wired article