Can scientific misconduct derail big data?


Scientists have a tough enough job trying to make a superstitious and paranoid public face reality.

And over the last 10 years, the number of retracted scientific papers increased tenfold, which won't help their cause when they try to get the world to believe what big data analytics are telling them.

This would be fine if they were retracted because better data or better processes proved them incorrect or inconclusive. Unfortunately, most were retracted because they were either purposely fraudulent or plagiarized, according to a New York Times article in October citing research from the journal Nature.  

Another study published in the Proceedings of the National Academy of Sciences, found that of 2,047 retracted papers in the biomedical and life sciences fields, three-quarters were retracted for misconduct.

Is it any wonder that climate change deniers find a receptive audience to their claims that the science is "unsettled" and there is no consensus on anthropogenic climate change. What does that portend for data scientists?

In The Atlantic this month, Edward Tenner, a historian of technology and culture, says big-data-powered science has an Achilles heel: software defects.

These defects could undermine the credibility of data science as much as fraud has. In fact, they too have been the cause of retracting published scientific papers. Tenner cited an example from the Department of Molecular Biology at the Scripps Research Institute, in which a hand-written program flipped two columns of data and inverted an electron-density map, causing several papers to be retracted from the journal Science.

He also cites sources who say that research conducted with corrupted software is a common occurrence.

Related Articles:
Traits of a good data scientist
Sure there's a talent shortage, but what are the talents?
Is big data the crack cocaine of millennial scientists?