Doing evil with data: a beginner's guide
The concept of evil has been co-opted by spiritualists and makers of horror films to represent something otherworldly, an amoral force impressing its will from beyond. But evil is often simply a choice. It is a choice among humans deciding how they want to wield a new-found power or advantage. Big data presents such an advantage and there will be those who choose to use it for public and private benefit, and those who purposely choose to apply it in ways that harm others and benefit only themselves.
Since it is often easier to choose evil, and in the opinion of data scientists Duncan Ross and Francine Bennett, sometimes more fun, the two offered last week a beginner's guide for using data for evil.
Why choose evil? It is not only much more fun, it gets you paid a lot more they said, as a lead into their spoof and cautionary tale on becoming an evil overlord of data without really trying.
Duncan Ross has been a data miner since the mid-1990s. He now leads the international data science team at Teradata. Francine Bennett is a data scientist and CEO and co-founder of Mastodon C, which sells an open source technology platform and skills for big data analytics.
Bennett says it doesn't take much for data scientists to be evil. "A talented data scientist can be evil and it doesn't have to be about big things. We all have the ability to make the world a tiny bit worse by our behaviors," she said.
Ross says to do evil, one has to think a little bit about the role of data in the world at the moment and the view that the media takes of data. He accuses the media of having a built-in scare factor when it comes to data. "One of the easiest ways you can be evil ... is simply by not doing good. You are encouraging the media to think about data and databases in terms of 1984," he said.
Ross warned potential evil doers--who don't even have to be hackers--that it is easy to make the world a slightly better place by building better predictive models. So, he and Bennett offered tips for steering them away from common mistakes that might inadvertently result in doing good. The first mistake people make when trying to do evil, Bennett said, is thinking too hard about the impact of what they do. One way to avoid this is to never measure the results of what you do nor think about the human beings behind it.
"Analysis can have quite a lot of affect on people's behavior, so by using models and data in a very smart way ... we can make people do things they normally wouldn't want to do," Bennett said.
Another lesson for evildoers, according to Ross, is that if you want to do maximum evil, make sure data scientists are kept as far away from the business end of things as possible, and that they don't care about the differential impact on others.
Another mistake people trying to do evil make is anonymizing data really well. You can be properly evil and reveal people's dirty secrets pretty well with data they have already freely shared with you. You can find out people's health status, sexuality and even bank account information without even doing any hacking, Bennett said.
They gave the example of a telco which used frequently called numbers and location data to surmise when people fell in love, got married or moved in together and ultimately started their extramarital affairs based on calling patterns and time-of-day analysis.
Bennett said it is really hard to guarantee anonymity, so the more data you link together, the easier it is to work backwards to get information on people that is worth its weight in gold.
- see O'Reilly Strata Webcast