Data scientists create code of professional conduct


Calling what's been happening lately in big data "data science malpractice" is a bit of an understatement. Given the extent of absolute and universal disregard for individual privacy, many would dub it more criminal-like than simple malpractice. But in any case, and much to their credit, a group of data scientists have joined forces to create a code of professional conduct to combat the dark side of big data.

The group I am referring to is the Data Science Association, a Denver based non-profit started just two months ago. Already it has 500 members.

 "Things were really getting out of control in terms of the definition of 'data science,'" said Michael Walker, president of the Association, in InformationWeek. "A lot of people who really weren't data scientists started calling themselves data scientists. And I saw a lot of data science malpractice in the companies, or clients, that we work with."

But their search for and efforts to confirm integrity within data projects and job title claims is not the extent of their work.

"A lot of vendors are making outlandish claims," continued Walker in that same article. "Spend hundreds of thousands, or millions, of dollars on our new technology, feed it with data, push a couple of buttons and--voilà!--you're going to get predictive analytics and a competitive advantage." The din is growing louder, he added, as more big data tools hit the market. "I can tell you, it's just malarkey. It doesn't work that way," said Walker. "It's actually very difficult to analyze data, especially large data sets, and use the scientific method in the right way to get valuable, actionable intelligence to help your company, or to help [government] policymakers make better policy."

While vendors may not like what that statement could lead to in potential vendor integrity rankings in the future, there is no doubt that common everyday people will likely cheer the new Data Science Code of Professional Conduct. Particularly passages such as this:

"If a data scientist reasonably believes a client is misusing data science to communicate a false reality or promote an illusion of understanding, the data scientist shall take reasonable remedial measures, including disclosure to the client, and including, if necessary, disclosure to the proper authorities. The data scientist shall take reasonable measures to persuade the client to use data science appropriately."

Three cheers for the return of integrity and the data scientists who are willing to stand firm for the cause.  

For more:
- see The Data Science Association's website
- see the Data Science Code of Professional Conduct page
- see the InformationWeek article

Related Articles:
University of Virginia launches Big Data Institute, includes ethics
10 ways to prevent big data drop-outs and disillusionments
The big question in big data is 'why are you asking?'
Biggest challenge to big data projects? Human bias