“With great power, comes great responsibility.”
Comic book fans will quickly recognize this quote as the words inspiring Peter Parker to become Spider-man. Others will note that Voltaire said it first. Nevertheless, as important as this quote is to history or to Spidey’s future, I think it has even greater relevance for technologists in our data-driven world.
The amount of personal data collected by organizations is staggering. As Facebook and Cambridge Analytica taught us, the opportunity to abuse data is overwhelming. For these reasons, we need to embrace the concept of data ethics.
What is Data Ethics?
It’s a thin, thin line between proper use and abuse of data. As data science and related technologies evolve, so does the “art of the possible“. While data analysis is not new, we now have the ability to quickly process large amounts of data and make correlations and predictions using disparate data sets. The ease of these efforts creates numerous issues related to privacy, confidentiality, transparency, and identity.
As a result, data ethics emerged as a new branch of Ethics to focuses on the moral problems associated with:
- How data is generated, recorded, and shared;
- The way algorithms for machine learning and artificial intelligence use data;
- The data practices embraced by the public and private sectors.
Data ethics highlights the complexity of the ethical challenges posed by data science and big data analytics. Gartner previously predicted that one-half of business ethics violations would result from the improper use of big data. In short, our current ethical frameworks no longer apply to data and we must now think differently.
Privacy in Practice
Although generally slow to respond, various governments recently enacted laws to protect consumers. The General Data Protection Regulations (GDPR) in the EU has been described as “privacy by default”, giving citizens strict control of their data. Earlier this year, California passed the California Consumer Privacy Act (CCPA) to protect online privacy and personally identifiable information (PII). Now, the Federal Data Strategy is purporting to make ethical governance one of its core principles. Major corporations are also lining up as privacy advocates in hopes of shaping future legislation in the US.
Privacy may be making a comeback, but we still need data ethics to guide us towards “privacy by design”.
The UK is Leading the Way
Such is the case in the United Kingdom.
The UK created a Data Ethics Framework, as part of its National Digital Strategy. The framework sets clear guidelines for acceptable uses of government data, building in transparency and accountability. The audience is anyone that interacts with government data, from statisticians to policymakers to IT staff and beyond.
As Matt Hancock, the previous UK Secretary of State for Digital, Culture, Media and Sport, stated: “If we fail to preserve the values we care about in our new digital society, then our big data capabilities risk abandoning these values for the sake of innovation and expediency.” Essentially, the UK felt it necessary to document their societal values to ensure their efficacy in the new economy.
I concur with this sentiment and believe that it is time for an international code of data ethics.
Code of Data Ethics
The following tenets, based on the UK framework, examples from professional organizations, and my own experience as a CIO, form guidelines for the acceptable use of data as we fully engage in digital transformation.
- Behind the data is a person. Respect the individual when interacting with their data. Watch out for disparate impact based on blind spots and inherent biases.
- Clearly state what you plan to do with an individual’s data. Never attempt to trick them. Make it easy to understand your intentions and give consent.
- Don’t use data in ways it was not originally intended. Make additional disclosures if intentions change.
- Be transparent. Open your data to inspire trust.
- Maintain an audit trail for a dataset’s lineage. This way, anyone that interacts with it can know its history, including accuracy and quality, the context for its collection, and any related manipulations. This also supports ethics reviews and minimizes risk across the data supply chain.
- Consult experts if there’s any doubt that you may be in violation of laws or regulations. Also, remember that the law often lags technology and is the minimum standard, not all you should do to protect confidentiality and privacy.
- Use as little data as necessary to meet your need. Less data equals less risk.
- Use data insights responsibly. There are limits to the decisions we should make based solely on data without human involvement.
- Take a risk-based approach when securing data. Protect PII as if it’s your own. You can’t secure what you don’t know exists, so make information visible by finding hidden datasets.
These ideas are a starting point. A healthy balance between data technologies and privacy rights is possible, but requires an ongoing, international conversation.
Despite concerns, data ethics will not stifle innovation.
Innovation comes from a marriage of collaboration and empathy. Empathy born from thinking about how we use data and its impact on society, combined with a collaborative, open dialogue with citizens and customers around data ethics, will lead to innovation. Further, under current conditions, ethics can be a competitive advantage, much like “green” technology for environmental-minded companies.
There’s a historic mistrust of institutions, especially those that abuse the great power of information. Consequently, we all have a great responsibility to embrace data ethics. The future of our digital economy hangs in the balance.