Putting on the legal straitjacket in data mining

New EU regulations try to make a strong case for protecting user rights

Last updated: September 15, 2018 | 20:04

4 MIN READ

Agency

A clash between European Union bureaucracy and artificial intelligence is a plot worthy of a cyberpunk thriller. It will take place in real life in 2018, once some European data protection laws, passed earlier this year, go into effect.

And, though we might instinctively be tempted to endorse progress over regulation, the EU is on the side of the angels in this battle.

The EU’s General Data Protection Regulation and a separate directive contain provisions to protect people against decisions made automatically by algorithms. “The data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her,” is how the regulation puts it. The directive, which will regulate police and intelligence work, is even tougher, prohibiting the use of data on ethnicity, religion, political views, sexual orientation or union membership to use any automated decisions.

The idea is that, as the regulation’s preamble says, the processing of personal data should be “subject to suitable safeguards”. In real-life terms, this means that if a bank denies a loan based on the algorithmic processing of a person’s data, an insurance company sets a high premium or a person is singled out for special police attention as a result of some blanket data-gathering operation like those Edward Snowden revealed at the National Security Agency, people should be able to challenge these decisions and have a human look into them. They should also be able to demand an explanation of why the decision was made.

“Wired” magazine — relying in part on a recent paper by two Oxford scientists, Bryce Goodman and Seth Flaxman — recently suggested that the new rules could affect the algorithms at the heart of Google and Facebook, which use artificial intelligence to target ads, provide relevant search results or shape a user’s news feed. It probably won’t be the case: Automated decisions will be allowed with a user’s explicit consent or where they are “necessary for the entering or performance of a contract between the data subject and a controller.”

That makes the Googles and Facebooks of this world immune as long as they don’t forget to require a user to approve a terms of use document that nobody ever reads.

Yet the EU is on the right track. It and other regulators should realise how machine learning technology is being used, and they should give citizens more power in their potentially losing battle with corporate and government applications.

Artificial intelligence can only be as impartial as the data sets on which it is based. Microsoft employee Solon Barocas and Yale’s Andrew Selbst recently published a seminal paper on how “big data” can incorporate bias. For example, the graduates of certain schools can be found to perform best in a certain job — but giving a high weight to this criterion will keep out qualified minority candidates because few of them attend these schools.

Even keeping obviously discriminatory variables — like the ones mentioned in the EU directive — out of the analysis may not straighten out the bias, because they are often correlated with seemingly innocuous variables like the area where a person lives or her shopping habits.

One could argue that humans are even worse than algorithms at making unbiased decisions. Unlike a piece of software, a human credit officer can pretend to be blind to an applicant’s race but remain a racist. Even people free of the most egregious biases may be worse at analysing their experience than an artificial neural network is at dissecting a data set.

Yet human decisions can at least be challenged, and an explanation can be demanded. It’s harder to do with a machine.

Neural networks, systems modelled on the human brain, find regularities in the data that no human has told them to look for: In that sense, they function like early proxies for a digital human brain. Yet, unlike the brain, they are black boxes: It’s difficult to ask them what affects their decisions.

“Neural networks, especially with the rise of deep learning, pose perhaps the biggest challenge” — what hope is there of explaining the weights learnt in a multilayer neural net with a complex architecture?” Goodman and Flaxman wrote.

There are ways to provide explanations and rectify biases. Algorithms can be developed to quantify the influence of various inputs in systems that process personal data and generate transparency reports. Yet these don’t work with deep neural networks yet.

It is the job of regulators to push researchers and software developers toward improving these techniques and making them as ubiquitous as artificial intelligence is getting to be. I would even argue that they should go further than that: People should have the right to determine exactly which bits of data they should allow any system to collect and process, and they should be able to opt out of certain algorithmic uses of their data.

Facebook, for example, keeps tweaking its news feed algorithm; most recently, it stressed putting the user in touch with people who are most like that user — a terrible concept for news organisations that use Facebook as a distribution platform, since it only serves to reinforce readers’ confirmation biases, dividing them into silos where they don’t come into contact with views other than their own. I would like to be able to opt out of submitting certain data that feeds this algorithm.

That would require an explanation from Facebook of what exactly its algorithm does an ability to act on that explanation.

Unfortunately, the new EU laws don’t go that far, only requiring algorithm-using organisations to account for decisions that produce “legal effects”. That’s something — especially if, by the time the regulations go into effect, the EU doesn’t lose the appetite for enforcing them.

Up Next

Putting on the legal straitjacket in data mining

Get Updates on Topics You Choose