Opinion /

How data breach can profile and track you down

‘Stravagate’ shows that the confidentiality of information is a public good that cannot be regulated by millions of individual choices

Last updated: November 05, 2018 | 15:10

By Zeynep Tufekci

4 MIN READ

Add as a preferred source on Google

AFP

Did you make a New Year’s resolution to exercise more? Perhaps you downloaded a fitness app to help track your workouts, maybe one that allows you to share that data online with your exercise buddies?

If so, you probably checked a box to accept the app’s privacy policy. For most apps, the default setting is to share data with at least the company; for many apps the default is to share data with the public.

But you probably didn’t even notice or care. After all, what do you have to hide?

GPS data points

For users of the exercise app Strava, the answer turns out to be a lot more than they realised.

Since November, Strava has featured a global “heat map” showing where its users jogged or walked or otherwise travelled while the app was on.

The map includes some three trillion GPS data points, covering more than 5 per cent of the earth. Over the weekend, a number of security analysts showed that because many American military service members are Strava users, the map inadvertently reveals the locations of military bases and the movements of their personnel.

Perhaps more alarming for the military, similar patterns of movement appear to possibly identify stations or airstrips in locations where the United States is not known to have such operations, as well as their supply and logistics routes.

Analysts noted that with Strava’s interface, it is relatively easy to identify the movements of individual soldiers not just abroad, but also when they are back at home, especially if combined with other public or social media data.

Apart from chastening the cybersecurity experts in the Pentagon, the Strava debacle underscores a crucial misconception at the heart of the system of privacy protection in the US.

The privacy of data cannot be managed person-by-person through a system of individualised informed consent. Data privacy is not like a consumer good, where you click “I accept” and all is well.

Collective response

Data privacy is more like air quality or safe drinking water, a public good that cannot be effectively regulated by trusting in the wisdom of millions of individual choices.

A more collective response is needed.

Part of the problem with the ideal of individualised informed consent is that it assumes companies have the ability to inform us about the risks we are consenting to.

They don’t. Strava surely did not intend to reveal the GPS coordinates of a possible Central Intelligence Agency annex in Mogadishu, Somalia — but it may have done just that.

Even if all technology companies meant well and acted in good faith, they would not be in a position to let you know what exactly you were signing up for.

Another part of the problem is the increasingly powerful computational methods called machine learning, which can take seemingly inconsequential data about you and, combining them with other data, can discover facts about you that you never intended to reveal.

For example, research shows that data as minor as your Facebook “likes” can be used to infer your sexual orientation, whether you use addictive substances, your race and your views on many political issues.

This kind of computational statistical inference is not 100 per cent accurate, but it can be fairly close — certainly close enough to be used to profile you for a variety of purposes.

A challenging feature of machine learning is that exactly how a given system works is opaque.

'Informed consent'

Nobody — not even those who have access to the code and data — can tell what piece of data came together with what other piece of data to result in the finding the programme made.

This further undermines the notion of informed consent, as we do not know which data results in what privacy consequences. What we do know is that these algorithms work better the more data they have.

This creates an incentive for companies to collect and store as much data as possible, and to bury the privacy ramifications, either in legalese or by playing dumb and being vague.

What can be done?

There must be strict controls and regulations concerning how all the data about us — not just the obviously sensitive bits — is collected, stored and sold. With the implications of our current data practices unknown, and with future uses of our data unknowable, data storage must move from being the default procedure to a step that is taken only when it is of demonstrable benefit to the user, with explicit consent and with clear warnings about what the company does and does not know.

And there should also be significant penalties for data breaches, especially ones that result from underinvestment in secure data practices, as many now do.

Companies often argue that privacy is what we sacrifice for the supercomputers in our pockets and their highly personalised services. This is not true.

While a perfect system with no trade-offs may not exist, there are technological avenues that remain underexplored, or even actively resisted by big companies, that could allow many of the advantages of the digital world without this kind of senseless assault on our privacy.

With luck, stricter regulations and a true consumer backlash will force our technological overlords to take this issue seriously and let us take back what should be ours: True and meaningful informed consent, and the right to be let alone.

— New York Times News Service

Zeynep Tufekci, an associate professor at the School of Information and Library Science at the University of North Carolina.

Up Next

How data breach can profile and track you down

GPS data points

Collective response

'Informed consent'

Get Updates on Topics You Choose