Using big data to improve health might seem like a great idea. The way private insurance works, though, it could end up making sick people — or even those perceived as likely to become sick — a lot poorer.

Suppose a company offers you an insurance discount and a free FitBit if you agree to share your data and submit to a yearly physical. You’re assured that the data will be used only in aggregate, never tied back to specific identities.

If that makes you feel safe, it shouldn’t. The way machine learning works, data can be used against individuals without being connected directly to names.

Remember that study about how likes on Facebook can indicate orientations? The researchers first looked at a pile of data from people with known orientations, to find patterns in liking behaviour. They then built a model that would go the other way, inferring anyone’s orientation from the stuff they happened to like — with a pretty high level of precision.

The same can be done with health. If an algorithm has enough data — on attributes such as weight, height, pre-existing conditions and exercise habits, as well as longer-term health outcomes — it can make predictions about the prospects of any given individual.

Wait a second, you might be thinking. Aren’t these tracking devices built to help me stay healthy? Well, some have challenged the idea that FitBit can reliably track sleep and heart rates.

Such devices seem better at figuring out whether you’ve been skipping your morning runs than they are at coaxing you to do them. Even IBM’s Watson has suffered a setback in its efforts to help doctors choose cancer treatments.

In the short term, there’s more money in profiling people as high-risk or low-risk than there is in solving their actual health problems. Granted, the information people provide to insurance companies might never be used against them personally. But it could ultimately be used against people like them.

Say, for example, left-handed people with vegetarian diets prove more likely to require expensive medical treatments. Insurance companies might then start charging higher premiums to people with similar profiles — that is, to those the algorithm has tagged as potentially costly.

The scope for such data-driven price discrimination will be even greater if insurers are once again allowed to charge extra for pre-existing conditions, an idea currently being debated in Congress.

Think about what that means for insurance. It’s meant to be a mechanism to pool risk — that is, to equalise the cost of protecting against unforeseen health problems.

But once the big data departments of insurance companies have enough information — including about online purchases and habits — they can build a minute profile about each and every person’s current and future health.

They can then steer “healthy” people to cheaper plans, while leaving people who have higher-risk profiles — often due to circumstances beyond their control — to pay increasingly unaffordable rates.

If we’re not careful, pretty soon it’ll be almost like there’s no insurance at all.

— Bloomberg