Haunted by Data

A very on-point lecture on the limitations and danger of the our digital mentality “collect everything and maybe we will use it later”.

One example the speaker touches on is the pharmaceuticals industry and how, because of the “big data” philosophy they bought into, they are now at a point of diminishing financial returns in new drug development:

This has been a bitter pill to swallow for the pharmacological industry. They bought in to the idea of big data very early on. The growing fear is that the data-driven approach is inherently a poor fit for life science. In the world of computers, we learn to avoid certain kinds of complexity, because they make our systems impossible to reason about.

Note that the speaker suggests that the diminishing returns result from the fact that computers need unambiguous rules in order to make sense of things. Thus, in order to create data models and make sense of the world, programmers have to throw out “certain kinds of complexity” which are inherently and naturally found in the realm of biological science. As the speaker states later on, “Nature is full of self-modifying, interlocking systems, with interdependent variables you can't isolate.”

Ultimately, what we are dealing with in our computer systems are humans and humans adapt. That means when you create a data model around a person, the model inevitably goes out the window because a person adapts and changes, not just naturally over time, but they also react and change according to the model that is enforced on them.

An example of how humans adapt to numerical requirements, as drawn from this transcript, is found in the anecdotal story of a nail factory. Once their was a nail factory. In the first year of their five year plan, the nail factory’s management evaluated employees by how many mails they could produce. As such, employees produced hundreds of millions of uselessly tiny nails. Seeing their mistakes, management changed their productions goals to measure nail weight rather than nail quantity. As a result, employees produced a single giant nail.

Perhaps this story seems unreal, but the speaker provides a less fictitious example of how humans adapt to the systems imposed on them and how that ultimately renders the collected data useless:

[An] example is electronic logging devices on trucks. These are intended to limit the hours people drive, but what do you do if you're caught ten miles from a motel? The device logs only once a minute, so if you accelerate to 45 mph, and then make sure to slow down under the 10 mph threshold right at the minute mark, you can go as far as you want. So we have these tired truckers staring at their phones, bunny-hopping down the freeway late at night. Of course there's an obvious technical countermeasure. You can start measuring once a second. Notice what you're doing, though. Now you're in an adversarial arms race with another human being that has nothing to do with measurement. It's become an issue of control, agency and power. You thought observing the driver’s behavior would get you closer to reality, but instead you've put another layer between you and what's really going on. These kinds of arms races are a symptom of data disease. We've seen them reach the point of absurdity in the online advertising industry, which unfortunately is also the economic cornerstone of the web. Advertisers have built a huge surveillance apparatus in the dream of perfect knowledge, only to find themselves in a hall of mirrors, where they can't tell who is real and who is fake.