At The Echo Nest
we collect pretty much every bit of data about music we can find. We crawl millions of web pages about music every day, we keep track of the listening habits of millions of streaming-music-service users and we analyze the actual audio of hundreds of millions of songs by millions of artists. Then we try to make some sense of it all.
One of the many ways we try to organize all this information is by genre. We want to know what kinds of music there are in the world, which artists are making which kinds and how the genres relate to each other. Sometimes this is useful in itself (want to hear some Finnish hip hop? we can do that). Sometimes it's a way of cross-checking other data (if we think somebody is making Finnish hip hop, but we think they are from Thailand and were active from 1952 to 1961, at least one of those things is probably wrong.)
I overheard a conversation with a man representing a Silicon Valley company the other day. He was bragging about how they apparently had millions of phone conversations recorded. This bounty was the outcome of the allegedly 'free' service they had been providing. A vast trove of data, perhaps waiting to be merged with more data, or sold to a company that figures out how to monetize it. And those people using the service, lets call them ‘data mines’, these ‘data mines’ had received something without understanding the cost.
In my film Terms and Conditions May Apply, I show how companies have used contracts of adhesion to legally sweep up as much data as possible without meaningful consent. There is a notion that companies have a right to any data they can accrue, no matter how personal, no matter if the person surrendering the information is even aware it’s happening.