Modern advertising is based on a contradiction. The industry is based on finding ways to target consumers who are interested in products so that companies can more effectively get those products in front of them, but they have to do it without personally identifying those consumers. This comes down to a fundamental contradiction in human nature – people don’t like feeling as though Big Brother is looking over their shoulder, but they are more than happy to give away personal data for a reasonable incentive. If you doubt it, check your wallet – do you have a loyalty card for Safeway, Lucky’s, etc.? I do.
The information about buying habits collected with such tools is valuable because – despite all protests that people don’t like to be targeted – targeted advertising works. As a society, the compromise we’ve reached between these two seemingly irreconcilable facts is that collecting data for ad targeting is allowed, but if there’s any personally identifiable information (PII), it can only be collected and used with the end user’s explicit consent.
That’s fine for the small subset of companies that have that explicit permission, but for everyone else, the ability to target users without personally identifying them is critical. Modern programmatic advertising relies heavily on this data to be effective. On the web, these data sets are built by tying usage information to cookies.
Cookies are small text files that are stored in your web browser that websites can modify and add data to over time. For instance, to tell that you’ve already logged into your web-based email so you don’t have to re-login every time you open a new message. They’re a critical part of how the web functions, and without them, modern web apps would be impossible. They also allow advertisers to collect data about your spending and browsing patterns over time.
Of course, you can go into your browser preferences and delete them all at any time to give yourself a clean slate or even set your browser to delete them after each session automatically, but most people don’t bother. It’s a pretty good system; the users who care the most about privacy can easily take steps to preserve it, and companies get the data they need to make sure everyone else sees ads that are relevant to their interests. Those ads pay for the web services that people use for free – everything from Google to Facebook – and everyone goes home happy.
Thanks to a variety of causes (see my article from last year on Walled Gardens), most mobile activity takes place inside apps and not in the mobile web browser. No browser means no cookies. This is why Apple and Google introduced the ID for advertisers (IDFA) and Google advertising ID (GAID), respectively. These privacy-compliant identifiers can be reset at any time by the user (like a cookie, though fewer people know how to do it) and do not correspond to any PII. These identifiers are the center of mobile data models that companies can attach all sorts of information to – what types of ads you click on, what you do in apps, what you buy and so on. Providing these IDs means that app developers don’t have to rely on hardware IDs that cannot be reset, and in fact, Apple and Google both now enforce bans on apps that use those more permanent identifiers. So far so good.
Now if someone were to tie those device identifiers to a personally identifiable data point – a phone number or social security number, for instance – the ability to wipe the slate clean would be nullified, and all the private online behavior could be irrevocably tied to real-world behavior. Since targeting data is a commodity and many companies sell it (though not RadiumOne, the company with whom I'm employed), anyone with big pockets can get their hands on it. Picture employers retaliating against employees who donate to political candidates they dislike or a similar dystopian scenario.
In order to protect user privacy, those two data sets should never be merged, and if they are merged, the utmost care must be taken to ensure those data sets don’t leak. That works pretty well if only looking at two actors – the user and the service – but the more players added, the more complex the relationships and the bigger the fallout if anyone makes a mistake.
In addition to end users and app developers, there are a host of analytics solutions – tools that app developers use to handle and make sense of the mountain of big data they are collecting on their users. There are also plenty of advertisers who are now realizing that the mountain of data stored in the various analytics tools is potentially incredibly valuable given that it’s the last untapped source of targeting information available in the mobile ecosystem. Companies have an incentive to use those data sets with their advertisers to run more effective ads, acquire new users for less money and sell more stuff. The big problem is that most analytics tools were never intended for the data they are collecting to be used by advertisers – they were built for product managers and engineers to help people make better apps – and do not handle PII carefully enough.
For example, one common mobile analytics SDK aggregator (a tool that allows developers to send usage data to multiple tools without needing to implement multiple data collection libraries in their apps) does not have a way to turn off the flow of personally identifiable information when you integrate with them! They were built on the assumption that every analytics solution would want that personal info if the users are willing to provide it and don’t think it’s worth their time to provide a way to strip it out. Clearly, there is a lot of education left to do for folks in the analytics space. This is a problem.
Many data management and visualization companies seem to take the attitude that they aren’t concerned with whether they are storing personally identifiable information because they don’t use it themselves. They store it on behalf of the brands they work with and send it wherever the brand tells them to. To me, this feels like dodging responsibility. It’s only a matter of time before someone gets hacked and millions of consumers find their information available to anyone who wants it. As an industry, we need to not only make every effort to prevent security breaches but also to make sure that if and when breaches happen, the damage they can cause is limited. That means either not collecting PII at all or, at minimum, making sure user behavior data and personally identifying information are not stored in the same databases.
It’s particularly ironic because advertising companies don’t need PII, and most of them don’t want it. You don’t build campaigns around individuals, you build them around groups. It’s entirely possible to target groups of users who are interested in blue suede shoes with ads for those shoes without knowing their names or home addresses – the information simply isn’t relevant. So there should be no cost in efficiency from respecting user privacy. By tying cookies to privacy-compliant device identifiers like the IDFA and GAID, it’s possible to build comprehensive, cross-device interest profiles without ever personally identifying the individuals.
Consumers don’t seem to mind sharing their data with an app or service if they get something in return, but they don’t expect that their personally identifiable data will end up in some anonymous third party’s data warehouse as a result. And they definitely expect that data to be treated with respect. In the mobile analytics industry, we need to do a better job here.