Head of Research & Insight
At time of writing, we are one day away from the long-awaited GDPR “enforcement date” of May 25th. This is the date upon which European governments, and their appointed regulators, can begin to act against companies or individuals that they believe are not compliant with this new privacy legislation.
If you’re reading this post from a far-distant future in which all UK companies have been fined into oblivion by an unexpectedly litigious ICO, and the only viable trade remaining is as a data / privacy consultant, then this discussion may seem quaint to you (but thanks for reading the blog!). However, if you’re reading this at-or-near time of publication, you may be knee-deep in preparing your organisation for GDPR compliance.
A question that has come up repeatedly is whether Google Analytics data is considered “personal” data and so falls under the remit of GDPR or whether the data is considered to be fully-anonymous? If the former, what are the obligations for a data controller, and what are the options for ensuring compliance? There is certainly some interpretation required here, as the legislation is far from prescriptive. Here are one man’s thoughts.
The question of whether Google Analytics data counts as “personal data” is fraught with confusion, so let’s clear that up first.
Most people with any Google Analytics experience know that capturing personally identifiable data (PII) in the platform is not permitted. Some people might not be aware that this is not a legal issue, but simply related to Google’s terms of service. To complicate matters further, Google’s definition of PII does not match the EU’s definition of “personal data” under GDPR.
You can read more about how Google define PII here, but in a nutshell, it includes any data that could be used on it’s own to directly identify, contact, or precisely locate an individual. This differs from the European Union’s GDPR definition of personal data in one crucial regard, namely that the latter’s definition includes any data that could be used to indirectly identify an individual in combination with other data. Here’s the relevant passage:
"The GDPR applies to ‘personal data’ meaning any information relating to an identifiable person who can be directly or indirectly identified in particular by reference to an identifier.
This definition provides for a wide range of personal identifiers to constitute personal data, including name, identification number, location data or online identifier, reflecting changes in technology and the way organisations collect information about people."
The exact wording matters, because all Google Analytics data is captured with a corresponding “Client ID”, which allows GA to understand the behaviour of users across multiple sessions. You may not see it in all your reports, but it’s there in Google’s database. This Client ID would not be defined as PII by Google, but would it fall under the GDPR definition of “personal data”? The full answer is as frustrating as it is unhelpful: it depends.
To break it down further, the question at the heart of this problem is whether this data can be used in any way to identify an individual. It’s possible that if you run a website without any customised analytics, and without any ecommerce tracking, you might be able to legitimately claim that this Client ID is a totally anonymous means of differentiating between users.
But if you are tracking something like ecommerce transactions, the likelihood is that the data from your CRM, payment gateway or other sales databases could be cross-referenced with the details of the purchase in Google Analytics to identify this customer. The possibility of this process (whether it is carried out or not) would make all of the associated Google Analytics browsing data “personal data” for the purposes of GDPR. This is likely true for most Google Analytics properties, and is just one example – combining data sources will allow the identification of users in a lot of different circumstances.
Advanced users of Google Analytics might also be making use of the User ID function, which swaps the Client ID for custom identifier, most often from another platform like a CRM as a way of tying up customers’ identities across multiple systems. If you’ve read this far, it should come as no surprise that this data would absolutely be considered personal data, and fully subject to GDPR.
It’s worth consulting with a legal expert on exactly how you should be handling your customers’ personal data in a post-GDPR world, including Google Analytics (if you haven’t already, what have you been doing?!). But Google have recently released a couple of features specifically designed to help users of Google Analytics achieve GDPR compliance.
The first is the new Data Retention policy. This allows data controllers / property administrators to set limits on how long Google Analytics will store user data for, and defaults to 24 months for free Google Analytics properties. GDPR emphasises data minimisation, and the indefinite storage of user data is very much frowned-upon (if not outright illegal), which makes the ability to set a time limit on storing user data a very useful feature.
After data has passed the retention limits, the data that powers custom reporting, user segmentation and so on will be permanently deleted. Some data will remain in pre-aggregated tables (for very standard reports), but underlying user and event data will be gone forever. For obvious reasons, be extremely careful when setting this policy!
Next up is the User Deletion feature, currently only accessible through the management API. This does exactly what you’d expect based on the name, as it allows an admin to permanently delete user and event data associated with a certain Client ID or User ID. Again, very useful given the so-called “right to be forgotten”, but with a couple of significant limitations.
The first is that a user could request for their data to be deleted but finding out their exact Client ID may be time-consuming and potentially imprecise. This is obviously made significantly easier by setting the custom User ID referenced earlier, as it isn’t reasonably practical to ask a layman to provide the Client ID from their cookie on their device. If it is impossible to identify this Client ID, you’re back to questioning whether you are really dealing with personal data or not.
Secondly, users that are deleted via this mechanism will start being tracked if they ever visit the same website again. Even worse, if the user’s original cookies still exist on their devices, they may not be served a “cookie consent” message that if that is usually served on the first visit. Unfortunately, there are no simple workarounds to this, and this feature may cause as many problems as it solves.
The information given here is a very high-level summary of how GDPR will impact users of Google Analytics (and – important caveat – not written by a legal professional. Seek advice!). But you can already see how GDPR could dramatically increase the complexity of managing a Google Analytics property.
As with the “cookie law”, introduced in 2012, we are seeing various implementations and interpretations of GDPR by brands. In time, the industry will form a consensus and optimal solutions will emerge, for the issues mentioned above, and for other concerns such as gathering user consent. Until then, it’s vital to understand the issues at play, and take reasonable steps to move towards compliance.
We believe that moving too slowly in digital is the biggest risk your business faces. If you are ready to move faster in digital, we are here to help.