Documenting user consent according to GDPR
2020-10-04 Andreas Dewes
Article 7(1) of the GDPR states that if processing is based on consent, a data controller must be able to prove that this consent was actually given by the data subject. This sounds easy but can be tricky when asking for consent in a context where the data subject is not easily identifiable: For example, if we ask visitors of our website if we can collect and process their personal data, how can we prove that they have actually consented?
In order to answer that, we need to think about how we might identify visitors on our website. In general, people that open our website for the first time can be considered anonymous to us, as we don't have any way to identify them or know if they have visited our website before (unless we resort to unethical and illegal techniques like browser fingerprinting, which we won't of course). Now, in order to measure how well our website performs, we might want to place an identifier in our visitors' browsers so that we can re-identify them as they navigate through our website or visit it again at a later time. There are several ways to accomplish this, the most popular being to set a browser cookie: Our webserver or any script that runs on our website can instruct our visitors' browser to store such a cookie and then let us retrieve it again the next time those visitors open a page on our website. Third-party services like Google Analytics e.g. use such cookies to store persistent user IDs that they sent to their backend with event data from our visitors, which then enables them to link page visits to specific users.
Now, on a typical website, visitors might have their personal data processed by several third parties, each using their own cookie-based identifiers. To prove that consent was correctly obtained, the GDPR requires that we keep the visitors' consent records robustly linked to all of these processing activities. So let's analyze how we might do that!
Consent Records For Short-Lived Data Processing
If the only identifying information that we or our data processors have about our website visitors are the identifiers stored in their browsers' cookies, we might simply store information about their consent in a cookie (or a similar storage mechanism) as well. In that way, the consent records are robustly stored together with the other identifiers that we and our data processors have stored in the visitors' browser. This fulfills the requirement of the GDPR that consent records must be linked to the processing activities they govern. However, since we store the records on the client-side, we also need to ensure they cannot be easily manipulated. This can be done by e.g. cryptographically signing the records via an API before storing them.
Client-Side Storage Vs. Service-Side Storage
Most other CMPs always store consent records in a backend (which causes significant costs and efforts and is not very privacy-friendly). We don't believe that this is necessary for most short-lived data processing activities, as storing consent records directly on the client-side in those cases provides the same assurances as storing them in a backend. Here's why:
- For short-lived data processing activities like website analytics, we can only identify visitors via the information that is stored in their browsers.
- If we store a consent record for these activities in a backend, we still need to link it with the visitors and their identifiers that are stored in their browsers. We can do this by e.g. assigning a unique ID to the record and storing that ID in a browser cookie.
- If visitors delete their browser cookies (including the one that contains our consent record ID), we will have no way to plausibly associate the record with any particular visitor or data processing activity anymore, which effectively renders it useless.
- Since the consent record is only useful in conjunction with the data stored in the browser, we might as well store it directly there.
- We can make the client-side consent record resistant to tampering by cryptographically signing them, which offers the same assurances that we'd have when storing it in a backend.
- Neither the client-side nor the backend-based storage of consent records protects us against visitors that intentionally delete these records or record identifiers.
As you can see, for short-lived data processing activities, storing cryptographically signed consent records directly in the browser provides the same assurances that we'd get when storing them in a backend. The only scenario where backend storage could offer an advantage is if we'd link additional (quasi-)identifiers to the records. Some CMPs e.g. store the IP addresses of visitors in the records, hoping this might provide an additional way to link them with other processing activities. We think this is a doubtful practice, as storing IP addresses is not privacy-friendly and linking them to third-party processing activities would require those services to store visitors' IP addresses as well. Doing that would be highly problematic from a privacy point of view, as it could create permanently identifiable data that cannot be effectively controlled. Furthermore, services like Google Analytics "anonymize" IP addresses by removing the last 8 (or more) bits from them, so even if we would store the full IP address in a consent record we wouldn't be able to effectively associate it with most third-party processing data. Likewise, storing "anonymized" IP addresses in our consent records would be equally useless, as it won't provide us with a robust way of linking our consent records to other processing activities. The only way to robustly associate backend-stored consent records with third-party processing data would be to store the third-party identifiers (like e.g. the Google Analytics user ID) in the records as well. You might consider doing this but we also advise against it for short-lived data processing, as in our opinion the privacy risks that such a centralized storage of identifiers incurs strongly outweighs the benefits. That's why we recommend storing consent records for short-lived processing activities as cryptographically signed data directly on the client-side: It is privacy-friendly, keeps the records closely linked with the processing activities that they govern and offers the same assurances like a backend-based storage solution. If you absolutely have to store an additional copy of your records in a backend you may do so of course, just remember that those records are only useful if they can be linked back to the processing activities they govern.
Consent Records For Long-Lived Data Processing
On the other hand, if we identify visitors using long-lived identifiers - like a permanent user ID or e-mail address - and make those available to third-party services as well, we will need to ensure that our consent records are properly linked to these identifiers as well. For example, let's assume we want to analyze the behavior of logged in users of our web app: To do that, tools like Segment or Google Analytics enable us to link their internal identifiers to our own: We could e.g. associate the Google Analytics data of logged in users with unique IDs that we store in our backend for those users. If we do that, the analytics data of those users, which was pseudonymous before, will be permanently associated with those IDs. Hence, to properly document consent in this scenario, we also need to permanently associate the consent records with our internal user IDs, and keep them at least as long as we or our data processors keep the user-associated data. Storing the consent record on the client-side will no longer be sufficient in that case, as users might delete records there while their data is still being processed, making it impossible for us to prove that they have consented to that data processing. Therefore, records associated with long-lived identifiers should always be kept in a backend infrastructure and should be properly linked to all long-lived identifiers. Most consent management platforms (CMP) don't implement this correctly, since they often don't allow associating consent records with user identifiers in a way that properly links the records to processing activities.
When developing the consent records functionality for Klaro!, we made sure to take all GDPR requirements into account and go beyond what most other CMPs offer. Our approach works as follows:
- For short-lived data processing activities that rely on pseudonymous identifiers generated on the client (e.g. random user IDs stored in browser cookies), and which produce data that effectively becomes anonymous after these pseudonyms are deleted, we store consent records together with these identifiers in the client. To ensure no tampering can occur, we cryptographically sign them with a timestamp and an authenticated hash via our API. For customers that really want to keep these records in a backend (which we think isn't necessary), we offer that option as well.
- For long-lived data processing activities that rely on permanent identifiers such as user IDs stored in your own backend, e-mail addresses or any other kind of permanent personal information, we enable you to robustly link consent records to those identifiers via a token mechanism: First, when a user logs into your web app or website, you create a user record via our API, where you store information you need to associate that user with all relevant identifiers (this could e.g. be an internal, opaque ID that links a consent record to a user entry in your own backend database). Our API then provides you with a cryptographic token that links to the created user data, and which you can safely forward to your frontend. When your users grant or decline consent via Klaro! in the frontend, this token gets passed to the Klaro! API again, which will use it to link the newly created consent record to the user data that you created before. You can then look up the consent record using the securely provided user data, and ensure that it is robustly linked to all internal identifiers you use, without having to provide any sensitive user data to us.
To summarize, when storing consent records we need to ensure that they are robustly linked to the data subjects and the processing activities that we carry out. As we have seen, this does not automatically imply the storage of these records in a backend infrastructure: If we collect short-lived data that is only tied to pseudonymous identifiers stored in client devices, like browsers, and if we don't have any means of identifying individuals in this data except using these same identifiers, it is more privacy-friendly to store the consent records directly in the client devices as well. If, however, we use more permanent identifiers in our data processing, we need to make sure that the consent records are also long-lived and robustly linked to these identifiers, which often makes it necessary to store them in a backend infrastructure. Klaro! supports both of these requirements with a consent record mechanism that ensures compliance and is designed with privacy as a default.