Deleting personal data

Deleting data in a database is often not easy.
And yet, this is what’s asked of us by the GDPR when the data is no longer needed or if a user requests it.

Since I regularly get objections about this obligation let’s take a tour of our options.

Soft delete is not the real thing

Can we just continue with ‘soft delete’ —flagging a record as ‘deleted’?
I get the argument that since deleting on a hard drive is in reality dereferencing, why not do the same in DB?
Well, it’s not really the same.
The data on the hard drive may still be there, but you need a specialized tool and people to retrieve it.
On the other hand, a simple flag on a record means that anyone in your organization who has access to the database can access the record.

Let me address all the concerns with that practice:

  1. you need to closely manage who has access to what data in your DB. Which you do, right?
  2. The record is still here, meaning it’s a liability to your organization. You need to provide for its security
  3. You’ve got data which is no longer “maintained,” which will get staled, obsolete
  4. Your database loses its coherence and is no longer qualitative
  5. Should your data be leaked —which I don’t wish to you— this “deleted record” will be leaked too.
    And how are you going to tell this user—that you are not allowed to talk to anymore— that his data has leaked?

Think about user’s rights

Now, how are you going to address this when your user/customer is going to exercise its right to erasure?
Are you going to say: “yes, of course, we will delete your record” and just flag it. So you’re just lying.
Or, are you going to state very clearly in your ToS that due to some technicality, you will not erase your customer records should he ask for?
Which goes against the GDPR requirements.

That said, I see a valid case for “soft delete”:

This is when your legal ground for processing is legitimate interest.
And your user is either objecting to the processing or asking to restrict the processing.
In that case, you are permitted to keep the data. But you are no longer allowed to process it.
Which means no one in your organization is allowed to access or read it. If it’s a profile page it should not return any data.
By the way, you really want to check with a privacy lawyer that you can use this legal ground.

Make sure the data is deletable

But there’s more to this than just making sure you can delete for sure a user’s record.
Not all user personal data are “deletable”.
It depends on the legal ground for processing. You may need to retain the data —due to legal obligation— for 10 years before it can be erased.
This is why mapping out your data, and the business process tied to it is so important.

The nitty-gritty of erasure

On the technical details for data erasure in a database,

  1. you want to maintain a separate table of deleted records ID so in case of DB restore from a backup you can replay the erasure
  2. No you don’t have to go into your backup to erase the data. But you must have a policy of retention for backups. Like 30 days.
    So you can say that right to erasure is fully effective in 30 days.
  3. In case of tables linked by foreign key, you should permit nullable key or rely on cascade delete.
    But in most cases, you’ll want to have a more robust workflow to delete your records.
    I highly recommend that you watch Actually deleting data, not just pretending to from DHH’s Basecamp.

I hope this will give you enough knowledge to make an informed decision about this particular case.

Learn the first easy steps to get started
Grab your 7 actionable steps cheatsheet