AWS Service Sector Industry Solutions has developed a feature that enables customers to efficiently locate and delete personal data upon request, helping them meet GDPR compliance requirements
The GDPR mandates organizations to obtain explicit consent before collecting personal data and provides individuals with the right to erasure.
The company needed a scalable, cost-effective solution to handle GDPR erasure requests.
The application manages extensive profile data across various services, and the process involved overcoming significant challenges related to data storage, retrieval, and deletion while also minimizing disruption to customers’ operations.
One of the primary design challenges was efficiently locating and purging profile data stored in Amazon S3, especially considering the terabytes of data involved.
For GDPR erasure of profile data in Amazon S3, the team built a custom solution predominantly using the Go programming language and aLambda function using AWS SDK for Pandas in Python.
The use of Parquet, a columnar storage format, allows Athena to query only the necessary columns rather than entire rows, as required with CSV files.
To achieve distributed mutex using DynamoDB, they used a custom mutex client.
The mutex client uses DynamoDB ConditionExpression to make sure the RVN has not changed from what was previously stored.
This solution can be adapted for other use cases requiring secure, distributed locking mechanisms or efficient data management across large datasets.