GDPR compliance workflows for headless content teams
A GDPR erasure request in a headless stack must propagate across the CMS database, CDN edge caches, static build artifacts, and downstream services — and any one of them can keep serving PII after the others have deleted it. This page builds an idempotent, auditable pipeline for deletion (Article 17, Right to Erasure), export (Article 20), and consent updates that won’t leave orphaned references or stale hydration states.
Why erasure leaks PII
Three default behaviors break deterministic deletion:
- Soft deletes. Most platforms soft-delete to preserve editorial history, leaving
deletedorstatus: archivedflags that still surface in GraphQL/REST responses. - Edge caching. CDNs cache JSON by
Cache-Controland stale-while-revalidate, so pre-rendered pages serve PII long after CMS-side deletion. - No transactional cascades. Federated schemas rarely enforce cascades across content types, so removing a user leaves dangling references in
author,commenter, orbilling_contactthat break hydration.
PII embedded directly in rich text or custom JSON blocks bypasses schema validation entirely, making automated extraction and redaction expensive and error-prone — which is why Headless CMS Architecture & Platform Selection matters for regulated environments.
Resolution
- Isolate PII at the schema level. Move PII into a dedicated
UserPIIorConsentRecordtype referenced relationally, with strict typing and field-level access controls — never inline text. - Orchestrate from webhooks. Subscribe to
entry.delete,entry.update, andasset.delete, routing events to a stateful service that checks an idempotency key before acting. - Cascade, then hard-delete. Two phases: mark the record
pending_erasureand nullify relational fields via GraphQL, then hard-delete once all dependents resolve. - Invalidate by tag. Purge edge caches with tag-based invalidation (
purge-by-tag: pii-user-{id}) and regenerate only affected routes. - Log to tamper-evident storage. Record webhook receipt, mutation, purge, and build completion with retention aligned to legal requirements, per Enterprise CMS Governance & Compliance.
The erasure request fans out across every layer that can retain PII, gated by an idempotency lock and an append-only audit log:
sequenceDiagram
participant CMS as CMS webhook
participant Svc as Orchestrator
participant Redis as Redis lock
participant GQL as GraphQL API
participant CDN as CDN + ISR
participant Log as Audit log
CMS->>Svc: entry.delete (signed)
Svc->>Redis: SET lock NX (24h)
Redis-->>Svc: acquired
Svc->>GQL: cascade nullify relations
Svc->>Log: relations_nullified
Svc->>CDN: purge-by-tag + revalidate routes
Svc->>Log: cache_purged
Svc->>GQL: hard delete entry
Svc->>Log: hard_delete_executed
Svc-->>CMS: 202 Accepted
Implementation
The orchestration service below assumes Node.js, Redis for idempotency tracking, and fetch.
1. Webhook ingestion and idempotency guard
Webhooks retry on network timeouts; an idempotency key stops duplicate cascades.
// src/gdpr/webhook-handler.ts
import { createHmac, timingSafeEqual } from 'node:crypto';
import { Redis } from 'ioredis';
import { fetch } from 'undici';
const redis = new Redis(process.env.REDIS_URL!);
const CMS_WEBHOOK_SECRET = process.env.CMS_WEBHOOK_SECRET!;
export async function handleCMSWebhook(req: Request): Promise<Response> {
const signature = req.headers.get('x-cms-signature');
const payload = await req.text();
// Verify webhook authenticity
const hmac = createHmac('sha256', CMS_WEBHOOK_SECRET).update(payload);
const expected = `sha256=${hmac.digest('hex')}`;
const sigBuf = Buffer.from(signature || '');
const expectedBuf = Buffer.from(expected);
// timingSafeEqual throws on length mismatch, so guard length first
if (!signature || sigBuf.length !== expectedBuf.length || !timingSafeEqual(sigBuf, expectedBuf)) {
return new Response('Invalid signature', { status: 401 });
}
const { type, entryId, idempotencyKey } = JSON.parse(payload);
if (type !== 'entry.delete' && type !== 'entry.update') {
return new Response('Ignored', { status: 200 });
}
// Idempotency check (24h TTL)
const exists = await redis.set(`gdpr:lock:${idempotencyKey}`, '1', 'EX', 86400, 'NX');
if (!exists) return new Response('Already processed', { status: 200 });
try {
await executeErasurePipeline(entryId, idempotencyKey);
return new Response('Accepted', { status: 202 });
} catch (err) {
await redis.del(`gdpr:lock:${idempotencyKey}`); // Release lock for retry
console.error('Pipeline failed:', err);
return new Response('Processing error', { status: 500 });
}
}
2. Cascade nullification
Sanitize relational fields before hard deletion. This mutation returns affected node IDs for downstream invalidation.
async function nullifyRelations(userId: string) {
const mutation = `
mutation CascadeNullify($userId: ID!) {
updateArticles(where: { author: { id: { eq: $userId } } }, data: { author: { disconnect: true } }) { count }
updateComments(where: { commenter: { id: { eq: $userId } } }, data: { commenter: { disconnect: true } }) { count }
}
`;
const response = await fetch(process.env.CMS_GRAPHQL_ENDPOINT!, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.CMS_ADMIN_TOKEN}`
},
body: JSON.stringify({ query: mutation, variables: { userId } })
});
const { data, errors } = await response.json();
if (errors) throw new Error(`GraphQL cascade failed: ${JSON.stringify(errors)}`);
return data;
}
3. CDN tag invalidation and ISR trigger
Use tag-based purges, not blanket ones, to preserve cache hit ratios.
async function invalidateEdgeCache(userId: string) {
// Fastly / Cloudflare / Vercel compatible tag purge
const purgePayload = {
tags: [`pii-user-${userId}`, `profile-route-${userId}`],
soft_purge: false
};
const res = await fetch(`${process.env.CDN_API_URL}/purge`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.CDN_API_TOKEN}`,
'Content-Type': 'application/json'
},
body: JSON.stringify(purgePayload)
});
if (!res.ok) throw new Error(`CDN purge failed: ${res.status}`);
// Trigger ISR for affected routes only
await fetch(`${process.env.NEXT_PUBLIC_SITE_URL}/api/revalidate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ paths: [`/users/${userId}`, `/profile/${userId}`] })
});
}
4. Audit trail
Append each step to an append-only log with a structured metadata hash for tamper evidence.
async function logAuditStep(idempotencyKey: string, step: string, status: 'success' | 'failed', details: Record<string, unknown>) {
const entry = {
timestamp: new Date().toISOString(),
idempotencyKey,
step,
status,
details,
hash: createHmac('sha256', process.env.AUDIT_SECRET!).update(`${step}${status}`).digest('hex')
};
await fetch(`${process.env.AUDIT_SERVICE_URL}/entries`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${process.env.AUDIT_TOKEN}` },
body: JSON.stringify(entry)
});
}
5. Pipeline orchestration
Wire the steps together with explicit error boundaries.
async function executeErasurePipeline(userId: string, idempotencyKey: string) {
try {
await logAuditStep(idempotencyKey, 'pipeline_start', 'success', { userId });
await nullifyRelations(userId);
await logAuditStep(idempotencyKey, 'relations_nullified', 'success', { userId });
await invalidateEdgeCache(userId);
await logAuditStep(idempotencyKey, 'cache_purged', 'success', { userId });
// Final hard delete via CMS REST/GraphQL API
await fetch(`${process.env.CMS_API_URL}/entries/${userId}`, {
method: 'DELETE',
headers: { 'Authorization': `Bearer ${process.env.CMS_ADMIN_TOKEN}` }
});
await logAuditStep(idempotencyKey, 'hard_delete_executed', 'success', { userId });
} catch (error) {
await logAuditStep(idempotencyKey, 'pipeline_failure', 'failed', { error: (error as Error).message });
throw error;
}
}
Validation and debugging
Idempotency and race conditions
Fire concurrent webhook simulations with ab or k6 and watch Redis keys:
redis-cli KEYS "gdpr:lock:*"
Duplicate keys mean the NX flag isn’t taking effect in your client. Wrap long-running mutations in an AbortController so deploy rollouts don’t strand zombie processes.
Stale cache states
When PII persists after deletion, check edge cache headers — x-cache and surrogate-key:
curl -I https://your-domain.com/users/123
If Cache-Control still carries max-age values above your compliance SLA, fix the CMS response headers or strip caching directives for PII routes in middleware. See MDN Cache-Control for precedence rules.
Rich-text PII leakage
For PII in rich text (@mentions, email strings), add a pre-save hook that scans AST nodes — regex or a lightweight NLP scanner — and redacts before the write hits the database. Send flagged entries to a quarantine queue for manual review.
Data portability export
For Article 20, invert the deletion pipeline: query relational references, serialize to JSON-LD, checksum, and deliver via signed, time-limited URLs. Exclude third-party tracking identifiers and respect consent scopes.
Schema isolation, webhook idempotency, cascade nullification, and tag-based invalidation turn GDPR compliance from a reactive liability into an auditable workflow.