GDPR compliance workflows for headless content teams

A GDPR erasure request in a headless stack must propagate across the CMS database, CDN edge caches, static build artifacts, and downstream services — and any one of them can keep serving PII after the others have deleted it. This page builds an idempotent, auditable pipeline for deletion (Article 17, Right to Erasure), export (Article 20), and consent updates that won’t leave orphaned references or stale hydration states.

Why erasure leaks PII

Three default behaviors break deterministic deletion:

  • Soft deletes. Most platforms soft-delete to preserve editorial history, leaving deleted or status: archived flags that still surface in GraphQL/REST responses.
  • Edge caching. CDNs cache JSON by Cache-Control and stale-while-revalidate, so pre-rendered pages serve PII long after CMS-side deletion.
  • No transactional cascades. Federated schemas rarely enforce cascades across content types, so removing a user leaves dangling references in author, commenter, or billing_contact that break hydration.

PII embedded directly in rich text or custom JSON blocks bypasses schema validation entirely, making automated extraction and redaction expensive and error-prone — which is why Headless CMS Architecture & Platform Selection matters for regulated environments.

Resolution

  1. Isolate PII at the schema level. Move PII into a dedicated UserPII or ConsentRecord type referenced relationally, with strict typing and field-level access controls — never inline text.
  2. Orchestrate from webhooks. Subscribe to entry.delete, entry.update, and asset.delete, routing events to a stateful service that checks an idempotency key before acting.
  3. Cascade, then hard-delete. Two phases: mark the record pending_erasure and nullify relational fields via GraphQL, then hard-delete once all dependents resolve.
  4. Invalidate by tag. Purge edge caches with tag-based invalidation (purge-by-tag: pii-user-{id}) and regenerate only affected routes.
  5. Log to tamper-evident storage. Record webhook receipt, mutation, purge, and build completion with retention aligned to legal requirements, per Enterprise CMS Governance & Compliance.

The erasure request fans out across every layer that can retain PII, gated by an idempotency lock and an append-only audit log:

sequenceDiagram
    participant CMS as CMS webhook
    participant Svc as Orchestrator
    participant Redis as Redis lock
    participant GQL as GraphQL API
    participant CDN as CDN + ISR
    participant Log as Audit log
    CMS->>Svc: entry.delete (signed)
    Svc->>Redis: SET lock NX (24h)
    Redis-->>Svc: acquired
    Svc->>GQL: cascade nullify relations
    Svc->>Log: relations_nullified
    Svc->>CDN: purge-by-tag + revalidate routes
    Svc->>Log: cache_purged
    Svc->>GQL: hard delete entry
    Svc->>Log: hard_delete_executed
    Svc-->>CMS: 202 Accepted

Implementation

The orchestration service below assumes Node.js, Redis for idempotency tracking, and fetch.

1. Webhook ingestion and idempotency guard

Webhooks retry on network timeouts; an idempotency key stops duplicate cascades.

TypeScript
// src/gdpr/webhook-handler.ts
import { createHmac, timingSafeEqual } from 'node:crypto';
import { Redis } from 'ioredis';
import { fetch } from 'undici';

const redis = new Redis(process.env.REDIS_URL!);
const CMS_WEBHOOK_SECRET = process.env.CMS_WEBHOOK_SECRET!;

export async function handleCMSWebhook(req: Request): Promise<Response> {
  const signature = req.headers.get('x-cms-signature');
  const payload = await req.text();

  // Verify webhook authenticity
  const hmac = createHmac('sha256', CMS_WEBHOOK_SECRET).update(payload);
  const expected = `sha256=${hmac.digest('hex')}`;
  const sigBuf = Buffer.from(signature || '');
  const expectedBuf = Buffer.from(expected);
  // timingSafeEqual throws on length mismatch, so guard length first
  if (!signature || sigBuf.length !== expectedBuf.length || !timingSafeEqual(sigBuf, expectedBuf)) {
    return new Response('Invalid signature', { status: 401 });
  }

  const { type, entryId, idempotencyKey } = JSON.parse(payload);
  if (type !== 'entry.delete' && type !== 'entry.update') {
    return new Response('Ignored', { status: 200 });
  }

  // Idempotency check (24h TTL)
  const exists = await redis.set(`gdpr:lock:${idempotencyKey}`, '1', 'EX', 86400, 'NX');
  if (!exists) return new Response('Already processed', { status: 200 });

  try {
    await executeErasurePipeline(entryId, idempotencyKey);
    return new Response('Accepted', { status: 202 });
  } catch (err) {
    await redis.del(`gdpr:lock:${idempotencyKey}`); // Release lock for retry
    console.error('Pipeline failed:', err);
    return new Response('Processing error', { status: 500 });
  }
}

2. Cascade nullification

Sanitize relational fields before hard deletion. This mutation returns affected node IDs for downstream invalidation.

TypeScript
async function nullifyRelations(userId: string) {
  const mutation = `
    mutation CascadeNullify($userId: ID!) {
      updateArticles(where: { author: { id: { eq: $userId } } }, data: { author: { disconnect: true } }) { count }
      updateComments(where: { commenter: { id: { eq: $userId } } }, data: { commenter: { disconnect: true } }) { count }
    }
  `;

  const response = await fetch(process.env.CMS_GRAPHQL_ENDPOINT!, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.CMS_ADMIN_TOKEN}`
    },
    body: JSON.stringify({ query: mutation, variables: { userId } })
  });

  const { data, errors } = await response.json();
  if (errors) throw new Error(`GraphQL cascade failed: ${JSON.stringify(errors)}`);
  return data;
}

3. CDN tag invalidation and ISR trigger

Use tag-based purges, not blanket ones, to preserve cache hit ratios.

TypeScript
async function invalidateEdgeCache(userId: string) {
  // Fastly / Cloudflare / Vercel compatible tag purge
  const purgePayload = {
    tags: [`pii-user-${userId}`, `profile-route-${userId}`],
    soft_purge: false
  };

  const res = await fetch(`${process.env.CDN_API_URL}/purge`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.CDN_API_TOKEN}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify(purgePayload)
  });

  if (!res.ok) throw new Error(`CDN purge failed: ${res.status}`);

  // Trigger ISR for affected routes only
  await fetch(`${process.env.NEXT_PUBLIC_SITE_URL}/api/revalidate`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ paths: [`/users/${userId}`, `/profile/${userId}`] })
  });
}

4. Audit trail

Append each step to an append-only log with a structured metadata hash for tamper evidence.

TypeScript
async function logAuditStep(idempotencyKey: string, step: string, status: 'success' | 'failed', details: Record<string, unknown>) {
  const entry = {
    timestamp: new Date().toISOString(),
    idempotencyKey,
    step,
    status,
    details,
    hash: createHmac('sha256', process.env.AUDIT_SECRET!).update(`${step}${status}`).digest('hex')
  };

  await fetch(`${process.env.AUDIT_SERVICE_URL}/entries`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${process.env.AUDIT_TOKEN}` },
    body: JSON.stringify(entry)
  });
}

5. Pipeline orchestration

Wire the steps together with explicit error boundaries.

TypeScript
async function executeErasurePipeline(userId: string, idempotencyKey: string) {
  try {
    await logAuditStep(idempotencyKey, 'pipeline_start', 'success', { userId });
    
    await nullifyRelations(userId);
    await logAuditStep(idempotencyKey, 'relations_nullified', 'success', { userId });

    await invalidateEdgeCache(userId);
    await logAuditStep(idempotencyKey, 'cache_purged', 'success', { userId });

    // Final hard delete via CMS REST/GraphQL API
    await fetch(`${process.env.CMS_API_URL}/entries/${userId}`, {
      method: 'DELETE',
      headers: { 'Authorization': `Bearer ${process.env.CMS_ADMIN_TOKEN}` }
    });
    await logAuditStep(idempotencyKey, 'hard_delete_executed', 'success', { userId });

  } catch (error) {
    await logAuditStep(idempotencyKey, 'pipeline_failure', 'failed', { error: (error as Error).message });
    throw error;
  }
}

Validation and debugging

Idempotency and race conditions

Fire concurrent webhook simulations with ab or k6 and watch Redis keys:

Bash
redis-cli KEYS "gdpr:lock:*"

Duplicate keys mean the NX flag isn’t taking effect in your client. Wrap long-running mutations in an AbortController so deploy rollouts don’t strand zombie processes.

Stale cache states

When PII persists after deletion, check edge cache headers — x-cache and surrogate-key:

Bash
curl -I https://your-domain.com/users/123

If Cache-Control still carries max-age values above your compliance SLA, fix the CMS response headers or strip caching directives for PII routes in middleware. See MDN Cache-Control for precedence rules.

Rich-text PII leakage

For PII in rich text (@mentions, email strings), add a pre-save hook that scans AST nodes — regex or a lightweight NLP scanner — and redacts before the write hits the database. Send flagged entries to a quarantine queue for manual review.

Data portability export

For Article 20, invert the deletion pipeline: query relational references, serialize to JSON-LD, checksum, and deliver via signed, time-limited URLs. Exclude third-party tracking identifiers and respect consent scopes.

Schema isolation, webhook idempotency, cascade nullification, and tag-based invalidation turn GDPR compliance from a reactive liability into an auditable workflow.