Handling URL redirects during CMS decoupling
URL fragmentation is the first production risk when you decouple a monolithic CMS: legacy routing emits paths that don’t match modern framework conventions, so without an explicit mapping the cutover produces 404s, drops inbound link equity, and breaks editorial links. This guide covers intercepting, normalizing, and routing legacy URLs through the transition — edge static redirects, dynamic middleware fallbacks, and CI validation for zero-downtime migrations.
Why the Paths Diverge
Monolithic systems resolve routes server-side via database slug lookups, often embedding query params, category hierarchies, or date prefixes. Headless frontends use file-system routing or dynamic catch-all routes. When the CMS stops serving HTML and starts returning JSON/GraphQL, the frontend router has no historical route table to consult. Two extra wrinkles: editors hardcode absolute paths in rich-text fields, and legacy redirects live in .htaccess or NGINX configs that don’t translate to edge networks. That gap between server-side resolution and edge/client routing has to be bridged explicitly. For the surrounding migration sequence, see Legacy System Decoupling Strategies.
Resolution Steps
The redirect layer is staged from precomputed edge rules down to a runtime fallback, then gated by CI:
flowchart TD
A["Legacy route inventory"] --> B["Define transformation rules"]
B --> C["Deploy edge static redirects"]
D["Incoming legacy path"] --> E{"Matches static redirect?"}
E -->|yes| F["301 / 308 at edge"]
E -->|no| G["Middleware: cached CMS slug lookup"]
G --> H{"Mapping found?"}
H -->|yes| I["308 redirect, preserve query"]
H -->|no| J["Fall through to 404"]
C --> K["CI: crawl inventory vs staging"]
K --> L{"All 301/308 to correct target?"}
L -->|no| M["Fail the build"]
L -->|yes| N["Promote"]
- Extract the legacy route inventory. Query the legacy database or parse
sitemap.xmlfor active paths. Normalize trailing slashes, lowercase variants, and strip tracking strings (?utm_*,?fbclid,?ref). Store as{ legacy: string, target: string }objects. - Define transformation rules. Map legacy patterns to headless equivalents: strip
/blog/prefixes, flatten date paths (/2023/04/15/post-slug→/post-slug), preserve UTM params through the redirect. - Deploy edge redirects first. Static redirects at the CDN layer give zero-latency routing for high-traffic paths and skip framework hydration entirely.
- Add a dynamic middleware fallback. For parameterized routes you can’t precompute, intercept unmatched paths in middleware, look up a matching slug, and issue a
301/308. - Validate in CI. Crawl the legacy URL set against staging, asserting
200/301responses and correct final destinations.
Production Configurations
Edge networks handle the bulk of legacy routing. Below are working configs for Vercel and Next.js middleware.
Vercel vercel.json static redirects
Intercept known legacy paths before they reach the runtime. 308 preserves the HTTP method and body — required for form submissions and API proxying during migration.
{
"redirects": [
{
"source": "/blog/:slug",
"destination": "/articles/:slug",
"permanent": true
},
{
"source": "/:year(\\d{4})/:month(\\d{2})/:day(\\d{2})/:slug",
"destination": "/articles/:slug",
"permanent": true
},
{
"source": "/legacy-campaign",
"destination": "/campaigns/2024-q1-launch",
"permanent": true
}
]
}
Vercel processes rules in declaration order, so place specific paths before wildcards. For syntax and edge cases, see the Vercel Redirects documentation.
Next.js middleware dynamic fallback
With thousands of legacy URLs or dynamically generated slugs, static config becomes unmanageable. Middleware intercepts at the edge, runs a cached CMS lookup, and redirects without rendering a page.
// middleware.ts
import { NextRequest, NextResponse } from 'next/server';
const LEGACY_PREFIXES = ['/blog', '/news', '/updates'];
const CMS_REDIRECT_ENDPOINT = process.env.CMS_REDIRECT_API;
export async function middleware(req: NextRequest) {
const { pathname } = req.nextUrl;
// Skip static assets, API routes, and Next.js internals
if (pathname.startsWith('/_next') || pathname.startsWith('/api') || pathname.includes('.')) {
return NextResponse.next();
}
// Check if path matches legacy patterns
const isLegacyPath = LEGACY_PREFIXES.some(prefix => pathname.startsWith(prefix));
if (!isLegacyPath) return NextResponse.next();
// Extract slug and query parameters
const pathSegments = pathname.split('/').filter(Boolean);
const slug = pathSegments.pop();
if (!slug) return NextResponse.next();
// Query CMS for redirect mapping (cached at edge)
const response = await fetch(`${CMS_REDIRECT_ENDPOINT}/resolve?slug=${encodeURIComponent(slug)}`, {
headers: { 'Authorization': `Bearer ${process.env.CMS_API_TOKEN}` },
next: { revalidate: 300 } // 5-minute ISR cache
});
if (response.ok) {
const { targetPath } = await response.json();
if (targetPath) {
const url = req.nextUrl.clone();
url.pathname = targetPath;
// Preserve query parameters (UTMs, tracking)
return NextResponse.redirect(url, 308);
}
}
// Fallback to 404 if no mapping exists
return NextResponse.next();
}
export const config = {
matcher: ['/((?!_next/static|_next/image|favicon.ico).*)'],
};
The edge runtime minimizes cold starts; 308 preserves query strings and request methods, matching RFC 7538 for permanent redirects.
CI Validation
Manual spot-checks don’t scale to enterprise migrations. Add a route validation step to CI using a headless crawler. A Node script over undici or linkinator iterates the extracted inventory, hits staging, and asserts response codes.
// scripts/validate-redirects.mjs
import { readFileSync } from 'fs';
import { fetch } from 'undici';
const legacyRoutes = JSON.parse(readFileSync('./legacy-routes.json', 'utf8'));
const BASE_URL = process.env.STAGING_URL || 'https://staging.example.com';
const results = await Promise.all(legacyRoutes.map(async ({ legacy, target }) => {
const res = await fetch(`${BASE_URL}${legacy}`, { redirect: 'manual' });
const location = res.headers.get('location') || '';
const passed = (res.status === 301 || res.status === 308) && location.includes(target);
return { legacy, status: res.status, target, passed };
}));
const failures = results.filter(r => !r.passed);
if (failures.length > 0) {
console.error('Redirect validation failed:', JSON.stringify(failures, null, 2));
process.exit(1);
}
console.log('All legacy routes validated successfully.');
Fail the build on any 404 or multi-hop redirect chain. Google’s crawler depends on predictable chains, so following their redirect guidelines keeps search equity transferring cleanly.
Hardcoded Link Remediation
Decoupling exposes hardcoded absolute URLs buried in legacy rich-text fields. Run a pre-deploy script that parses the CMS export JSON, finds absolute paths matching legacy domains, and rewrites them to relative paths or dynamic link resolvers. For drafts, route preview URLs around the redirect layer with a ?preview=true flag that hits the staging API directly. Phase out legacy path references with editorial before the final cutover so internal navigation doesn’t break post-launch.
URL fragmentation is predictable and preventable. Prioritize edge static mappings, fall back to middleware for the dynamic cases, and enforce automated route validation — treat routing as a data pipeline, not an afterthought, and the transition stays clean across environments.