· 17 min read
Dynamic OG Meta Tags in React SPAs: The Crawler Problem Nobody Talks About
Table of Contents
If you’ve ever shared a link from your React app on Twitter, LinkedIn, or Slack and watched it render a blank preview (no title, no description, no image), you’ve hit one of the most frustrating invisible walls in frontend development.
It’s not a bug in your code. It’s an architectural mismatch between how React SPAs work and how social media crawlers work. The fix isn’t obvious, which is why so many React apps ship with broken link previews for years without anyone noticing.
In this article I’ll explain why this happens and how to solve it without migrating to a different framework, using the approach I built for Anonfeedback: Netlify Edge Functions, with the expensive part, on-the-fly preview images, cached in object storage. By the end you’ll have a production-ready solution you can adapt to any app that needs a unique preview per route.
What Are OG Meta Tags?
OG stands for Open Graph, a protocol originally created by Facebook that’s now the de facto standard for link previews. When you paste a URL into Slack, Twitter, LinkedIn, iMessage, or Discord, the platform sends a bot to fetch your page and look for specific <meta> tags in the <head>:
<meta property="og:title" content="Your Page Title" />
<meta property="og:description" content="A short description of the page." />
<meta property="og:image" content="https://yourdomain.com/preview-image.png" />
<meta property="og:url" content="https://yourdomain.com/your-page" />
Twitter (now X) has its own variant, Twitter Cards, with twitter:title, twitter:description, twitter:image, and twitter:card tags that work the same way.
These tags are read from the raw HTML of the page. That word, raw, is the crux of the entire problem.
The SPA Problem: Why React Breaks Crawlers
When someone visits a typical React SPA:
- The browser requests
/some/route - The server returns
index.html: a nearly empty file with a<div id="root">and a<script>tag - The browser downloads and executes the JavaScript bundle
- React renders the UI, including any dynamic
<head>content
Step 3 is the problem. Social media crawlers don’t execute JavaScript. They fetch the raw HTML, parse it, and move on. By the time React would have injected your meta tags, the crawler is long gone. What the crawler actually sees is this:
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>My App</title>
<!-- No OG tags here -->
</head>
<body>
<div id="root"></div>
<script type="module" src="/assets/index-abc123.js"></script>
</body>
</html>
Every route in your app (/products/123, /articles/my-post, /forms/abc) returns this exact same index.html to the crawler. Same title, same missing description, same blank image.
Libraries like react-helmet are often suggested as the fix. They work great for real users in browsers, but they rely on JavaScript execution, so they don’t help crawlers at all.
This is the invisible wall: your app looks perfect in the browser, and every shared link looks like a blank card.
How Other Frameworks Handle This
Server-rendered frameworks don’t have this problem, and understanding why clarifies what we need to replicate.
Next.js runs a generateMetadata function on the server before sending HTML, so the crawler receives fully rendered tags. Astro bakes meta tags into static HTML at build time (or at request time for SSR routes). Nuxt and Angular Universal do the same via server-side rendering.
The common thread: the server produces complete HTML before the client receives it. A plain React SPA has no server rendering step. It’s static files on a CDN, and that’s exactly the gap we’re going to fill.
Your Options as a React Developer
If migrating away from your SPA isn’t realistic, you have a few paths:
- Migrate to Next.js or Remix. The nuclear option. Effective, but expensive for a mature codebase.
- Pre-rendering services like
react-snapor Prerender.io, which execute your JavaScript and cache the rendered HTML for crawlers. Works, but adds infrastructure cost and stale-cache headaches. - A dedicated rendering server running Puppeteer or Playwright to render pages on demand for bots. Effective but operationally heavy; you’re now maintaining a browser farm.
- Edge functions. Intercept requests at the CDN edge, fetch the metadata from your backend, inject it into the HTML, and return the modified response. Fast, cheap, and zero changes to your React app.
Option 4 is the one I landed on and the one this article covers.
The Edge Function Approach
Edge functions run at the CDN level, geographically close to the user, before the request reaches your origin. For a crawler requesting /forms/abc123:
Request -> Edge Function -> fetch metadata -> inject into HTML -> Response
- The edge function intercepts the request and extracts the resource ID (
abc123) - It calls your backend:
GET /api/public/forms/abc123/meta - It fetches the original
index.htmlfrom the CDN - It injects the OG tags into the
<head>and returns the modified HTML
The key insight: you don’t change your React app at all. The edge function sits in front of it, invisible to your frontend code and your users. Only the crawlers see the difference, and that’s exactly what we want.
Setting Up Netlify Edge Functions
Netlify Edge Functions run on Deno, not Node.js, which affects how you import modules and which APIs are available. They live in netlify/edge-functions/ and are configured in netlify.toml:
[build]
publish = "dist"
command = "npm run build"
[[edge_functions]]
function = "inject-og-meta"
path = "/forms/*"
[[edge_functions]]
function = "inject-og-meta"
path = "/articles/*"
Only the listed routes run the function; static assets, API calls, and everything else bypass it entirely. That matters for performance.
Here’s the skeleton:
// netlify/edge-functions/inject-og-meta.ts
import type { Context } from 'https://edge.netlify.com'
export default async function handler(
request: Request,
context: Context,
): Promise<Response> {
// Let non-HTML requests pass through (JS, CSS, images, API calls)
const acceptHeader = request.headers.get('accept') || ''
if (!acceptHeader.includes('text/html')) {
return context.next()
}
const response = await context.next()
const contentType = response.headers.get('content-type') || ''
if (!contentType.includes('text/html')) {
return response
}
const html = await response.text()
// TODO: fetch metadata and inject OG tags
return new Response(html, {
status: response.status,
headers: response.headers,
})
}
context.next() is the key call: it passes the request through to your static files on the CDN and hands you the response to modify in flight. You’re acting as middleware.
Fetching Metadata from Your Backend
The edge function needs a public API endpoint that returns metadata without authentication (crawlers aren’t logged in). Keep it lightweight: just the fields needed for OG tags.
interface PageMetadata {
title: string
description: string
imageUrl?: string
url: string
}
const DEFAULT_FALLBACK: PageMetadata = {
title: 'My App',
description: 'Welcome to My App.',
imageUrl: 'https://yourdomain.com/og-default.png',
url: '',
}
async function fetchPageMetadata(
resourceType: string,
resourceId: string,
apiBaseUrl: string,
): Promise<PageMetadata> {
try {
const res = await fetch(
`${apiBaseUrl}/api/public/${resourceType}/${resourceId}/meta`,
{ signal: AbortSignal.timeout(3000) },
)
if (!res.ok) return DEFAULT_FALLBACK
const data = await res.json()
return {
title: data.title || DEFAULT_FALLBACK.title,
description: data.description || DEFAULT_FALLBACK.description,
imageUrl: data.imageUrl || DEFAULT_FALLBACK.imageUrl,
url: data.url || '',
}
} catch {
// Network error, timeout, parse error: always fall back gracefully
return DEFAULT_FALLBACK
}
}
Three rules baked into that function:
- Always have a fallback. If your backend is down or the resource doesn’t exist, return your app’s generic title, description, and image rather than an error.
- Set a short timeout. Three seconds is generous. Fail fast and use the fallback.
AbortSignal.timeout()is available natively in Deno. - Never throw. An edge function that throws returns a 500. A missing OG tag is much better than a broken page.
To know which resource to fetch, parse the URL:
const ROUTE_PATTERNS = [
{ regex: /^\/forms\/([^\/]+)/, type: 'forms' },
{ regex: /^\/articles\/([^\/]+)/, type: 'articles' },
]
function extractResourceInfo(
pathname: string,
): { resourceType: string; resourceId: string } | null {
for (const pattern of ROUTE_PATTERNS) {
const match = pathname.match(pattern.regex)
if (match) {
return { resourceType: pattern.type, resourceId: match[1] }
}
}
return null
}
Adapt the patterns to your own URL structure. The point is to extract a type and an ID your backend understands.
Injecting OG Tags into the HTML
Once you have the metadata, find the <head> tag and insert the meta tags right after it:
function escapeHtml(str: string): string {
return str
.replace(/&/g, '&')
.replace(/"/g, '"')
.replace(/</g, '<')
.replace(/>/g, '>')
}
function buildOgTags(metadata: PageMetadata, requestUrl: string): string {
const title = escapeHtml(metadata.title)
const description = escapeHtml(metadata.description)
const imageUrl = metadata.imageUrl ? escapeHtml(metadata.imageUrl) : ''
const pageUrl = escapeHtml(metadata.url || requestUrl)
const tags = [
`<meta property="og:title" content="${title}" />`,
`<meta property="og:description" content="${description}" />`,
`<meta property="og:url" content="${pageUrl}" />`,
`<meta property="og:type" content="website" />`,
`<meta name="twitter:card" content="summary_large_image" />`,
`<meta name="twitter:title" content="${title}" />`,
`<meta name="twitter:description" content="${description}" />`,
]
if (imageUrl) {
tags.push(`<meta property="og:image" content="${imageUrl}" />`)
tags.push(`<meta property="og:image:width" content="1200" />`)
tags.push(`<meta property="og:image:height" content="630" />`)
tags.push(`<meta name="twitter:image" content="${imageUrl}" />`)
}
return tags.join('\n ')
}
function injectOgTagsIntoHtml(html: string, ogTagsHtml: string): string {
// Remove any existing OG tags first to avoid duplicates
const cleanedHtml = html
.replace(/<meta\s+property="og:[^"]*"[^>]*\/>/gi, '')
.replace(/<meta\s+name="twitter:[^"]*"[^>]*\/>/gi, '')
// Inject right after the opening <head> tag, preserving any attributes
return cleanedHtml.replace(/<head([^>]*)>/i, `<head$1>\n ${ogTagsHtml}`)
}
The escapeHtml function is critical. User-generated content (form titles, article names, product descriptions) can contain characters that break your HTML or, worse, open XSS vulnerabilities. Always escape before injecting.
Stripping existing OG tags first prevents duplicates: your index.html probably has default tags, and duplicate tags can make crawlers pick up the wrong values.
Do You Need to Cache the Metadata?
Every request to the edge function makes a network call to your backend’s metadata endpoint, and the instinct is to put a cache in front of it. Plenty of tutorials reach for Redis here. I didn’t, and it’s worth explaining why, because the honest answer is “it depends on what your endpoint does.”
For Anonfeedback the metadata endpoint is a single indexed MongoDB lookup that selects three or four fields and returns them as JSON. That responds in a few milliseconds, comfortably under the time the edge function already spends fetching the static HTML and rewriting it. Adding a cache would mean another moving part, another connection to manage from Deno, and a staleness window, all to shave a couple of milliseconds off an already cheap call. It wasn’t worth it.
If your metadata is expensive to produce (heavy joins, an upstream API, computed fields), that calculus flips and a cache earns its place. The pattern to reach for is cache-aside: check the cache first, on a hit skip the backend, on a miss fetch and store the result with a TTL. The cleanest place for it is usually inside your backend endpoint, using whatever Redis client you already have there. A backend cache hit responds in under 5ms and keeps your cache off the public internet. If you want it at the edge instead, note that Netlify Edge Functions run on Deno and can’t see your private network, so you’d need a store reachable over plain fetch, like Upstash Redis. Pick a TTL that matches how often the data changes: hours for articles, minutes for forms or listings.
The expensive work in this pipeline isn’t fetching text, though. It’s generating the preview image, and that’s the thing actually worth caching. We’ll get to it shortly, and the cache there lives in object storage rather than Redis for a reason.
Putting It All Together
With all the helpers above in one file, the main handler is short:
// netlify/edge-functions/inject-og-meta.ts
import type { Context } from 'https://edge.netlify.com'
// ... PageMetadata, DEFAULT_FALLBACK, ROUTE_PATTERNS, escapeHtml,
// extractResourceInfo, fetchPageMetadata, buildOgTags, injectOgTagsIntoHtml
export default async function handler(
request: Request,
context: Context,
): Promise<Response> {
const url = new URL(request.url)
const acceptHeader = request.headers.get('accept') || ''
if (!acceptHeader.includes('text/html')) {
return context.next()
}
const resourceInfo = extractResourceInfo(url.pathname)
if (!resourceInfo) {
return context.next()
}
const apiBaseUrl = Deno.env.get('API_BASE_URL') || ''
// No API configured: serve the page unchanged
if (!apiBaseUrl) {
return context.next()
}
const response = await context.next()
const contentType = response.headers.get('content-type') || ''
if (!contentType.includes('text/html')) {
return response
}
let html = await response.text()
const metadata = await fetchPageMetadata(
resourceInfo.resourceType,
resourceInfo.resourceId,
apiBaseUrl,
)
html = injectOgTagsIntoHtml(html, buildOgTags(metadata, request.url))
return new Response(html, {
status: response.status,
headers: response.headers,
})
}
In your Netlify dashboard, set API_BASE_URL to your backend’s base URL. That’s the only variable the edge function needs; the metadata lookup and the image cache both live on the backend.
Your Backend Metadata Endpoint
On the backend, expose the public endpoint the function calls. A minimal Express version:
// GET /api/public/forms/:id/meta
router.get('/public/forms/:id/meta', async (req, res) => {
try {
const form = await Form.findById(req.params.id)
.select('title description imageUrl')
.lean()
if (!form) {
return res.status(404).json({ error: 'Not found' })
}
return res.json({
title: form.title,
description: form.description || `Fill out the ${form.title} form`,
imageUrl: form.imageUrl || null,
url: `https://yourdomain.com/forms/${req.params.id}`,
})
} catch {
return res.status(500).json({ error: 'Server error' })
}
})
Keep it fast: it’s called on every cache miss, so select only the fields you need and avoid heavy joins.
Going Further: Dynamic OG Images
Everything so far injects text metadata, but the tag that makes the biggest visual difference is og:image. A static image is fine for your home page; for a specific form, article, or product, an image generated on the fly with that resource’s title and branding is dramatically more compelling.
The nice part is that the edge function doesn’t need to change at all. Instead of returning a static imageUrl, your metadata endpoint points og:image at another backend endpoint that renders the image on demand:
imageUrl: `https://api.yourdomain.com/og/image/forms/${id}`
To generate the image itself, the combination I use for Anonfeedback is Satori + Resvg: Satori (from Vercel) converts JSX-like markup into a 1200x630 SVG (no headless browser needed), and Resvg renders that SVG to a PNG. The endpoint fetches the resource, builds the layout, and responds with Content-Type: image/png.
This is where the real caching happens, and it’s why I keep saying object storage rather than Redis. A rendered PNG is a 50 to 200KB binary; Redis memory is expensive for that, while object storage (GCS, S3) costs pennies per GB and sits behind a CDN. So the image endpoint is itself a cache-aside, with the bucket as the store:
const cachePath = getCachePath(subdomain, slug, form.updatedAt, org.logoVersion)
const cacheFile = bucket.file(cachePath)
// Hit: stream the cached PNG straight back
const [exists] = await cacheFile.exists()
if (exists) {
const [buffer] = await cacheFile.download()
res.setHeader('Content-Type', 'image/png')
res.setHeader('Cache-Control', 'public, max-age=31536000, immutable')
res.setHeader('X-Cache-Status', 'HIT')
return res.send(buffer)
}
// Miss: generate, respond immediately, upload in the background
const pngBuffer = await renderOgImage(/* ...satori + resvg... */)
setImmediate(() =>
cacheFile.save(pngBuffer, { metadata: { contentType: 'image/png' } }),
)
res.setHeader('X-Cache-Status', 'MISS')
res.send(pngBuffer)
Two details make this clean:
- The cache key is the storage path, and invalidation is baked into it. I hash the resource ID together with its
updatedAttimestamp and a logo version into the filename:og_cache/{subdomain}/{slug}_{hash}.png. When the title or logo changes the hash changes, so the old path is simply never requested again. No manual cache clearing, ever. TheX-Cache-Statusheader makes hits and misses easy to see fromcurl. - The upload is fire and forget.
setImmediatelets the response go out before the bucket write finishes. If the write fails, the next request is just another miss that regenerates and tries again.
The one thing hash-based paths don’t solve on their own is cleanup. A changed resource orphans its old image instead of overwriting it, so left alone the bucket grows forever. Rather than tracking and deleting those by hand, I let the bucket do it with a lifecycle rule that deletes cached images after 30 days. Anything still in use is regenerated and re-uploaded on its next request, so the worst case after expiry is a single cache miss.
The part that matters when you set this up is scoping the rule to the cache prefix. My bucket also holds org logos and user uploads, and a bucket-wide delete rule would happily delete those too. GCS lifecycle is a single bucket-wide policy with replace semantics, so the rule has to be prefix-scoped, and it has to carry any existing rules along with it:
{
"rule": [
{
"action": { "type": "Delete" },
"condition": { "age": 30, "matchesPrefix": ["og_cache/"] }
},
{
"action": { "type": "SetStorageClass", "storageClass": "NEARLINE" },
"condition": { "age": 90 }
},
{
"action": { "type": "SetStorageClass", "storageClass": "COLDLINE" },
"condition": { "age": 365 }
}
]
}
gcloud storage buckets update gs://your-bucket --lifecycle-file=lifecycle.json
matchesPrefix is the line that confines the 30-day delete to og_cache/ and away from real user data. The two SetStorageClass rules are pre-existing cost-tiering rules; because applying a lifecycle file replaces the whole policy, leaving them out would silently drop them. One caveat if your bucket has versioning enabled: an age-based delete archives the live object to a noncurrent version rather than reclaiming the bytes, so add a condition on noncurrent versions if you want the space actually freed. The cleanest setup is a dedicated bucket for the cache, with no other prefixes and no versioning, which sidesteps both gotchas; mixing the cache into a shared bucket is the tradeoff I made, and the reason the prefix scoping matters so much.
The whole image cache is optional. Drop it and you regenerate on every request, trading CPU for zero storage. But generation is the expensive step, so for anything shared more than once, caching pays for itself quickly.
The full pipeline (custom fonts, emoji, logo embedding, layout design) deserves its own article, but the takeaway here is that og:image can be just another URL your edge function injects; the generation complexity lives entirely on the backend.
Testing Your Implementation
Use the official debuggers. These simulate exactly what the crawler sees: raw HTML, no JavaScript.
- Facebook: developers.facebook.com/tools/debug
- LinkedIn: linkedin.com/post-inspector
- Twitter / X: cards-dev.twitter.com/validator
One caveat: platforms cache previews aggressively. If you shared a URL before deploying your fix, hit “Scrape Again” to force a fresh fetch.
Curl the URL directly. This is the quickest sanity check:
curl -H "Accept: text/html" https://yourdomain.com/forms/abc123 | grep -i "og:"
If you see your injected tags, it works. If you see nothing, the edge function isn’t running on that route; check your netlify.toml path patterns.
Test locally with netlify dev. It runs your edge functions locally for the fastest feedback loop, picking up environment variables from a .env file or your Netlify dashboard.
Verify the image cache. Request a form’s image endpoint twice and watch the X-Cache-Status header flip from MISS to HIT:
curl -sI https://api.yourdomain.com/public/og/image/myorg/my-form | grep -i x-cache-status
You can also list what’s been cached straight from the bucket:
gcloud storage ls "gs://your-bucket/og_cache/**"
Common Issues
- Tags not appearing: check that your
netlify.tomlpaths match your routes./forms/*matches/forms/abc123but not necessarily what you expect for nested paths. - Always getting fallback tags: your metadata endpoint is probably timing out. Check its response times and add a database index on the ID field if needed.
- Duplicate OG tags: the cleanup regex may not match how your existing tags are formatted (single vs double quotes, self-closing or not). Inspect your raw
index.htmland adjust. - Image cache never hits: check that your cache key is stable. If you hash in a value that changes every request (a fresh
Date.now()instead of the resource’supdatedAt), every request misses and regenerates. List the bucket to confirm: a healthy cache has one object per resource version, not hundreds. - Function not running at all: check the deploy logs for TypeScript errors. Deno is strict, and a type error will prevent the function from deploying.
A Quick Note on Anonfeedback
This is exactly the problem I solved for Anonfeedback, an AI-powered anonymous feedback platform I’ve been building for the past two years. Organisations create feedback forms, share them with teams or students, and collect honest anonymous responses; the AI layer surfaces patterns, sentiment, and key themes without exposing individual answers.
When someone shares a form link on Slack or by email, the preview needs to show the form’s actual title and description, not a blank card. That’s what pushed me to build this edge function, and now every shared form renders a proper preview, which makes people far more likely to click through and respond.
If anonymous feedback would be useful for you (team retrospectives, course evaluations, surveys, 360 reviews), join the waitlist at anonfeedback.io. I’m rolling out access gradually and would love to have you in early.
Wrapping Up
The OG meta tag problem in React SPAs is easy to overlook until someone shares your link and gets a blank preview. The edge function approach is the cleanest fix I’ve found for apps that can’t or won’t migrate to a server-rendered framework:
- Non-invasive: zero changes to your React app
- Fast: runs at the CDN edge; the metadata lookup is a single indexed query and preview images are served straight from cache
- Scalable: the image cache absorbs traffic spikes, so a viral link renders each preview once instead of on every hit
- Resilient: graceful fallbacks at every layer mean an outage shows default tags, not errors
- Cheap: Netlify’s free tier includes edge functions, and cached images cost pennies per GB in object storage
To adapt it: update ROUTE_PATTERNS and the metadata endpoint path to match your app, set your DEFAULT_FALLBACK, add the API_BASE_URL variable in Netlify, and cache generated images in object storage if you render them. Deploy, test with the Facebook debugger, and enjoy finally having proper link previews.
If you found this useful or have questions, feel free to reach out. And if you’re building something that could benefit from anonymous feedback, check out Anonfeedback.