• What is Soft 404?

Soft 404

Soft 404 is a URL that returns an HTTP 200 OK status code - meaning “the server found the page and is serving it successfully” - but whose content is effectively a “page not found” experience. An empty search result, a “this item is no longer available” notice, a bare header with no body content, or a redirect to the homepage for deleted pages all count. From a user perspective, the content is missing. From an HTTP perspective, everything looks fine. Google detects the mismatch and flags these URLs in Search Console as Soft 404s.

Why soft 404s happen

Four common causes:

Thin content pages returning 200. A search results page with zero matches, a category page with no products, a blog tag page with no posts. The page “succeeds” technically, but the user gets nothing.

Deleted pages redirecting to homepage. When a product or article is removed, some teams redirect old URLs to the homepage rather than returning a proper 404. Google treats this mass redirect-to-root pattern as a soft 404 because the destination doesn’t match the original URL’s intent.

Temporary unavailability served as 200. “This item is out of stock” pages that return 200 OK. Arguably correct for occasional stockouts; problematic when the out-of-stock message persists indefinitely.

CMS quirks. Some content systems serve a “this page has been removed” template with a 200 status rather than issuing a proper 404 or 410. The template might even include a helpful message, but Google’s evaluation looks at the mismatch between URL expectation and content delivery.

Why soft 404s hurt SEO

Three specific problems:

Wasted crawl budget. Googlebot spends time fetching URLs that return no value. On large sites with many soft 404s, this displaces crawl time from the pages that actually matter. Smaller sites notice this less; enterprise sites with 100K+ URLs can lose meaningful indexing coverage.

Indexing confusion. Soft 404 URLs may or may not stay indexed. When indexed, they show up in search results for their original queries and deliver empty content to the clicker - a direct user-experience hit. When deindexed, any external backlinks to those URLs stop flowing authority.

Signal contamination. A site with many soft 404s looks lower-quality to Google’s algorithms, regardless of the quality of its actual content. The pattern is a site-wide health signal, not a per-URL one.

How to detect soft 404s

Four methods:

Google Search Console. The “Pages” or “Coverage” report flags URLs Google has classified as Soft 404. Start here - it’s the most authoritative source because it’s Google’s own classification.

Site crawl tools. Screaming Frog, Sitebulb, and Ahrefs Site Audit can flag pages with minimal content but 200 status. Tune the thresholds to match your site (what counts as “too little content” varies by page type).

Server log analysis. Track Googlebot hit patterns. URLs that receive crawl visits but produce no indexed result are candidates for soft-404 investigation.

Manual sampling. Click a random selection of URLs from your sitemap. If any deliver “no results”, “out of stock”, or empty templates, you probably have more across the site.

How to fix soft 404s

Four standard remedies, matched to the cause:

Return proper 404 or 410 status codes. Deleted pages should return 404 (not found) or 410 (gone). The latter signals permanence and helps Google deindex faster. Configure the CMS or server to return the appropriate status code when the content is genuinely absent.

Fix empty states. Category pages with no products should either redirect to a parent category, return a 404, or serve genuinely useful fallback content (related categories, top products, search suggestions). Empty templates are the worst of the options.

Use redirects to topically-relevant destinations. When redirecting a deleted page, redirect to the closest live topical match, not the homepage. A 301 from /product/blue-hat to /products/hats works; a 301 to the homepage does not.

noindex high-value-thin pages. For pages that need to exist (search result pages, filter pages) but shouldn’t be indexed, use a noindex meta tag. These aren’t technically soft 404s, but the same anti-pattern applies and the fix is similar.

A practical worked example

An e-commerce site with 80,000 URLs saw organic traffic plateau despite ongoing content investment. Search Console showed 12,000 URLs classified as Soft 404 - mostly search-result pages returning 200 with “no matching products” templates, and old product URLs redirecting to the homepage. Fixes: search-result pages with zero matches were moved to return 404; deleted product URLs were redirected to their parent category pages; a noindex tag was added to all internal search and filter URLs. Six weeks later, Soft 404 count dropped to 900, indexed URL count was stable (no loss of real content), and crawl-to-index ratio on legitimate content improved by 31%. Organic traffic grew 18% over the following quarter, most of it from existing content finally being crawled.

We built Penfriend to produce content that never ships as a soft 404 - every page has substantive content, proper status codes, and reasons to exist. Soft 404s accumulate silently on content-at-scale programmes that use thin AI drafts; voice-trained content doesn’t produce them.

Related terms