Search Engine OptimizationQuick SEO Tips

A Close Look at 404 Errors in SEO

By August 3, 2010 December 3rd, 2012 5 Comments

Ever click on a page in a search result listing and get a “404 – Page Not Found” error?  It probably hasn’t happened much to you since the search engines do a fairly good job of not ranking pages with 404 errors, or even sites that have “coming soon” pages.

There are a couple of common ways you as a site owner can inadvertently generate these types of pages, and you want to make sure they are not indexed in the search engines.

The first way is probably the most common – you changed the URL and forgot to redirect the old one to the new one.  So you might have changed a page from “/relevance-of-404-errors/” to “/importance-of-404-errors/”.  The problem is that without permanently redirecting the old URL, it could still be visible in the search results, leading to that “404 – Page Not Found” error.  Whoops.

The second way is when you simply remove pages from your website, not realizing the pages are still indexed in Google or other search engines. This is common with special promotional pages for marketing, or landing pages you might be temporarily using for paid search efforts.

The ideal 404 response:

Here /abc.html, /pqr.html and /xyz.html are pages that don’t exist.

There are two components to this:

1. Search Engine component: In terms of SEO and to avoid any implications of 404 errors in search engines (which we will discuss below) ensure that that when a page is requested which doesn’t exist the web server should return a ‘404 not found’ status code in the header.

2. Usability component: The browser should preferably render a custom 404 page. From a user’s perspective once we reach a page which doesn’t exist there should be ways of going back to the main page; without hitting the back button.

If your domain doesn’t handle number 1 you have chances of running into issues of duplicate content. The reason: If it doesn’t return a “404 not found” it means you are giving a green signal to a search engine to index the page. And since the same page is displayed whenever anyone types a URL which doesn’t exist on your domain (theoretically infinite variations are possible) this same page is indexed under multiple non-existent URL’s. This is a duplicate content issue and the search engine could possibly put a small red flag on your site. Something you definitely  want to avoid.

The 404 myth:

The most common case is when someone thinks they have a valid 404 because they have a custom 404 page and their server is not returning a ‘404 not found’.  This is misleading and a common scenario looks like this.  In this case we are giving the search engine a green signal by returning a ‘200 OK’ to index /abc.html, /pqr.html, /xyz.html all for the same 404 page. This leads to the search engine indexing the 404 page (which we don’t want) for all the three URL’s : a potential duplicate content issue.

 

How to check for 404’s:

Run an analysis of your site (it takes 30 seconds) on our Free Website Analyzer; it identifies 404 errors among other SEO factors.

There is a really useful Firefox plugin called ‘Live HTTP Headers‘ where you can check the status code in the header to see if it’s a’404 not found’.

5 Comments

  • Clint Watson says:

    I’ve actually wondered about this issue for some time. I understand what you’re saying BUT, a question:

    If /abc.html is indexed by google wouldn’t that likely mean there was a link somewhere pointing to it? If you 404 it, won’t it drop out of the index where if you 200 it it will stay in the index? Or maybe you could detect any search queries bringing traffic to the custom 404 page and serve up slightly different custom 404’s (but return response code 200)? Or lastly, log the the custom 404’s where visitors land (but keep returning 200 so they don’t drop out of the index) and then 301 the ones from search queries to appropriate content?

    Wow, I’m even making my own head hurt at this point…..

  • stewartie432 says:

    I found your blog on google and read a few of your other posts. I just added you to my Google News Reader. Keep up the good work. Look forward to reading more articles from you in the future.

  • Nisha Shah says:

    Very interesting post – I’m definitely going to bookmark you! Thank you for your info.

  • CodyThomas1 says:

    I still have to figure that out. Some pages of my blog at http://www.1capecoral.com/blog are indexed with the folder search in it and show 404. I was wondering if I could just put some HTML code with a link for people to click on it and go to the blog homepage for example. It’s not ideal but it’s better than just the 404.

  • DimitriAus says:

    Soo.. how to make HTTP header “404 not found” with a custom html 404 page existed ?

Leave a Reply