Causes for 404 Error Code and Solutions
This morning I received an email from one of my blog readers who is a bit disappointed by the overwhelming number of 404 error code that shows in his server log. He is aware of the fact that 404 HTTP status code stands for “Not Found” but that confuses him further because all his links are working fine and there is no link that leads to a page not found. ( Well, I guess he is missing the fine difference between “Not Found” and “Page not found” )
Instead of answering his email I thought I would put up a post here, so it can be of help to all my readers who are struck with such 404 error issues.
As per the W3.org, 404 HTTP status code is defined as “Not Found”. It further says,
The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
Here are some of the common causes of 404 Error
- Moving a Web Page – At times you reorganize your website or move resources from one page to another for various reasons. Now even after you have removed the page it is likely that you will still have some references to the page both on-site ( links from other pages) as well as off-site ( search engine result pages, links from external sites etc) While you can take care of the internal pages and remove all references you are helpless with the external references and any visitor following those links would get a 404 error.
Solution: Set up a 301 redirection from your old page to new page. All your visitors would be able to find the page. Search engines would also recognize the move, drop the old page and index the new one.
- Mistyping by Type in Traffic – If your website is very popular you might be getting a lot of type in traffic. Any mistyping by those visitors would return a 404 error page.
Solution: Not really a solution for this because you don’t know what your visitors are going to type but as a best practice try using file names that are logical, short and easy to remember. Also, you can gain back some of these traffic by using a custom 404 error page ( will write a separate post on this soon ) .
- Misspelled URLs in External Links – Its always good to get links from other websites, but if the URL is misspelt all visitors would get a 404 status code.
Solution: Monitor your 404 error logs and see if they are sent by any particular site. If it is a case of misspelt URL, contact the webmaster and request them to change the link with correct spelling. If that is not possible, create a page with the misspelt URL and set up a 301 redirect from that page to the intended page.
- Renaming Documents & Change of URL Structure – At times pages are also renamed or URL structures are modified. This is a common case with old dynamic websites with complex URL structures with parameters who might be changing over to a nice static looking URLs with URL rewriting. Even in this case the old URLs and file references remain in various locations on and off the site and any visitor trying to access the old URLs would get a 404 error.
Solution: Again 301 redirection is the best solution that you can have to avoid these 404 error codes.
- Case Sensitive File Names – While this is not an issue with IIS this can be a major problem if your website is on Unix or Linux server. Apache is case sensitive in respect to file names. If your file name is seo.html and you have linked to SEO.html , your server would return a 404 Http status code.
Solution: Be very careful while naming your files. Its a generally accepted practice to have file names in lower case and following the convention should save you much trouble in case you are on Unix or Linux server.
- Old Domains – Many of us would buy an old domain and put our site on that in lure of those few extra backlinks, PR juice and domain authority. Now these domains might have some of their pages still indexed in search engines or linked from other websites which can give you a trickling flow of visitors and most of them getting a 404 error code ( unless coincidentally some of your URLs are same as the old site).
Solution: You would probably like to ignore this traffic unless the traffic is highly relevant to your website and the volume of traffic is also good. In case you see a good amount of traffic flow, you can always set up a new page based on the old URL.
- Missing Favicon.ico – Favicon.ico is a small graphic icon that gets displayed on the address bar when you open a URL in Firefox or when you try to bookmark a URL in IE. Its great for branding to have a small logo beside your URL or eside your bookmarked URL. Some of the browsers like Mozilla would call for this favicon.ico file whenever it opens a page and IE would call for this file whenever somone tries to bookmark a page. Now if you don’t have a Favicon.ico it your server returns a 404 http status code to the user agent .
Solution: Simple ! Add a favicon.ico to your website. If you are good with graphics you can do it yourself using Photoshop or other similar programs. Alternatively use this Favicon generator.
- Missing Robots.txt – All search engine spiders and bots calls for the robots.txt file whenever they visit your website. A robots.txt file contains the directives that tells the bots which part of the website they should not spider. These days robots.txt file is also used to provide links to XML sitemaps so it is easier for spiders to crawl the entire website. Each website must have a robots.txt file in their root directory. Even if you don’t want to block any part of your website from the spiders, upload a blank robots.txt.
Solution: Make sure you have a robots.txt file in your root directory. In case you want to know more, read how to create a robots.txt .