In this article we’ll discuss the possible impact from the handling of 404 errors. These are caused by requests to pages or resources that don’t exist on the server.
By default a lot of the applications you might install on your website, will try to process requests for a page that doesn’t exist dynamically.
What this can mean, is each time someone mistypes a URL, your application is going to query the database to try to figure out if that page exists, and if it doesn’t it’s going to serve up a page not found error to the visitor.
In most cases this type of default 404 error page activity is completely harmless. However as the amount of 404 error pages you’re forcing the server to think about increases, it can become more and more problematic.
Causes of 404 errors
Let’s take a look at some of the most common reasons a 404 error might be generated.
- A visitor tries to manually type in a page that doesn’t exist.
- Your pages were not internally or externally linked correctly in the first place.
- Resources such as images, .css, .js, or other files were removed from the server.
- One of your pages was moved, and links either internal or external weren’t updated.
- Search engine crawlers, or automated bots try to access generic pages that your site doesn’t have.
404 error page handling
A good webmaster will want to periodically check on their website statistics to see what requests are generating the most 404 errors.
You can use this information to your advantage, as lets say for example you notice 500 people a month are trying to pull up a page called /about that doesn’t exist. You can go ahead then and create that page, or setup re-direction on that URL to hopefully get them pointed in the right direction and have them remain on your site during future visits.
Viewing 404 errors via Awstats
You can view your 404 errors via Awstats from within cPanel.
Once you’re logged into Awstats, if you simply scroll to the very bottom of your statistics report, you should see the total number of 404s encountered and what percentage of your traffic that represents.
Then simply click on the 404 link to see the actual requests triggering these errors.
In this example you can clearly see that the URL /directory/search-all.php is by far the biggest cause of the 404 errors on this site.
You can also see that there are several resources such as .css stylesheets and .png and .gif images that are missing.
Note the Referrer of these resources, as those would be the pages with the invalid links to these files that don’t exist.
Creating static custom 404 error pages and handling excessive requests
Now that we’ve seen a little bit about what a 404 error is, you might wish to dive deeper and configure a custom 404 error page to help steer your visitors to other sections of your site when then try to access a page that doesn’t exist.
Or if you’re having problems with too many 404 errors causing performance issues a static 404 error page can help with this as well. All of the following articles go more in-depth on handling 404 errors and would be recommended reading for anyone looking to get the total big picture on 404 error handling.