View request type, URL, and response codes from Apache access log

In this article I’ll be reviewing how to use Apache access logs on either your VPS (Virtual Private Server) or dedicated server in order to determine the types of requests your website is handling.

A lot of times the usage on your server can be greatly affected by the types of requests that are happening on your websites, if you’ve read either of my previous articles on how to do advanced server load monitoring, or you’ve setup a server load monitoring script you might be aware that your server’s load average has been recently spiking.

I’ve already covered determining the cause of a server usage spike which goes over taking a particular load spike’s time stamp and correlating that with your Apache access logs, and I’ve also gone over how to view the level of traffic with Apache access logs which lets you view the hits per day, per hour, and per minute to your website.

In this article I’ll be going over different types of requests that show up in your Apache access logs. More specifically the request type, requested URL, and response codes.

To follow along with the instructions in this article you’ll need to have either a VPS or dedicated server so that you can SSH into the server to run the commands we’ll go over.

View types of requests from Apache access log

  1. Login to your server via SSH.
  2. In this example our domain is example.com and the cPanel username is userna5. You can run the following command to get to the /access-logs directory for that user:
    cd ~userna5/access-logs
  3. Run this command to see what Apache access logs are present:
    ls -lahtr
    You should get back:
    drwxr-xr-x 3 root at0m 4.0K Dec 31 16:47 .
    drwx--x--x 9 root wheel 4.0K Jan 4 06:01 ..
    -rw-r----- 2 root at0m 15K Jan 9 05:09 ftp.example.com-ftp_log
    -rw-r----- 2 root at0m 3M Jan 23 13:10 example.com
  4. View request types GET/HEAD/POST from Apache access log

  5. Run the following command to view the types of requests that are happening the most, either GET which means a visitor is simply requesting a resource such as a HTML page or image, HEAD which is typically a web-browser or bot checking to see if the file requested has been updated since it was last accessed. Or a POST which means a visitor has filled out information in a form and is POSTing it to the server much like you would see from a login attempt.
    awk '{print $6}' example.com | sort | uniq -c | sort -n
    Code breakdown:

    awk ‘{print $6}’ example.com Use the awk command to print out the $6th column of data from the example.com Apache access log which is the request type.
    sort | uniq -c | sort -n Sort the requests types, uniquely count them, then finally numerically sort them.

    You should get back something like this:
    1089 "HEAD
    105017 "POST
    221268 "GET

  6. View highest requested base URLs from Apache access log

  7. Run the following command to view the most requested base URLs from your website. For instance if you have requests for wp-cron.php?doing_wp_cron=135 and wp-cron.php?doing_wp_cron=136 this strips them down to just the base URL of wp-cron.php:
    cat example.com | cut -d" -f2 | awk '{print $1 " " $2}' | cut -d? -f1 | sort | uniq -c | sort -n
    Code breakdown:

    cat example.com | cut -d” -f2 Use the cat command to concatenate (read) the example.com Apache access log. Then use the cut command with the -delimiter set to double quotes and then use the -field of data that happens after the 2nd occurence of the delimiter.
    awk ‘{print $1 ” ” $2}’ Use the awk command to print out the $1st column of data which is the request type, a space, then the $2nd column of data which is the base URL requested.
    cut -d? -f1 Use the cut command with the -delimiter set to a question mark ?, then print out the -field of data before the 1st occurence which gives us back just the base URL.
    sort | uniq -c | sort -n Sort all the data, uniquely count them, then numerically sort them from lowest to highest.

    You should get back something like this:
    355 GET /wp-login.php
    1448 POST /wp-login.php

  8. View highest requested unique URLs from Apache access log

  9. Run the following command to view the most requested unique URLs from your website. For instance if you have requests for wp-cron.php?doing_wp_cron=135 and wp-cron.php?doing_wp_cron=136 these will be treated as unique requests. This is a good method to use to see if one exact URL is getting hit again and again:
    cat example.com | cut -d" -f2 | awk '{print $1 " " $2}' | sort | uniq -c | sort -n
    Code breakdown:

    cat example.com | cut -d” -f2 Use the cat command to concatenate (read) the example.com Apache access log. Then use the cut command with the -delimiter set to double quotes and then use the -field of data that happens after the 2nd occurence of the delimiter.
    awk ‘{print $1 ” ” $2}’ Use the awk command to print out the $1st column of data which is the request type, a space, then the $2nd column of data which is the full unique requested URL.
    sort | uniq -c | sort -n Sort all the data, uniquely count them, then numerically sort them from lowest to highest.

    You should get back something like this:
    2 POST /wp-login.php?action=register
    4 GET /wp-login.php
    157 GET /wp-login.php?registration=disabled
    190 GET /wp-login.php?action=register
    1446 POST /wp-login.php

  10. View response codes from Apache access log

  11. Run the following command to view the most common response codes your visitors are causing.
    cat example.com | awk '{print $9}' | sort | uniq -c | sort -n
    Code breakdown:

    cat example.com | awk ‘{print $9}’ Use the cat command to concatenate (read) the example.com Apache access log. Then use the awk command to only print out the $9th column of data which is the response code.
    sort | uniq -c | sort -n Sort all the data, uniquely count them, then numerically sort them from lowest to highest.

    You should get back something like this:

    306 500
    862 404
    893 301
    10012 302
    12485 200

You should now understand how to extract data out of your Apache access logs to get a good idea of the type of requests that your server is having to process.

Thoughts on “View request type, URL, and response codes from Apache access log

  • Hello,

    Article was so helpful in finding out what I was looking for, Step 5 when trying to check for the most View highest requested base URLs from access log is not giving any output at my end, could you help me in correcting the command.

    I get output as nothing, below.

    root@server access-logs]# cat example.com | cut -d” -f2 | awk ‘{print $1 ” ” $2}’ | cut -d? -f1 | sort |uniq -c | sort -n
    >
    >

    • Hi, Shiva! Just to clarify, did you replace ‘example.com’ with your domain when you ran the command? I still leave it in myself sometimes, so I’m just double checking!

      Anyway, I ran some tests with some of our Support Agents and it looks like we need to update the code in the article. Try using this version of the command:
      cat example.com | cut -d\" -f2 | awk '{print $1 " " $2}' | cut -d? -f1 | sort | uniq -c | sort -n
      Notice that we changed the -d" to -d\". That seems to be resolving the issue! I’ll update the article accordingly.

  • Great article.

    When I see HTTP/1.1″ 500 – “-”  return with a HEAD request in the Apache logs. Is this saying that in fail in checking to see if the file requested or time out issue?

  • Hi,

     

    Great article.

    I am looking for a top 10 slowest UR’s. How would I fabric that with AWK? Responstimes are logged with %D (uSec) in the log.

     

    Thanks!

    • Unfortunately, using the Apache logs to identify which of your pages is loading the slowest is not quite reliable as there are many other factors such as visitor location. Your best solution would be to simply run your various pages through a tool such as GTmetrix which provides much better data.

Leave a Reply to Paul Cancel reply