Navigation:

In this article I'm going to teach you how to parse archived raw access logs from your cPanel VPS (Virtual Private Server) or dedicated server. Reviewing requests from your archived raw access logs can help bring to light a common problematic request or user causing server issues that you might not have been able to catch otherwise.

Before trying to follow along with this guide, you should have already read my article about how to enable raw access log archiving for all cPanel accounts so that you actually have archived raw access logs to review.

The method we'll be going over for parsing these raw access logs is very handy, as you can do it on the server directly, instead of having to access the raw access logs in cPanel which requires you to download the logs to your own computer first.

To follow along with this guide you'll need root access to either your VPS or dedicated server so that you have full access to read all of the archived logs.

Review archived raw access logs

Using the steps below I'll show you how to connect to your server and run a command to read through your various archived raw access logs.

  1. Login to your server via SSH as the root user.
  2. Review all requests that happened during the month of January 2013 using the following command:

    zgrep "Jan/2013" /home/*/logs/*-Jan-2013.gz | less

    You'll be able to use Page Up and Page Down to scroll up and down through all of the log data.

    You can also use a forward slash / which will put the less command in to search mode. So for instance after typing / if you follow it with 8/Jan you'll be dropped right to the section of the logs for January 8th.

    Once you are done reviewing the log this way, you can simply hit q to quit the less command.

    You should see entries like this, in this case we can see these lines are from our example.com site belonging to the userna5 user:

    /home/userna5/logs/example.com-Jan-2013.gz:123.123.123.123 - - [01/Jan/2013:00:09:10 -0500] "GET /category/linux/ HTTP/1.1" 200 3063 "-" "Mozilla/5.0 (compatible; AhrefsBot/4.0; +http://ahrefs.com/robot/)"
    /home/userna5/logs/example.com-Jan-2013.gz:123.123.123.123 - - [01/Jan/2013:02:57:05 -0500] "GET /2010/12/ HTTP/1.1" 200 5197 "-" "Mozilla/5.0 (compatible; AhrefsBot/4.0; +http://ahrefs.com/robot/)"
    /home/userna5/logs/example.com-Jan-2013.gz:123.123.123.123 - - [01/Jan/2013:04:06:32 -0500] "POST /wp-cron.php HTTP/1.0" 200 - "-" "WordPress/3.4.1; http://atomlabs.net"
    /home/userna5/logs/example.com-Jan-2013.gz:123.123.123.123 - - [01/Jan/2013:04:06:29 -0500] "GET /wp-login.php HTTP/1.1" 200 2147 "-" "Mozilla/5.0 (compatible; AhrefsBot/4.0; +http://ahrefs.com/robot/)"

Parse IPs from archived raw access logs

Below I'll show how to parse out all of the IP addresses from your raw access logs for the example.com domain.

  1. Run this command:

    zgrep "Jan/2013" /home/userna5/logs/example.com-Jan-2013.gz | sed 's#:# #' | awk '{print $2}' | sort -n | uniq -c | sort -n

    This will spit back info like this:

    76 123.123.123.129
    80 123.123.123.124
    599 123.123.123.125
    6512 123.123.123.123

Parse User-Agents from archived raw access logs

Parse out all of the User-agents from your raw access logs for the example.com domain.

  1. Run this command:

    zgrep "Jan/2013" /home/userna5/logs/example.com-Jan-2013.gz | awk -F\" '{print $6}' | sort | uniq -c | sort -n

    This will spit back info like this:

    192 Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
    340 Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
    1509 Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)
    5548 Mozilla/5.0 (compatible; AhrefsBot/4.0; +http://ahrefs.com/robot/)

Parse requested URLs from archived raw access logs

Below I'll show how to parse out all of the requested URLs from your raw access logs for the example.com domain.

  1. Run this command:

    zgrep "Jan/2013" /home/userna5/logs/example.com-Jan-2013.gz | awk '{print $7}' | sort | uniq -c | sort -n

    This will spit back info like this:

    172 /wp-login.php
    201 /robots.txt
    380 /
    2017 /opencart/undefined

Parse referrers from archived raw access logs

Below I'll show how to parse out all of the referrers from your raw access logs for the example.com domain.

  1. Run this command:

    zgrep "Jan/2013" /home/userna5/logs/example.com-Jan-2013.gz | awk -F\" '{print $4}' | sort | uniq -c | sort -n

    This will spit back info like this:

    219 http://example.com/prestashop/index.php
    337 http://example.com/list/admin/
    2009 http://example.com/
    2522 http://example.com/opencart/

You should now fully understand how you can parse your archived raw access logs on your server to get a better understanding of requests that have been going on, possibly causing server usage issues.

You might also be interested in reading my article about blocking unwanted users from your site using .htaccess for an in-depth explanation on how you could block any users that are causing an excessive amount of requests to your sites.

Did you find this article helpful?

We value your feedback!

Why was this article not helpful? (Check all that apply)
The article is too difficult or too technical to follow.
There is a step or detail missing from the instructions.
The information is incorrect or out-of-date.
It does not resolve the question/problem I have.
How did you find this article?
Please tell us how we can improve this article:
Email Address
Name

new! - Enter your name and email address above and we will post your feedback in the comments on this page!

Like this Article?

Post a Comment

Name:
Email Address:
Phone Number:
Comment:
Submit

Please note: Your name and comment will be displayed, but we will not show your email address.

News / Announcements

WordPress wp-login.php brute force attack
Updated 2014-07-17 06:43 pm EST
Hits: 200901

Related Questions

Here are a few questions related to this article that our customers have asked:
Ooops! It looks like there are no questions about this page.
Would you like to ask a question about this page? If so, click the button below!
Ask a Question

Need more Help?

Search

Ask the Community!

Get help with your questions from our community of like-minded hosting users and InMotion Hosting Staff.

Current Customers

Chat: Click to Chat Now E-mail: support@InMotionHosting.com
Call: 888-321-HOST (4678) Ticket: Submit a Support Ticket

Not a Customer?

Get web hosting from a company that is here to help. Sign up today!