Parse archived raw access logs from cPanel Updated on August 16, 2021 by InMotion Hosting Contributor 3 Minutes, 57 Seconds to Read Navigation: View archived logs Parse IPs Parse User-Agents Parse URLs Parse referrers In this article I’m going to teach you how to parse archived raw access logs from your cPanel VPS (Virtual Private Server) or dedicated server. Reviewing requests from your archived raw access logs can help bring to light a common problematic request or user causing server issues that you might not have been able to catch otherwise. Before trying to follow along with this guide, you should have already read my article about how to enable raw access log archiving for all cPanel accounts so that you actually have archived raw access logs to review. The method we’ll be going over for parsing these raw access logs is very handy, as you can do it on the server directly, instead of having to access the raw access logs in cPanel which requires you to download the logs to your own computer first. To follow along with this guide you’ll need root access to either your VPS or dedicated server so that you have full access to read all of the archived logs. Review archived raw access logs Using the steps below I’ll show you how to connect to your server and run a command to read through your various archived raw access logs. Login to your server via SSH as the root user. Review all requests that happened during the month of January 2013 using the following command: zgrep “Jan/2013” /home/*/logs/*-Jan-2013.gz | less You’ll be able to use Page Up and Page Down to scroll up and down through all of the log data. You can also use a forward slash / which will put the less command in to search mode. So for instance after typing / if you follow it with 8/Jan you’ll be dropped right to the section of the logs for January 8th. Once you are done reviewing the log this way, you can simply hit q to quit the less command. You should see entries like this, in this case we can see these lines are from our example.com site belonging to the userna5 user: /home/userna5/logs/example.com-Jan-2013.gz:123.123.123.123 – – [01/Jan/2013:00:09:10 -0500] “GET /category/linux/ HTTP/1.1” 200 3063 “-” “Mozilla/5.0 (compatible; AhrefsBot/4.0; +https://ahrefs.com/robot/)” /home/userna5/logs/example.com-Jan-2013.gz:123.123.123.123 – – [01/Jan/2013:02:57:05 -0500] “GET /2010/12/ HTTP/1.1” 200 5197 “-” “Mozilla/5.0 (compatible; AhrefsBot/4.0; +https://ahrefs.com/robot/)” /home/userna5/logs/example.com-Jan-2013.gz:123.123.123.123 – – [01/Jan/2013:04:06:32 -0500] “POST /wp-cron.php HTTP/1.0” 200 – “-” “WordPress/3.4.1; https://atomlabs.net” /home/userna5/logs/example.com-Jan-2013.gz:123.123.123.123 – – [01/Jan/2013:04:06:29 -0500] “GET /wp-login.php HTTP/1.1” 200 2147 “-” “Mozilla/5.0 (compatible; AhrefsBot/4.0; +https://ahrefs.com/robot/)” Parse IPs from archived raw access logs Below I’ll show how to parse out all of the IP addresses from your raw access logs for the example.com domain. Run this command: zgrep “Jan/2013” /home/userna5/logs/example.com-Jan-2013.gz | sed ‘s#:# #’ | awk ‘{print $2}’ | sort -n | uniq -c | sort -n This will spit back info like this: 76 123.123.123.129 80 123.123.123.124 599 123.123.123.125 6512 123.123.123.123 Parse User-Agents from archived raw access logs Parse out all of the User-agents from your raw access logs for the example.com domain. Run this command: zgrep “Jan/2013″ /home/userna5/logs/example.com-Jan-2013.gz | awk -F” ‘{print $6}’ | sort | uniq -c | sort -n This will spit back info like this: 192 Mozilla/5.0 (compatible; YandexBot/3.0; +https://yandex.com/bots) 340 Mozilla/5.0 (compatible; Baiduspider/2.0; +https://www.baidu.com/search/spider.html) 1509 Mozilla/5.0 (compatible; SISTRIX Crawler; https://crawler.sistrix.net/) 5548 Mozilla/5.0 (compatible; AhrefsBot/4.0; +https://ahrefs.com/robot/) Parse requested URLs from archived raw access logs Below I’ll show how to parse out all of the requested URLs from your raw access logs for the example.com domain. Run this command: zgrep “Jan/2013” /home/userna5/logs/example.com-Jan-2013.gz | awk ‘{print $7}’ | sort | uniq -c | sort -n This will spit back info like this: 172 /wp-login.php 201 /robots.txt 380 / 2017 /opencart/undefined Parse referrers from archived raw access logs Below I’ll show how to parse out all of the referrers from your raw access logs for the example.com domain. Run this command: zgrep “Jan/2013″ /home/userna5/logs/example.com-Jan-2013.gz | awk -F” ‘{print $4}’ | sort | uniq -c | sort -n This will spit back info like this: 219 https://example.com/prestashop/index.php 337 https://example.com/list/admin/ 2009 https://example.com/ 2522 https://example.com/opencart/ You should now fully understand how you can parse your archived raw access logs on your server to get a better understanding of requests that have been going on, possibly causing server usage issues. You might also be interested in reading my article about blocking unwanted users from your site using .htaccess for an in-depth explanation on how you could block any users that are causing an excessive amount of requests to your sites. Share this Article InMotion Hosting Contributor Content Writer InMotion Hosting contributors are highly knowledgeable individuals who create relevant content on new trends and troubleshooting techniques to help you achieve your online goals! More Articles by InMotion Hosting Related Articles How to Configure cPanel in WHM Resetting the cPanel Password in WHM How to Import Email Accounts and Forwarders into cPanel How to Install WHMCS With Softaculous What is your default PHP.ini file? How to Create / Delete an FTP Account in cPanel How to Park a Domain in cPanel How to Permanently Delete Trash in File Manager The Complete Guide to cPanel Backups How to Delete a cPanel in WHM