3077 Mozilla/5.0 (compatible; spbot/4.1.0; +http://OpenLinkProfiler.org/bot ) 2353 Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html) 2099 Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) 1800 Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html) 1290 Mozilla/5.0 (compatible; freefind/2.1; +http://www.freefind.com/spider.html) 873 Mozilla/5.0 (compatible; 007ac9 Crawler; http://crawler.007ac9.net/) 612 Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, [email protected]) 452 Mozilla/5.0 (compatible; Plukkie/1.5; http://www.botje.com/plukkie.htm) 324 Baiduspider-image+(+http://www.baidu.com/search/spider.htm) 83 Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots) 80 Mozilla/5.0 (compatible; MJ12bot/v1.4.5; http://www.majestic12.co.uk/bot.php?+) 58 Mozilla/5.0 (compatible; AhrefsBot/5.0; +http://ahrefs.com/robot/) 54 NerdyBot 40 Mozilla/5.0 (compatible; 200PleaseBot/1.0; +http://www.200please.com/bot) 28 Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)
If you for instance wanted to block all these bots outright, here is a .htaccess rule you could use:BrowserMatchNoCase "spbot" bots BrowserMatchNoCase "EasouSpider" bots BrowserMatchNoCase "YandexBot" bots BrowserMatchNoCase "Baiduspider" bots BrowserMatchNoCase "freefind" bots BrowserMatchNoCase "007ac9" bots BrowserMatchNoCase "DotBot" bots BrowserMatchNoCase "Plukkie" bots BrowserMatchNoCase "MJ12bot" bots BrowserMatchNoCase "AhrefsBot" bots BrowserMatchNoCase "200PleaseBot" bots BrowserMatchNoCase "SISTRIX Crawler" bots Order Allow,Deny Allow from ALL Deny from env=bots
I also noticed that you have deny from rules in your .htaccess file in this format:Deny from 157.55.*
You don't actually need the asterisk *:Deny from 157.55
Also you typically don't want to block based off of the PTR address or hostname in your .htaccess file. Rather a direct IP address, or in the case of the Chinese Baidu crawler simply blocking them by their User-agent is more effective. It looks like since we blocked specific requests with &user=202.46 in the URL with this code:ErrorDocument 503 "Temporarily unavailable" RewriteEngine on RewriteCond %{QUERY_STRING} ^.*user=202.46.*$ RewriteRule .* - [R=503,L]
Your site has blocked 208 of those type of requests so far today, and it looks like your resource usage has dropped a bit. If you block some of those bots that you don't need crawling your site, Yandex for instance is a Russian search engine and Baidu is a Chinese one. That can help cut your resource usage even further. As always you can view CPU graphs in cPanel to help ensure that your usage isn't spiking again. Hope that helps, and please let us know if you had any other questions at all! - Jacob
deny from ptr.cnsat.com.cn