{"id":82582,"date":"2026-03-20T04:56:00","date_gmt":"2026-03-20T08:56:00","guid":{"rendered":"https:\/\/www.inmotionhosting.com\/blog\/?p=82582"},"modified":"2026-04-07T12:42:51","modified_gmt":"2026-04-07T16:42:51","slug":"rate-limiting-ai-crawler-bots-modsecurity","status":"publish","type":"post","link":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/","title":{"rendered":"Rate Limiting AI Crawler Bots with ModSecurity"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"538\" src=\"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity-1024x538.png\" alt=\"Rate Limiting AI Crawler Bots with ModSecurity - how we did it\" class=\"wp-image-82583\" srcset=\"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity-1024x538.png 1024w, https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity-300x158.png 300w, https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity-768x403.png 768w, https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity.png 1200w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>AI training bots from OpenAI, Anthropic, Amazon, and a dozen other companies are now hitting production web servers with the same aggression as a DDoS attack, and robots.txt isn&#8217;t stopping them. This guide walks through how InMotion&#8217;s systems team uses ModSecurity to enforce per-bot rate limiting at the server level, without cutting off your site&#8217;s indexing exposure entirely.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Problem: AI Bots That Don&#8217;t Follow the Rules<\/h2>\n\n\n\n<p><strong>robots.txt<\/strong> has been the de facto agreement between websites and web crawlers for decades. A directive like <code>Crawl-delay: 10<\/code> tells compliant bots to wait 10 seconds between requests. Google gives you a way to configure crawl rate through <a href=\"https:\/\/www.inmotionhosting.com\/support\/website\/google-tools\/setting-a-crawl-delay-in-google-webmaster-tools\/\">Google Search Console<\/a>. Traditional search crawlers have operated within these boundaries long enough that most sysadmins never thought much about them.<\/p>\n\n\n\n<p>LLM training crawlers are a different story.<\/p>\n\n\n\n<p>Starting in 2024, InMotion&#8217;s systems administration teams began seeing a pattern of unusually heavy traffic across shared and dedicated infrastructure. The source wasn&#8217;t a single bot running wild. It was several bots, each operated by a different AI company, simultaneously crawling the same servers with no delay between requests and no respect for <code>Crawl-delay<\/code> directives. None of them coordinated with each other. None of them needed to. The combined load of GPTBot, ClaudeBot, Amazonbot, and their peers hitting the same server concurrently produces resource exhaustion that looks functionally identical to an unintentional distributed denial-of-service attack.<\/p>\n\n\n\n<p><em>That surprises a lot of website owners who assume <strong>robots.txt<\/strong> is binding. It isn&#8217;t. It&#8217;s a convention, and these bots aren&#8217;t observing it.<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Two Options, One Clear Tradeoff<\/h2>\n\n\n\n<p>The blunt instrument is a full block via .htaccess. You can <a href=\"https:\/\/www.inmotionhosting.com\/support\/website\/block-unwanted-users-from-your-site-using-htaccess\/#multi-agent\">deny access by User-Agent<\/a> and the bots stop hitting your server entirely. Problem solved, except it isn&#8217;t: your site also disappears from AI-driven discovery systems. For businesses that want to appear in AI-generated answers or LLM-powered search features, blocking training crawlers entirely carries a real long-term cost.<\/p>\n\n\n\n<p>Rate limiting is the better path. You slow the bots down to a pace your server can absorb. They still index your content. You still maintain visibility. And when a bot refuses to respect the rate limit you&#8217;ve set, you block that specific request rather than the bot permanently.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How ModSecurity Rate Limiting Works<\/h2>\n\n\n\n<p><a href=\"https:\/\/github.com\/owasp-modsecurity\/ModSecurity\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">ModSecurity<\/a> is an open-source Web Application Firewall that operates inside Apache or Nginx, inspecting HTTP traffic in real time. It&#8217;s the same tool that blocks SQL injection attempts and cross-site scripting attacks on properly hardened servers. What makes it useful here is its ability to track request frequency by User-Agent and deny requests that exceed a defined threshold.<\/p>\n\n\n\n<p>The approach works in two steps:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Identify the incoming request by <code>User-Agent<\/code> string and increment a per-host counter.<\/li>\n\n\n\n<li>If that counter exceeds the allowed limit before it expires, deny the request with a <code>429 Too Many Requests<\/code> response and set a <code>Retry-After<\/code> header.<\/li>\n<\/ul>\n\n\n\n<p>That <code>Retry-After<\/code> header matters. It explicitly tells the bot how long to wait before its next request. A well-behaved crawler will honor it. One that doesn&#8217;t get blocked on its next attempt.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>The ModSecurity Rules<\/strong><\/h2>\n\n\n\n<p>Below are the rate-limiting rules InMotion Hosting&#8217;s systems team developed and currently deploys. Each rule set targets a specific bot by <code>User-Agent<\/code> and enforces a maximum of one request per 3 seconds per hostname.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">GPTBot (OpenAI)<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># Limit GPTBot hits by user agent to one hit per 3 seconds\nSecRule REQUEST_HEADERS:User-Agent \"@pm GPTBot\" \\\n    \"id:13075,phase:2,nolog,pass,\\\n    setuid:%{request_headers.host},\\\n    setvar:user.ratelimit_gptbot=+1,\\\n    expirevar:user.ratelimit_gptbot=3\"\n\nSecRule USER:RATELIMIT_GPTBOT \"@gt 1\" \\\n    \"chain,id:13076,phase:2,deny,status:429,\\\n    setenv:RATELIMITED_GPTBOT,\\\n    log,msg:'RATELIMITED GPTBOT'\"\n    SecRule REQUEST_HEADERS:User-Agent \"@pm GPTBot\"\n\nHeader always set Retry-After \"3\" env=RATELIMITED_GPTBOT\nErrorDocument 429 \"Too Many Requests\"<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color: #24292e\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #6A737D\"># Limit GPTBot hits by user agent to one hit per 3 seconds<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">SecRule REQUEST_HEADERS:<\/span><span style=\"color: #F97583\">User<\/span><span style=\"color: #E1E4E8\">-Agent &quot;@pm GPTBot&quot; \\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    &quot;id:<\/span><span style=\"color: #79B8FF\">13075<\/span><span style=\"color: #E1E4E8\">,phase:<\/span><span style=\"color: #79B8FF\">2<\/span><span style=\"color: #E1E4E8\">,nolog,pass,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setuid:%{<\/span><span style=\"color: #FFAB70\">request_headers.host<\/span><span style=\"color: #E1E4E8\">},\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setvar:user.ratelimit_gptbot=+<\/span><span style=\"color: #79B8FF\">1<\/span><span style=\"color: #E1E4E8\">,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    expirevar:user.ratelimit_gptbot=<\/span><span style=\"color: #79B8FF\">3<\/span><span style=\"color: #E1E4E8\">&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">SecRule USER:RATELIMIT_GPTBOT &quot;@gt <\/span><span style=\"color: #79B8FF\">1<\/span><span style=\"color: #E1E4E8\">&quot; \\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    &quot;chain,id:<\/span><span style=\"color: #79B8FF\">13076<\/span><span style=\"color: #E1E4E8\">,phase:<\/span><span style=\"color: #79B8FF\">2<\/span><span style=\"color: #E1E4E8\">,deny,status:<\/span><span style=\"color: #79B8FF\">429<\/span><span style=\"color: #E1E4E8\">,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setenv:RATELIMITED_GPTBOT,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    log,msg:&#39;RATELIMITED GPTBOT&#39;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    SecRule REQUEST_HEADERS:<\/span><span style=\"color: #F97583\">User<\/span><span style=\"color: #E1E4E8\">-Agent &quot;@pm GPTBot&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #F97583\">Header<\/span><span style=\"color: #E1E4E8\"> always <\/span><span style=\"color: #B392F0\">set<\/span><span style=\"color: #E1E4E8\"> Retry-After &quot;<\/span><span style=\"color: #79B8FF\">3<\/span><span style=\"color: #E1E4E8\">&quot; env=RATELIMITED_GPTBOT<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F97583\">ErrorDocument<\/span><span style=\"color: #E1E4E8\"> <\/span><span style=\"color: #79B8FF\">429<\/span><span style=\"color: #E1E4E8\"> &quot;Too Many Requests&quot;<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">ClaudeBot (Anthropic)<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># Limit ClaudeBot hits by user agent to one hit per 3 seconds\nSecRule REQUEST_HEADERS:User-Agent \"@pm ClaudeBot\" \\\n    \"id:13077,phase:2,nolog,pass,\\\n    setuid:%{request_headers.host},\\\n    setvar:user.ratelimit_claudebot=+1,\\\n    expirevar:user.ratelimit_claudebot=3\"\n\nSecRule USER:RATELIMIT_CLAUDEBOT \"@gt 1\" \\\n    \"chain,id:13078,phase:2,deny,status:429,\\\n    setenv:RATELIMITED_CLAUDEBOT,\\\n    log,msg:'RATELIMITED CLAUDEBOT'\"\n    SecRule REQUEST_HEADERS:User-Agent \"@pm ClaudeBot\"\n\nHeader always set Retry-After \"3\" env=RATELIMITED_CLAUDEBOT\nErrorDocument 429 \"Too Many Requests\"<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color: #24292e\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #6A737D\"># Limit ClaudeBot hits by user agent to one hit per 3 seconds<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">SecRule REQUEST_HEADERS:<\/span><span style=\"color: #F97583\">User<\/span><span style=\"color: #E1E4E8\">-Agent &quot;@pm ClaudeBot&quot; \\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    &quot;id:<\/span><span style=\"color: #79B8FF\">13077<\/span><span style=\"color: #E1E4E8\">,phase:<\/span><span style=\"color: #79B8FF\">2<\/span><span style=\"color: #E1E4E8\">,nolog,pass,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setuid:%{<\/span><span style=\"color: #FFAB70\">request_headers.host<\/span><span style=\"color: #E1E4E8\">},\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setvar:user.ratelimit_claudebot=+<\/span><span style=\"color: #79B8FF\">1<\/span><span style=\"color: #E1E4E8\">,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    expirevar:user.ratelimit_claudebot=<\/span><span style=\"color: #79B8FF\">3<\/span><span style=\"color: #E1E4E8\">&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">SecRule USER:RATELIMIT_CLAUDEBOT &quot;@gt <\/span><span style=\"color: #79B8FF\">1<\/span><span style=\"color: #E1E4E8\">&quot; \\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    &quot;chain,id:<\/span><span style=\"color: #79B8FF\">13078<\/span><span style=\"color: #E1E4E8\">,phase:<\/span><span style=\"color: #79B8FF\">2<\/span><span style=\"color: #E1E4E8\">,deny,status:<\/span><span style=\"color: #79B8FF\">429<\/span><span style=\"color: #E1E4E8\">,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setenv:RATELIMITED_CLAUDEBOT,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    log,msg:&#39;RATELIMITED CLAUDEBOT&#39;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    SecRule REQUEST_HEADERS:<\/span><span style=\"color: #F97583\">User<\/span><span style=\"color: #E1E4E8\">-Agent &quot;@pm ClaudeBot&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #F97583\">Header<\/span><span style=\"color: #E1E4E8\"> always <\/span><span style=\"color: #B392F0\">set<\/span><span style=\"color: #E1E4E8\"> Retry-After &quot;<\/span><span style=\"color: #79B8FF\">3<\/span><span style=\"color: #E1E4E8\">&quot; env=RATELIMITED_CLAUDEBOT<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F97583\">ErrorDocument<\/span><span style=\"color: #E1E4E8\"> <\/span><span style=\"color: #79B8FF\">429<\/span><span style=\"color: #E1E4E8\"> &quot;Too Many Requests&quot;<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Amazonbot<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># Limit Amazonbot hits by user agent to one hit per 3 seconds\nSecRule REQUEST_HEADERS:User-Agent \"@pm Amazonbot\" \\\n    \"id:13079,phase:2,nolog,pass,\\\n    setuid:%{request_headers.host},\\\n    setvar:user.ratelimit_amazonbot=+1,\\\n    expirevar:user.ratelimit_amazonbot=3\"\n\nSecRule USER:RATELIMIT_AMAZONBOT \"@gt 1\" \\\n    \"chain,id:13080,phase:2,deny,status:429,\\\n    setenv:RATELIMITED_AMAZONBOT,\\\n    log,msg:'RATELIMITED AMAZONBOT'\"\n    SecRule REQUEST_HEADERS:User-Agent \"@pm Amazonbot\"\n\nHeader always set Retry-After \"3\" env=RATELIMITED_AMAZONBOT\nErrorDocument 429 \"Too Many Requests\"<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color: #24292e\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #6A737D\"># Limit Amazonbot hits by user agent to one hit per 3 seconds<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">SecRule REQUEST_HEADERS:<\/span><span style=\"color: #F97583\">User<\/span><span style=\"color: #E1E4E8\">-Agent &quot;@pm Amazonbot&quot; \\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    &quot;id:<\/span><span style=\"color: #79B8FF\">13079<\/span><span style=\"color: #E1E4E8\">,phase:<\/span><span style=\"color: #79B8FF\">2<\/span><span style=\"color: #E1E4E8\">,nolog,pass,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setuid:%{<\/span><span style=\"color: #FFAB70\">request_headers.host<\/span><span style=\"color: #E1E4E8\">},\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setvar:user.ratelimit_amazonbot=+<\/span><span style=\"color: #79B8FF\">1<\/span><span style=\"color: #E1E4E8\">,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    expirevar:user.ratelimit_amazonbot=<\/span><span style=\"color: #79B8FF\">3<\/span><span style=\"color: #E1E4E8\">&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">SecRule USER:RATELIMIT_AMAZONBOT &quot;@gt <\/span><span style=\"color: #79B8FF\">1<\/span><span style=\"color: #E1E4E8\">&quot; \\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    &quot;chain,id:<\/span><span style=\"color: #79B8FF\">13080<\/span><span style=\"color: #E1E4E8\">,phase:<\/span><span style=\"color: #79B8FF\">2<\/span><span style=\"color: #E1E4E8\">,deny,status:<\/span><span style=\"color: #79B8FF\">429<\/span><span style=\"color: #E1E4E8\">,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    setenv:RATELIMITED_AMAZONBOT,\\<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    log,msg:&#39;RATELIMITED AMAZONBOT&#39;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #E1E4E8\">    SecRule REQUEST_HEADERS:<\/span><span style=\"color: #F97583\">User<\/span><span style=\"color: #E1E4E8\">-Agent &quot;@pm Amazonbot&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #F97583\">Header<\/span><span style=\"color: #E1E4E8\"> always <\/span><span style=\"color: #B392F0\">set<\/span><span style=\"color: #E1E4E8\"> Retry-After &quot;<\/span><span style=\"color: #79B8FF\">3<\/span><span style=\"color: #E1E4E8\">&quot; env=RATELIMITED_AMAZONBOT<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F97583\">ErrorDocument<\/span><span style=\"color: #E1E4E8\"> <\/span><span style=\"color: #79B8FF\">429<\/span><span style=\"color: #E1E4E8\"> &quot;Too Many Requests&quot;<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Adapting the Rules for Other Bots<\/h2>\n\n\n\n<p>The structure is the same for every bot. To add coverage for a new crawler, copy any rule set and make two changes:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Replace the <code>User-Agent<\/code> string (e.g., <code>GPTBot<\/code>) with the new bot&#8217;s identifier.<\/li>\n\n\n\n<li>Assign unique <code>id<\/code> values and unique <code>env<\/code> variable names to avoid conflicts with existing rules.<\/li>\n<\/ul>\n\n\n\n<p>The <code>id<\/code> field must be unique across your entire ModSecurity configuration. If you&#8217;re adding these to an existing ruleset, check what IDs are already in use before assigning new ones. Collisions cause rules to fail silently.<\/p>\n\n\n\n<p>For reference, a growing list of known AI crawler User-Agent strings includes <code>Bytespider<\/code>, <code>CCBot<\/code>, <code>Google-Extended<\/code>, <code>Meta-ExternalAgent<\/code>, and <code>PerplexityBot<\/code>, among others. The <a href=\"https:\/\/darkvisitors.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Dark Visitors project<\/a> maintains a reasonably current catalogue of known AI agent identifiers.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What Happens After You Deploy<\/h2>\n\n\n\n<p>Once these rules are active, a bot that makes two requests to the same hostname within a 3-second window receives a <code>429<\/code> on the second request. The <code>Retry-After: 3<\/code> header tells it to wait before trying again.<\/p>\n\n\n\n<p>From there, behavior splits into two categories:<\/p>\n\n\n\n<p><strong>Bots that respect the header <\/strong>slow down automatically. They continue indexing your content at a pace your server can handle. Resources are conserved, and your site stays accessible to the crawlers worth caring about.<\/p>\n\n\n\n<p><strong>Bots that ignore the header <\/strong>keep hitting the deny rule on every subsequent request until their internal retry logic kicks in or they move on. Either way, they&#8217;re consuming a fraction of the resources they would have without rate limiting in place.<\/p>\n\n\n\n<p>You won&#8217;t fix the underlying problem of AI companies deploying aggressive crawlers without consent. But you stop absorbing the cost of their indexing operations on your hardware.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Prerequisites and Where to Apply These Rules<\/h2>\n\n\n\n<p>These rules require ModSecurity to be installed and enabled on your server. On <a href=\"https:\/\/www.inmotionhosting.com\/dedicated-servers\">InMotion Hosting Dedicated Servers<\/a> and <a href=\"https:\/\/www.inmotionhosting.com\/vps-hosting\">VPS plans<\/a>, ModSecurity is available through cPanel&#8217;s <strong>WHM interface<\/strong> under <strong>Security Center > ModSecurity<\/strong>. The rules can be added as custom rules through WHM or directly in your server&#8217;s ModSecurity configuration directory.<\/p>\n\n\n\n<p>If you&#8217;re on a managed dedicated server, InMotion Hosting&#8217;s <a href=\"https:\/\/www.inmotionhosting.com\/support\/amp\/advanced-product-support\/\">Advanced Product Support team<\/a> can assist with custom ModSecurity rule deployment. Customers with <a href=\"https:\/\/www.inmotionhosting.com\/blog\/inmotion-premier-care\/\">Premier Care<\/a> have access to InMotion Solutions for exactly this kind of custom server configuration work.<\/p>\n\n\n\n<p>Shared hosting environments don&#8217;t support custom ModSecurity rules at the account level. If aggressive bot traffic is a problem on shared hosting, the options are limited to .htaccess blocks or upgrading to a VPS or dedicated server where you have full WAF configurability.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">A Note on robots.txt<\/h2>\n\n\n\n<p>None of this replaces a well-structured <strong>robots.txt<\/strong> file. Keeping crawl-delay directives in place for compliant bots remains worthwhile, and explicitly listing AI crawlers you want to restrict adds a documented signal of intent, even if some bots ignore it. The ModSecurity rules handle enforcement for the ones that won&#8217;t self-regulate.<\/p>\n\n\n\n<p>robots.txt for bots that respect conventions; ModSecurity rate limiting for the ones that don&#8217;t. The two layers work together.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p>AI training crawlers don&#8217;t observe <a href=\"https:\/\/www.inmotionhosting.com\/blog\/wordpress-seo\/\" type=\"post\" id=\"10322\">robots.txt<\/a> the way traditional search bots do, and the combined load from multiple simultaneous indexing operations can degrade server performance for legitimate traffic. ModSecurity&#8217;s User-Agent-based rate limiting gives you server-side control over how frequently these bots can request resources, without requiring you to block them from indexing your site entirely.<\/p>\n\n\n\n<p>The rules are straightforward to deploy, extend to any bot by copying the template, and provide explicit signaling via <code>Retry-After<\/code> headers for crawlers that are capable of honoring them.<\/p>\n\n\n\n<p>If you&#8217;re seeing unexplained spikes in server load or HTTP request volume that don&#8217;t correlate with real user traffic, check your access logs for <a href=\"https:\/\/www.inmotionhosting.com\/blog\/ai-crawlers-slowing-down-your-website\/\" type=\"post\" id=\"78540\">AI crawler User-Agents<\/a> before assuming you&#8217;re dealing with something more complex.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>AI training bots from OpenAI, Anthropic, Amazon, and a dozen other companies are now hitting production web servers with the same aggression as a DDoS attack, and robots.txt isn&#8217;t stopping them. This guide walks through how InMotion&#8217;s systems team uses ModSecurity to enforce per-bot rate limiting at the server level, without cutting off your site&#8217;s indexing exposure entirely.<\/p>\n","protected":false},"author":116,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[716],"tags":[],"class_list":["post-82582","post","type-post","status-publish","format-standard","hentry","category-ai-tools"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Rate Limiting AI Crawler Bots with ModSecurity | InMotion Hosting<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Rate Limiting AI Crawler Bots with ModSecurity | InMotion Hosting\" \/>\n<meta property=\"og:description\" content=\"AI training bots from OpenAI, Anthropic, Amazon, and a dozen other companies are now hitting production web servers with the same aggression as a DDoS attack, and robots.txt isn&#039;t stopping them. This guide walks through how InMotion&#039;s systems team uses ModSecurity to enforce per-bot rate limiting at the server level, without cutting off your site&#039;s indexing exposure entirely.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/\" \/>\n<meta property=\"og:site_name\" content=\"InMotion Hosting Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/inmotionhosting\" \/>\n<meta property=\"article:published_time\" content=\"2026-03-20T08:56:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-04-07T16:42:51+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"630\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Sam Page\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@inmotionhosting\" \/>\n<meta name=\"twitter:site\" content=\"@inmotionhosting\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sam Page\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Rate Limiting AI Crawler Bots with ModSecurity | InMotion Hosting","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/","og_locale":"en_US","og_type":"article","og_title":"Rate Limiting AI Crawler Bots with ModSecurity | InMotion Hosting","og_description":"AI training bots from OpenAI, Anthropic, Amazon, and a dozen other companies are now hitting production web servers with the same aggression as a DDoS attack, and robots.txt isn't stopping them. This guide walks through how InMotion's systems team uses ModSecurity to enforce per-bot rate limiting at the server level, without cutting off your site's indexing exposure entirely.","og_url":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/","og_site_name":"InMotion Hosting Blog","article_publisher":"https:\/\/www.facebook.com\/inmotionhosting","article_published_time":"2026-03-20T08:56:00+00:00","article_modified_time":"2026-04-07T16:42:51+00:00","og_image":[{"width":1200,"height":630,"url":"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity.png","type":"image\/png"}],"author":"Sam Page","twitter_card":"summary_large_image","twitter_creator":"@inmotionhosting","twitter_site":"@inmotionhosting","twitter_misc":{"Written by":"Sam Page","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"TechArticle","@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/#article","isPartOf":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/"},"author":{"name":"Sam Page","@id":"https:\/\/www.inmotionhosting.com\/blog\/#\/schema\/person\/b459c4b748083c4f8431d5312e795796"},"headline":"Rate Limiting AI Crawler Bots with ModSecurity","datePublished":"2026-03-20T08:56:00+00:00","dateModified":"2026-04-07T16:42:51+00:00","mainEntityOfPage":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/"},"wordCount":1167,"commentCount":0,"publisher":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity-1024x538.png","articleSection":["AI Tools"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/","url":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/","name":"Rate Limiting AI Crawler Bots with ModSecurity | InMotion Hosting","isPartOf":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/#primaryimage"},"image":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity-1024x538.png","datePublished":"2026-03-20T08:56:00+00:00","dateModified":"2026-04-07T16:42:51+00:00","breadcrumb":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/#primaryimage","url":"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity.png","contentUrl":"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2026\/03\/Rate-Limiting-AI-Crawler-Bots-with-ModSecurity.png","width":1200,"height":630},{"@type":"BreadcrumbList","@id":"https:\/\/www.inmotionhosting.com\/blog\/rate-limiting-ai-crawler-bots-modsecurity\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.inmotionhosting.com\/blog\/"},{"@type":"ListItem","position":2,"name":"AI Tools","item":"https:\/\/www.inmotionhosting.com\/blog\/ai-tools\/"},{"@type":"ListItem","position":3,"name":"Rate Limiting AI Crawler Bots with ModSecurity"}]},{"@type":"WebSite","@id":"https:\/\/www.inmotionhosting.com\/blog\/#website","url":"https:\/\/www.inmotionhosting.com\/blog\/","name":"InMotion Hosting Blog","description":"Web Hosting Strategy, Trends and Security","publisher":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.inmotionhosting.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.inmotionhosting.com\/blog\/#organization","name":"InMotion Hosting","url":"https:\/\/www.inmotionhosting.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.inmotionhosting.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2019\/11\/imh-logo-all-colors-big.jpg","contentUrl":"https:\/\/www.inmotionhosting.com\/blog\/wp-content\/uploads\/2019\/11\/imh-logo-all-colors-big.jpg","width":1630,"height":430,"caption":"InMotion Hosting"},"image":{"@id":"https:\/\/www.inmotionhosting.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/inmotionhosting","https:\/\/x.com\/inmotionhosting"]},{"@type":"Person","@id":"https:\/\/www.inmotionhosting.com\/blog\/#\/schema\/person\/b459c4b748083c4f8431d5312e795796","name":"Sam Page","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/35c230f33cd7aacf52f0f53bc02230a2ee7840b5b221af549d491ab98f65a363?s=96&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/35c230f33cd7aacf52f0f53bc02230a2ee7840b5b221af549d491ab98f65a363?s=96&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/35c230f33cd7aacf52f0f53bc02230a2ee7840b5b221af549d491ab98f65a363?s=96&r=g","caption":"Sam Page"},"url":"https:\/\/www.inmotionhosting.com\/blog\/author\/samp\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"primary_category":{"id":716,"name":"AI Tools","slug":"ai-tools","link":"https:\/\/www.inmotionhosting.com\/blog\/ai-tools\/"},"_links":{"self":[{"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/posts\/82582","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/users\/116"}],"replies":[{"embeddable":true,"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/comments?post=82582"}],"version-history":[{"count":3,"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/posts\/82582\/revisions"}],"predecessor-version":[{"id":82731,"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/posts\/82582\/revisions\/82731"}],"wp:attachment":[{"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/media?parent=82582"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/categories?post=82582"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.inmotionhosting.com\/blog\/wp-json\/wp\/v2\/tags?post=82582"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}