How to block bots from OpenAI GPTBot

OpenAI has an AI-based crawler (also referred to as “GPTBot”) that fetches and indexes website content, which is then used to power the training data of models like GPT series, o series, and more. According to our research, OpenAI generates hundreds of millions of monthly requests, making it the most active AI crawler on the web.

Good bots vs. bad bots

Good Bots: Examples include Googlebot, Applebot, and Bingbot. They’re typically transparent about their intentions, respect robots.txt, and help improve search engine visibility.
Bad Bots: These bots may scrape content without permission, inflate server usage, or perform malicious activities. Even certain “legitimate” AI crawlers can become unwanted if they exceed fair-use limits or repeatedly crawl error pages (e.g., excessive 404 requests).

We'll focus on the OpenAI GPTBot here.

What is the Vercel Firewall?

Vercel Firewall (WAF) is a Web Application Firewall service that lets you:

Log or block requests that match certain criteria (IP address, user agent, request path, geolocation, etc.).
Challenge suspicious traffic with an automated check (e.g., requiring the visitor’s browser to pass a JavaScript challenge).
Rate limit excessive or malicious requests.

All configuration changes are applied globally within ~300ms, and can be rolled back instantly.

Identifying the Bot

User Agent

OpenAI identifies itself in the User-Agent header. Check your Firewall traffic for entries like Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot).

If you see repeated requests from that User-Agent, you can be confident it’s OpenAI. Note that AI crawlers occasionally use different sub-variants of user agents over time.

Taking action with the Firewall

Vercel Firewall allows a few approaches:

Rate Limit

If you don’t mind occasional indexing by OpenAI models, but want to cap how frequently it crawls your site, you can rate limit requests from the “GPTBot” user agent. This helps prevent resource overload while still allowing some AI-based traffic.

Challenge

If you want to slow down less-dedicated bots, you can challenge requests from certain user agents. The challenge forces a short browser-based security check. Most legitimate human traffic (with real browsers) will pass automatically, but a bot that can’t solve or respond to the challenge will be blocked.

Persistent Blocking (No Charge)

If you want to fully prevent GPTBot from crawling your site—and avoid incurring data transfer or function usage for these requests—you can persistently block it. Requests that match your block rule won’t reach your Vercel Functions or static pages, so you won’t be charged.

Templates

Vercel provides several Firewall Templates you can clone or learn from:

To block OpenAI GPTBot specifically, you can start with the “Block AI Bots Firewall Rule” template and modify it for the defined user agent.

Example: Creating a Custom Rule to Block OpenAI GPTBot

Navigate to “Firewall” in your Vercel project dashboard.
Click “Add Rule” (or “Create Rule” if using templates).
Select the “Block AI Bots” template (or a blank custom rule).
Match Condition: For “User Agent” contains “GPTBot”.
Action: Choose “Deny” (to block) or “Challenge” (to verify) or “Rate Limit” (to limit).
Review & Publish changes.

Note: Once published, changes take effect globally in ~300ms. You can always roll back if you block or challenge traffic unintentionally.

Resources

These posts provide background on how advanced bots handle JavaScript, distribution, and crawling inefficiencies (e.g., excessive 404s).

How to block bots from OpenAI GPTBot

Couldn't find the guide you need?