Bot Protection
Learn how to manage bot traffic to your site.Bots generate nearly half of all internet traffic, with many originating from malicious sources. These automated threats scrape content, attempt unauthorized logins, or overload servers. Bot protection mitigates these risks by distinguishing between verified bots and potentially harmful automated traffic.
Bot protection systems analyze incoming traffic to identify whether a request originates from a real user, a trusted bot like a search engine crawler, or an unknown automated source.
- Allowing legitimate bots that correctly identify themselves
- Challenging suspicious traffic that behaves abnormally or does not resemble real browser activity
- Enforcing browser-like behavior by verifying navigation patterns and cache usage
To filter out harmful bot traffic, various techniques are used, including:
- Signature-based detection: Inspecting HTTP requests for known bot signatures
- Rate limiting: Restricting how often certain actions can be performed to prevent abuse
- Challenges: Using JavaScript checks to verify human presence
- Behavioral analysis: Detecting unusual patterns in user activity that suggest automation
With Vercel, you can use:
- Managed rulesets to challenge specific bot traffic
- Rate limiting and challenge actions with WAF custom rules to prevent bot activity from reaching your application
With Vercel, you can use the bot filter managed ruleset to challenge non-browser traffic from accessing your applications. It filters out automated threats while allowing legitimate traffic.
- It identifies clients that violate browser-like behavior and serves a javascript challenge to them.
- It prevents requests that falsely claim to be from a browser such as a
curl
request identifying as Chrome. - It automatically excludes verified bots, such as Google's crawler, from evaluation.
To learn more about how the ruleset works, review the Challenge section of Firewall actions. To understand the details of what get logged and how to monitor your traffic, review Firewall Observability.
For trusted automated traffic, you can create custom WAF rules with bypass actions that will allow this traffic to skip the bot filter ruleset.
You can apply the ruleset to your project in log or challenge mode. Learn how to Configure bot filter managed ruleset.
Vercel maintains and continuously updates a comprehensive directory of known legitimate bots from across the internet. This directory is regularly updated to include new legitimate services as they emerge. Attack Challenge Mode and bot filter automatically recognize and allow these bots to pass through without being challenged. You can block access to some or all of these bots by writing WAF custom rules with the User Agent match condition. To learn how to do this, review WAF Examples.
Bot name | Description | Documentation |
---|---|---|
AdsBot-Google | AdsBot-Google is Google's web crawler used for quality control of Google Ads. | View |
Adsense | The AdSense crawler visits participating sites in order to provide them with relevant ads. | View |
AhrefsBot | Powers the database for both Ahrefs, a marketing intelligence platform, and Yep, an independent, privacy-focused search engine. | View |
AhrefsSiteAudit | Powers Ahrefs’ Site Audit tool. Ahrefs users can use Site Audit to analyze websites and find both technical SEO and on-page SEO issues. | View |
Algolia | The Algolia Crawler extracts content from your site and makes it searchable. | View |
Amazon Kendra | Amazon Kendra is a managed information retrieval and intelligent search service that uses natural language processing and advanced deep learning model. | View |
Amazon Q | Amazon Q Business is a generative artificial intelligence (generative AI)-powered assistant that you can tailor to your business needs. | View |
Amazonbot | Amazonbot is Amazon's web crawler used to improve our services, such as enabling Alexa to more accurately answer questions for customers. | View |
APIs-Google | Crawling preferences addressed to the APIs-Google user agent affect the delivery of push notification messages by Google APIs. | View |
Applebot | Applebot is used to power search in Spotlight, Siri, and Safari. | View |
Artemis Web Crawler | Artemis is a calm web reader with which you can follow websites and blogs. | View |
Better Stack | Better Stack is a platform for monitoring and alerting on your applications. | View |
Bingbot | Bingbot is Microsoft's web crawler used for indexing websites for Bing Search. | View |
ChatGPT-User | Handles user-initiated requests in ChatGPT, accessing external content to provide real-time information; not used for automated crawling or AI training. | View |
Checkly | Checkly is a platform for monitoring and alerting on your applications. | View |
Chrome Lighthouse | PageSpeed Insights (PSI) reports on the user experience of a page on both mobile and desktop devices, and provides suggestions on how that page may be improved. | View |
Chrome Privacy Preserving Prefetch Proxy | Chrome's Privacy Preserving Prefetch Proxy service that fetches /.well-known/traffic-advice to enable privacy-preserving prefetch hints. | View |
Claude-SearchBot | Claude-SearchBot navigates the web to improve search result quality for users. It analyzes online content specifically to enhance the relevance and accuracy of search responses. | View |
Claude-User | Claude-User supports Claude AI users. When individuals ask questions to Claude, it may access websites using a Claude-User agent. | View |
ClaudeBot | ClaudeBot helps enhance the utility and safety of our generative AI models by collecting web content that could potentially contribute to their training. | View |
Cookiebot | Cookiebot automates compliance with cookie laws and helps you manage your cookie consent preferences. | View |
Datadog Synthetic Monitoring Robot | Datadog's automated monitoring service that performs synthetic tests to verify website availability and performance. | View |
DuckAssistBot | DuckAssistBot is a web crawler for DuckDuckGo Search that crawls pages in real-time for AI-assisted answers, which prominently cite their sources. This data is not used in any way to train AI models. | View |
DuckDuckBot | DuckDuckBot is a web crawler for DuckDuckGo. DuckDuckBot’s job is to constantly improve search results and offer users the best and most secure search experience possible. | View |
FacebookExternalHit | Fetches content for shared links on Meta platforms to generate rich previews. | View |
Feedfetcher | Feedfetcher is used for crawling RSS or Atom feeds for Google News and PubSubHubbub. | View |
GitHub Camo | GitHub's image proxy service | View |
GitHub Hookshot | GitHub's webhooks for events like push, pull request, etc. | View |
Google-CloudVertexBot | Crawling preferences addressed to the Google-CloudVertexBot user agent affect crawls requested by the site owners' for building Vertex AI Agents. It has no effect on Google Search or other products. | View |
Google-Extended | Google-Extended is a standalone product token that web publishers can use to manage whether their sites help improve Gemini Apps and Vertex AI generative APIs, including future generations of models that power those products. Grounding with Google Search on Vertex AI does not use web pages for grounding that have disallowed Google-Extended. Google-Extended does not impact a site's inclusion or ranking in Google Search. | View |
Google-InspectionTool | Crawling preferences addressed to the Google-InspectionTool user agent affect Search testing tools such as the Rich Result Test and URL inspection in Search Console. It has no effect on Google Search or other products. | View |
Google PageRenderer | Upon user request, Google Page Renderer fetches and renders web pages. | View |
Google Publisher Center | Google Publisher Center fetches and processes feeds that publishers explicitly supplied for use in Google News landing pages. | View |
Google Read Aloud | Upon user request, Google Read Aloud fetches and reads out web pages using text-to-speech (TTS). | View |
Google-Safety | The Google-Safety user agent handles abuse-specific crawling, such as malware discovery for publicly posted links on Google properties. As such it's unaffected by crawling preferences. | View |
Google Site Verifier | Google Site Verifier fetches Search Console verification tokens. | View |
Google StoreBot | Crawling preferences addressed to the Storebot-Google user agent affect all surfaces of Google Shopping (for example, the Shopping tab in Google Search and Google Shopping). | View |
Googlebot | Crawling preferences addressed to the Googlebot user agent affect Google Search (including Discover and all Google Search features), as well as other products such as Google Images, Google Video, Google News, and Discover. | View |
GoogleOther | Crawling preferences addressed to the GoogleOther user agent don't affect any specific product. GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development. It has no effect on Google Search or other products. | View |
GPTBot | Crawls web content to improve OpenAI's generative AI models; respects 'robots.txt' directives to exclude sites from training data. | View |
HetrixTools Uptime Monitoring Bot | HetrixTools Uptime Monitoring Bot is used by HetrixTools's monitoring services to perform various checks on websites, including uptime and performance monitoring. | View |
Hookdeck | A reliable Event Gateway for event-driven applications | View |
Hydrozen | Hydrozen is a tool for monitoring availability of your websites, Cronjobs, APIs, Domains, SSL etc. | View |
ImagesiftBot | ImageSiftBot is a web crawler that scrapes the internet for publicly available images to support Hive's suite of web intelligence products. | View |
Inngest | Inngest is a platform for building event-driven applications. | View |
LinkedInBot | LinkedInBot is a bot that renders links shared on LinkedIn. | View |
Lumar | The Lumar website intelligence platform is used by SEO, engineering, marketing and digital operations teams to monitor the performance of their site’s technical health, and ensure a high-performing, revenue-driving website. | View |
meta-externalagent | The Meta-ExternalAgent crawler crawls the web for use cases such as training AI models or improving products by indexing content directly. | View |
meta-externalfetcher | The Meta-ExternalFetcher crawler performs user-initiated fetches of individual links to support specific product functions. Because the fetch was initiated by a user, this crawler may bypass robots.txt rules. | View |
OAI-SearchBot | Indexes websites for inclusion in ChatGPT's search results; does not crawl content for AI model training. | View |
PayPal | PayPal delivers real-time event notifications for payments, subscriptions, and account updates. | View |
Perplexity-User | Handles user-initiated requests in Perplexity, accessing external content to provide real-time information; not used for automated crawling or AI training. | View |
PerplexityBot | Indexes websites for inclusion in Perplexity's search results; does not crawl content for AI model training. | View |
Pingdom Bot | Pingdom Bot is used by Pingdom's monitoring services to perform various checks on websites, including uptime and performance monitoring. | View |
Pinterest Bot | Pinterest Bot is a bot that crawls the web and indexes images and videos. | View |
QStash | QStash is a platform for building event-driven applications. | View |
Amazon Route 53 Health Check Service | Amazon Route 53 Health Check Service | View |
Semrush | Semrush is a platform for SEO, content marketing, competitor research, PPC and social media marketing. | View |
Sentry Uptime Monitoring Bot | Sentry's Uptime Monitoring Bot performs health checks on configured URLs to monitor the availability and reliability of web services. | View |
Site24x7 | Site24x7 Bot is used by Site24x7's monitoring services to perform various checks on websites, including uptime and performance monitoring. | View |
StatusCake | StatusCake is a website monitoring service that checks the uptime and performance of your website. | View |
Stripe Webhooks | Stripe's webhook service that delivers real-time event notifications for payment processing and account updates. | View |
svix | svix is a webhook service for sending events to webhooks. | View |
Twitterbot | Fetches content for shared links on X/Twitter to generate rich previews. | View |
Uptime Robot | Uptime Robot is a platform for monitoring and alerting on your applications. | View |
v0bot | Bot for v0 services. | View |
Vercel Favicon Bot | Vercel Favicon Bot | View |
vercelflags | vercel flags | View |
Vercel Screenshot Bot | Vercel Screenshot Bot | View |
Yandexbot | YandexBot is a web crawler operated by Yandex, a major Russian search engine. | View |
Was this helpful?