Blog Attacks

How to Block Playwright Automation on Your Website

Playwright runs real browsers that look identical to humans at the network layer. Here is how to detect it, and why robots.txt and IP blocks all fail.

Jun 26, 2026 • 8 min read

Mike Kutlu Client-Side Security Consultant

How to Block Playwright Automation on Your Website

Playwright is a browser automation framework built and maintained by Microsoft. It controls real Chromium, Firefox, and WebKit browser instances, executes JavaScript fully, and produces sessions that look indistinguishable from genuine users at the network layer. That makes it the preferred tool for a growing category of AI agents, scrapers, and fraud operations that are specifically designed to evade detection.

Blocking Playwright is not like blocking a declared crawler. There is no user-agent string to deny, no published IP range to restrict, and no robots.txt directive it will respect. Detection requires reading what happens inside the browser session itself. Forrester noted the shift in Q4 2025, formally renaming the analyst category to Bot and Agent Trust Management Software to reflect that the problem is no longer bots at the network edge but agents operating inside real browser sessions. For the broader pattern across automated agents, see our guide to blocking AI agents on your website.

What Is Playwright?

Quick answer: Playwright is an open-source browser automation library maintained by Microsoft. It launches and controls real browser instances (Chromium, Firefox, WebKit), executes JavaScript, handles authentication flows, and interacts with web application interfaces. It is widely used for legitimate testing, but also powers a significant share of AI agents, scrapers, and automated fraud operations targeting live websites.

Unlike earlier automation tools such as Selenium or PhantomJS, Playwright was designed for modern web applications. It handles dynamic content, single-page applications, shadow DOM, iframes, and complex authentication flows without the rendering gaps that older tools exhibited. When used by a malicious or undeclared agent, those capabilities make it particularly difficult to distinguish from a real user at the infrastructure level.

Playwright sessions leave a clean network footprint: a standard TLS handshake, a current browser user-agent, valid HTTP/2 headers, and no obvious automation fingerprints in the request stream. The session looks legitimate because it is running a legitimate browser. Operators running evasion-focused deployments also layer stealth libraries such as playwright-stealth or rebrowser-patches on top, which suppress the default JavaScript environment signals that simpler detection scripts rely on.

Why Standard Blocking Methods Fail Against Playwright

Quick answer: Playwright uses real browsers, so IP blocks, user-agent filtering, and robots.txt have no effect. A Playwright agent arrives with a valid IP, a current Chrome or Firefox user-agent, and correct TLS signatures. Network-layer tools cannot distinguish it from a human session without browser-layer signals.

Traditional bot blocking operates on the assumption that automated traffic looks different at the network edge. Playwright removes that assumption. Here is why each standard method fails:

robots.txt: Playwright agents do not check robots.txt. The framework provides no mechanism for honouring it, and any agent using it for scraping or fraud has no incentive to comply.

User-agent filtering: Playwright's default user-agent strings are identical to current Chrome, Firefox, or WebKit releases. Blocking common automation user-agents (like HeadlessChrome) has no effect because Playwright launches headed browser instances that report the standard production user-agent.

IP blocking: Playwright agents typically run on cloud infrastructure, residential proxies, or distributed networks with rotating IP addresses. IP-level blocking catches only unsophisticated deployments. Well-resourced operations rotate through IP pools faster than blocklists can be updated.

WAF and CDN rules: WAF and CDN rules that look for automation signals in headers or request patterns will see a clean Playwright session as legitimate traffic. The HTTP layer is indistinguishable.

How Playwright Is Detected at the Browser Layer

Quick answer: Playwright leaves signals inside the browser session that are invisible at the network layer but readable from within the page. These include timing anomalies in interaction sequences, absent or unrealistic micro-movement patterns, navigation path uniformity, and specific JavaScript environment properties that differ from genuine user sessions.

Browser-layer detection reads the signals that Playwright automation cannot easily suppress without breaking its own functionality:

Interaction timing: Human users produce irregular timing between actions, with natural variance in click timing, keystroke intervals, and scroll velocity. Playwright automation produces statistically uniform timing or near-identical sequences across sessions. Even with randomisation added by the operator, the distribution differs from genuine human variance.

Navigation and engagement patterns: Human users exhibit non-linear navigation, revisit previous pages, interact with secondary elements, hover before clicking, and leave micro-corrections in input fields. Playwright sessions follow programmatic sequences with no exploratory behaviour, no pointer hesitation, and no abandoned form inputs.

JavaScript environment properties: Playwright exposes specific runtime properties that differ from genuine browser sessions. The most diagnostic include navigator.webdriver (set to true in unpatched Playwright), window.__playwright and window.__pw_manual context markers, and anomalies in the WebGL renderer string and timing API resolution. Stealth libraries attempt to overwrite these values, but doing so introduces secondary inconsistencies that remain detectable.

CDP and WebSocket traces: When Playwright uses Chrome DevTools Protocol to control the browser, specific WebSocket patterns and message timings can be observed from within the page context. These are not visible at the HTTP layer but are readable by in-page detection scripts. For a wider walkthrough of these techniques, see our guide to detecting AI agent traffic on your website.

cside AI agent detection dashboard

What This Looks Like in Practice

A retail competitor deploys a Playwright agent to monitor pricing on your product catalogue. The agent launches a Chromium instance with a current Chrome user-agent and a residential proxy IP from a UK address. Your WAF logs it as a standard browse session from a returning UK user. Your CDN serves it without a challenge.

Inside the session, cside observes: 47 product pages visited in 8 minutes, each with a 3.2-second dwell time and a scroll depth exactly reaching the price element. No hover events before any click. No cursor movement outside the interaction path. Zero variance in the inter-page timing. The pointer trajectory on each product title follows an identical straight line.

The network layer saw a clean session. The browser layer saw a machine. cside classifies the session as a pricing-intelligence agent and triggers a policy response, while the WAF log still shows nothing unusual.

A second pattern cside sees frequently involves checkout fraud. A Playwright agent iterates through card number ranges on a low-friction checkout page, submitting one order attempt every 4 to 6 seconds across hundreds of sessions. Each session uses a different residential IP, a fresh browser context, and a realistic navigator.userAgent string. The WAF sees distributed, low-rate traffic from clean IPs. The browser layer sees zero real pointer movement, no form field hesitation, and navigator.webdriver returning true in unpatched instances. Intent classification at the browser layer stops the enumeration before meaningful card data is validated.

Blocking and Policy Options for Playwright Agents

Quick answer: Once a Playwright session is detected at the browser layer, policy responses can include hard block, silent misdirection (serving altered content to the agent while allowing genuine users through), rate limiting, or allow with monitoring. The right response depends on whether the agent's intent is commercial, malicious, or legitimately automated.

Hard blocking (returning a 403 or redirect) is appropriate for agents showing malicious intent signals, such as credential stuffing, checkout fraud, or high-volume data extraction. Silent misdirection is often more effective for pricing intelligence operations, as it degrades the quality of data the competitor receives without alerting them that detection has occurred.

Not all Playwright sessions are hostile. Legitimate testing pipelines, accessibility audits, and internal automation tools also use Playwright. An effective policy separates intent from tooling: the question is not "is this Playwright?" but "what is this session trying to do, and is it authorised to do it?"

Per-page policy rules allow different responses on different sections of the site. A Playwright agent on the pricing page presents a different risk profile than one on the blog. Blocking uniformly across the site risks disrupting legitimate automated workflows. The same intent-led logic applies to agentic browsers such as OpenAI Operator that drive real browser sessions.

Should You Try to Block All Playwright Traffic?

Quick answer: No. Playwright is also used by legitimate testing tools, monitoring services, and internal automation. The correct approach is intent classification, not tool blocking. Detect what the agent is doing, not what tool it is using, and apply policy based on the session's behaviour and the risk it presents to that specific page or function.

Organisations that attempt to block all Playwright traffic typically find two outcomes: they disrupt their own CI/CD pipelines and monitoring tools, and they push adversarial agents to switch to equivalent frameworks (Puppeteer, Selenium, custom browser automation) that present identical detection challenges.

The more durable approach is browser-layer intent classification. A Playwright session completing a legitimate synthetic monitoring check, a Playwright session scraping pricing data, and a Playwright session attempting card testing on a checkout form all use the same tooling. The session behaviour distinguishes them. In cside's controlled testing, traditional bot detection tools failed to classify malicious AI agent sessions correctly in 81 out of 100 test scenarios, a gap that reflects architecture rather than configuration. The same browser-layer approach extends to blocking AI content scrapers that arrive without a declared user-agent.

Client-Side Security Consultant Mike Kutlu

Client-side security consultant at cside. 10+ years of experience implementing technology solutions for enterprises (previously at Oracle, Cloudflare, and Splunk). Now helping teams use client-side intelligence to catch & reduce fraud.

Don't just take our word for it, ask AI

FAQ

Frequently Asked Questions

Playwright controls real browser instances that produce valid TLS signatures, current user-agent strings, and standard HTTP headers. Unlike declared crawlers such as GPTBot or CCBot that identify themselves and can be blocked by user-agent or robots.txt, Playwright sessions look identical to genuine human sessions at the network layer. Detection requires reading signals from inside the browser session.

No. Playwright does not check or honour `robots.txt`. The framework has no built-in compliance mechanism, and agents using Playwright for scraping or automation have no obligation or incentive to follow `robots.txt` directives. robots.txt is only effective against declared, cooperative crawlers.

Standard WAF rules based on IP reputation, user-agent matching, and request header analysis will not reliably detect Playwright. Playwright produces clean network-layer traffic. WAFs that incorporate behavioural anomaly detection can flag some patterns, but they lack access to the in-browser signals such as interaction timing, navigation patterns, and JavaScript environment properties that distinguish Playwright from a genuine user.

Key signals include statistically uniform interaction timing across sessions, absent hover events before clicks, straight-line pointer trajectories, scroll depth exactly reaching target elements, zero input correction events, and specific JavaScript runtime properties such as `navigator.webdriver` being set to `true` and the presence of `window.__playwright` context markers. These signals are not visible in HTTP logs but are readable by in-page detection scripts.

Use intent classification rather than tool blocking. Legitimate Playwright usage such as internal testing, synthetic monitoring, and accessibility audits follows consistent, narrow behavioural patterns tied to known infrastructure. Malicious Playwright usage exhibits different behavioural signatures across scale, timing distribution, navigation targets, and interaction sequences. Per-page policy rules let you apply different responses on high-risk pages such as checkout, account creation, and pricing without disrupting legitimate automation elsewhere.

Monitor and Secure Your Third-Party Scripts

Gain full visibility and control over every script delivered to your users to enhance site security and performance.

Book a demo

Start for free

Start free, or try Business with a 14-day trial.

cside dashboard interface showing script monitoring and security analytics

How to detect and prevent account sharing without hurting legitimate users

The biggest objection to account sharing detection is false positives: what if we flag a subscriber who is just using multiple devices?

How to Block GPTBot (and Why You Might Not Want To)

GPTBot crawls your site to train OpenAI models. Here is how to block it with robots.txt and IP ranges, plus what that block still leaves uncovered.

Dark cside blog cover with a blue pixel wave and checklist about session recording tools and PII exfiltration risk

Session Recording Tools on Gambling Sites: The PII Exfiltration Risk Operators Are Missing

Session recording tools on gambling sites can silently exfiltrate player PII when misconfigured or compromised. Here are the three ways it happens.

Account sharing detection: how to close the enforcement gap that concurrent session limits miss

Concurrent session limits flag the obvious case. They do not distinguish between a single user on two devices and two people sharing one account.

A smooth glowing blue cursor path beside an angular red bot path on a dark plane.

Catching bots by the way they move: behavioral cursor detection

How cside's cursor_v2 model scores mouse movement to catch the stealth bots that already beat fingerprint and IP checks.

How to Block Applebot-Extended on Your Website

Applebot-Extended is Apple's AI training crawler that feeds Apple Intelligence. Learn how it differs from Applebot and how to opt out via robots.txt.

Dark cside blog cover with a blue pixel wave and checklist about monitoring third-party scripts across casino domains

How to Monitor Third-Party Scripts Across 100 or More Casino Domains

A practical guide to monitoring third-party scripts across 100-plus casino domains: script sprawl, cross-domain alerts, and scaling cside.

Agentic AI Security Risks for Websites: Privacy, Compliance, and Detection

Agentic AI browsers bypass cookie consent, execute real JavaScript, and create GDPR compliance gaps that CDN-level bot detection cannot see.

Illustration of a two-stage neural bot detection stack separating human and bot browser sessions

Catching bots that don't want to be caught: inside a two-stage neural detection stack

How a two-stage neural stack catches stealth browsers, proxied scrapers, and LLM agents that pass every fingerprint check, and where it hits a wall.

How to Block DeepSeekBot on Your Website

DeepSeekBot crawls your site for a Chinese AI company. Learn how to block it with robots.txt, IP rules, and the real data sovereignty risks it raises.