You have three problems wearing the same costume. A human reading your checkout page, a search crawler indexing it, and a stealth browser enumerating stolen cards against it can all present a plausible Chrome user-agent and a clean residential IP. Treat them as one bucket and you either block revenue or wave through fraud.
The fix is a taxonomy and a decision: classify each session into a known class, read what it is trying to do, and map that to exactly one action: allow, monitor, challenge, serve agent content, or block. This post is the classification and decision framework. For the underlying signal mechanics, the guide to detecting AI agent traffic covers identity, network, browser, and behavioral signals; for picking a vendor, see how to choose an AI agent detection solution. When you need to know why older defenses miss this traffic, legacy bot detection in the age of AI agents explains the gap. Here, the job is deciding what to do once you can see the traffic.
A five-class taxonomy that maps to action
"Good bot vs bad bot" is too coarse, because a consumer's shopping agent is automated and welcome, while a search crawler is automated and welcome for a completely different reason. Split traffic into five operational classes, each tied to a default action:
| Class | Examples | Intent | Default action |
|---|---|---|---|
| Human | Real visitors, logged-in customers | Browse, buy, manage account | Allow, monitor risk |
| Good bot | Googlebot, GPTBot, ClaudeBot, PerplexityBot, partner API bots | Index content, declared integration | Allow, rate-limit, verify identity |
| Neutral automation | Uptime monitors, link checkers, RSS/preview fetchers | Operational, low value, low harm | Monitor, rate-limit |
| Consumer AI agent | Shopping and research agents acting for a real user | Complete a task on behalf of a person | Allow or serve agent content |
| Malicious agent | Scrapers, card testers, account-abuse bots, stealth browsers | Extract value or commit fraud | Challenge or block |
The class is not fixed for a session. A consumer agent browsing product pages is in the "allow" column right up until it starts submitting payment forms at machine speed, at which point its intent, and its class, have changed.
Identity tells you who; intent tells you what to do
Identity signals answer "who does this claim to be": user-agent, declared crawler name, fingerprint. They are necessary and almost free to spoof. A self-declaring GPTBot can be verified by cross-checking the request IP against the crawler's published ranges, which catches impersonators. But the dangerous classes never declare themselves.
Intent signals answer "what is this session doing." They live in behavior and in the runtime, and they are far more expensive to fake convincingly:
- navigator.webdriver set, or suppressed too cleanly, on a session that otherwise looks like vanilla Chrome.
- CDP / Runtime leaks: Chrome DevTools Protocol artifacts (
cdc_properties, stripped accessibility nodes) that betray Playwright or Puppeteer driving the page. - Fingerprint drift: WebGL, Canvas, and Audio context that do not tell a coherent story about one device, or that mutate across a session.
- Residential-proxy behavior: a "consumer" IP whose timezone, language, and ASN history don't line up, rotating across requests.
- Action cadence: a burst of card submissions in a few minutes is intent, not identity. No user-agent string will tell you that; the sequence of actions will.
You classify on identity plus intent together. A session that passes every identity check but fails on runtime and cadence is exactly the malicious-agent case that network-only tooling waves through.
Why this matters more in 2026
The malicious class got cheap. cside's 2026 web security research reports that playwright-stealth installs were roughly ten times higher through 2025, a clean proxy for how fast anti-detection automation moved from a niche into mainstream attack tooling. cside 2026 research report
At the same time the welcome classes grew. AI-search crawlers now drive real discovery, and consumer shopping agents complete real purchases. So the two ends of the taxonomy expanded at once: more automation you want to allow, and more automation built specifically to look like it. That is why a binary detector fails: it has no column for "automated and welcome." For the deep mechanics of how the malicious end hides, see stealth browsers and anti-detect browsers, explained. The same signals catch the credential-stuffing runs that hit the login once an agent shifts from browsing to attacking accounts.
Map each class to one enforcement action
Once a session is classified, enforcement should be deterministic. Five actions cover the taxonomy:
- Allow: humans and verified good bots in their expected paths. Log and move on.
- Monitor: neutral automation and any session whose class is still ambiguous. Collect signals, don't add friction yet.
- Challenge / throttle: sessions trending malicious. Slow them, step up verification, or rate-limit the specific action (login, checkout) rather than the whole site.
- Serve agent content: a known consumer agent on a path where you'd rather guide than block. Give it a purpose-built view or a "contact us" step instead of leaking raw pricing to a scraper-shaped session.
- Block: confirmed malicious intent such as card enumeration, credential stuffing, and account-abuse runs.
Two rules keep this honest. Scope actions to the action, not the visitor: challenge the checkout submission, don't 403 the homepage. And make the decision per page: a stealth browser reading a blog post is a monitor case; the same session on your card vault is a block case. For the playbook on the block end, see how to block AI agents on your website, and for the payment-fraud variant, how to block AI card-testing agents.
Where the classification has to happen
This taxonomy only works if you can read intent, and intent lives in the browser. AI crawlers that never execute JavaScript never fire your analytics, so they're invisible to GA4 and PostHog. Consumer and malicious agents run real browsers and look human to those same tools. Neither end is separable at the analytics layer, and most of the malicious class passes network-layer checks by design: clean IP, valid user-agent, plausible request shape.
cside watches the browser runtime in real time. It captures the device and real IP, surfaces the automation and fingerprint signals that reveal intent, flags AI agents and stealth browsers inside the page, and exposes those signals via API so you can drive the allow / monitor / challenge / serve / block decision in your own workflow. That is the layer where a human, a good bot, and a malicious agent finally stop looking alike.







