The request path
A request hits your edge
An AI agent or a human visits a page on your site. The request arrives at
your CDN or edge before it reaches your origin.
The integration captures it
The integration you installed reads the request line: the user-agent, path,
method, country, IP, referer, and status. It forwards a small event to
Rankly and lets the request continue to your site untouched. Your visitors
see no change.
Rankly classifies and verifies
Rankly names the agent (for example
GPTBot), groups it by what it does
(training, search-index, live-fetch), and checks its identity against the
vendor’s published IP ranges and signed-request signatures.What Rankly captures
Beyond AI bots, the integrations also forward two kinds of browser-user-agent requests on purpose, because they power features that would otherwise read zero:LLM referrals
A human who clicks a citation in ChatGPT, Perplexity, or Gemini arrives with
a normal browser user-agent and an LLM referer. We keep these so you can see
AI-driven traffic, not just crawls.
Humanity signal
Real browsers fetch
/favicon.ico; headless scrapers usually don’t. We use
favicon fetches as a lightweight signal to separate real humans from
look-alike bots.Why the edge, not a tag
A JavaScript tag only fires when a real browser runs your page. Most AI crawlers never execute JavaScript, so a tag-based tool counts almost none of them. Because Rankly reads requests at the edge, it counts the crawl whether or not any code runs.Some integrations read logs your platform already produces (CloudFront,
Google Cloud, Fastly, Akamai, Vercel/Netlify Log Drains). Others sit in the
request path (Cloudflare Worker, Vercel/Netlify edge functions). Both report
the same events; pick whichever your platform makes easiest.
What Rankly works out for every request
From the raw request, Rankly derives the full picture:Who it is
The agent’s name, vendor, category, and (for AI bots) purpose: training,
search-index, or live-fetch.
Whether it's real
Verified, unverified, or spoofed, by checking published IP ranges and signed
requests.
Whether it behaved
Whether the crawler respected your robots.txt, and which paths it shouldn’t
have fetched.
Whether it's disguised
A proprietary check that catches headless bots posing as real browsers.
Where your data lives
Every event is tied to your account through your ingest token and filtered to the domains you have verified. Events for hosts you have not verified are dropped, so your dashboard only ever shows your own traffic.See the event fields
The exact fields Rankly reads from each request, and what it derives.