How to Check If AI Agents Can See Your Website (Step-by-Step)
Most websites are invisible to AI agents and don't know it. Here's how to check - from robots.txt to Schema.org to the real-world test.
Dani
A friend asked me recently: "If I ask ChatGPT to recommend a tool in my industry, will it know my company exists?"
I told him to try it. He did. It didn't.
That conversation is what made the penny drop for a lot of people I've talked to. Not the technical arguments about structured data or robots.txt permissions. Just the simple, slightly uncomfortable experience of asking an AI about your own business and getting silence back.
Most websites are invisible to AI agents - and don't know it
The majority of websites are either partially or fully invisible to AI agents. Not because their content is bad, but because of technical blockers they don't even know about.
The most common culprit? Robots.txt rules that block AI crawlers. Many sites added Disallow rules for bots like GPTBot and ClaudeBot during the 2023-2024 panic about AI scraping copyrighted content. Others use Cloudflare or similar CDNs that started blocking AI bots by default. The site owner has no idea - they just notice their business never shows up in AI-generated answers.
Here's how to check, step by step.
Step 1: Check your robots.txt
Go to yoursite.com/robots.txt in your browser. Look for these user agents:
- GPTBot - OpenAI (ChatGPT)
- ChatGPT-User - OpenAI (ChatGPT browsing mode)
- ClaudeBot - Anthropic (Claude)
- Claude-Web - Anthropic (Claude web search)
- Google-Extended - Google (Gemini, AI Overviews)
- PerplexityBot - Perplexity AI
- Applebot-Extended - Apple Intelligence
- Bytespider - ByteDance AI
- CCBot - Common Crawl (used in many LLM training sets)
- Amazonbot - Amazon Alexa AI
If you see Disallow: / next to any of these, that AI system is blocked from your entire site. If you see a blanket User-agent: * with Disallow: /, everything is blocked - including AI crawlers.
What good looks like:
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Google-Extended
Allow: /
This is something I check on every site I audit, and it's surprising how often the issue is right here. A one-line fix in robots.txt that nobody ever looked at.
Step 2: Check your Cloudflare / CDN settings
This is the hidden gotcha. In 2025, Cloudflare changed its default configuration to block AI bots. If you use Cloudflare - and about 20% of the web does - your AI bot traffic may have been turned off without you doing anything.
Check your Cloudflare dashboard under Security > Bots. Look for an "AI Scrapers and Crawlers" toggle. If it's enabled, AI bots are being blocked at the CDN level, before they even reach your robots.txt.
I ran into this myself while building AgentReady's crawler. Sites with aggressive WAF settings reject any bot that doesn't look like Chrome. We had to add browser-like headers to our own crawler just to avoid getting blocked by overzealous firewalls. If it's happening to our crawler, it's happening to GPTBot and ClaudeBot too.
Step 3: Test your Schema.org structured data
Your Schema.org JSON-LD markup is how AI agents understand what your site is about - your products, your business, your content structure.
The quickest way to check: go to Google's Rich Results Test and paste your URL. It'll show you what structured data it finds and whether it's valid.
What to look for:
- Organization - Does it know your company name, logo, description, contact info?
- Product - Can it identify your products, pricing, availability?
- Article / BlogPosting - Are your articles marked up with author, date, headline?
- FAQPage - Do your FAQ sections have proper schema so AI can extract Q&A pairs?
- BreadcrumbList - Can it understand your site's navigation structure?
If the Rich Results Test shows nothing or just a basic WebSite schema, AI agents are working with very limited information about your business.
The thing that surprised me when I started looking at this seriously: how much is missing from most sites, and how fairly easy it is to fix. Most of these Schema.org types are 10-20 lines of JSON you add once. The impact is disproportionate to the effort.
Step 4: Check if your content renders without JavaScript
Open your site in Chrome, then open DevTools (F12) → Settings → Debugger → check "Disable JavaScript." Reload the page. What do you see?
If you see your content - headings, text, images - you're fine. AI crawlers will see it too.
If you see a blank page, a loading spinner, or just a <div id="root"></div> - AI crawlers see the same nothing. Your entire site content is invisible to them.
This is common with single-page apps built in React, Vue, or Angular that do client-side rendering. The fix is usually server-side rendering (SSR) or static site generation (SSG) - Next.js, Nuxt, and similar frameworks make this straightforward.
Step 5: Ask an AI about your business
This is the real-world test. Open ChatGPT, Claude, and Perplexity. Ask them:
- "What does [your company name] do?"
- "Recommend a [your product category] tool"
- "Compare [your company] vs [competitor]"
What do they say? Do they know you exist? Is the information accurate? Are they recommending your competitors instead?
This isn't a technical audit - it's a reality check. It's exactly what my friend did, and it's the most persuasive argument I've found for why this stuff matters. When an AI can't tell someone about your business, you feel it.
The automated version
The steps above work, but they're manual and they miss a lot. There are over 50 signals that affect AI agent readiness - checking them all by hand is tedious and easy to get wrong.
That's why I built AgentReady. You paste a URL, and in about 30 seconds you get a score from 0-100 across all three layers: Discovery (can agents find you), Data Quality (can they understand you), and Actionability (Advanced) (can they act on your behalf). Every check tells you what's passing, what's failing, and exactly how to fix it - with code examples you can copy-paste.
No signup, no email, no cost. I built this because the manual process above is what I was doing for every site I checked, and it took forever. I wanted anyone to be able to answer "can AI agents see my business?" without needing to hire someone or spend half a day in DevTools.
The fixes are easier than most people think. Doing very little right now - literally a few hours of work - gives you a big advantage over everyone who hasn't started yet. And almost nobody has started yet.
Related articles
Get more guides like this
AI readiness insights, delivered to your inbox.