Bot Crawl Statistics
Understanding how AI systems interact with your website is fundamental to GEO. This page explains the metrics we track and what they reveal about AI visibility.
What We Track
Every bot visit to a GEO-optimized site is logged in the bot_crawl_logs table
with the following data points:
Bot Identity
Which AI crawler made the request — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bingbot, Applebot, and others. Identified via User-Agent string matching.
Page Path
Which page the bot requested. This reveals which content AI systems prioritize and which pages may need optimization to attract more crawl attention.
Full User Agent
The complete User-Agent string for detailed analysis. Useful for distinguishing crawler versions, detecting new bots, and diagnosing access issues.
Timestamp
Precise timing of each crawl event. Enables frequency analysis, trend detection, and correlation with GEO infrastructure changes.
Aggregated Metrics
Raw crawl logs are aggregated into the bot_crawl_hourly table for efficient
trend analysis. Aggregated metrics include:
- Hourly crawl volume: Total bot visits per hour, broken down by crawler
- Page-level distribution: Which pages receive the most bot attention
- Crawler diversity: How many distinct AI systems are visiting
- Trend direction: Is crawl activity increasing, stable, or declining?
Why Crawl Stats Matter for GEO
Bot crawl activity is GEO Signal 3 in the 8-signal framework. It provides ground-truth validation that your GEO infrastructure is working:
- If AI crawlers are not visiting, your structured data and clean-room HTML are not being consumed — regardless of quality
- Increasing crawl frequency correlates with improving AI citation likelihood
- Sudden drops in crawl activity may indicate infrastructure issues (broken pages, blocked bots, DNS problems)
- Page-level crawl patterns reveal which content AI systems find most valuable
Without crawl logging, GEO optimization is blind. You cannot improve what you cannot measure. Bot crawl statistics close this measurement gap.
Health Monitoring Integration
Crawl statistics feed into the daily health check system. The health-check-daily
function verifies that all bot-facing endpoints are operational, AI surface files are accessible,
and bot crawl logging is functioning. Results are stored in the health_check_daily_runs
table and surfaced in the daily GEO audit email.