Frequently Asked Questions
Everything you need to know about AI visibility check
Why AI visibility is different from SEO
Traditional search engines crawl your entire site, index every word, and have virtually unlimited resources to parse complex HTML. AI systems work differently. Every piece of information an AI reads costs tokens — and tokens cost money and time.
When ChatGPT, Gemini, or Perplexity look at your website, they need to extract useful information as quickly and efficiently as possible. A page buried in thousands of lines of JavaScript, inline styles, and deeply nested <div> tags forces the AI to spend far more tokens just to find a phone number or understand what your business does. In many cases, the AI will simply skip over that information or hallucinate an answer instead.
This means that for AI visibility, how you structure your data matters more than how much content you have. Clean semantic HTML, structured data (JSON-LD), and dedicated machine-readable files like llms.txt and page.md (a text-only version of your main pages) allow AI systems to find what they need in a fraction of the tokens — making it far more likely they'll reference your site accurately.
AI visibility check measures exactly this: how easy it is for AI systems to find, read, and understand your website.
AI visibility check analyzes how visible and accessible your website is to AI systems like ChatGPT, Gemini, and Perplexity. It crawls your site, evaluates key technical and content factors, and provides an actionable score from 0 to 100 — along with specific recommendations to improve your AI visibility.
Your score is the weighted sum of 6 categories, each normalized to a 0–100 scale:
| Category | Weight |
|---|---|
| Web Popularity | 20% |
| AI Readiness | 20% |
| AI Brand Awareness | 20% |
| Discoverability | 15% |
| Content Structure | 15% |
| Technical Efficiency | 10% |
The first 5 categories are calculated immediately when the analysis completes. The AI Brand Awareness category requires querying external AI models, so your initial score shows a maximum of 80/100. Once the AI evaluation finishes (typically 10–15 seconds), the score updates and animates to the final value out of 100.
Based on your total score, you receive one of these grades:
| Score | Grade |
|---|---|
| ≥ 90 | LLM-Optimized |
| ≥ 70 | Good Visibility |
| ≥ 50 | Partial Visibility |
| ≥ 30 | Low Visibility |
| ≥ 10 | Poor Visibility |
| < 10 | Invisible |
AI Readiness (20% of your score) evaluates how well your site communicates with AI crawlers and systems:
- llms.txt (3 pts) — A plain-text summary file at your domain root that helps AI systems understand your site's purpose and content at a glance.
- llms-full.txt (2 pts) — An extended version containing your full site content in a single, AI-readable file.
- robots.txt (0–5 pts) — Controls how bots access your site. Points are awarded for a well-configured robots.txt. Penalties apply if AI-specific bots (GPTBot, ClaudeBot, etc.) are blocked.
Discoverability (15% of your score) evaluates how easily AI systems can find and understand your content:
- sitemap.xml (2 pts) — Awarded for having a valid sitemap that helps crawlers discover all your pages.
- Structured Data (0–3 pts) — Scored from 0 to 100 based on the presence and completeness of JSON-LD structured data, then scaled to up to 3 points. The score evaluates four areas:
- Universal Schemas (30%) — Core schema types every site should have: Organization/Business, WebSite, BreadcrumbList, page types (WebPage, AboutPage, etc.), and ContactPoint.
- Local Business (20%) — Only evaluated if the site is a local business: PostalAddress, opening hours, geo coordinates, and reviews/ratings.
- Site-Type Specific (30%) — Schemas relevant to the detected site type. E-commerce sites are checked for Product, Offer, and reviews. Blogs are checked for Article/BlogPosting, Author, and media objects. Service sites are checked for Service, Offer, and area served.
- Enrichment (20%) — Additional schemas that boost visibility: FAQPage (with 3+ Q&A pairs for full score), social profiles (sameAs), and SearchAction for sitelinks.
- Meta Tags (0–5 pts) — Evaluates title, description, Open Graph tags, canonical URL, language attribute, and other metadata.
Content Structure (15% of your score) assesses how well your content is organized for both humans and machines:
- Heading Hierarchy (0–3.5 pts) — Checks your H1–H6 structure for proper nesting. Deductions for skipped levels, missing H1, duplicate H1s, or empty headings.
- Semantic HTML (0–2.5 pts) — Looks for meaningful HTML5 elements like
<main>,<nav>,<article>,<section>,<header>, and<footer>. - Content Freshness (0–2.5 pts) — Scores based on the age of the most recent date found on your pages. Fresher content scores higher.
- FAQ Content (0–1.5 pts) — Detects FAQ content across your site: dedicated FAQ pages, FAQPage schema,
<details>/<summary>elements, heading-based Q&A patterns, and accordion components. Points are awarded for having FAQ content, a dedicated FAQ page, structured data markup, sufficient Q&A volume, and substantive answers.
FAQ pages are one of the most powerful tools for AI visibility because they directly mirror how people interact with AI assistants: by asking questions. When a user asks ChatGPT, Gemini, or Perplexity something like "What is the return policy for online orders?", the AI looks for content that matches that exact format — a clear question with a clear answer.
What AI visibility check checks
We scan all crawled pages for FAQ content using multiple detection methods:
- Dedicated FAQ pages — URLs containing
/faq,/frequently-asked-questions,/help,/hjelp,/domande-frequenti, or links in your navigation whose anchor text suggests a FAQ. - FAQPage structured data — JSON-LD markup with
@type: "FAQPage"that makes your Q&A pairs machine-readable. This is the most impactful format for AI systems. - HTML patterns —
<details>/<summary>elements, heading-based Q&A sequences (3+ consecutive headings phrased as questions), and accordion components identified by CSS classes. - FAQ section headings — Headings like "Frequently Asked Questions", "Vanlige spørsmål", "Domande frequenti", or "Q&A" followed by content.
Why it matters
FAQ content impacts your score in two ways:
- Discoverability — FAQPage structured data is part of the structured data score (0–100). Having a valid FAQPage schema with 3+ Q&A pairs earns full points in the Enrichment category; fewer pairs earn partial credit. This directly contributes to the up to 3 points that structured data adds to your Discoverability score.
- Content Structure — Having FAQ content (with or without structured data) contributes up to 1.5 points to your Content Structure score, rewarding dedicated pages, sufficient Q&A volume, and substantive answers.
Best practices
- Consolidate your FAQ — If Q&A content is scattered across multiple pages, consider creating a dedicated
/faqpage that brings it all together. - Write natural questions — Use the same phrasing your customers would use when asking an AI assistant. "How much does X cost?" is better than "Pricing information".
- Give substantive answers — Short one-word answers are less useful to AI systems. Aim for at least 1–2 sentences per answer.
- Include 5+ Q&A pairs — A critical mass of questions signals that your FAQ is a comprehensive resource, not an afterthought.
- Keep it updated — Stale FAQs with outdated information are worse than no FAQ at all.
Technical Efficiency (10% of your score) checks the technical health of your site from an AI crawling perspective:
- Code-to-Text Ratio (0–3 pts) — Proportion of visible text versus HTML code. A higher ratio means more meaningful content for AI to consume.
- JS-Dependent Content (0 or 2 pts) — Pass/fail. Verifies that content is visible without JavaScript, since many AI crawlers don't execute JS.
- Page Weight (0–3 pts) — Average HTML page size. Lighter pages are easier for AI systems to process.
- HTTPS (0 or 2 pts) — Pass/fail. Confirms your site uses a secure connection.
Web Popularity (20% of your score) gauges your site's presence and authority across the web — a key factor in whether AI systems reference your content:
- Domain Mentions (0–10 pts) — How often your domain is mentioned across the web. More mentions indicate higher authority.
- Indexed Pages (0 or 10 pts) — Whether your site appears in search engine indexes.
- Common Crawl (0 or 10 pts) — Whether your site is present in the Common Crawl archive, a major data source used to train AI models.
AI Brand Awareness (20% of your score) measures whether major AI systems actually know your business. Unlike the other categories — which analyze your website's technical setup — this category tests real-world AI visibility by directly asking 3 large language models (ChatGPT, Gemini, and Grok) about your company.
How it works
No website content is shared with the AI models. We provide only two pieces of information: your website URL and your company name. We then ask each model:
- Do you know this business?
- What industry does it operate in?
- Where is it located geographically?
- How confident are you in your answer? (high / medium / low)
- Does this business or URL appear in your training data?
- Would you consider it a well-known brand?
This approach is deliberately minimal: since no HTML content or crawled data is sent to the AI models, we do not violate any directive from the website's robots.txt or content policies that restrict sharing content with LLMs. The models can only answer based on what they already know from their training data.
Scoring breakdown
The brand score (0–10) is computed from three components:
- Recognition (0–6 pts) — Each AI model that recognizes your business contributes points based on its confidence level: high confidence = 2 pts, medium = 1.5 pts, low = 1 pt, unknown = 0.
- Accuracy (0–2 pts) — For each model that knows your business, we compare its answers (location, description) against the data we extracted from your website. Closer matches indicate the AI has accurate information about you.
- Consistency (0–2 pts) — Measures agreement between models. If all 3 recognize you with high confidence: 2 pts. All 3 known: 1.5 pts. 2 out of 3: 1 pt. 1 out of 3: 0.5 pts.
What if my score is low?
A low AI Brand Awareness score is completely normal for small, local, or niche businesses. AI models are trained on large-scale web data and naturally know more about widely-covered brands. A low score here does not mean your website is poorly built — it simply reflects that your business hasn't yet reached the threshold of visibility where AI systems include it in their knowledge. Improving your Web Popularity (domain mentions, indexed pages, Common Crawl presence) is the most effective way to increase brand awareness over time.
The tool extracts business information (company name, address, phone, email, social media, founding year, employee count, revenue, opening hours) using a multi-source heuristic approach. For each data point, it searches across your crawled pages in priority order:
- JSON-LD structured data — The highest-priority source. The tool parses
<script type="application/ld+json">blocks looking for Schema.org types likeOrganization,LocalBusiness,Corporation, and their properties (name, address, telephone, email, sameAs, foundingDate, numberOfEmployees, etc.). - Microdata and RDFa — Inline Schema.org markup embedded in HTML attributes (
itemscope,itemprop,typeof,property). - Meta tags — Open Graph tags (
og:site_name,og:locale), HTMLlangattribute, and other metadata. - Semantic HTML elements — Content inside
<address>tags,mailto:andtel:links, header/footer regions. - Visible page text — As a last resort, pattern matching on page content: phone number formats, email addresses, street name patterns (e.g. Norwegian "gaten/veien", German "straße/weg"), postal code formats, copyright year ranges, and social media link URLs.
Each extracted value is tagged with its source (e.g. "json-ld", "meta tag", "footer text") and a confidence level, so you can see exactly where the data came from.
When your business details live only in visible text (a footer, a "Contact us" page), AI crawlers have to guess what each piece of information means. A phone number next to an address could be misread. A name in a copyright notice might not be recognized as the company name. This guessing process is error-prone and often incomplete.
Structured data solves this problem entirely. By embedding your business information in JSON-LD (the format recommended by Google and understood by all major AI systems), you make every data point unambiguous and machine-readable — company name, address, phone, email, social links, founding date, and more are all explicitly labeled so no AI has to guess.
With structured data in place:
- AI systems can extract your information with 100% accuracy — no guessing, no misinterpretation.
- Your data is picked up immediately, without the crawler needing to scan and pattern-match every page.
- AI assistants like ChatGPT and Gemini are far more likely to cite correct details about your business when answering user questions.
- It also improves your Google Knowledge Panel and rich search results.
If you currently rely on footer text or contact pages alone, adding a single JSON-LD block to your homepage is one of the highest-impact improvements you can make for AI visibility.
AI visibility check looks for the following information across your crawled pages:
| Data Point | Sources (in priority order) |
|---|---|
| Company Name | JSON-LD name/legalName, og:site_name, logo alt text, copyright footer, <title> |
| Company Type | JSON-LD legalName suffixes (LLC, GmbH, AS, etc.), footer/legal text |
| Address | JSON-LD PostalAddress, <address> tag, contact page text, footer text, Google Maps embed |
| Phone | JSON-LD telephone, tel: links, meta tags, header/footer text |
| JSON-LD email, mailto: links, visible text patterns | |
| Social Media | Header/footer links, body links, JSON-LD sameAs |
| Founding Year | JSON-LD foundingDate, visible text, copyright year ranges |
| Employee Count | JSON-LD numberOfEmployees, visible text patterns |
| Annual Revenue | Visible text patterns (e.g. "$10M revenue") |
| Opening Hours | JSON-LD OpeningHoursSpecification, visible text schedules |
Data sourced from JSON-LD is marked as high confidence. Data from text pattern matching is marked as lower confidence and should be verified.
You have every right to decide how your content is used. There are several techniques to signal to AI systems that you don't want your content ingested by large language models:
The robots.txt approach
Your robots.txt file is the primary tool for controlling crawler access. A common first instinct is to block AI crawlers one by one — GPTBot, ClaudeBot, Google-Extended, PerplexityBot, CCBot, and so on. However, this approach is inherently fragile: new AI crawlers appear constantly, and if you miss even one, your content remains exposed to it. You're essentially playing catch-up against an ever-growing list.
The more effective strategy is the opposite: allow only the crawlers you trust (Googlebot, Bingbot, and other traditional search engine bots) and block everything else by default. This way, any new or unknown crawler — including future AI bots — is automatically denied access without you needing to update your configuration.
Selective protection
You don't have to block everything. A more nuanced approach is to protect specific content (blog posts, research articles, premium material) while keeping your core business pages accessible. This way, AI assistants can still find and reference your company — your name, what you do, how to reach you — without using your original content for training or answers.
Additional measures
- Meta tags — Add
<meta name="robots" content="noai, noimageai">to specific pages to signal AI systems not to use that content. Support for these directives is growing. - AI-specific TDM (Text and Data Mining) headers — Some AI providers respect the
TDM-ReservationHTTP header and machine-readable TDM policies that indicate you reserve your rights. - llms.txt as a boundary file — You can use your
llms.txtto explicitly state what content AI systems are allowed to reference, and what should be excluded.
The problem: not all AI crawlers play by the rules
All of the techniques above — robots.txt, meta tags, TDM headers — rely on one assumption: that the crawler will respect your instructions. Legitimate bots like Googlebot, GPTBot, and ClaudeBot generally do. But there is no technical enforcement: a crawler that ignores your robots.txt can still access and ingest your content, and you may never know it happened.
This is not a hypothetical risk. Undisclosed crawlers, scrapers, and AI training pipelines routinely ignore opt-out signals. Your robots.txt is a request, not a lock on the door.
How we can help: active protection with AI honeypots
Beyond configuring your robots.txt and permissions correctly, we can help you set up AI honeypot pages — decoy pages that look like real content but are designed to mislead unauthorized crawlers.
These pages are linked from your site in a way that legitimate users and search engines will never visit (hidden behind robots.txt Disallow rules, nofollow links, or CSS-hidden elements), but that non-compliant AI crawlers — which ignore these signals — will follow and ingest. The decoy content is crafted to appear topically relevant to your business, but contains no real information. Instead, it can:
- Dilute your real content — By flooding unauthorized crawlers with plausible but fictional data, the AI model's knowledge about your business becomes unreliable, reducing the chance it will confidently cite your actual content.
- Detect unauthorized access — Since no legitimate visitor should ever reach these pages, any access logged to a honeypot URL is evidence of a non-compliant crawler or scraper.
- Poison training data — If a crawler ingests honeypot content into its training set, the resulting model will contain controlled misinformation about your business — making it clear when your content has been used without permission.
This approach works precisely because it does not rely on the crawler's cooperation. It exploits the fact that bots which ignore robots.txt will also follow links that compliant bots would skip.
The key takeaway: you don't have to choose between "fully visible" and "completely invisible." With a careful setup — combining proper access controls with active honeypot defenses — you can protect your proprietary content while still maintaining your business's presence in AI-powered search results. Contact us for a customized solution.
Yes. Our crawler — VCbot/1.0 — is designed to be a responsible web citizen. Here's how:
- Respects robots.txt — If your robots.txt disallows crawling, VCbot will not crawl those pages. We verify whether AI bots are blocked on your site, but we follow the same rules ourselves.
- Respects Crawl-delay — If your robots.txt specifies a
Crawl-delaydirective, VCbot will honor it and slow down accordingly. - Identifiable User-Agent — Our crawler identifies itself as
VCbot/1.0, with a link back to this page. You can always see who is crawling and why. - Rate limited — Maximum 2 requests per second per domain. If your Crawl-delay requires slower, we use the slower rate.
- Short timeouts — 15-second request timeout. We won't hold connections open indefinitely.
- No login/form pages — VCbot automatically skips pages behind authentication, registration forms, admin panels, and checkout flows.
- Respects nofollow — Links marked with
rel="nofollow"are not followed. - Limited scope — Each crawl visits up to 25 pages. We never attempt to crawl your entire site.
You can block VCbot at any time by adding the following to your robots.txt:
User-agent: VCbot Disallow: /
No. VCbot is not an AI crawler and does not train any AI model. It is a traditional web crawler that fetches your HTML pages to analyze their structure, metadata, and content organization.
Our AI Brand Awareness feature does query third-party AI models (ChatGPT, Gemini, Grok), but we only send them your website URL and company name — no HTML code, no page content, and no data obtained through crawling. This means we never share your website's content with AI systems, fully respecting any robots.txt directives or content policies that restrict AI access to your content.
Structured data generation and Gemini
If you choose to generate structured data (Organization or FAQPage JSON-LD), we send plain text extracts — not raw HTML — to Google Gemini so it can produce a ready-to-use JSON-LD block. Specifically:
- Organization JSON-LD — Text is extracted only from your Home, About, and Contact pages, combined with any business data already found (name, address, phone, etc.). The resulting JSON-LD is intended for your homepage.
- FAQPage JSON-LD — Text is extracted from pages that contain questions and answers about your business (dedicated FAQ pages, help sections, or accordion content). If no FAQ content is detected, general page text is used so Gemini can suggest relevant Q&A pairs. The resulting JSON-LD is intended for your FAQ page.
This only happens if your robots.txt allows VCbot to access those specific pages. Before extracting text, our system verifies every page against your robots.txt rules. If a page is disallowed — for VCbot specifically or via a wildcard rule — its content is excluded and never sent to Gemini.
Structured data generation is always on-demand: it only runs when you explicitly click the generate button, never automatically.
The key distinction
| What | How it works |
|---|---|
| VCbot crawl | Direct request from our server to your website. Fetches HTML only. Respects robots.txt. You see it in your access logs as VCbot/1.0. |
| AI Brand Awareness | Our server sends only the site URL and company name to AI APIs and asks if they know the business. No HTML content, no crawled data, no page text is shared. The AI models answer based solely on their existing training data. |
| Structured data generation | On-demand only. Sends plain text extracts (not HTML) from specific pages to Google Gemini. Only pages allowed by your robots.txt are used. Home, About, and Contact pages for Organization JSON-LD; FAQ and Q&A pages for FAQPage JSON-LD. |
This means:
- Blocking AI crawlers (GPTBot, ClaudeBot, etc.) in your robots.txt will not affect our analysis — because our analysis doesn't crawl you with those bots.
- Your website content is never sent to AI models automatically. The only exception is structured data generation, which requires your explicit action and respects your robots.txt.
- Even if your
robots.txtblocks all AI crawlers, our AI Brand Awareness check remains fully compliant — because no content is shared, only a URL and a name. - If you want to block our crawler specifically, block
VCbotin your robots.txt — that's the only bot that touches your server.
AI visibility check automatically identifies the CMS or framework powering your website using meta tags, HTML signatures, and backend endpoint probing. Detection feeds both the context shown in your analysis and the optimization guides included with the paid plans.
Detection vs. tailored optimization guides
There are two levels of support, and it's worth knowing the difference:
- Detected — We recognize the platform and use it to enrich your analysis (site profile, more relevant recommendations). Every platform listed on this page is detected.
- Tailored guides (Pro & Max) — On top of detection, supported platforms get step-by-step, platform-specific instructions in the structured-data and configuration guides: exactly where to paste each JSON-LD snippet for your CMS, SEO plugin, or framework. Platforms that are only detected still receive a generic-but-actionable guide that works for any system.
Platforms with tailored optimization guides
CMS — dedicated instructions, including the SEO-plugin-specific path on WordPress:
- WordPress + Yoast SEO
- WordPress + Rank Math
- WordPress + All in One SEO
- WordPress + SEOPress
- WordPress (no SEO plugin)
- Shopify
- Wix
- Squarespace
- Webflow
Frameworks — instructions with framework-native code examples:
- Next.js
- Nuxt
- Astro
- Gatsby
- SvelteKit
- Eleventy (11ty)
- Jekyll
- Hugo
- Remix
- React
- Vue.js
- Angular
Headless CMS — a dedicated guide for injecting schema in the frontend framework that renders a headless setup (Contentful, Sanity, Storyblok, Prismic, DatoCMS, Strapi, or a headless WordPress).
Also detected (generic guidance)
These are recognized and used for context; their optimization guide uses the generic, platform-agnostic instructions rather than a dedicated template:
- WooCommerce
- Drupal
- Joomla!
- Umbraco
- Ghost
- Craft CMS
- MediaWiki
- Magento
- PrestaShop
- BigCommerce
- HubSpot
- TYPO3
- Blogger
- Weebly
- Sitecore
- Sitefinity
- Optimizely (Episerver)
- Kentico
- Svelte
- Framer
- Blazor
- Laravel
- Ruby on Rails
- Django
- Duda
- GoDaddy Website Builder
- Carrd
- Notion Sites
If your CMS or framework is not detected, it may produce clean HTML without identifiable patterns — which is actually a good practice for security. Detection does not affect your score; it is used only for informational purposes and platform-specific recommendations.
AI visibility check can generate ready-to-use JSON-LD structured data for your website using Google Gemini AI. This is available directly from your analysis results — no coding knowledge required.
Organization JSON-LD
After analyzing your site, if business details are missing or incomplete, AI visibility check offers to generate an Organization JSON-LD block. When you click "Use AI to extract further data", the tool:
- Sends a text extract of your key pages (Home, About, Contact) to Gemini AI — only the text content, not your raw HTML.
- Combines that with any business data already extracted (company name, address, phone, email, social links, opening hours, etc.).
- Returns a complete, valid
OrganizationJSON-LD block you can copy and paste into your homepage's<head>section.
The generated JSON-LD includes all relevant Schema.org properties: name, URL, description, address, contact information, social media links, founding date, and more — tailored to your actual business data.
FAQPage JSON-LD
AI visibility check can also generate FAQPage structured data. This works in two ways:
- If FAQ content was detected — The tool sends the Q&A pairs it found on your site to Gemini, which enriches and formats them into a proper FAQPage JSON-LD block.
- If no FAQ content was found — Gemini analyzes your site content (homepage, about page, etc.) and generates relevant FAQ questions and answers that a potential customer might ask about your business.
In both cases, you get a copy-ready <script type="application/ld+json"> block that you can add directly to your FAQ page (or homepage) to boost AI visibility.
Privacy and caching
- Only plain text extracts are sent to Gemini — no raw HTML, no user data, no authentication details.
- Generated results are cached, so re-running an analysis will show the previously generated structured data instantly without calling the AI again.
- You can regenerate at any time to get updated results based on fresh crawl data.
AI visibility check automatically classifies your website into one of three primary types: E-commerce, Blog, or Services. This classification appears in the Site Profile card alongside your CMS/framework detection.
Detection methods
The tool uses multiple signals to determine your site type, each with different confidence levels:
- Navigation analysis — Links in your
<nav>elements are examined for URL patterns (e.g./shop,/blog,/services) and anchor text (e.g. "Products", "News", "Our Services"). Navigation links carry the highest weight. - Sitemap classification — If you have a
sitemap.xml, all URLs are categorized by path patterns. A site with 200 URLs under/products/is strong evidence of e-commerce. - Schema.org signals — JSON-LD types like
Product,Offer,BlogPosting, andArticledirectly indicate site purpose. - E-commerce indicators — Cart/checkout links, price displays, "Add to Cart" buttons, and platform markers (Shopify, WooCommerce, Magento).
- CMS context — E-commerce-specific platforms like Shopify or WooCommerce strongly indicate an online store.
Multi-language support
Detection works across multiple languages. URL patterns and navigation text are recognized in English, Norwegian, Italian, German, French, Spanish, Swedish, and Danish — including localized terms like "nettbutikk" (Norwegian), "prodotti" (Italian), "Leistungen" (German), and "tjenester" (Norwegian).
Detected sections
Beyond the primary type, the tool also identifies which sections your site has: products/shop, blog/news, services, and FAQ. A site can be classified as "E-commerce" while also having detected blog and FAQ sections. Each classification includes a confidence level (high, medium, or low).
Site type detection does not affect your score — it is used for informational purposes and to provide more relevant context in your analysis results.
Content Freshness (part of your Content Structure score, up to 2.5 points) measures how recently your website content was updated. AI models and search engines prefer fresh, regularly updated content over stale pages.
How it works
The tool scans your crawled pages for dates using multiple methods:
- Structured data dates —
datePublished,dateModified, anddateCreatedin JSON-LD blocks (highest priority). - Meta tags —
<meta name="date">,<meta property="article:published_time">, and similar tags. - HTML time elements —
<time datetime="...">elements with valid dates. - Visible text patterns — Date patterns found in page content as a last resort.
The most recent date found across all pages is used as your "last updated" reference. The score is then calculated based on how recent that date is:
- Updated within the last month — Full points.
- Updated within 3 months — Good score.
- Updated within 6 months — Moderate score.
- Updated over a year ago or no date found — Low score.
Why it matters for AI
When AI systems decide which sources to cite, recency is a key factor. A site that was last updated in 2021 is less likely to be referenced than one updated last month — even if the underlying information hasn't changed. Adding dates to your content (especially via structured data or <time> elements) signals to both AI and search engines that your content is current and maintained.
After running a free analysis, you can purchase a one-time report with ready-to-use files and actionable fixes tailored to your site. There are three plans:
Starter — $19
Ideal for small sites with up to 25 pages. Includes:
- PDF report with per-page issue details and a prioritized action checklist.
- llms.txt + Markdown (.md) files for your key pages — an AI-readable index of your most important content.
- Homepage JSON-LD (Organization / LocalBusiness) structured data, ready to paste into your site.
- Optimized HTML files (if needed) with corrected headings, semantic HTML, structured data, and meta tags.
- Optimized robots.txt if your current one blocks AI crawlers.
- Directory submission checklist customized for your industry and country.
Pro — $49
Best for medium-sized sites with up to 100 pages. Everything in Starter, plus:
- llms-full.txt — the full text of every analyzed page in one AI-friendly file, going beyond the key-page index included in Starter.
- sitemap.xml generated for your site (if missing and the site is static).
- Page-type structured data — Product schemas for e-commerce, Service schemas for service sites, Article schemas for blogs.
- ContactPoint schema for your contact page.
- AI-generated FAQ page (HTML + Markdown + FAQPage JSON-LD) or optimization of your existing FAQ with structured data.
- CMS/framework configuration guide with platform-specific instructions for structured data, meta tags, and headings.
Max — $89
For large sites with up to 200 pages. Everything in Pro, plus:
- JSON-LD structured data for every page analyzed — not just the homepage.
- Per-page report with individual scores for each page.
- Priority map showing which pages need the most work.
- CSV export of all per-page data for your own analysis.
All plans include
- A free re-scan with an updated report after 30 days to verify your improvements.
- Upgrade credit — if you start with Starter or Pro, you can upgrade anytime and we'll credit what you already paid.
- Files available for download for 30 days after purchase.
Not sure which plan fits? Visit the pricing page for a personalized recommendation based on your analysis results.