The Authoritative Editorial
WebinarsAdvertise
All NewsSEO NewsPPC NewsSocial Media NewsWebinarsPodcastFor AgenciesCareers

Topics

All SEOEnterprise SEOAlgorithm UpdatesInternational SEOLink BuildingLocal SEOMobile SEOOn-Page SEOSEO StrategyTechnical SEOWeb Dev For SEOWordPress SEO

Columns

Search Visibility With Bill HuntEnterprise SEO With Dan TaylorAsk An SEOSEO Pulse

Topics

All Paid MediaPaid Media NewsPaid StrategyDisplay AdsPPCProgrammaticSocial Media AdvertisingVideo Advertising

Columns

Ask A PPCPPC Pulse

Topics

All ContentContent NewsContent StrategyContent CreationContent MarketingContent Trends

Topics

All Social MediaSocial Media NewsSocial StrategySocial AdvertisingBlueskyFacebookLinkedInTikTokTwitterYouTubeRedditInstagram

Topics

All Digital MarketingDigital StrategyAffiliate MarketingAnalytics & DataDigital ExperienceGenerative AIDigital TrendsEcommerceLead GenerationWordPress
Webinars

Guides

All GuidesBeginner's Guide to SEOCore Web Vitals GuideGoogle E-E-A-T GuideLink Building GuideLocal SEO GuideOn-Page SEORanking FactorsWordPress SEOTechnical SEOSEO AuditKeyword Research GuidePPC GuideFacebook Ads GuideContent Marketing Guide

Resources

WebinarsRundownsLibraryEbooksSEJ Show + PodcastGoogle Algorithm UpdatesSEO Conferences

Advertise

Advertising on SEJCase Study: B2B SaaSBanner Ads

Company

SubscribeAboutContactCareersPrivacy PolicyDo Not Sell My Info
Advertise

Company

  • About
  • Contact
  • Careers

Resources

  • Privacy Policy
  • Editorial Guidelines
  • RSS Feed

Newsletter

© 2026 The Authoritative Editorial. All rights reserved.

PrivacyTermsContact
Back to Technical SEO
Technical SEO

Google May Expand Unsupported Robots.txt Rules List

Learn the difference between better and best with easy rules, examples, and common mistakes.

Rakibul Hasan

Rakibul Hasan

English Writing Coach

April 24, 2026
5 min read
Google May Expand Unsupported Robots.txt Rules List

Highlight

  • Google may expand the list of unsupported robots.txt rules in its documentation based on analysis of real-world robots.txt data collected through HTTP Archive.

  • Gary Illyes and Martin Splitt described the project on the latest episode of Search Off the Record. The work started after a community member submitted a pull request to Google’s robots.txt repository proposing two new tags be added to the unsupported list.

source

Illyes explained why the team broadened the scope beyond the two tags in the PR:

“We tried to not do things arbitrarily, but rather collect data.”

Rather than add only the two tags proposed, the team decided to look at the top 10 or 15 most-used unsupported rules. Illyes said the goal was “a decent starting point, a decent baseline” for documenting the most common unsupported tags in the wild.

AD How The Research Worked The team used HTTP Archive to study what rules websites use in their robots.txt files. HTTP Archive runs monthly crawls across millions of URLs using WebPageTest and stores the results in Google BigQuery.

The first attempt hit a wall. The team quickly figured out that no one is actually requesting robots.txt files during the default crawl, meaning the HTTP Archive datasets don’t typically include robots.txt content.

After consulting with Barry Pollard and the HTTP Archive community, the team wrote a custom JavaScript parser that extracts robots.txt rules line by line. The custom metric was merged before the February crawl, and the resulting data is now available in the custom_metrics dataset in BigQuery.

What The Data Shows The parser extracted every line that matched a field-colon-value pattern. Illyes described the resulting distribution:

AD “After allow and disallow and user agent, the drop is extremely drastic.”

Beyond those three fields, rule usage falls into a long tail of less common directives, plus junk data from broken files that return HTML instead of plain text.

Google currently supports four fields in robots.txt. Those fields are user-agent, allow, disallow, and sitemap. The documentation says other fields “aren’t supported” without listing which unsupported fields are most common in the wild.

Google has clarified that unsupported fields are ignored. The current project extends that work by identifying specific rules Google plans to document.

The top 10 to 15 most-used rules beyond the four supported fields are expected to be added to Google’s unsupported rules list. Illyes did not name specific rules that would be included.

Typo Tolerance May Expand Illyes said the analysis also surfaced common misspellings of the disallow rule:

AD “I’m probably going to expand the typos that we accept.”

His phrasing implies the parser already accepts some misspellings. Illyes didn’t commit to a timeline or name specific typos.

Why This Matters Search Console already surfaces some unrecognized robots.txt tags. If Google documents more unsupported directives, that could make its public documentation more closely reflect the unrecognized tags people already see surfaced in Search Console.

Looking Ahead The planned update would affect Google’s public documentation and how disallow typos are handled. Anyone maintaining a robots.txt file with rules beyond user-agent, allow, disallow, and sitemap should audit for directives that have never worked for Google.

The HTTP Archive data is publicly queryable on BigQuery for anyone who wants to examine the distribution directly. source

Rakibul Hasan

Rakibul Hasan

English Writing Coach