The web crawling Googlebot may find a forgotten line in robots.txt that causes it to de-index a site from the search engine.
Unvalidated Robots.Txt Risks Google Banishment |
Webmasters welcome being dropped out of Google about as much as they enjoy flossing with barbed wire. Making it easier for Google to do that would be anathema to being a webmaster. Why willingly exclude one's site from Google?
That could happen with an unvalidated robots.txt file. Robots.txt allows webmasters to provide standing instructions to visiting spiders, which contributes to having a site indexed faster and more accurately.
Google has been considering new syntax to recognize within robots.txt. The Sebastians-Pamphlets blog said Google confirmed recognizing experimental syntax like Noindex in the robots.txt file.
This poses a danger to webmasters who have not validated their robots.txt. A line reading Noindex: / could lead to one's site being completely de-indexed.
The surname-less Sebastian recommended Google's robots.txt analyzer, part of Google's Webmaster Tools, and only using the Disallow, Allow, and Sitemaps crawler directives in the Googlebot section of robots.txt.
No comments:
Post a Comment