Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag

August 16th, 2007 | RSS Feed



If you're new here, you may want to subscribe to our Full RSS feed to get a daily digest of news around search engine industry.

John Blackburn announces that the refreshed robots.txt analysis tool will now be able to recognize sitemap declarations and relative urls.

"Earlier versions weren't aware of sitemaps at all, and understood only absolute URLs; anything else was reported as Syntax not understood. The improved version now tells you whether your sitemap's URL and scope are valid. You can also test against relative URLs with a lot less typing.

Reporting is better, too. You'll now be told of multiple problems per line if they exist, unlike earlier versions which only reported the first problem encountered. And we've made other general improvements to analysis and validation."

In order to let search engine bots index all in your portal (barring the images folder). Your robots.txt file will look like this:

disalow images

user-agent: *

Disallow:

sitemap: http://www.example.com/sitemap.xml

You visit Webmaster Central to test your site against the robots.txtanalysis tool using these two test URLs:

http://www.example.com

/archives

Previous version of the tool

gwtbefore.gif

Whereas the latter image of the tool would look like this.

gwtafter.png

For more information read Google Webmaster Central official blog.

In other news, Google confirms the new unavailable_after META tag which we reported about last week.

"Let's assume you are running a promotion that expires at the end of 2007. In the headers of page www.example.com/2007promotion.html, you can use the following:

untitled.jpg

The second exciting news: the new X-Robots-Tag directive, which adds Robots Exclusion Protocol (REP) META tag support for non-HTML pages! Finally, you can have the same control over your videos, spreadsheets, and other indexed file types. Using the example above, let's say your promotion page is in PDF format. For www.example.com/2007promotion.pdf, you would use the following in the file's HTML headers:

X-Robots-Tag: unavailable_after: 31 Dec

2007 23:59:59 EST

Remember, REP META tags can be useful for implementing noarchive, nosnippet, and now unavailable_after tags for page-level instruction, as opposed to robots.txt, which is controlled at the domain root."

Click here to subscribe to our RSS feed to get a daily digest of news around search engine industry. PageTraffic SEO Blog is updated four times a day and is ranked as one of the best search engine resources blog by Pandia!


 


Comments

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.

One Response to “Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag”

  1. Ecommerce Lounge - Todays top blog posts on Internet Marketing - Powered by SocialRank Says:

    [...] Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag [...]

Leave a Reply

Back to Top

Connect with us

Connect us on twitter
Connect us on facebook
Connect us on flickr
Connect us on youtube

Life@PageTraffic on Flickr

Sweet ReceptionGalleryCafe


More >>

Subscribe To Our SEO Blog


Enter your email address:

Delivered by FeedBurner

Search


PageTraffic on Facebook
SEO Blogs - Blog Catalog Blog Directory
Feedback Form