FAQ: Are robots.txt files outdated or worthwhile?

Google’s own staff have been divided for years on the topic of whether the use of robots.txt to block crawlers from websites is an advisable practise or not.

John Mueller used to freely advise people to use it to block Google all you like, however the Search Quality team led by Matt Cutts used to recommend minimising your use of robots.txt and to avoid blocking Googlebot unless absolutely necessary. It appears that the Search Quality team was successful in winning the fight, as all the main guidance in the past few years has swung towards avoiding the blocking of Googlebot. Allowing free crawling is of course the default, when you have no robots file, so it really doesn’t need to be defined.

If you do a ‘fetch’ in Google Search Console you’ll see warnings where the crawling of assets like CSS and JavaScript has been disallowed by robots.txt – you’ll be warned about each blocked asset as Google needs to see them to determine the full visual design of a page, as humans see it, in order to properly assess the design quality and ensure there’s no spam going on.

So what would you ever wish to block? Private or secret parts of the site, you might say, so that Google never displays them in search results upon accidentally finding them. But what you’re actually doing by listing these in your robots file is telling hackers exactly where to look!

Its a highly controversial topic of course, as Google’s own staff have been divided on the matter for many years, and most SEO agencies will commonly use it as one of many ‘jobs to do’ to make their work look more substantial. Much like creating XML Sitemaps on sites that don’t need them – there will always be agencies promoting this kind of fluff to justify their charges.