Last updated on November 20th, 2015
As well as being true to his words about Guantanamo;
we reject as false the choice between our safety and our ideals. Our founding fathers, faced with perils we can scarcely imagine, drafted a charter to assure the rule of law and the rights of man, a charter expanded by the blood of generations
– new President Obama has kicked out the blocking mechanisms from the Whitehouse website….
Apart from the fact that I’d never actually checked, I was astounded to find that the previous robots.txt file went to over 2300 lines of blocked files and folders! Now it has just two;
User-agent: * Disallow: /includes/
You can check the file for yourself on any site, the Whitehouse’s is here, https://www.whitehouse.gov/robots.txt
It’s not necessary to have the file, and no robot or spider, indexing the web, actually has to abide by it, but it’s useful to speed up trawling and get properly indexed by the “good” spiders.
The robots.txt for this site exists and can be examined here, https://strangelyperfect.tv/robots.txt. The reason it’s bigger than the above is because I was getting spurious returns because of the use of the language translator(s) that I’ve used. To avoid confusion and to stop the Google downgrade that happens for multiple pages of the same content, I block the “virtual” pages from the index. I will review this soon because the plugin designs have changed since I did it. Like Obama, I’m for open-ness.
It does make one wonder about the mentality the previous administration that so wanted to clamp down on the mechanisms of democratic government – or maybe it just confirms, yet again, what we already knew. And bizarrely, they seemed unaware that there is no compunction whatsoever for a robot spider to pay any attention to the actual file! It’s in the specs.
https://www.robotstxt.org/robotstxt.html – Robot Usage from the Organisation that sets the standard