This blog post is 16 years old! Most likely, its content is outdated. Especially if it's technical.
"robots.txt" is a file you can create on your site to help indexing bots to index your site correctly. These bots first scans your robots.txt
file to see which pages to ignore.
This page is a good tool to keep in mind to validate your robots.txt
files. robotstxt.org has more information about the wannabe standard.
- Previous:
- MathML and displaying Math on the web 23 January 2004
- Next:
- Labels in HTML forms 26 January 2004
- Related by category:
- Fastest way to find out if a file exists in S3 (with boto3) 16 June 2017 Web development
- How to create-react-app with Docker 17 November 2017 Web development
- How to throttle AND debounce an autocomplete input in React
01 March 2018 Web development
- Be very careful with your add_header in Nginx! You might make your site insecure 11 February 2018 Web development
- Displaying fetch() errors and unwanted responses in React 06 February 2019 Web development
- Related by keyword:
- Interesting float/int casting in Python 25 April 2006
- Check your email addresses in Python, as a whole
22 May 2020
- django-html-validator now supports Django 2.x 13 August 2018
- django-html-validator 20 October 2014
Just found out how to exclude my printer-friendly version and PDF version (see bottom righthand corner)
Disallow: /pv$
Disallow: /pv/pdf$
Let's hope it works.