21.1 Robot.txt
The first thing you need to know about indexing your site at search engines is that you control which pages are indexed and which are excluded. You do that with a file called robots.txt.
Robots.txt contains nothing more than a record of which robots should index which pages.
Without going into too much detail, there are two conventions used in a robots.txt file:
User-agent: [Defines which robots the site is addressing.]
Disallow: [Allows you to list the sites or robots you want to
exclude.]
In general, you’re probably going to use “User-agent: *” to make sure that you’re addressing the robots of every search engine and you’ll probably want include all of your pages (although you might want to exclude your directories: “Disallow: /cgi-bin/”).
Robots.txt just allows you to control which robots index which pages. It’s important to have in your directory but it won’t really increase your search engine rankings.
0 commentaires:
Enregistrer un commentaire