Home | Toolshop | Guides | FAQ's | Inside | Contact Us Win Prizes!
Submit Corner Logo
Browse Submit Corner

Submit Corner : Guides : Site Improvement Guide : Robots & User-Agents

Toolshop

Assessment Tools
Keyword Thesaurus
Link Popularity Tracker
META Tag Generator
META Tag Scanner
Robot Generator
Submit Engine
Top Keywords
Word Tracker

Guides

META Tags
Description Tag
Keywords Tag
Revisit Tag
See Complete List

Search Engines
Google
Yahoo!
Overture
See Complete List

Site Improvement
Image Tags
Title Optimization
Effective META Tags
See Complete List

Software Packages
Web Position Gold
Submit Wolf
See Complete List

Website Assessment
Editor's Viewpoint of Directories
Assess Your Site Indexing
Who's Linking To You
Competitor Identification

Bandwidth Conservation
Background Images
External Javascript
Optimize HTML
See Complete List

FAQ

Our Recommendations
Read FAQ

Webposition Gold - Download Now!

Inside Submit Corner

Advertise
Author Guidelines
Contact Us
Linking Info
Privacy Policy
Rave Reviews
Terms of Use

Robots & User-Agents
By Submit Corner
Tell a Friend About This Page

Overview: Robots are a means to communicate with search engines about which pages to index and which to follow (spider). Learn how to write robots that communicate easily with search engines on exactly which pages you want indexed and spidered.

Robots or user-agents are automated scripts and programs that visit websites and attempt to index all links and content that they find on your webpages. If robots are not given instructions on what they can index and spider, they'll keep following links on your site until there's no more, indexing everything it finds. While this may be a great utility to save time on submitting each individual page to a search engine, this also may pose some problems. Many websites offer dynamic content through scripts and databases which generally should not be indexed since there can be an infinite number of webpage. In order to create a governing set of rules that would interact with the search engine robots on which files to index and which to skip, a standard has been created called the Standard For Robot Exclusion which involves the creation of a robots.txt file placed in the root web directory. Alternatively, some users do not have access to write files to the root web directory and as such another supported format is through the use of Robots META Tags. Server side (robots.txt) robots syntax has been described below with two examples. To easily create a robots file or robots META tag, we've created an online Robots Generator which will create the code for you in a matter of seconds.

Server Side Robots (robots.txt) Usage

The robots.txt file is a plain text file placed into your web root directory such that any client can access it by going to yoursite.com/robots.txt. If you cannot create a file to this location, you'll need to implement a META Robot instead.

robots.txt Usage:
File Location: "/robots.txt"
General Usage: User-agent: <AGENT>
Disallow: /PATH/

<AGENT> represents the name of a search engine Agent or use an asterisk (*) to represent all agents.
PATH represents a relative path which you do not want to access.
Examples: User-agent: *
Disallow: /cgi-bin/

Comments: The above will tell all robots to not index anything in the /cgi-bin/ directory

User-agent: ia_archiver
Disallow: /

Comments: The above will tell the robot called ia_archiver to not access anything on this webserver. All other user-agents will have full access. The user-agent names are available from each individual search engine, or you may use our Robots Generator Script to create your Robots code for you.

Sponsored Links



Win Great Prizes Just for Using Our Services

Latest Headlines

All links open in a new window
View All Headlines
(Thu, Aug 21 00:59:02)

Google Seeks Support for Internet on TV Spectrum
Source: GigaLaw.com
Date: Aug 21 2008 2:14AM

Microsoft pursues search improvements, sans Yahoo
Source: Australian PC World
Date: Aug 21 2008 2:06AM

Google Backs Geothermal Startup (VC Deals)
Source: Private Equity HUB
Date: Aug 21 2008 1:32AM

Wikia Search, Cuil trailing Google by a long shot
Source: Computerworld Australia
Date: Aug 21 2008 1:30AM

Microsoft pursues search improvements, sans Yahoo
Source: Computerworld Australia
Date: Aug 21 2008 1:24AM

Google Wins Gold in the Future Mobile Awards for their Contribution to Mobile Search
Source: Lien Multimédia
Date: Aug 21 2008 1:19AM

Police, Google to sensitise students
Source: The Hindu
Date: Aug 21 2008 1:05AM

There are 23 additional news headlines. Click to View All Headlines

Copyright ©2000 - 2006 Wired 2000 Corporation
All Rights Reserved
Privacy | Terms of Use