iNET Interactive - Online Advertising Agency
          
   Home    Authors    About    Login    Contact Us
   Search:   
Advanced Search     
  Articles

  Directories (11)
  Google (105)
  Interviews (8)
  Keywords (30)
  Link Development (40)
  Marketing (48)
  Meta Tags (7)
  Optimization (112)
  Promotion (30)
  SE News (706)
  Spiders & Robots (22)
  Submission (8)
  Traffic Analysis (6)
  Tools (7)
  Algorithm (11)
  PPC (17)
  Domain Names (6)
  SEO Services (39)
 
Want to receive new articles via e-mail? Click here!
/Home /Spiders & Robots

Working with robots.txt file 

  Views:    5904
  Votes:    2
by Pannu Jagdeep.S. 4/23/04 Rating: 

Synopsis:

Learn all about working with robots.txt file. A useful guide that talks about what robots.txt file is, its advantages & disadvantages, how to optimize & use robots.txt file to define the content you want excluded from indexing, thus saving the crawler's indexing time…
Pages: firstback1 2 3 5 6 forwardlast
The Article

Disadvantages of the robots.txt file

Careless handling of directory and filenames can lead hackers to snoop around your site by studying the robots.txt file, as you sometimes may also list filenames and directories that have classified content. This is not a serious issue as deploying some effective security checks to the content in question can take care of it. For example if you have your traffic log on your site on a URL such as www.domain.com/stats/index.htm which you do not want robots to index, then you would have to add a command to your robots.txt file. As an example:

User-agent: *
Disallow: /stats/

However, it is easy for a snooper to guess what you are trying to hide and simply typing the URL www.domain.com/stats in his browser would enable access to the same. This calls for one of the following remedies -
Change file names:

  Change the stats filename from index.htm to something different, such as stats-    new.htm so that your stats URL now becomes www.domain.com/stats/stats-new.htm

  Place a simple text file containing the text, “Sorry you are not authorized to view this    page”, and save it as index.htm in your /stats/directory.

This way the snooper cannot guess your actual filename and get to your banned content.

Use login passwords:

  Password-protect the sensitive content listed in your robots.txt file.

Pages: firstback1 2 3 5 6 forwardlast

Similar/related articles:


 
  Sponsors