How to create perfect robots.txt all about Robots.txt

Posted by on Jun 3, 2011 in web designing | 4 comments

How to create perfect robots.txt file for WordPress

WordPress is very delicate system with multiple plugins we can increase its functionality,

Why we need this robots.txt in WordPress –

this robots.txt is the most important and essential file for a webmaster to manage the control access of robot to the website directories,

Robots txt 1

How to create perfect robots.txt file for WordPress

  • by controlling the search bot from expressing to some useless directories or folders we can prevent our site from being penalised from google for 404 not found errors and other reasons,
  • robots.txt file is the guidelines for google or any other search bot where to search  for contents and useful files which we wants to index,
  • it prevent search bot to indexing useless files from our WordPress or other files, because these files may cause 404 not found errors,
  • but with functionality these plugins also made some temp files for completing the function they are made for,
  • these temp files are no more exist and causes the 404 not found error in webmaster,
  • these 404 not found are the most lethal error from SEO point of view, because it makes bad impression on google search bot,

How to make the robots.txt file

  • robots.txt file is nothing but a simple text file created in notepad in your system just rename it to robots.txt ,

 

how to check robots.txt file of any site –

 

  • for this you have to just type the robots.txt after the domain name of any site and enter it the robots.txt file will be shown,

http://example.com/robots.txt

 

Medicscientist robots.txt file –

sitemap: http://www.medicscientist.com/sitemap.xml

User-agent:  *# disallow all files in these directoriesDisallow: /cgi-bin/
Disallow: /downloads/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Disallow: /go/
Disallow: /archives/*
Disallow: /*?*
Disallow: /wp-*
Disallow: /category/*
Disallow: /tag/*
Disallow: /author
Disallow: /comment-page/*
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
Disallow: /*/comment-page-/$
Disallow: /comments/feed/*
Disallow: /comments/default
Disallow: /feeds/*
Disallow: /feed/*
Disallow: /trackback/

User-agent: Googlebot
Disallow: /*.php$
Disallow: /*.js$
Disallow: /*.php*
Disallow: /*.css$

User-agent: Mediapartners-Google*Allow: /

User-agent: Googlebot-ImageAllow: /wp-content/uploads/

User-agent: Adsbot-GoogleAllow: /

User-agent: Googlebot-MobileAllow: /

 

Shoutmeloud robots.txt –

sitemap: http://www.shoutmeloud.com/sitemap.xml

User-agent:  *# disallow all files in these directories
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Disallow: /go/
Disallow: /archives/
disallow: /*?*
Disallow: /wp-*
Disallow: /author
Disallow: /comments/feed/
Disallow: /ar/
Disallow: /be/
Disallow: /bg/
Disallow: /ca/
Disallow: /cs/
Disallow: /da/
Disallow: /de/
Disallow: /el/
Disallow: /es/
Disallow: /et/
Disallow: /fa/
Disallow: /fi/
Disallow: /fr/
Disallow: /ga/
Disallow: /gl/
Disallow: /hi/
Disallow: /hr/
Disallow: /hu/
Disallow: /id/
Disallow: /is/
Disallow: /it/
Disallow: /iw/
Disallow: /ja/
Disallow: /ko/
Disallow: /lt/
Disallow: /lv/
Disallow: /mk/
Disallow: /ms/
Disallow: /mt/
Disallow: /nl/
Disallow: /no/
Disallow: /pl/
Disallow: /pt/
Disallow: /ro/
Disallow: /ru/
Disallow: /sk/
Disallow: /sl/
Disallow: /sq/
Disallow: /sr/
Disallow: /stale/
Disallow: /sv/
Disallow: /th/
Disallow: /tl/
Disallow: /tr/
Disallow: /uk/
Disallow: /vi/
Disallow: /zh-CN/
Disallow: *?replytocom

User-agent: Mediapartners-Google*Allow: /

User-agent: Googlebot-ImageAllow: /wp-content/uploads/

User-agent: Adsbot-GoogleAllow: /

User-agent: Googlebot-MobileAllow: /

 

Hellboundbloggers robots.txt

sitemap: http://hellboundbloggers.com/sitemap.xml

User-agent:  *# disallow all files in these directories
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Disallow: /go/
Disallow: /archives/
disallow: /*?*
Disallow: /wp-*
Disallow: /author
Disallow: /comments/feed/
Disallow: *?replytocom

User-agent: Mediapartners-Google*Allow: /

User-agent: Googlebot-ImageAllow: /wp-content/uploads/

User-agent: Adsbot-GoogleAllow: /

User-agent: Googlebot-MobileAllow: /

 

If your blog is installerd In sub folder like domain/blog then you should have to use the robots txt like below –

User-agent:  *# disallow all files in these directories
Disallow: /cgi-bin/
Disallow: /af/
Disallow: /downloads/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Disallow: /go/
Disallow: /archives/
disallow: /*?*
Disallow: /wp-*
Disallow: /author
Disallow: /comments/feed/
Disallow: /blog/cgi-bin/
Disallow: /blog/wp-admin/
Disallow: /blog/wp-includes/
Disallow: /blog/wp-content/
Disallow: /blog/go/
Disallow: /blog/archives/
disallow: /blog/*?*
Disallow: /blog/wp-*
Disallow: /blog/author
Disallow: /blog/comments/feed/

User-agent: Googlebot
Disallow: /blog/*.php$
Disallow: /blog/*.js$
Disallow: /blog/*.php*
Disallow: /blog/*.css$
Allow: /blog/wp-content/uploads/

User-agent: Googlebot-ImageAllow: /*

 

  • Replace the sitemap.xml url with your domain and put this robots.txt file in your

how to add your own rule

  • if you have to remove the directory ( /example ) from ,like http://domain.com/example/xyz/aaaa
  • then add
  • Disallow: /example/
  • it will restrict robot in all file after /example
  • if you wants to remove file containing extension like (/comment-page-) from NOTFOUND 404 URL  LIKE (http://acedesigno.com/2010/11/how-to-remove-blog-post-date-and-time-easiest-way-to-remove-date-and-time.html/comment-page-)
  • then add
  • Disallow: /*/comment-page-$

note –

  • *   star denote for dynamic
  • $  is to end the extension
  • like if you have to remove all file containing (.jpg) then add
  • Disallow: /*.jpg$
  • if you wants to remove categories and tag than you can also add this
  • Disallow: /tag/*
  • Disallow: /category/*
  • That’s all about the robots .txt