It can be difficult to know which directories should be restricted from search engines and which should be allowed.  I thought it would be a good idea to create a robots.txt reference guide for popular content management systems and shopping carts.  Depending on what backend you are using, all you have to do is copy the the text into a robots.txt file and upload it to your server.  This should help manage the issue with duplicate content pages on your site.

WordPress Robots.txt File

**If your blog is in a sub-directory, prefix the below with the blog directory name. (ex: /blog/directory)

[plain] User-agent: *
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /tag
Disallow: /author
Disallow: /wget/
Disallow: /httpd/
Disallow: /cgi-bin
Disallow: /images/
Disallow: /search
Disallow: /feed
Disallow: /feed/
Disallow: /trackback/
Disallow: /rss
Disallow: /comments/feed
Disallow: /feed/$
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$ [/plain]

Magento Robots.txt File

[plain]
User-agent: *
Disallow: /*?
Disallow: /*.js$
Disallow: /*.css$
Disallow: /checkout/
Disallow: /catalogsearch/
Disallow: /app/
Disallow: /downloader/
Disallow: /images/
Disallow: /js/
Disallow: /lib/
Disallow: /media/
Disallow: /*.php$
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /var/
Disallow: /catalog/product_compare/
Disallow: /catalog/ Disallow: /customer/
Disallow: /catalogsearch/advanced/
Disallow: /wishlist/
Disallow: /404/
Disallow: /admin/
Disallow: /api/
Disallow: /install/
Disallow: /catalog/product/view/id/</code>
Disallow: /customer/
[/plain]

Drupal Robots.txt File

[plain]

User-agent: *
# Directories
Disallow: /database/
Disallow: /includes/
Disallow: /misc/
Disallow: /modules/
Disallow: /sites/
Disallow: /themes/
Disallow: /scripts/
Disallow: /updates/
Disallow: /profiles/
# Paths (clean URLs)
Disallow: /admin/
Disallow: /aggregator/
Disallow: /comment/reply/
Disallow: /contact/
Disallow: /logout/
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /contact
Disallow: /logout
Disallow: /user/register
Disallow: /user/password
Disallow: /user/login
Disallow: /user/password/
Disallow: /print/
Disallow: /forward/
# Files
Disallow: /xmlrpc.php
Disallow: /cron.php
Disallow: /update.php
Disallow: /install.php
Disallow: /INSTALL.txt
Disallow: /INSTALL.mysql.txt
Disallow: /INSTALL.pgsql.txt
Disallow: /CHANGELOG.txt
Disallow: /MAINTAINERS.txt
Disallow: /LICENSE.txt
Disallow: /UPGRADE.txt
# Block user tracker pages
Allow: /project/track
Disallow: /*/track$
Disallow: /*/track?page=

If you are not using static urls:

Disallow: /?q=admin/
Disallow: /?q=aggregator/
Disallow: /?q=comment/reply/
Disallow: /?q=contact/
Disallow: /?q=logout/
Disallow: /?q=node/add/
Disallow: /?q=search/
Disallow: /?q=user/password/
Disallow: /?q=user/register/
Disallow: /?q=user/login/
Disallow: /user/login/
[/plain]

Joomla Robots.txt File

[plain]
User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /editor/
Disallow: /help/
Disallow: /includes/
Disallow: /language/
Disallow: /mambots/
Disallow: /media/
Disallow: /modules/
Disallow: /templates/
Disallow: /installation/
Disallow: /libraries/
Disallow: /tmp/
Disallow: /xmlrpc/
Disallow: /admin
Disallow: /administrator
Disallow:/admin/
Disallow: /admin.html
Disallow:/admin.php
[/plain]

Robots.txt References