StayOnSearch

Improve Your Website Traffic by Cleaning Up Your Google Index

Saudiqua Thebus June 10, 2010

Your site index gives you an indication of health problems on your site; if you suddenly see a spike or large drop in Google indexed pages it flags a warning that there is something wrong that needs some attention. We focus mostly on Google, as this search engine sends us the greater share of organic SEO traffic, i.e. they have a larger market share.

Remember that the more pages you have indexed in Google, the more keywords you can rank for (long-tail) and the more keywords you rank for, the higher the amount of traffic you will receive to your site.

Finding Issues

So how do we find issues within our site’s index? As an SEO, what you need to be doing are weekly health checks which allow you to monitor your site’s health by tracking things like indexed pages, backlinks and duplicates. If you notice any significant changes in these areas, you should start a new investigation into what has been going on.

Signs that your site’s health is in danger would be a combination of a large drop of indexed pages in Google, and a large increase in indexed pages in Yahoo. This normally indicates duplicate or junk content problem.

If your site has been verified with Google Webmaster Tools, you will be able to see things such as duplicate page titles and descriptions, 404 errors, etc.

I normally use a combination of Google’s Webmaster Tools console and Yahoo SiteExplorer to find any issues that may hinder my site being indexed in Google.

How to Go About It

Here are some tips I agree with which has been taken from this blog post.

1. META Tags

You can add a simple META tag to the top of every live page you have (between the <head> and </head> tags) and configure it to your liking. Here’s how the META tag should look:

<Meta content=”NOINDEX, FOLLOW”>

NOINDEX signifies that site crawlers should not index the page. Alternatively, writing index would tell crawlers to index the page.

FOLLOW signifies that links on this page should be tracked or given credit for (alternatively, no follow would not track links).

The “nofollow” attribute is also used commonly for individual links by adding rel=”nofollow” to the <a> HTML element when web publishers don’t want to get penalized for linking to suspicious or low quality sites.

By default, every page is tagged as “index, follow”, but changing this attribute can help you configure your pages in a few different ways.

WordPress Meta Robots Plug-in

For WordPress users, you can install the Meta Robots plug-in. This will allow you to configure each and every post and page from the editor, as well as configure the global settings of your site.

2. Google Webmaster Tools – Remove URL Tool

You can use Google’s Remove URL tool in Google Webmaster tools for an emergency URL removal, this tool can be found under “Crawler access” as shown in the image below:


You’ll notice at the top of the page, there are one of three things you must do before being able to submit a Removal Request. For a directory/page/file to be removed from Google’s index, you must do one of the following (see Google’s URL removal Requirements):

  • Make sure the content is no longer live on the web. Requests for the page or image you want to remove must return an HTTP 404 (not found) or 410 status code.
  • Block the content using a Meta No Index tag.
  • Block the content using a robots.txt file.

In other words, you must either delete the directory/page/file from your server, or do one of the 2 things I’ve already discussed.

Once this has been done, go ahead and click the New Removal Request button. You’ll find that Google gives you options to choose from.


Choose “Remove page from search results and cache” , check the checkbox and submit your request. Your request will be added to a Pending list of requests, and approximately 24 hours later, if all goes well and all requirements have been satisfied, you’ll find that your requests will have been fulfilled and your directory/page/file will have been removed from Google’s index.

Some Stuff to Note

  • Ensure that you do not have any blocked URLs in your  XML sitemap.
  • Note, that robots.txt file does not work alone; you will still see your blocked URLs in your index but without your page title or Meta description.
  • DO NOT add Google Analytics tracking scripts to pages you do not want Google to see.

In conclusion, by cleaning up your Google index, you should see an increase in the amount of indexed pages in Google, with an associated increase in SEO keywords and traffic in your analytics.

About Author

Saudiqua Thebus

Currently the Senior SEO Strategist at a Cape Town based digital marketing agency in South Africa and the mind behind Social-icious, a blog focusing on digital marketing. She has over 3 years experience in the online marketing environment and holds a Bachelor of Commerce degree in Information Systems from the illustrious University of Cape Town. She also consults on a freelance basis and holds a passion for fast cars, good food and is a single mum. :). You can follow Saudiqua on Twitter or connect via LinkedIn.. View all posts by Saudiqua Thebus →

12 comments
Joel Gray
Joel Gray

I'm quite impress with the information you have given. I've been checking on helpful sites like yours and those such as http://www.TrafficTips101.com to widen my knowledge in improving website traffic. Thanks a lot!

autobaclink
autobaclink

one thing, traffic building is something that takes time. Whether using SEO (search engine optimization), link building, blog carnivals social networking, or other methods your post about Improve Your Website Traffic by Cleaning Up Your Google Index ,A great post and quite informative. You write very well and keep it “simple”, easy, yet direct. Keep up the great work.

Kevin Tan
Kevin Tan

Thanks Phil and Mark for your suggested solution. I did the 301 redirect method to get rid of the indexed pages.

Phil
Phil

If those pages have PR, i will have different approach as i will create same contents and same date posts with help of mysql, still 301 is best

Mark Thompson
Mark Thompson

Yes, recreate your XML sitemap and 301 redirect the pages that have PR.

Kevin Tan
Kevin Tan

Some non-existing pages still have PageRank on them and are keep getting 404 errors traffic. So what I need to do now is just reformat my XML sitemap for Google and let the non-existing URLs to be expired (from Google's index database) automatically?Correct me if I'm wrong.

Mark Thompson
Mark Thompson

I wouldn't recommended blocking your entire site within Webmaster Tools.If you can 301 redirect the pages that are sending you traffic to the new corresponding page, that will help keep the value of those pages. If the pages no longer exist Google will start to drop them from their index automatically. I would also recommend recreating your XML sitemap and submitting it to Webmaster Tools with the proper URLs.

Kevin Tan
Kevin Tan

If I select to remove my entire site from Google Webmaster Tools, will Google re-index or reset the index status of my blog?I have too many URLs that needed to be removed because I've lost my WordPress database earlier and Google is still sending me a lot of traffic to those non-exist URLs. I'm receiving too much 404 error reports each day.What would be your suggestion?

Claire
Claire

Hi, even thought about another thing in your article and came back to re-read it! Told you it was useful!

Saudiqua
Saudiqua

Thanks for your comment. I'm glad that you enjoyed the post and that it was of some use to you :)Saudiqua

Claire Jones
Claire Jones

A good round up and some really good notes on areas so often overlooked. Much appreciated and now implemented. Thanks Saudiqua.

Login to your account

Can't remember your Password ?

Register for this site!