A question I hear a lot from new AOM users is “Why isn’t my site indexed yet?”. People with little or no Internet-savvy may have some preconceived ideas about affiliate marketing in general, and AOM in particular. This month I would like to discuss some of the general realities of indexing, hopefully to separate fact from myth.
I’ve long ago lost count of how many times I’ve been asked why a certain site has not been indexed yet. Often it’s one that’s been up for a week or two, or less than a month. Sites can be indexed within a few hours, or several days. Or never. An important point that needs to be made is that:
Search engines are not required to index any site.
Believing otherwise is a myth. Most sites are indexed eventually, but not all. And there’s little you can do to influence or speed up the process. Let’s have a quick overview of what indexing means, for those who hear the term, but do not know what it refers to, or why it should be important to them.
Search engines such as Google, Yahoo and Bing will use programs called spiders (also known as ‘bots, short for robots) that crawl the Internet, exploring links on websites to reach new sites. The new sites that spiders find are compiled in an index. The indexes are used to provide search results when you query the engine. Or to put it another way, when you ‘google‘ something, the results you get are drawn from the indexes the spiders make. So if you look up fruitcake, you’ll get a list of sites that feature fruitcakes: recipe sites, cooking sites, grocery sites, crazy people sites, and so on.
A site that’s been indexed may eventually acquire pagerank. This is a Google-derived term that provides a rough measure of how ‘popular’ a site is. A pagerank (or ‘PR’) of 1 is better than 0. A PR of 2 is better than 1, and so on. It’s believed that each level of PR is something like an order of 10 from the previous level. So whatever factors determine your PR is 1 (like links, traffic, etc. Google does not reveal the exact metrics), you would need ten times more to achieve PR2. Ten times more than that for PR 3 (or 10 x 10 times more than PR 1). Ten times more again for PR 4 (or 10 x 10 x 10 times more than PR 1). And so on. To make it more confusing, this is different than whatever page of the index your site appears on. The Holy Grail is to be at the first position on page one of the search results. Or at least anywhere on the first page. Studies show most web surfers are too lazy to look much farther than the first page of results.
When someone sets up a website, then, their primary aim is to get the site indexed, so that it shows up in a search of relevant keywords. The problem is that new sites tend to exist in a vacuum; if spiders hop from site to site by following links, a new site won’t have any links to it. So the spiders never find your site to add it to the index. It’s a bit like building a new house somewhere where there are no roads. You tend to not get much (if any) mail that way. So how do you get the search engines to index your site? Here are a few options to try:
Build links – These days, the idea of exchanging reciprocal links with other sites is not seen as important as it once was; there is some truth to this, since many site owners would trade links with other sites that are also not indexed, or to ‘link farms’ or directories that may actually harm a site’s chance of getting indexed. But ‘quality’ links from a site that’s seen as an ‘authority’ regarding a particular keyword can be seen as a blessing. If a high traffic site for fruitcake happened to mention your site, it would drive a lot of traffic to you, and increase your chances of getting indexed. This is why spammers love blogs so much (even though blogs usually have a ‘nofollow’ directive for spiders). You can pay for backlinks to your site (sites that link ‘back’ to yours, without having one in return), but there is still much debate as to whether this really helps or not.
Webmaster Tools (Google)/Site Explorer (Yahoo) – These are free services offered by the search engines (Bing has one as well) that are generally a good way to get a site indexed. It directly notifies the engine about your site, and usually means a spider will come visit eventually. Google especially offers a lot of information about best practices to keep your site out of trouble and make it spider-friendly. This is probably one of the best ways to get a site indexed, but bear in mind that while the site may be examined by spiders, that’s still no guarantee it will be indexed.
Some people believe that paid advertising, such as Google’s AdSense will help you get indexed. It won’t. It can however, create ghost listings. These are search results for your site that lead to pages your ad was displayed on the last time the page was read by a spider. So a search for ‘www.mysite.com’ will provide results, but none of which currently have a link to ‘mysite.com’.
While it’s easy to fall into the trap of building ‘just’ for search engines, it’s crucial to remember that spiders and humans both love original content. Nobody wants to shop at sites that are just lists and lists of product. They want to read about the merchandise, gather opinions, discover new things. The days of cookie-cutter sites riding high in the listings are over. If you add information relevant to your site, you increase your chances of getting people to link to you, being indexed, and attracting traffic. And that’s the point, right?