Ensuring site indexed by search engines
Search engines use what are known as "robots", or "spiders", to continuously crawl around the internet looking for web pages.
These "spiders" essentially take a copy of the pages they find and send them through to the search engine's data centers for indexing.
The indexing process involves utilising the ranking criteria to be explained in the next section.
But you need to ensure your site is "spiderable" in the first place.
To this end, you need to ensure that "spiders" can read all the relevant content on your site, and can access all the relevant pages, through being able to follow each of the links.
What "Spiders" Like to See
"Spiders" essentially replicate the functionality of the earliest type of web browsers. Thus they are happiest when your web site is made up of:
1) Static HTML, ASP, PHP or other web-standard files (Word documents, PDFs etc. are "spiderable", too, but are not as good for ranking purposes).
2) Text-based links to the other pages of the site.
3) "Clean" pages, with little extraneous code.
What "Spiders" Don't Like to See
Though the latest versions of search engine "spiders" are improving, it is still best practice to consider them to be earlier versions, which have difficulty reading:
1) Flash.
2) Java-based links.
3) Frames.
4) Lots of code - which can often be housed in a separate folder and referenced from the web page, rather than being a part of the page's inherent code.
5) Dynamic pages with many query string variables.
<< How search engines arrive at results | Achieving a good ranking >>
