Most of the leading search engines like Google and Yahoo use crawlers to find web-pages for their algorithmic search results. If a page is linked from other search engine indexed page it doesn’t need to be submitted because it will be found automatically. Some search engines have a pay-per-click operation in place, which means that it gets guaranteed crawling. However, this does not guarantee their natural ranking within the usual search results.
There are a number of different factors taken into account when a search engine crawler is crawling a web-site and not every page on a site will be indexed by any particular search engine. Another factor could be the distance of any page from the root directory of a site.
Web masters or the outsourced search engine optimisation consultant can instruct spiders not to crawl certain files and/or directories through the standard robots.txt file in the root directory of the domain. This will avoid undesirable content in the search indexes. On top of this, a web-page can be explicitly excluded from the search engine’s database. This is achieved through Meta tag manipulation. However, the search engine may sometimes crawl one of these pages, because it can possibly be left cached from a previous crawl. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam.
There are a variety of other methods of search engine optimisation in order to get the web-page indexed and shown in the search engine results. These include cross linking between pages of the same web-site. A SEO firm could also attain more links to the main pages of the web-site which will in turn improve page rank used by the search engines. This could mean the beginning of a link farming campaign or even through comment spam. A webmaster will also attach relevant keywords and keyword phrases into the content text and the Meta tag of that page. This could include keyword stuffing. Search Engine Optimisation could also involve title and URL optimisation for any particular webpage, and will try to avoid any canonical issues – where a page has a duplicate URL.