Free Website Content
Website Spidering
Spidering Websites
By Sharon Housley
Website Spidering refers to the automated
process of indexing a web site by a search engine. An
automated program, known as a web crawler or spider,
will go through a website following the links on each
page, and will gather pertinent information from each
page until it has properly indexed the entire website.
If a search engine is unable to spider
a website, it may be a unable to index some or all of
the content on that site. As a result, the website may
not appear in the search results from that search engine,
even when associated keywords are searched for. Potential
customers may use search engines to seek out a product
or service, but if a website does not appear in the
search results due to missing or incomplete indexing,
that website may be losing out on an opportunity. As
such, it is very important to make sure the search engine
spiders can indeed "crawl" and index your
website.
There are a number of things that webmasters
can do to improve the "crawlability" of their
websites to make them more spider-friendly...
Display Using HTML
HTML is by far the easiest type of content for search
engines to spider. If the webmaster uses scripting or
flash to display some of the site's content, the search
engine spiders may have a difficult time following the
links.
Use a Sitemap
Sitemaps are simply roadmaps for a website.
The sitemap will help insure that all the pages on the
website are indexed by the search engine. Create a proper
sitemap for the website, and then submit the sitemap
to the major search engines.
Sitemap Details - http://www.small-business-software.net/ins-and-outs-of-sitemaps.htm
Robots.txt
A properly-formatted robots.txt file will
help direct search engine spiders to the various parts
of the website that should be indexed, as well as specifying
any parts that should not be indexed. The robots.txt
file should be included in the website's root directory.
Secure
Keep in mind that a search engine spider
can not follow links behind a password or secure server
(https). Any important web pages that require indexing
should never be located behind a password or secure
server.
Avoid ID=
Avoid using "ID=" or similar
parameters in the webpage urls. Search engines will
often ignore any URLs that include an "ID="
as a parameter.
No Frames
Avoid using frames if possible. Content
that is contained in a frame cannot be spidered by search
engines.
Consider implementing these few easy steps
to increase the spiderability of your website, to help
insure that the site will be properly indexed.
About the Author:
Sharon Housley manages marketing for FeedForAll http://www.feedforall.com
software for creating, editing, publishing RSS feeds
and podcasts. In addition Sharon manages marketing for
RecordForAll http://www.recordforall.com
audio recording and editing software.
**********************************************************
This article may be used freely in opt-in
publications and websites, provided that the resource
box is included and the links are active. A courtesy
copy of the issue or a link to any online posting would
be greatly appreciated send an email to sharon@notepage.net
.
Additional articles available for publication available
at http://www.small-business-software.net/free-website-content.htm
**********************************************************
|