Bot

Agent

A piece of software, such as a browser or spider, that interprets the content on a web server and presents it to the user as a web page. Examples include Internet Explorer, Opera, Netscape and various search engine spiders.
Examples: MS Internet Explorer, Netscape Navigator, Googlebot, Slurp, T-Rex

Applet

A small program, often written in Java, which usually runs in a web browser, as part of a web page. It is possible that the use of such a program may cause spiders and robots to stop indexing a page.

Bot

Abbreviation for robot (also called a spider). It refers to software programs that scan the web. Bots vary in purpose from indexing web pages for search engines to harvesting e-mail addresses for spammers.

Cache

A copy of a web site on held on a computer, or in a search engine's index. On personal computers, cache is used to save a copy of web sites images, text and code to help speed up download upon future visits to the site. On search engines, cache serves as a record of the content of a web page when a search engine last visited and indexed it.

Crawler

Also called bots or spiders; programs that follows links to visit web sites on behalf of search engines. Crawlers then process and index the code and content of a web page according to an algorithm and store the pages in the search engine's database. Googlebot is the crawler that travels the web finding and indexing pages for the Google search engine.

Googlebot

Googlebot is the Google's spider or crawler or bot; it travels the web finding and indexing pages for the Google search engine. Googlebot leaves it's identity on your web server's log file. Webmasters should study their server log files closely to see the googlebot's visit is successful and is able to crawl the whole site.

Indexed Pages

Pages that are included in a search engine's index. An important step in search engine optimisation in insuring your web sites pages are included, indexed, in the search engine databases. Search engines index pages through the spiders or robots.

Robots.txt

Robots.txt is a file on a web sites root directory which spiders are supposed to read to determine which parts of a website they may or may not visit.

Robot

Also known as spider, bot or crawler. The part of a search engine that locates and indexes every page on the Web. Successful search engine optimisation depends on robots finding many or all a Web site's pages.

Trademark Infringement

When a company uses or impersonats the trademark of another company, resulting in confusion or deception of consumers.

Trademark infringements are commonplace online. For instance, the use of keywords based on a competitor's brand name for search engine optimisation or pay per click advertising, with no mention of the competitor brand in the advert. Basically, trying to take advantage of the brand strength developed by the competitor for promotion of ones own site.

Or when the trademark is included in the text of a competitor's advert. Both infringement are a good case for the search engine company to act on behalf of the trademark holder.


Generate your Google Sitemap for free!

Once you have visited through your site (or waited till your normal traffic has browsed all your site), complete the form and our sitemap generator will produce the sitemap.xml file. Enter the full address, including the http, to set the level of from which your sitemap is based, e.g. "http://www.thesite.com/" or "http://www.thesite.com/home/"

Top Level URL: (must end in "/")


Lastmod: (blank to leave out)
 Priority
 Changefreq:

 




yourgooglesitemap.com


Home | Contact | Glossary


2004,2005 © yourgooglesitemap.com - Generated 2.918.189 Google Sitemap pages