Kasun’s Blog

Kasun Indrasiri

  • Kasun Indrasiri

  • Info

    Department of CSE
    University of Moratuwa,
    Sri Lanka

  • Archives

  • Categories

Search Engine Technology-An In depth Analysis (I)

Posted by kasun04 on September 7, 2008

A search engine is an information retrieval system (IR Systems-See my prior post) designed to help find information stored on a computer systems. Search engines minimize the time required to find information and the amount of information which must be consulted.

The search engine technology is evolved over the past decade or so, owing to the continuous and rapid grow of World Wide Web. As the information systems grow bigger and bigger the amount information stored in the information systems became enormous. So as an information retrieval system, the search engine has to be evolved in order to be sync. with the growing stored data.

The most public, visible form of a search engine is a Web search engine which searches for information on the World Wide Web. And more or less that the most important form of the search engine, because the www is the world’s fastest growing, distributed and fault tolerant information system. As an important business product the Enterprise Search Engines also play a critical role as most of the companies need to have their own customizable information retrieval system which is beyond a conventional web search engine.

Regardless of web or enterprise search, the technology used by the both forms are quite similar. in fact from the core (or heart) of the both engine looks almost the same. Here is a brief overview of different forms of Search Engines.

Web Search Engines

A Web search engine is a search engine designed to search for information on the World Wide Web. Information may consist of web pages, images and other types of files (including pdf, doc etc.). Web Search products are typically free to use and the vendors generate revenue via advertising.

Challenges:
  • The number of documents to be indexed (tens of billions)
  • The number of users (hundreds of millions)
Major Vendors:
  • Google
  • Microsoft
  • Yahoo

Enterprise search Engines

Enterprise search is the practice of identifying and enabling specific content across the enterprise to be indexed, searched, and displayed to authorized users. Enterprise Search Engine facilitates the application of search technology to information within an organization. And most of ES vendors tend to focus exclusively on Enterprise Search and do not also offer Web Search or Desktop Search products.

Challenges:
  • Index documents from a variety of sources such as: File systems, Intranets, Document Management Systems, E-mail, Databases.
  • Present a consolidated list of relevance ranked documents from these various sources
  • Many applications require the integration of structured data as part of the search criteria and when presenting results back to the users controls are vital
  • if users are to be restricted to documents to which they are granted access by the various document repositories within the enterprise
Major Vendors:
  • Microsoft (FAST Search & Transfer, Convera)
  • Autonomy
  • Dieselpoint
  • Endeca
  • Exalead
  • IBM
  • ORACLE
  • Google

Desktop search

Desktop search is about searching the documents personal to a single user (i.e. a local hard drive and email folders).Most of the leading Desktop Search products are provided as free downloads, or for a small fee per user.

Challenges:
  • To offer the search on the user’s local machine without impacting the performance of other applications running locally.
Vendors:
  • Microsoft
  • Google & Copernic/Coveo.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: