Year

2004

Paper Type

Master's Thesis

College

College of Computing, Engineering & Construction

Degree Name

Master of Science in Computer and Information Sciences (MS)

Department

Computing

First Advisor

Dr. Behrooz Seyed-Abbassi

Second Advisor

Dr. Yap Chua

Third Advisor

Dr. Arturo Sanchez-Ruiz

Department Chair

Dr. Judith Solano

College Dean

Dr. Neal Coulter

Abstract

As the World Wide Web continues to grow, the tools to retrieve the information must develop in terms of locating web pages, categorizing content, and retrieving quality pages. Web search engines have enhanced the online experience by making pages easier to find. Search engines have made a science of cataloging page content, but the data can age, becoming outdated and irrelevant.

By searching pages in real time in a localized area of the web, information that is retrieved is guaranteed to be available at the time of the search. The real-time search engines intriguing premise provides an overwhelming challenge. Since the web is searched in real time, the engine's execution will take longer than traditional search engines. The challenge is to determine what factors can enhance the performance of the real-time search engine.

This research takes a look at three components: traversal methodologies for searching the web, utilizing concurrently executing spiders, and implementing a caching resource to reduce the execution time of the real-time search engine. These components represent some basic methodologies to improve performance. By determining which implementations provide the best response, a better and faster real-time search engine can become a useful searching tool for Internet users.

bsw code.zip (32 kB)
Code

Share

COinS