Wide Area Information Service
Search Tools

This article is reprinted from the June, 1995 issue of PC Alamode Magazine
© Copyright 1995 by John Woody. All rights reserved.


Alamo PC Organization: HOME > PC Alamode Magazine > Columns > Comm Corner 

Continuing in our coverage of Internet search tools, we are going to cover another "stateless" search application. Stateless search applications are defined as programs which use the Client/Server relationship wherein one part of the application lies in wait at the remote server site (SERVER), and the other part is used to query the Internet including that site when it is run (CLIENT). The Client query does not have to know the exact addresses of files it is searching for, but, using its application program, sends the query to remote sites which all have the capability to answer it if the information or data is available. Internet Gopher and World Wide Web browser searches fall into this category. Another search tool which accomplishes these tasks is the Wide Area Information Service (WAIS).
 
 

What WAIS Is

WAIS is a Client/Server search tool which works with collections of data or databases. It searches through indexed material and files based on what they contain. WAIS searches through the index associated with the file to find the data which is related to the query. Each WAIS server contains a library of indexed subject matter (files, programs, descriptive material, etc) in databases. One normally thinks of databases as containing mountains of numbers. This is not necessarily true, in that any digital data can be indexed into a database, especially text articles on a particular subject. There are over 500 WAIS libraries on the Internet now. Subject coverage is not complete, since each is maintained by volunteers and donated computer time. Subject matter, where there are many volunteers, is well covered, especially the libraries for computer science, networking, molecular biology, and religion. Some literature libraries such as the Project Gutenberg contain most of the classics and other literature. The US Geodetic Survey operates a WAIS server. The Dow Jones Information Service is provided commercially through a WAIS interface. In every WAIS server library, someone has created an index for the server to use during searches. During textual data index development, every word in a file is indexed.
 
 

WAIS Protocol

WAIS conforms to draft Standard Z39.50, a National Information Standards Organization ANSI standard for requesting bibliographic information. It is a distributed text-searching tool which was developed by Thinking Machines Corporation, a supercomputer manufacturer. This protocol was developed to ease the search of on-line text. It works by allowing a user to send a combination of keywords in search strings' to the proper WAIS server site. WAIS uses the client/server model as the basis for operations.
 
 

How WAIS Works

WAIS clients are used to formulate search strings in the form of queries which are transmitted to the appropriate WAIS server. Initial search strings should be kept broad in nature to gain the most from the query. WAIS servers return lists of documents which may be of benefit and "score" them for appropriateness. Scores are normalized so that the most appropriate document is given a score of 1000. Others in the listing are ranked below that number in order of appropriateness. The listing is limited to 15 to 50 depending on the WAIS client you use.

A Typical Query

One enters the query into his WAIS client by typing, for example, "find me documents that contain Gingrich and Republican'". The query goes out on the Internet to all of the WAIS servers to be processed. The scored' listing is returned by the appropriate server, giving the ranked files. There is a problem in that every word is ranked equally, i.e., Gingrich', and', Republican' all count in the score. An article with many'and's may rank higher than another which just has Gingrich' or Republican' in it. Also, one can not tell the query what order to place the words in the search string, i.e., there is no "contextual sensitivity" ability in the query. It does, however, have one really unique feature which makes it one of the best search tools on the Internet. That feature is "relevance feedback". Some clients allow one to find articles that are similar to original search and retrieve portions of them which fit your query subject.
 
 

Access to a WAIS Server

There are WAIS clients for most standard operating systems and computers such as Macintosh, DOS, X Windows, NeXT, and UNIX. Command line and graphics interface clients all are layered into the TCP/IP protocol model. The graphics interface clients are organized in pages where the query and sources are entered. One of the source addresses should be "directory-of-servers.scr". Frame your query, select directory-of-servers.scr and wait for the results. Refine any new or repeated queries with repeated runs to get exactly what you want.
 
 

WWW and Gopher

WAIS servers can also be accessed from Gopher and WWW browsers. It can also be accessed by Telnetting to a known .scr. A basic public WAIS server is located at "quake.think.com" and is logged onto with "wais". Use the URL standard in the WWW browsers to start a WAIS search.
 
 

Conclusion

The WAIS server system is an excellent method to conduct searches for research text based data. It adds additional research capability to our Internet connections.
 
 

Definitions

Free-net: An organization to provide free Internet access to people in a certain area.

 FAQ: Either a frequently asked question, or a list of frequently asked questions about a particular subject which is posted so that new comers will not repeat them.

 Gateway: A computer system that transfers data between normally incompatible applications or networks.