Wide Area Information Service
Search Tools
This article is reprinted from the June, 1995 issue of PC Alamode
Magazine
© Copyright 1995 by John Woody. All rights reserved.
Alamo PC Organization: HOME >
PC Alamode Magazine >
Columns >
Comm Corner
Continuing in our coverage of Internet search tools, we are going to cover
another "stateless" search application. Stateless search applications are
defined as programs which use the Client/Server relationship wherein one
part of the application lies in wait at the remote server site (SERVER),
and the other part is used to query the Internet including that site when
it is run (CLIENT). The Client query does not have to know the exact addresses
of files it is searching for, but, using its application program, sends
the query to remote sites which all have the capability to answer it if
the information or data is available. Internet Gopher and World Wide Web
browser searches fall into this category. Another search tool which accomplishes
these tasks is the Wide Area Information Service (WAIS).
What WAIS Is
WAIS is a Client/Server search tool which works with collections of data
or databases. It searches through indexed material and files based on what
they contain. WAIS searches through the index associated with the file
to find the data which is related to the query. Each WAIS server contains
a library of indexed subject matter (files, programs, descriptive material,
etc) in databases. One normally thinks of databases as containing mountains
of numbers. This is not necessarily true, in that any digital data can
be indexed into a database, especially text articles on a particular subject.
There are over 500 WAIS libraries on the Internet now. Subject coverage
is not complete, since each is maintained by volunteers and donated computer
time. Subject matter, where there are many volunteers, is well covered,
especially the libraries for computer science, networking, molecular biology,
and religion. Some literature libraries such as the Project Gutenberg contain
most of the classics and other literature. The US Geodetic Survey operates
a WAIS server. The Dow Jones Information Service is provided commercially
through a WAIS interface. In every WAIS server library, someone has created
an index for the server to use during searches. During textual data index
development, every word in a file is indexed.
WAIS Protocol
WAIS conforms to draft Standard Z39.50, a National Information Standards
Organization ANSI standard for requesting bibliographic information. It
is a distributed text-searching tool which was developed by Thinking Machines
Corporation, a supercomputer manufacturer. This protocol was developed
to ease the search of on-line text. It works by allowing a user to send
a combination of keywords in search strings' to the proper WAIS server
site. WAIS uses the client/server model as the basis for operations.
How WAIS Works
WAIS clients are used to formulate search strings in the form of queries
which are transmitted to the appropriate WAIS server. Initial search strings
should be kept broad in nature to gain the most from the query. WAIS servers
return lists of documents which may be of benefit and "score" them for
appropriateness. Scores are normalized so that the most appropriate document
is given a score of 1000. Others in the listing are ranked below that number
in order of appropriateness. The listing is limited to 15 to 50 depending
on the WAIS client you use.
A Typical Query
One enters the query into his WAIS client by typing, for example, "find
me documents that contain Gingrich and Republican'". The query goes out
on the Internet to all of the WAIS servers to be processed. The scored'
listing is returned by the appropriate server, giving the ranked files.
There is a problem in that every word is ranked equally, i.e., Gingrich',
and', Republican' all count in the score. An article with many'and's may
rank higher than another which just has Gingrich' or Republican' in it.
Also, one can not tell the query what order to place the words in the search
string, i.e., there is no "contextual sensitivity" ability in the query.
It does, however, have one really unique feature which makes it one of
the best search tools on the Internet. That feature is "relevance feedback".
Some clients allow one to find articles that are similar to original search
and retrieve portions of them which fit your query subject.
Access to a WAIS Server
There are WAIS clients for most standard operating systems and computers
such as Macintosh, DOS, X Windows, NeXT, and UNIX. Command line and graphics
interface clients all are layered into the TCP/IP protocol model. The graphics
interface clients are organized in pages where the query and sources are
entered. One of the source addresses should be "directory-of-servers.scr".
Frame your query, select directory-of-servers.scr and wait for the results.
Refine any new or repeated queries with repeated runs to get exactly what
you want.
WWW and Gopher
WAIS servers can also be accessed from Gopher and WWW browsers. It can
also be accessed by Telnetting to a known .scr. A basic public WAIS server
is located at "quake.think.com" and is logged onto with "wais". Use the
URL standard in the WWW browsers to start a WAIS search.
Conclusion
The WAIS server system is an excellent method to conduct searches for research
text based data. It adds additional research capability to our Internet
connections.
Definitions
Free-net: An organization to provide free Internet access to people
in a certain area.
FAQ: Either a frequently asked question, or a list of frequently
asked questions about a particular subject which is posted so that new
comers will not repeat them.
Gateway: A computer system that transfers data between
normally incompatible applications or networks.