Searching the Internet. Part 1: About Search Engines

Background

Most software and programs that let you search the Internet fall into two categories: Search Engines and Web Directories (Subject Guides), although many programs combine these two strategies. A search engine lets you look for a file (text, graphics, webpage, discussion group, etc.) by typing in a key word or combination of key words. To find out information from the World Wide Web about Shakespeare, you could for example use the search engine Alta Vista and type in the phrase "William Shakespeare." If you wanted information about the play Hamlet, you could type in Hamlet, and so on. A subject guide, on the other hand, divides information into broad categories and lets you choose the categories most relevant to your topic. If you wanted to use the subject guide in Yahoo to find information about Shakespeare, you could choose the general category Arts, next the category Humanities, then Literature, then Playwrights, and finally Shakespeare. The actual address for Yahoo's page on Shakespeare is
http://dir.yahoo.com/Arts/Humanities/Literature/Authors/Playwrights/Shakespeare__William__1564_1616_/ 
which means that the Shakespeare file is actually stored under Performing Arts and Drama, rather than under Humanities and Literature. Yahoo classifies its material in several different ways, and so there may be more than one correct route to a page. Yahoo also provides you with a key word search engine, which allows you to search its files without having to go through a maze of subject headings.

Choosing between a Search Engine and a Subject Guide

Specific vs. general

Since a search engine allows you to focus very specifically on a term or combination of terms, it is usually the better program to use if you are looking for a very specific fact. For example, if you want to learn about Shakespeare's birthplace, you could type "Stratford upon Avon" in the Altavista text box, and you would find about 2,000 different pages with that phrase.

Subject guides are less useful for finding a specific fact, but more useful if you're not sure what you're looking for or if you want to browse through a subject. If you wanted general background on Bosnia, you could go to Yahoo and select Regional/Countries/Bosnia_and_Herzegovina/. There you would find information classified into such categories as Education, Government, History, Maps, News and Media, Political Opinion, and Travel.

Inclusive vs. selective

Regardless of whether you're looking for a specific fact or general information, you may find that some programs give you too much information, while some give you to little. Most search engines use an automatic program that captures information from throughout the World Wide Web and indexes the pages so that they can be retrieved by a key word. Altavista is one of the most inclusive of these engines, indexing 31 million pages found on 476,000 servers. A simple search for the word Shakespeare will find about 100,000 pages, including every Web-based course syllabus that has the word Shakespeare on it.

Subject guides typically are much more selective. They list pages that the author of the guide has found useful, or pages that individuals have submitted to be added to the collection. Using the search engine within Yahoo to find only the Shakespeare sites within the Yahoo subject guide collection will turn up about 174 pages. The pages you find from a subject guide will tend to be more useful than those you find from a search engine, but you will almost always find more pages from a search engine that indexes the entire Web than from a subject guide.

The combined search

Unless you know exactly what you're looking for, the best strategy is often to use both a search engine and a subject guide. You can begin by browsing pages in a Web Directory (subject guide) and from these pages, you can find keywords to plug into a keyword-based search engine.

Suppose you wanted information about the role of the United States in the Bosnian conflict. The phrase "U.S. military involvement in Bosnia" in Altavista yields only about ten sites, few of which would be useful. On the other hand, using the subject guide path Regional/Countries/Bosnia_and_Herzegovina/News_and_Media/Peacekeeping_Forces/ followed by a little browsing takes you to a page that gives the offical name of one of the NATO peacekeeping operations, LANDCENT. By switching to Altavista and typing landcent in the simple search text box, you can find a recent NATO press briefing about military activity in Bosnia.

Refining a Key Word Search Using Boolean Operators

Most search engines allow you to focus a search by using certain terms called Boolean operators. The primary Boolean operators are and, or, not, and near. Some engines also use symbols such as + and - to restrict searches to certain words. The description below describes a typical search. Before using Boolean operators on any site, you should read that site's help section.

To find information on U. S. military involvement in Bosnia, you could go to Altavista and then choose Advanced Search, which allows you to use Boolean operators. In the text box, you could type the phrase (without quotation marks) military and U. S. and Bosnia. You'd get about 8,000 pages; generally in Altavista, the first pages are the most relevant, and the first page turned up by this sample search was the homepage for the United States European Command. Boolean searches take practice, and different engines may vary in how they use Boolean operators.

The following Diagrams to Illustrate Boolean Operators provide a simple way to visualize how Boolean logic works.

You can also use parentheses (as in algebraic expressions) to combine Boolean operators. The search strategy (pets and cats) or dogs will find both those pages that have the words (pets and cats) and those that have the word (dogs). The search strategy pets and (cats or dogs) will find a smaller number of pages--those that have (pets and cats) and those that have (pets and dogs). For more information, see the help section of your search engine.

For an excellent handout on Boolean terms, see Boolean Searching on the Internet from the University at Albany Libraries.

Natural Language Searching

Several newer search engines (for example, Ask Jeeves) allow you to use natural language (as opposed to the more mathematical terms of Boolean logic). At the present (January 2001), a search using one of these natural language engines will probably not yield as many results as a precise Boolean search, but it may find some sites that the Boolean search didn't. (Note: as with key words, you should try several combinations and word choices in using a natural language question.) As computers increase in power and speed, the accuracy of natural language search engines should improve.

Note: some GALILEO databases (e.g. ProQuest) now include Natural Language questions.

Searching the Internet. Part 2: Choosing a Search Engine

Searching the Internet. Part 3: Evaluating Information from the Internet