Many teachers and trainers are beginning to use the web on a regular basis and are looking for guidance and organizational strategies for what appears to be a chaotic and certainly a ubiquitous environment. They find that simply accessing personal web pages does not provide a useful scheme for finding information to perform higher level operations such as research. They are also finding out that traditional methods of searching for information do not apply in the distributed resource environment of the world wide web. Educators and Trainers need unique searching strategies for efficiently and effectively finding information in this dynamic environment.
There are three main factors to improving searches on the web: 1) efficiency (speed to increase the number of hits per minute), 2) effectiveness (accuracy to get what you wanted to get), and 3) extraction (strategies for extracting information from a text in the most efficient ways possible). This page will concentrate on efficiency (speed) and effectiveness (accuracy), or what I term active searching.
Active searching is an important concept. For example, when you are reading it is often useful to highlight, underline, and annotate the text. On the web it is also necessary to use strategies which improve your efficiency and effectiveness. These include the appropriate use of searching techniques and tools. Active searching also helps keep you focused on the task at hand and avoids wondering...or in web terms...surfing.
When you begin to examine this topic, you will soon find that using just "a search engine" may or may not be good enough to meet the goals of efficiency and effectiveness. Full-text keyword searches are easily implemented and commonly offered by search engines. But full-text searching simply may not be the right tool for the job you have to do...in the way you need to do it.
Productivity in a Ubiquitous Environment
If you have browsed the web, you have surely found that it contains vast amounts of both useful and non useful information based on your interests. But, as the information contained on the web continues to grow exponentially, filtering through what I term non useful (i.e., advertisements, personal resumes, etc.) information has forced me to develop strategies that would make my job as an educator easier and not more complicated. To give you an example of this growth, although no one knows exactly how many documents there actually are on the web now, the Association of Public Data Users and International Association for Social Science Information Service and Technology (1996) state:
Where you only need the shallowest knowledge on the topic, you can use simple locating and sampling strategies (skimming). If you need moderate levels of information, you can use collecting (scanning) strategies. As your need becomes more detailed (i.e., research), you may find it necessary to use conceptual strategies (studying). Although all of these tools may be used for every level of information gathering, using the right tool for the job at hand may just make your job easier.
There are many tips and techniques to efficiently and effectively searching the web. To increase your web searching productivity try some or all of the active searching tips provided below.
Boolean Operators
Most search engines now offer boolean capabilities. Boolean operators express different and specific relationships between words and phrases used in the search.
AND limits a search by requiring each term must be present. For example a search on learning AND cognition specifies that you want information on BOTH learning and cognition. If an article only has the term learning in it, it will not be matched. Using AND will usually produce fewer hits.
OR expands the search by combining discrete terms into a conditional set. Searching for learning OR cognition specifies that you want information either learning or cognition. Using OR usually produces the most hits.
NOT limits the search by specifying that a term not be present. Searching for learning NOT training will find matches with the term learning but not training.
Proximity Operators
With some search engines you can use proximity operators such as OpenText's NEAR operators or Webcrawler's ADJecent or the FOLLOWED BY operator. With each of these operators, word order is important. For example: if you place square brackets such as [learning theory] causes a hit if they are found within 100 words of each other (Gray, 1966).
Truncation (*)
You can use truncation on most search engines. That is, you can use the asterisk (*) operator to end a root word. For example: searching for teach* will find teacher, teaching, and teachers. Note: the asterisk can not be the first or second letter of a root word.
Wildcard (?)
You can find words that share some but not all characters using the question mark (?) operator. For example: Johns?n will find Johnson and Johnsen. Note: the ? can not be the first character in the search.
You may also use combinations of truncation (*) and single character wildcard (?) in your searches.
Throughout the process of searching for information you will find many useful sites. If you do not have time to examine these sites in detail you may either print them for off-line review or simply set a bookmark to easily return to them later. Although bookmarks are simple to set and will certainly help your overall searching, organizing your bookmarks dramatically increases your efficiency. Netscape for example allows you to organize bookmarks into folders. I suggest that you make bookmark folders thematic according to your search topic(s). Now when you add a bookmark, you can "drag" and "drop" it in the thematic folder of your choice (Win95 and Mac). This will certainly make returning to these sites for further information or citations a much easier task. Additonally, you can export these files to share with your colleagues for collaborative working arrangements.
Search tools are certainly proliferating on the web. These tools have grown from early naive indexing tools to those that now use a form of artificial intelligence algorithms termed heuristics. Heuristic searching tools are designed to aid the user in learning, discovering or problem solving through self-educating techniques (i.e., feedback) to improve performance.
It is important to note that there is no best tool for searching the web. Different search engines maintain different attributes. As stated earlier, the key to deciding which search tool you should use is dependant on what you want to know, then knowing which tool(s) will best help you efficiently and effectively find that information.
To determine which search engine(s) you should use to aid you in your task, you need to know a little about various strategies they use and features they provide.
I have grouped search strategies into five categories (rating, sampling, locating, collecting, and concept searching). Search engines may be segregated into one or more of these categories. As search engines continue to develop, many are integrating multiple strategies into their capabilities and thus blurring these categorical "lines." But for now, I have defined each of these categories to help you understand the uses of various search engines.
The decision regarding which search engine to use depends upon your knowledge of how an engine searchers and indexes web pages. To better understand this let's look at a few examples. The Lycos indexing search engine examines only specific parts of a web page such as the title, headings, and the most significant 100 words. Where Webcrawler examines every word on a web page (Webster & Paul, 1996). But these are not the only criteria to consider when selecting a search engine. The size of the database (i.e., listings) is also a major factor.
I have provided hyperlinks to examples for each of the five categories for your examination and better understanding of their uses. I recommend that after you have reviewed these search engines that you bookmark those that are most useful to you for this course and for your professional work. This way you will not have to continually return to this web page to access your preferred search tools.
Rating Strategy (rating and
reviews) - Finding rated and reviewed sites.
Use: When you want to find
out how others have rated topical sites.
Sampling Strategy (subject trees) - Finding a few high quality sources based on topics.
Concept Searching (heuristics, fuzzy matching and relevancy matching) - Find information on topics using feedback.
Searching Educational Databases
Some of the most useful research tools to be introduced in the last ten years are the CD Database Search Systems. These tools allow you to search such databases as ERIC, Dissertation Abstracts, etc. Today, these tools are migrating to the web. If you would like to perform database searches on the web you can access the Reference Services Subject Guide on the OSU web site. This site will provide you with links to database search tools for numerous topics. Within these links is the new FirstSearch Web Service. This service will allow you to search education databases such as ERIC, Education Abstracts, and many others. Give them a try, you will find these links useful whenever you need to research a topic.
SEARCH.COM provides a unique feature for a regular search engine user...it's a build a personalized search engine page. This is a useful tool in that the personalizer will guide you through the process of gathering search engines that are most helpful to your work.
You can put all your favorite search engines onto your own personal page. All you need is a few minutes and a cookie-capable browser (Netscape Navigator or Microsoft Internet Explorer). The three step are: 1) select the categories you're interested in; 2) from the next page, select the search engines you're interested in (you can also add a few of your favorite Web links); 3) from the next page, tell the developers the order in which you'd like to list your choices. That's it, you should find it a very useful tool. You can go to the SEARCH.COM page now if you would like to create your personalized grouping of search engines.
Robots, Spiders, Worms, WebAnts, and Agents
Most search engines create indexes that are compiled by computer programs known as robots, spiders, webants, or worms. Robots and spiders are the same thing, but worms are technically different in that they are a replicating program, where WebAnts are distributed cooperating robots. These resources traverse the web to examine documents and indexes or enters it into a database, and recursively retrieves all documents that are referenced (Koster, 1996). These robots will follow the hyperlinks to other documents and index those also. Agents have numerous meanings in the computing arena. Agents are programs which act autonomously on a task. The most common agents found on the web are: Autonomous Agents which are programs that travel between sites and decide based on algorithms when to move and what to do; and Intelligent Agents which are programs that help users with things such as forms and heuristics. They choose a product, or guide a user through forms, or help users find information.
There are those who believe that robots are bad for the web. In fact some poorly constructed robots can overload networks and web servers and therefore may need attention if you have set up a web server. If you are concerned that you are being visited by bots and they may be causing problems with your system, check you server logs. According to Koster (1996) robots may retrieve large numbers of documents from your site in a very short period of time. One way to check this is to examine your user-agent logging and when you notice a site repeatedly looking for the file /robots.txt' it is likely that is a robot. If you have a medium or high performance server, it should be able to deal with a high load of several requests per second.
If you would like to examine a database of web robots, you can visit the Webcrawler Robots Page.
Heuristics, Metaheuristics, and Knowledge Management Tools
A number of new search tools are starting to crop up that take search strategies to new levels. These new tools are based on heuristics, metaheuristics, and the management of knowledge. These differ from index search engines and even meta-index search engines. These are "smart" search tools that learn from your input and provide feedback to assist you with more efficiently and effectively finding and managing information. A few of these tools are: Saqqara's Step Search, the Inso Search Wizard, and QUARTERDECK's Web Compass.
Inso Search Wizard is a concept based search technology. The technology helps users figure out what they really mean and want to find. The Inso Search Wizard uses what is termed "computational linguistics" to conceptually search for specific information.
QUATERDECK's Web Compass searches all search engines with a single command, purges duplicates, publishes results, and keeps watch for updates. QUARTERDECK terms this tool "your personal research assistant on the World Wide Web, Internet, and Intranet." This is certainly the way new searching toolsets are going. It is not enough that search engines just perform searches anymore, the new tools will manage your data in an intelligent way.
These are just a few of the strategies and new tools teachers and trainers can use to make working on the web more productive. As teachers and trainers continue to use the web they will soon see the next generation of web "knowledge tools" begin to emerge. These will include multidimensional tools that are created to manage data on the web using factors such as "virtual neighborhoods of information," "organic structuring," and "mental model based searching and flying mechanisms" (Eichmann, 1994). These are tools which are intended to make the world wide web more manageable for the user.
Let us now go back to my original statement...that the goals of search strategies and engines should be to increase your efficiency and effectiveness when looking for information on the web. Only you can decide which search/knowledge management strategies and tools actually improve your productivity. It is my hope that this article helps you with making these decisions.