Evaluating Quality on the Netby Hope N. Tillman, Director of Libraries, Babson College, Babson Park, MassachusettsMay 30, 2000 version. It was originally created in 1995 with the title, "Evaluating the Quality of Information on the Internet or Finding a Needle in a Haystack" as a presentation delivered at the John F. Kennedy School of Government, Harvard University, Cambridge, Massachusetts, September 6, 1995. I expect the paper to continue to evolve and appreciate feedback.I see most of my talk as pure common sense from a librarian standpoint . We need to use the same critical evaluative skills in looking for information on the Internet that we would do in a book, a paper index, a musical score, or on an online commercial database. The content of the Internet is only more diverse because of the potential of interaction with more media. By media, I mean, not just audio and video but all forms of technology-assisted communication. With the growth of information on the Internet and the development of more sophisticated searching tools, there is now the more likely possibility of finding information and answers to real questions. But, within the morass of networked data are both valuable nuggets and an incredible amount of junk. How should users today approach searching on the net and critically evaluating the data they find?You need a systematic approach to evaluating the tools you will use for searching and what they will cause you to receive or keep you from receiving and also you need a systematic approach to evaluating the document or result that you receive as a result of your search. As information professionals we are in the best position to determine and expand the relevance of existing criteria to new and future formats.What this paper will address:
How should we look at Internet information?Consider the continuum of information on the net as opposed to the continuum in print. Is it really any different? And if so, what makes that difference?In print: vanity to very scholarly/specific
The "home page" may be nothing more than a form of vanity or self publishing. Within what I might characterize vanity would be the sites where an individual decides to share working papers or information they have been working on for a dissertation. Many home pages have been through a rigorous review process and should not be equated with the term "vanity." Vanity publishing A vanity work may be a very specific document that has information of great value but it hasn't been through the peer review process intrinsic to scholarship or it hasn't been disseminated by the trade publishing industry. Heretofore, vanity and short-run specialty publishing has been possible in print and can be "quality" in nature, although its value may not be as easy to determine without analysis. It will not have some of the visual clues which facilitate the viewer's critical analysis. My grandfather had my grandmother's childhood memoirs published and distributed to family and friends. I always thought of it as a very entertaining and pretty well written story of a little girl growing up as part of an acting troupe in the midwest. The title was "A Little Girl Goes Barnstorming." Reading it, it belongs in the history of the American stage in the late nineteenth century. How did it really differ from regular publishing? It was carefully edited but no publisher was involved. We look to publishers to give us assurance of added value and provided quality control -- both editorial review and adherence to standards. While the term vanity press is a derogatory one, the content of what comes out of a vanity press may not be bad. But it is, from an information professional's standpoint, much more suspect. It lacks any of the trappings that scholarly publishing affords. Grey literature is another category - pamphlets, preprints, technical reports -- I am not sure the Internet is any better or worse in its indexing than were the subject based vertical files of my early library career years. ERIC has played a valuable role of giving us access to some of the gray literature for the education and library profession. I would think anything that is submitted to ERIC today probably could find its way onto the Web as well, and probably should. Professional associations have played a historical role in the indexing of hard-to-find materials within their scope. For instance, in 1972 the American Gas Association formed the Library Services Committee to participate in information sharing among members, including the preparation of bibliographies of concern to the industry, a directory of gas industry libraries, and a union list of reference tools and services. (Shirk, Virginia R. and Davis, Marc L. "Gas Libraries: An industry-wide network," Science & Technology Libraries, vol 1 no. 2 (1981), 15-22). Distribution of those tools was limited to members of that association not so much by their choice but by feasibility. Today, a group of professionals such as the Australian Firenet can share their information with the world, for better or worse. Firenet, hosted by the Australian National University, is a cooperative set of World Wide Web servers for discipline specialists in the field of fire management and fire ecology. In this case librarians have not been involved. FIRENET's specialized publications are locally mounted and managed and distributed via the Internet. Among other awards, they have been honored with the 911 Fire Police Medical Web Page First Alarm Site Award. In this case, I would consider a professional award much more telling than one from one of the many Internet awarding bodies. The role of professional associations can already be seen. Contrast FIRENET with the American Mathematical Society, which I would put on the scholarly end of the spectrum. Access is provided to MathSciNet, a web-accessible subscription database of the data in Mathematical Reviews (MR) and Current Mathematical Publications (CMP), which index and review the mathematics research literature from 1940 to the present. Bibliographic data only is available from 1940 to 1979, and from 1980 to the present both bibliographic data and review texts are available. Items listed in the annual indexes of Mathematical Reviews but not given an individual review are also included. Those in Mathematical Reviews appear first in Current Mathematical Publications. Institutional site licenses are the primary way that users get access. The cost for an individual can be steep, but MathSci Online is offered via commercial services such as Dialog, CompuServe as an option. In this case the web is integrated with the association's publishing program and can be seen as just another distribution medium, to meet the needs of their customers. Current Experimentation of all types of publishers includes parallel publishing with print and/or supplementary publishing of putting some information on the Internet but holding back something for the print publication. The Internet gives us access to large volumes of data. One of the earliest research projects that the net facilitated was the Genome Project. It allows us to manage materials that many libraries have not collected before, such as the statistics site Statlib at Carnegie Mellon. Advertising and Public Relations as an Additional Category At the original 1995 NEASIS presentation, Clifford Lynch brought up this category that I had not originally put in my list. Since then marketing has taken a front seat on the Internet, and I certainly agree belongs as a category of its own. Internet publishing categories include promotion, from self-publishing to the commercial variety. Along with providing information about products, it is perfectly natural for companies to promote them. Consider the autombile sites which describe all the features of this year's models. There is nothing wrong with this information being available and I certainly want to have access to it, but as an information professional, I also want to be aware of the bias of what I am viewing. This is no different than the need to understand what you are reading in a 10K document filed with the SEC and contrast that from the role of a company's annual report. A perfect example of the value added that a promotional site can bring can be seen by the bookstore sites, such as Amazon Book Store. Not only can you find bibliographic citations and order books, but here are comments from authors and unsolicited reviews of books by anyone who wants to contribute them, both good or bad, as well as professional reviews. Amazon compiles a wealth of information on its site to encourage anyone to return and **by the way** <smiley-face> to order a book or two because it is such an easy and cost-effective way to get what you need. What is most impressive is the level of customer service provided and speed of delivery. Amazon is not alone; its competitor Barnes and Noble has partnered with sites such as the Northern Light Search engine to provide search for books and CDs once you have finished searching for articles on a topic. There are a growing number of sites that may have started out because some people felt that the content belonged on the web, but now these sites need to support themselves. An example is the excellent Internet Movie Database. The commercial label is blurred, and the important thing to pay attention to is whether a site has valuable content and whether its presentation or content biases make any difference in terms of what you need to get out of it. Multimedia Issues Given the continuum of Internet "publishing", additional criteria must be added to reflect the multimedia nature of the medium. Quality of sound is still pretty early in its evolutionary cycle. Sound files of any size may take an unreasonable time to transfer, but that is getting better and I have confidence video will be improving as well. [multimedia can bring immediate access to bird images and sounds or animation of a bird in flight]. I am not a proponent of the medium for its own sake, but where it is used effectively, it can provide an enhanced product. For example, the National Geographic River Wild--Running the Selway is an excellent example of merging sound and graphics with print content to enhance the educational and recreational experience. However, there is the caveat that you need to have the right technology (hardware and software) to be able to take advantage of the sound, in this case, a sound card and Real-Audio software. The multimedia technology is not sufficiently developed that the browsers have everything you need built in. Print publishers can run the gamut of quality as well, and as information professionals we have generally gleaned something about a whole line of a publishers' works and the care with which titles are brought out. In the Internet publishing field, for instance, there are currently some shops that are known to move books out so fast that you can expect typos and errata that will be corrected if there are later printings or the errata can be tracked down with some effort by going to their web site. Some publishers are known to be advocates or supporters of different causes and their biases are part of what we keep in mind when we evaluate them. Consider the Sierra Club -- their publications are slanted in a particular direction, just as I would expect campaign literature, any other form of advocacy or activist publishing. This translates on to the Internet and we must look at the viewpoint of the site. These may be explicit in a scope statement, or you may not be able to confirm your suspicions except by analyzing the point of view of the contents of the site. The Internet has enabled a vast new group to enter the world of publishing - those who didn't learn the culture of the print publishing trade. And we need to have them use the right information so that we can evaluate their sites. So we have a responsibility to explain the rules to new publishers, just as the Internet community tells new users the Internet netiquette rules of the road. So how do you come to terms with quality be it vanity or grey literature or scholarly? I take a pragmatic view of quality. At the very least, I want my facts accurate, current, and the bias and authority of authors clear.
Just to look at some of the issues to consider in evaluation of a web site, take a look at a site I think very highly of: Gilbert and Sullivan Archive. There is a clear table of contents and very good navigation. It is designed to be viewed both by text browsers (Lynx) and graphics browsers (Netscape Navigator and Microsoft Internet Explorer). Graphics load quickly. The G&S Photo Gallery displays black and white photographs, which show best on monitors with high resolution. A collection of public domain photographs of the stars and other principals of the original Gilbert & Sullivan products has been scanned. Some, such as the picture of Alice Barnett as Queen of the Fairies in Iolanthe, has some text, while others are just the picture and the name of the star. The Midi and Mpeg audio files are particularly appropriate and well done for this site. Since this is for afficionados, the karaoke nature of the midi files is designed for the members who want to sing the parts. The mpeg files, such as the Mikado March by John Philip Sousa, are not as easy to play, because even though the format was set as a helper application, it insisted that I download the file to play directly with the mp2 format player while the midi files play directly. This represents an existing problem, solvable, but a hurdle to overcome. What is the authority of the site? The webmaster Alex Feldman is Associate Professor in the Department of Mathematics and Computer Science at Boise State University, Boise, Idaho, which hosts the web site. The curator of the archive is Jim Farron who is a computer and electronic publishing specialist with the U.S. government. They are joined by a number of others who participate in making this such a rich site. For instance, interested individuals are contributing libretti, diaries of festivals, and additional audio files. One member is compiling a complete discography of all G&S that has been recorded based on his own collection as well as that of others. The peer review process for a site such as the G&S Archive is the care and attention of its contributors. Just as with any print or other types of resource, the viewer must bring his or her own critical evaluative questioning to the content. How complete is the Gilbert & Sullivan archive? What can one expect to find here? The web site archive has grown from the initial files such as the photo gallery and a couple of libretti which had been on the FTP site to at least one libretto for each of the operas. They have now moved on to adding works by either Gilbert or Sullivan individually. The content includes libretti in the public domain, and sources are identified. While there is minimal dating of entries as a whole, there is a current What's New archive that goes back in the current year further than the dating in the What's New section.
Generic Criteria for Evaluation
Current State of Evaluation Tools on the Net
General Guides and Directories
You will want to compare in terms of value to you the level of specificity in Yahoo and the WWW Virtual Library and the newer general directories versus the set of categories in the various directories of the search engines.Specialized Guides
"the Internet allows all types of publishing in the broadest sense--much of the information contained in Internet resident discussion groups is transitory--and this network of networks will continue to expand exponentially so that bibliographic control will continue to be out of reach. There is no Dialog superstructure to create a "dialindex" of indexes, and one is not likely to exist in the future because of the distributed nature of the system and the ephemeral quality of much of the information posted to network repositories. Librarian skill at creating specialized indexes or other retrieval tools will be needed." (Sharyn J. Ladner and Hope N. Tillman, Internet and Special Librarians: Use, Training, and the Future. Washington, D.C.: Special Libraries Association, 1993, p. 58)What a difference a couple of years makes. Our crystal ball was not very good. There is the potential for a whole lot more bibliographic control today; and at the same time there is increasing complexity. I still believe in the importance of information professionals' contributing their skill to develop the searching tools for whatever the Internet is going to become. General GuidesArgus ClearinghouseWhat started as the University of Michigan ClearingHouse project now is the Argus Clearinghouse. It is now truly separate in name as well as management. It has had growing pains. There is now a tighter process to ensure the quality of their guides. An early flaw that is being remedied is that many of these developed as student projects, and after the end of the year, the students left. Now there is a staff to do the reviews. Not all guides are done by students, and Internet gurus including John December and Diane Kovacs have been among the contributors. Guides not updated in the past year are listed in a separate file. Several years ago, I did a review of the ClearingHouse project handling of business resources for the Journal of Business and Finance Librarianship. Since it has been over two years, I have removed it from my web site as out of date. The original project leader, Lou Rosenfeld, began the ClearingHouse project while a Ph.D. candidate at the University of Michigan library school. With Peter Morville, he currently heads a business Argus Associates. In Fall 1995 to improve the value of the guide, a plan was put into effect to rate guides according to 4 criteria:
last checked by Clearinghouse: May 27, 1997This gives an excellent set of characteristics to frame how to look at that particular site and what to expect from it. In this case, it is interesting that the weak link is the resource evaluation of what his site points to. The site gives a great view of the universe of education available via the Internet. However, its annotations about the resources it points to are no more than one liners. I will not have unwarranted expectations about the evaluations but will expect the site to have an excellent organizational structure. The biggest value in leading you to explore the strengths of a work. Gale's Cyberhound Guide -- an early casualty Gale has been in the directory service business for a long time, as its many library customers will attest. It looked to leverage its indexing skills to help those looking for information on the Internet. However, its web-accessible endeavor was shortlived, as it has pulled the plug on the Cyberhound, formerly at http://www.cyberhound.com/, and will just be providing print reviews. " Searching for the best sites on the web, 24 hours a day, 365 days a year has Cyberhound completely fried. (No wonder you never catch him without his shades.) He's retiring from the Internet spotlight to pursue his writing career. From now on, please access Cyberhound reviews in one of his quality softcover reference volumes."Given the ability to update information on the web (if done), I certainly could not expect a print publication to be timely and that is a major requirement of an Internet evaluation tool. Internet Tools of the Profession, 2nd edition, 1997 The 2nd edition of this resource guide has a web site where URLs of the reviewed titles can be updated, and chapter authors can add new sites, as needed. Martindale-Hubbell Lawyer Locator
In some cases, the contributors identified how they or one of their patrons had used this particular site to answer a question. As with any book about the Internet, even though this title was released in June, it is out of date. At this point address changes and updates on http://www.sla.org.pubs/itotp/ For the second edition, eleven divisions came forward with chapters and for others there is identification of their listservs and websites. Specialized Guides A good resource for identifying the best of these is to use the searching feature of the Clearinghouse website described above. I have particularly enjoyed the development of this specialized evaluation site which began rating business school web sites for several years. This site uses a table to display the criteria by which the business schools' sites are evaluated so that not only it is clear whether or not they have met a particular criteria, but you can "click" on that category and see its display at the specific site.The table formatting is particularly effective as a way to see the comparison between the business schools' web sites. DirectoriesYahoo
See their categories from the Yahoo home page. Yahoo has grown its list very quickly using people and technology to assist. Those who submit URLs are forced to select from among existing categories; there is a place to recommend alternate or new categories. Its categories are home grown using a variety of techniques but no different than any library list of subject headings, with its own set of biases developed because of the nature of Yahoo and what they have looked at. For instance, they poll automatically to see if sites are up or available. They may not catch forwarding addresses with this technique.
Of particular interest is the WWW Virtual Library disclaimer: This information is provided in good faith but no warranty can be made for its accuracy. Opinions expressed are entirely those of myself and/or my colleagues and cannot be taken to represent views past present or future of our employers.
My key indicators of quality (my checklist):
Advice for those "publishing," promoting, or "communicating" via the netWhat should creators of Internet information (especially web sites) need to consider in "publishing" on the net so that their valuable nuggets can be found and so that they will be appreciated as credible?For librarians a well-indexed title or a periodical that is indexed by a major index are analogous. Certainly, Internet information providers want their pages indexed by the major search tools like Altavista, Infoseek, and Lycos and need to understand those serch engines well enough to get the most important content indexed. Creating good meta tag description statements is valuable for those search engines that will use them (Alta Vista, HotBot, Infoseek). In addition to these meta tags, you need to build a summary paragraph into your web page which can be used by the Search Engines which do not use the meta tags. Excite used to have a statement saying it did not use meta tags because they considered them to be unreliable. For the search engines that are looking at the visible text, consider what is being said in the first 250 characters of the web site when the page loads. Engines like Web Crawler and OpenText will use this information for their summary of your web page. Paying attention to the top words on the home page would be a basic suggestion; no different from providing a good table of contents in a book. Sites like Lycos look at words in terms of how far into the document they are. Topmost info gets higher ratings. I'm sure you are very aware of the difference between the Internet and the online services in terms of indexing. From my own experience I see these search engines as very powerful resources but reminiscent in their interfaces of early DIALOG or BRS with their use of cryptic character to carry out commands (for instance the plus symbol). I enjoyed the statement in AltaVista which does use the terms and, or , not, etc. that if you are nostalgic for algebra you can use the symbols. But, nostalgia is not my problem. Fortunately the search engines are fast learners and keep improving the searchability of their databases. It is very important that you do not turn off your target audience because your pages have software requirements that are beyond the capabilities of the viewer or their browsers. For instance, until recently very few had browsers who could work with 1024 by 768 display screens unless they were graphic artists. Many browsers in use today still do not support frames. Keep the text only people in mind too who cannot navigate with bitmapped image maps or frames. There are other caveats to consider, such as keeping your graphics small for quick loading. See Walt Howe's graphics guide for some good comments on this.
Evaluating
© Copyright 1995, 1996, 1997, 1998, 1999, 2000 Hope N. Tillman
|