AECT Handbook of Research

Table of Contents

25. Technologies for Information Access in Library and Information Centers
PDF

25.1 Introduction
25.2 Categories of Research in Information Access
25.3 Chapter Overview
25.4 Research on Access
25.5 Research on Information
25.6 Integrating Users, Access, and Information: Three Longitudinal Studies
25.7 Conclusion
References
Search this Handbook for:

25.5 RESEARCH ON INFORMATION

In this section, information is examined as a tangible entity and a tangible process (Buckland, 1991, p. 6). The focus is on information as a thing that requires processing and organizing to provide access. Access to knowledge as enhanced or restricted by public policies and private usage is also discussed. Early work on information in LIS tended to be structural, with the focus on mathematics, linguistics, and logic representation. More recently, other aspects of information have become prevalent in the research. Searching for pictorial information, representing audio data in a database, and finding moving images have become part of the information substructure. These have enlarged the foundations used to understand, construct, and interpret information. This review examines three general areas that relate to the tangible aspects of information: (1) What is information? (2) How is information organized? (3) How is information controlled? Issues such as group uses of information, using information to generate knowledge or to become informed, and information seeking are discussed in other sections of this chapter.

25.5. 1 What Is Information?

This section defines information in terms of its use in the access process and looks at common characteristics of information as discussed within LIS. Because of its position in this review, information is seen as it relates to users and access issues. The information presented in this section builds on that presented in other sections. Order of presentation can affect interpretation of information. The linear presentation suggests that users come first, then access, and then information. It implies a causal chain of events. Another reviewer could have presented information research first, then users, and finally access issues. One of the constituents of information might be relative position in the information access sequence.

The characteristics of information that are important in the context of information access as discussed in this chapter have to do with the ability to organize, structure, and retrieve information. Each type of information-textual, visual, moving, iconic-will have different characteristics that interact with the need to store and retrieve information. Until recently, most major information storage and retrieval systems focused almost exclusively on retrieval via words and numbers. Even pictorial information was assigned indexing terms (words) to aid in retrieving maps, pictures, and slides.

The addition of computers has changed both information that is stored and its organization and retrieval. This has implications for characteristics of information that are considered essential to represent in storage systems. Another change that has occurred is the storage of entire documents or other original materials, as opposed to surrogate representations (e.g., bibliographic citations, abstracts). Until recently, libraries, document rooms, museums, or other large facilities were required for storage of original materials. The characteristics of information that were important to represent had to do with physical descriptions, subject or content representation, and methods of retrieval, With new electronic capabilities to store original materials, the range of characteristics that must be considered and the organizational sophistication necessary for retrieval are more complex.

Another aspect of understanding what constitutes information has to do with philosophical assumptions about the nature of human beings and their relationship to information. One position among researchers is that information does not exist without human construction. No matter what the storage medium, the organizational structure, or the retrieval mechanisms, there is no meaning until an individual creates meaning. Among researchers who deal exclusively with information, mathematical representations, and database structures, there would be those who agree and those who disagree with this assumption.

One of the difficulties in discussing information is the need to tie information to the retrieval system employed. Often the structure of the database or storage system is designed with specific retrieval capabilities. Some ways to access information are not possible due to the original system design. For example, the Colorado Alliance of Research Libraries (CARL) only allows keyword access. There is no subject access capability. Searches for a subject such as library research include all the materials that have the keywords library and research. This could be "me Library of Research in Primate Behavior." Another system would allow the retrieval of only information that had the subject heading library research. Systems restrictions can create problems in information access. Retrieval systems must interface with humans and must respond to human logic and needs. Information storage systems must function within the limitations and physical characteristics of the hardware used. These two needs are not necessarily compatible. The, limitations of the hardware can be imposed on the retrieval system rather than the system being designed to meet retrieval needs. This basic incompatibility creates dilemmas for researchers (Peters, 1991). Should the limitations of the system be researched and suggestions for improvement created? Should the needs of the user be explored and instruction developed to improve the users skills to access the system? Should some other combinations of factors be employed in the research process? The construction of information and its storage systems have implications for both users and access.

The concepts of value, utility, relevance, pertinence, and acceptability are also characteristics of information. It is possible to store all types of information. Everything from a child's finger painting on the refrigerator to a holographic image of a lion to complete sound and three-dimensional virtual reality of the Globe theater has the potential to be stored and retrieved. What actually is saved, the characteristics stored, the features considered unnecessary, and the method of storage have to do with decisions of value, utility, relevance, and acceptability. The decisions are affected by economic factors about who will pay for the initial storage, the long-term care, and the retrieval and maintenance costs. Information is sorted and discarded before it is ever stored. The characteristics of what should be saved related to what could be saved need to be considered.

To summarize what constitutes information: It is things (symbols, ideas, knowledge, wisdom, antelopes); it is the characteristics of those things (names, places, dates, subjects, content, values); and it is the process that is used to make decisions about which of those things actually are important or useful information (storing, saving, making available, selling).

25.5.2 How Is Information Organized?

Larson (1991) developed a four-part explanation of the functional components of an on-line public access catalog (OPAC which described the steps that come between users and information:

  • User interface
  • Database management system interface
  • Database management system
  • Database
In a more generic sense, these four components can be used to account for all forms of information organization and will be used to guide the review of literature that follows.

Description of the user interface and associated research was discussed earlier in this chapter. It will be referred to here as it has bearing on information organization. Generically, the user interface must be conceptualized as human-human interactions, human-computer interactions, and all other forms of socially constructed human-information interactions (e.g., human beings using indexes, bibliographies, tables of contents, telephones, hypermedia nodes, Veronicas, Archies, gophers, and virtual-reality devices). The user interface is exactly what the user sees, hears, touches, and tries to interpret.

The database system management interface is hidden to the user. In an on-line public access catalog, it is the software that translates the information from the user interface into commands the system database manager can understand and handle. In human terms, the reference librarian could be considered a system management interface. The reference librarian helps the user interpret the language and structure of a system such as Social Sciences Citation Index or the Library of Congress Classification system.

In an OPAC, the database system manager is the software and sets of algorithms, rules, or heuristics that search

the database and retrieve information. In noncomputer applications, sets of rules, algorithms, or heuristics that organize and provide a view into information could be considered system managers. For example, the Anglo-American Cataloging Rules 11 Revised, the Dewey Decimal Classification System, Library of Congress Subject Headings or any indexing system provides structure and process windows into specific bodies of information.

The database (storage system) in an OPAC consists of information chosen to represent a collection of materials. This information is stored in some predefined structure, with limitations on the type and extent of characteristics that can be included. All other information storage systems such as libraries, on-line databases, CD-ROMs, laser discs, videotapes, or maps have structures that limit storage possibilities. Only certain information is chosen to be included in an information set. Limited characteristics or values related to that information are provided to help retrieve the information. (Even in full-text databases, there are limitations on how much and what is included and how it may be searched.) Some predefined structure for storage is developed. The storage medium and technical process limitations on the structure also have implications for retrieval. In the next three sections, organizational schemes, structures and research related to systems interfaces, the systems themselves, and the storage of information are explored.

25.5.3 The System Manager Interface

Vickery and Vickery (1993, p. 160) review and describe many prototype and operational interfaces. Their conclusion demonstrates the variety of techniques that information science has contributed to system management interfaces. Listed below are techniques and examples of interfaces:

•Technique: Thesaurus relations and classification hierarchies
    Systems: MAI, CITE NLM, INSERM INTERFACE, TOME SEARCHER, METACAT, BIBLIOGRAPHY MANAGER, CANSEARCH, CIRCE, EDOR, DIANE-GUIDE/NLA
•Technique:

Stoplists

    Systems: FASIT, CITE NLM, TOME SEARCHER, CIRCE, METACAT, LEXIQUEST
•Technique: Recognition of suffixes and stemming
    Systems: FASIT, CITE NLM, ERLI/MINITEL, METACAT, BIBLIOGRAPHY MANAGER, DIANEGUIDE/NLA
•Technique: Formation of Boolean search statements
    Systems: TOME SEARCHER, CANSEARCH, EURISKO, DIANEGUIDE/NLA
•Technique: Manipulation of Boolean search statements
    Systems: QUESTQUORUM, CIRT
•Technique:

Near matching of search. terms rather than exact

    Systems: CIRCE, METACAT
•Technique: Calculation of term relevance, document weighting and ranking
    Systems: CITE NLM, CIRT, SABRE
•Technique: Query amendment by relevance feedback
    Systems: CIRCE, EURISKO
•Technique: Co-occurence of terms within documentary items
    Systems: LEXIQUEST, ESA ZOOM
The purpose of the system manager interface is to release human searchers from routine and technical acts of searching that can be effectively automated. Different system interface managers can be created for different users, such as expert searchers, end users, and novice users. Expert search intermediaries need system interfaces that help them access unfamiliar systems and databases. Experts have strong search skills but may need aid in applying those skills to new systems. End users and novice users are more likely to need system interfaces that aid them in choosing and using search strategies. End users need their natural-language queries translated into system language. Novice users need help in learning how to search. Expert systems are being developed as system management interfaces to provide multiple approaches for all levels of users. Specific examples of these systems are discussed in the next section on system managers.

25.5.4 The System Manager

The system manager interface and the system manager are conceptualized as performing different functions. Their definitions provide a viewfinder for thinking about information retrieval at different levels of mechanical access. In reality the differences can be difficult to separate. Functions are becoming merged due to technical advances and innovations. A program with the appearance of a system management interface may also serve as a system manager, and the reverse. The previous section briefly reviewed system interface techniques applied to information retrieval. This section details information retrieval research. A brief review of information retrieval history is followed by examining the two most prevalent retrieval methods: statistical/probabilistic and cognitive. Expert systems, hypermedia, and future issues in information retrieval conclude the discussion.

Scholars of information retrieval date its beginning to the early 1950s. The Cranfield projects (reviewed in Ellis, 1990, pp. 3-14) were early information retrieval research programs. They developed operationalizations for three dependent variables that are still used in retrieval research. These measures are recall, precision, and relevance. Relevance measures whether an item retrieved contains information to meet the search request (see Eisenberg, 1988; Park, 1993). Recall and precision are ratios that relate to relevance (see Buckland & Gey, 1994). Recall is the ratio of all relevant items in a data set to the number of relevant items that were actually retrieved by the search. For example, if a database contained 20 items about frogs, and 10 of those items were retrieved during a search, the recall for frogs would be .5. Precision is the ratio of the total number of documents retrieved to relevant items retrieved. For example, if a search resulted in 40 items, and 10 of those items were about frogs, the precision for frogs would be .25.

In the period following the Cranfield projects until the 1980s, intense debate in information retrieval surrounded the concept of relevance. Relevance is a judgment about whether or not a particular item meets the search request. Personal and economic factors can influence the judgment about an item's relevance. For example, in a study to test a new proprietary system that would automate the assignment of indexing terms, two differing judgments of relevance were found (reviewed in Ellis, 1990, pp. 1-3). The company's representatives found that the items retrieved were relevant. The representatives from professional indexing found the items retrieved less relevant. Each of these parties had personal and economic values attached to the outcomes. The company wanted to sell its product. The professional indexers may have seen the product as a threat to their livelihood. Another explanation would be that each group had different set points or standards about how much information was necessary in an item to achieve relevance. In relevance judgments, issues of variability (how much information is necessary for an item to be considered relevant) and consistency (can different individuals apply the criteria in the same way?) are critical and can effect judgments about what is relevant. Saracevic (1970, 1975) provides extensive reviews of the relevance controversy during the 1960s and early 1970s.

Current information retrieval research focuses in two areas: statistics and probability research (for an overview of issues see Belkin & Croft, 1987, and Fidel, 1987) and cognitive research (for reviews of current issues see Ingwersen, 1992, and Jacobs, 1993). Statistics and probability research uses techniques such as automated indexing, classification, searching, and abstracting. Statistics and probability measures are based on matching the query as expressed in the search statement with the representations in the database searched (query needs). They look at physical representations, mathematical probabilities, and logical rules. Cognitive research creates models of users (see 5.3.6), develops expert systems (see 24.8.1), and applies other methods to help the users match their needs with the system. Other techniques looking at the integration of users and information systems include user modeling, expert systems, and hypermedia applications (see 2 1. 1). User modeling involves creating a representation of the user to interact with the system. Expert systems are designed to help the user understand and interact with the system more effectively. Hypermedia applications are designed to improve browsing, navigation, and user interaction capabilities (e.g., Chang & Rice, 1993; Newby, 1990).

Statistical and probability efforts can be divided in two types of retrieval techniques: exact matching and partial matching. Exact matching indicates that the search request and the items found in the database or retrieval set are identical. Techniques such as Boolean searching, full text, and string matching represent exact matching. Most operational information retrieval systems are based on exact matching techniques.

Partial matching techniques are those where the retrieved documents or their representatives are not a complete match with the search request, Belkin and Croft (1987, p. 112) provide a schematic classification system that depicts types of partial match techniques and their relationships. The most frequently studied partial match techniques are: (1) networking techniques that look at groups of documents and include clustering, browsing and spreading activation, and (2) individual techniques that examine one item at a time and include fuzzy set, vector space, and probability techniques.

Networking Techniques

  • Document clustering: Closely linked documents are relevant to the same requests (e.g., Willett, 1988).
  • Browsing: User browses through nodes and connections in a network (e.g., Croft & Thompson, 1987).
  • Spreading activation: Similar to browsing, but the system rather than the user activates parts of the network and their relationships (e.g., Cohen & Kjeldsen, 1987; Lee, Kim & Lee, 1993).

Individual Techniques

  • Fuzzy set: Integrates Boolean queries with ranking techniques (e.g., Bookstein, 1985).
  • Vector space: Represents documents by weighted term in dimensional space where each dimension corresponds to an index term (e.g.. Buckley & Lewit, 1985; Wilbur, 1992).
  • Probability ranking principle: Similar to vector space, attempts to estimate how relevant a document will be to a search request (e.g., Bookstein, 1983; Croft, 1986).

Cognitive-based research in information retrieval looks at the interaction of the user and the information system. The attempt is to create, via system enhancements or changes, a better representation of the user request. Allen (1994) is an example of this type of research. Two experiments on the relationships between users' cognitive abilities and information system features were conducted. In each experiment, systems. that included different approaches to the design of information were explored. In addition, cognitive abilities of participants were tested and randomly assigned to the different systems. A general linear modeling statistic (a statistic that combines features of ANOVA and linear regression) was used to test the hypothesis that there would be an interaction between system design and individual differences in cognitive ability. In one study, results showed an interaction between logical reasoning and order of presentation of references. In the other study, no interaction was discovered between perceptual speed and the way index terms were presented in browsable displays. Allen (1994) interprets these results for the overall design of information systems. System designers may wish to consider different orders of presentation as user-selected options to take into account the logical reasoning differences. On the other hand, since the browsable displays showed no impact on search precision, other factors may be more relevant in the choice of browsing displays.

Belkin's anomalous states of knowledge (ASK) model (Belkin, 1980; Belkin, Oddy & Brooks, 1982a, 1982b) is an example of cognitive user modeling in LIS research. ASK looks at a network of associations between items on a database. Two aspects are critical: (1) the author's decision to communicate and (2) the users' decision to search and the decision that a particular item meets the search need. ASK relates to the second component. The searcher is aware of an anomaly in his or her state of knowledge about a problem or issue. The searcher examines items from a knowledge structure to interact with the searcher's request. This process continues until the ASK is resolved.

Two features underlie this cognitive approach to information retrieval: (1) construction of a model of the user of the system and (2) derivation of this model from cognitive characteristics of the user (Ellis, 1990, p. 67). The searcher interacts with a database via the creation of a model of her or his perceptions and requirements. In theory, conceptual associations are at the foundation of the model constructed. In reality, term associations are more likely to be used. Most of the research and development in this area are prototypes rather than operational systems.

Expert systems (see 24.8) and expert system intermediaries are more likely to be operational systems than user-modeling prototypes. Some expert systems have user-modeling components and some do not. Most expert systems are founded on assumptions about cognition and the user of the system (for reviews see Borko, 1987; Croft, 1987; Hawkins, 1988; Smith, 1987). Brooks says, "Me influence of expert systems has shifted [information retrieval] research from a paradigm concerned largely with retrieval algorithms to one in which users, retrieval heuristics, knowledge, and human-computer interaction are key themes" (1987, p. 379). Expert systems have been influential in adding the users' perspective to information retrieval research. Expert systems engage users in dialogue to acquire a detailed request model or provide multiple retrieval techniques. Expert systems are used for query formulation, database selection, retrieval in subject domains, user modeling, and knowledge acquisition (Drenth et al., 199 1).

Drenth et al. (1991) suggest three categories of expert systems that are under development in LIS: search advisors, intelligent front ends, and intelligent intermediaries. Search advisors teach users how to accomplish such tasks as search an on-line system. Intelligent front ends provide search tactics, search formulations, selection of terms, selection of databases, and search strategies. Intelligent intermediaries draw on knowledge of users and search tactics to interpret and elaborate search requests. They also use conceptual knowledge from the database or storage system. Information retrieval expert systems are primarily intelligent intermediaries (see Gauch, 1992, for an introduction to intelligent information retrieval). They serve to bridge the system gaps between the user and the stored information. Examples of expert system development in LIS include Croft and Thompson, 1987; Fox, 1987; Gauch and Smith, 1993; Khoo and Poo, 1994; Shute and Smith, 1993.

13R (Croft & Thompson, 1987) is an example of an intelligent intermediary expert system in information retrieval. 13R (Intelligent Intermediary for Information Retrieval) is a prototype system that optimizes the system's picture of the user's information need. It uses both probabilistic and clustering algorithms for retrieval. It also adds browsing, domain knowledge, and natural language processing. It is a multiple retrieval strategy system including both statistical and cognitive techniques. The system tries to build an accurate picture of the user's request by incorporating query analysis, domain knowledge, and browsing. This is then used with probabilistic and cluster retrieval techniques to retrieve documents through an inference process. User evaluation and further browsing provide a feedback loop to refine the request model and lead to more retrieval.

Multimedia- and hypermedia-based retrieval have looked at two functions: (1) integrating database management and information retrieval systems into a single model and (2) applying hypermedia as a browsing interface (retrieval by association). Agosti (1993) suggests that these new hypermedia retrieval models need to incorporate the concept of navigation as well as direct search. In one navigation model, Arents; and Bogaerts (1993) developed a concept-based retrieval system that includes three-dimensional index navigation and semantic hyperindexing. They indicate that this type of navigation based on concept indexing could result in more effective retrieval of information in hypermedia environments.

Other LIS approaches to retrieval in hypermedia environments include plausible inference, subject browsing, and the use of classification systems. In most hypertext systems, retrieval can be accomplished through browsing or searching. In a plausible inference system (Lucarella & Zanzi, 1993), these two strategies are combined to increase retrieval effectiveness and search efficiency. Pollard, on the other hand, suggests improving efficiency in hypertext through the use of subject thesauri as navigational aids. The outcome is improved access to the subject content of the bibliographic database. Rada, Wang, and Birchall (1993) use a similar thesaurus approach in the development of their MUCH (Many Using and Creating Hypertext) system. Aboud, Chrisment, Razouk, Sedes, and Soule-Dupuy (1993) suggest the application of another traditional method from LIS to increase effective use of hypermedia. They describe a navigation approach that uses classification processes with the graphical interface. Selected nodes are ordered through their relevance, thus favoring some entry points in the database over others. This could reduce the disorientation to users in browsing space.

LIS retrieval research within hypermedia environments also addresses problems such as full-text retrieval, application of search strategies, and user interactions. Full-text databases create special problems for information retrieval via hypermedia methods. Browsing-based hypermedia systems may provide ease of access for beginners, but they often perform poorly with large document bases (Dunlop & van Rijsbergen, 1993). Dunlop and van Rijsbergen (1993)

conducted experiments to test a hybrid variation on the problems of browsing from large databases and retrieval through multimedia access. They used the results of the experiments to design a prototype system that minimizes the negative effects. Croft and Turtle (1993) examine another retrieval issue in hypertext: search strategies. They designed a probabilistic model based on inference nets. Results showed this retrieval strategy to be as effective as the more standard spreading activation technique. Belkin, Marchetti, and Cool (1993) designed a user interface that focused on user interactions for retrieval of bibliographic information. It used a two-level hypertext model and many different search strategies to increase interactions with users.

Whether the research is statistical/probabilistic or cognitive, whether the applications are expert systems, hypermedia, or another system, research in information retrieval has four ongoing areas of concentration:

  • Operational systems compared to experimental systems
  • Effectiveness of the different retrieval techniques
  • Use of multiple strategies as opposed to a single-strategy search
  • Retrieving information other than text

Belkin and Croft (1987) discuss the relationship between currently operational systems and experimental techniques. They ask "... why has the experimental experience had little effect on the operational environment?" (p. 112). It appears that most operational retrieval processes are based on exact-match retrieval and use Boolean, string searching, or full-text match as their basis. Suggestions for why new techniques are not applied include cost, time, and the need to learn to use new systems. In addition, the experimental techniques often have been tried on limited data sets and have not shown their effectiveness for large-scale database application. The addition of expert systems and hypermedia applications to LIS has increased the operational systems that used alternative retrieval methods.

All retrieval techniques seem to have certain areas where they are more effective. This leads to the belief that since current systems function, there is no need to add the time and cost of the experimental techniques. On the other hand, often experimental or theoretical techniques perform better than those in current use. Salton, Fox, and Vortices (1986) found that, in general, partial-match techniques have been shown to respond better than exact match. One suggested reason is that cumulative results mask the effects of individual queries in comparison experiments. Information retrieval techniques are ways of comparing the search query with the document (or item) to be retrieved. Representations of items (e.g., citations, abstracts) interact with retrieval techniques and influence the retrieval of relevant items.

One of the consistent findings across retrieval research is that use of multiple retrieval strategies are more effective than use of a single strategy (e.g., Saracevic & Kantor, 1988b). Expert systems are one method for providing users with access to multiple retrieval strategies. Ongoing issues in this research are how to choose the strategies that will be accessed for any particular search and which strategies should be made available (Belkin & Croft, 1987).

New areas of retrieval that go beyond text and document retrieval are being explored, particularly in expert system, multimedia, and hypermedia environments. Other research areas include: pattern recognition, image matching, numerical representation, and chemical structures. A complete listing of new research areas in information storage and retrieval is provided at the end of the "Information Storage and Structure" section.

25.5.5 Information Storage and Structure

Information storage is a highly technical area of research and development in LIS. Because of its complexity and depth, the literature and research of how materials are stored will not be extensively addressed in this review. Key issues are briefly discussed below. Interested readers can find entry points to the literature in the following: Burt and Kinnucan, 1990; Fox, Levitin, and Redman, 1994; Lancaster and Warner, 1993; Meadows, 1992; Pao, 1989; Soergel, 1985; Tremblay, 1985; Wiederhold, 1987.

The storage of information is related to characteristics of the information such as format, size, and retrieval needs. These characteristics can interact with the different types of information storage:

  • Information representations (e.g., subject headings, descriptive cataloging, bibliographic citations, sound bytes, thesauri, abstracts)
  • Original materials (e.g., books, videotapes, maps, speeches, holograms)

An OPAC, an on-line search service such as ERIC, and a bibliographic index such as Psychology Abstracts could be considered storage of representations of original materials. A library, a full-text database, a CD of Martin Luther King's speeches, or a museum could be considered storage of original materials.

Creating, organizing, and storing representations of information is a core area of LIS. The development of rules, procedures, algorithms, heuristics, and other organizational structures has been a cornerstone of LIS research development since the profession began. Before computers, this work was in the form of card catalogs, indexes, and abstracts. Some of the conventions from these earlier structures have been translated into use with computerized storage. Some processes, such as the Anglo American Cataloging Rules 11 (AACRII), have well-established conventions for describing and representing the information housed in libraries and information centers. Others such as subject analysis and content code (cataloging) of materials have no equivalent to the AACRII conventions. Research issues in this area include: models of data structure (e.g., linear, relational, hierarchical, network), semantic nets, indexing, subject analysis (including automated analysis), natural and artificial language, and information representation.

Storage of original materials is advancing with changes in electronic technologies. Mechanical access media such as microfilm and microfiche are being replaced by electronic media such as laser discs and computer discs. There are technical implications about length of storage, speed of access, and cost of replication. Research in storage of original materials includes such issues as: knowledge base construction, access to very large databases, database construction, information construction, storage of nontextual information, hypermedia and multimedia, and data compression.

The 1994 request for proposal for a digital library initiative from the National Science Foundation and the Advanced Research Projects Agency laid out three areas for future research: capturing data, advanced algorithms, and networked databases. The details of this proposal are provided below as an overview of future research areas for information storage and retrieval.

1. Capturing data of all forms and categorizing and organizing electronic information in a variety of formats
  • Optical character reader (OCR) page layout
  • Speech recognition, audio segmentation, broadcast capture
  • Graphics understanding (image, drawing, graphs)
  • Indexing, interpretation, classification, and cataloging of electronic information
  • Multilingual indexing
  • Hypermedia structuring and linking
  • Graphical interfaces
  • Browsing technology
2. Advanced software and algorithms for browsing, searching, filtering, abstracting, summarizing, and combining large volumes of data, imagery, and all kinds of information
  • Retrieval theories and models for data, metadata, information, knowledge bases, evaluation methods
  • Formal structures of documents and texts, query languages
  • Feature-based image analysis and classification, pattern recognition
  • Spatial-temporal feature indexing of video
  • Filtering, routing, alerting and. selection, dissemination of information
  • Natural language analysis
  • daptive learning systems
  • Pictorial feature recognition, image classification
  • Multiscale displays, zooming
  • Data visualization, interactive visualization control, simulation to improve visualization
  • Navigation, hypermedia, metaphors, virtual reality
3. Research on networking protocols and standards needed to ensure the ability of the digital network to accommodate high volume and worldwide distribution
  • Network security
  • Protocol design
  • Data compression
  • Scalability for large numbers of simultaneous users
  • Knowbots, agents, mediators, intelligent gatekeepers
  • Personalized interactive news, magazine, and journal services
  • Modeling and simulating usage
  • Collaboration technology

25.5.6 How Is Information Controlled?

In a democratic society, we prefer to believe that information is not controlled, that access to any type of information is available to every citizen. Yet we all know cases where information is not available for reasons of government -security, because no one thought it was important enough to distribute widely, because it would limit sales, or because the library or information agency was required to remove the item due to complaints. Even in a democratic society, there are controls placed on both the information that is made available and the access to that information. It is in balance, understanding, and constant vigilance that those controls do not become repressive. One significant control on information is economic. The availability of information can be limited by the cost of making it available.

Three issues that have social, economic, and cultural implications to information access are discussed. Public policies about information availability and information gatekeepers are explored. Access through United States government resources is examined. Proprietary aspects of information access are discussed. The control of information can have positive and negative connotations. Control can limit, reduce, and provide barriers. Control can also add access points, increase public awareness of information availability, and raise questions for consideration and reflection.

25.5.6.1. Public Policy. Libraries and public information agencies are an example of a U.S. public policy for making information easily available to all of its members. These agencies are the results of ongoing policies and cultural belief systems that indicate free access to all types of information is important. Through research on users, user needs, user feelings, and user beliefs, library and information centers try to balance services, facilities, and materials to make the most available to the most people. This public policy to provide information and access to all citizens is an example of a positive element. Other elements, however, can serve to hamper this general policy. Local policies such as library hours, information center location, type of access (e.g., telephone reference, electronic access), and services for special populations can provide barriers and constraints to all members of a community sharing equal access to information resources. Economic issues can also limit access to information in libraries and information centers. Selection and collection development policies are created to take into account the economic necessities, but they can also limit what is made available. For example, with limited funding, a school library media center may restrict purchases to curriculum-related items. Students with special interests in personal reading or viewing may not be able to fin( items of interest in their school library media center.

Cheryl Metoyer-Duran (1993a, 1993b, 1993c) describe: another aspect of control: gatekeepers. Gatekeepers are individuals who influence the access of others to information. In a large body of literature, gatekeepers are considered to form the function of restricting access t( information and providing negative controls. Metoyer-Duran suggests that gatekeepers, particularly in ethno-linguistically different communities, can improve access to information for community members.

25.5.6.2. United States Government Access. The United States government is a special case of information collection, storage, and dissemination. One purpose of government is to educate and inform the people it serves. It tries to do this through information dissemination that is economically viable. There is also a need to protect national security and other sensitive areas of government. Hernon and McClure divide government information into two types: (1) public information, which they define as that collected or developed by the government, not classified personal or proprietary; and (2) private information, which is for use only by the government for reasons of a privacy right or statutory obligation (1987, pp. 6-7).

Public service, economic constraints, and legal obligations create conflicting values and needs that influence the collection, storage, and dissemination of government information. Three emerging areas of research and discussion related to government information are discussed below: electronic information, access for special-needs populations, and economic conflicts. While the discussion for this review centers on U.S. federal government access, the issues discussed are equally relevant to local and state government information. For further research on government access to information, see: access (Hernon & McClure, 1988), electronic information (Hernon & McClure, 1993), federal statistics (Sy & Robbin, 1990), Internet (Kalhin, 1991; Lynch & Preston, 1990), National Resources in Education Network-NREN (McClure, Bishop, Doty, & Rosenbaum, 1991), privatizing government information (Stewart, 1990), and technology and information policies (Ballard, 1987).

A new issue in government access to information is electronic availability (Hernon & McClure, 1993). Government agencies are attempting to reduce cost and increase information access through electronic availability such as government files through Internet and alternative formats such as microfiche and CD-ROM. Sprehe (1992) suggests that federal agencies will need to organize and administer public access to maintain the greatest benefit to the user with the least disruption to the agency.

Other issues related to government access include information availability to special populations. Marshall (1992) suggests a number of issues related to the print disabled (blind and others who cannot read print). Information needs to be formatted in specific ways in order for it to be read by speech synthesizers or translated into Braille copy. There may be limited access that relates to hardware, software, and standards. Economic constraints to access also exist. Certain products are too expensive to purchase and translate (e.g., the Federal Register). Significant barriers to access to government information by print-disabled persons are created.

The cost for government agencies to gather, organize, and disseminate information is both in real dollars and resolving conflicts created by agency policies. Different groups have needs that may conflict in regard to the price of government information and the ease of accessibility. A summary of economic problems and issues associated with access to electronic government information is provided by Hernon and McClure (1993, p. 76):

  • Librarians want to increase access but limit cost.
  • Budgeters want to decrease the federal deficit and increase revenue.
  • Economists want marginal cost pricing in order to maximize efficiency.
  • Lawyers want precedents and consistency with other laws.
  • Political scientists want an equitable process for setting prices.
  • Researchers and scientists want data available, they want to know the format, and they are not much interested in prices.
  • Statisticians want to maintain the integrity and accuracy of the data.
  • Computer specialists want controls to ensure efficient use (e.g., price controls).
  • Computer users want friendliness and flexibility in access.

While these economic factors were developed to account for differing needs around electronic access, they also are relevant to information access in general. Information has differing values and meanings to groups and individuals. For some, cost is a serious consideration in accessing information. For others, cost is unimportant, but the nature of the information's storage is critical (e.g., Braille, computer tape, hologram). For still others, it is the policy issues that regulate the dissemination of information that is crucial. Information needs can conflict and create confusion and dissent in policy development and implementation of access procedures.

25.5.6.3. Proprietary Interests. Services such as the Internet are resulting in mixtures of public and proprietary information sources that may have conflicting values, beliefs, and needs about who accesses information, what information is made available, and how much it costs. Some recent information scenarios illustrate conflicting values in information access between proprietary and public interests. The first scenario is an example of a conflict between public access to information and the proprietary use and sale of government information. The second scenario is related to public access to. proprietary information. Both examples can be interpreted to respond to different public and private needs and pressures.

In the mid-1980s, a lobbying campaign by information industries resulted in information produced directly by government agencies becoming a commodity for sale to private companies (Smith, 1985). Private firms organize, package, and sell government information. This in itself is not a problem. The problem occurs because government versions of the same information, which used to be free to the public, now are difficult and sometimes impossible to obtain. Proprietary and public interests are in conflict.

Another example of conflict between proprietary and public interests is reported by Pfaffenberger (1990, p. 12). A librarian for the AFL-CIO attempted to search a DIALOG database produced by Dun & Bradstreet. The AFL-CIO had paid all the appropriate subscription and on-line fees for usage of DIALOG databases. They were denied access to the Dun & Bradstreet database. Dun & Bradstreet had sent a list to DIALOG indicating approximately 240 groups that were not allowed access. These groups were predominantly labor, consumer, and environmental organizations. There is a conflict between a proprietary organization's right to limit sales of its product and a strong societal belief that information should be available to everyone.

Services such as Internet provide access to both free and private information. Issues of who should have access to the proprietary and public information will continue to be part of a debate about information control. In addition, other issues are being raised about certain groups of people and their access to information. For example, should students be allowed free access to all resources of Internet, including the sexual and incendiary?

Doctor (1992) summarizes many of the research issues related to information technology and social equity. Five areas are discussed (p. 45): (1) the relationship between society and technology, (2) implementations of democracy and control relationships, (3) social justice and social equity, (4) information needs, sources, and uses, and (5) mass information delivery systems such as high-capacity computer networks. A common theme related to research in technology and society is the nature of their interactions. No technology is free of social pressures for its application and use. Technology and society are interdependent. These interactions can be seen in such research questions as: How is equity of access to technology achieved? What are implications of cultural lag in the workplace?

In discussing power and control in a democracy, Doctor (1992) addresses social justice and the distribution of power. Information-based power is considered as one of the possible outcomes of increasing technological implementation. Disparities in wealth and information access can also be seen to affect other aspects of social justice and social equity. "The gap between the wealthiest people in America and the poorest is increasing. Disparities in income, and therefore in the ability to acquire information resources, are worsening" (p. 54).

Doctor (1992) discusses the impact of mass information systems in terms of the distribution of information resources across society. Federal funding has helped some community-based agencies, including libraries, experiment with different types of information and referral systems. More recently, computer technologies are affecting the interactive delivery of information. Doctor suggests that there are two basic types of systems: specialist based and consumer based. Specialist-based systems are services such as DIALOG, BRS, and CompuServe. Consumer-based systems are services such as library-based community information systems, free-nets, Prodigy, and telephone company gateway systems. However, few of these current systems serve the daily information needs of the poor. "they are designed to serve upper- and upper-middle-income groups; only incidentally do some effectively reach down to middle- and lower-income groups" (p. 79). Doctor suggests that the development of programs to ensure distribution of resources to the one-third of the population that is information poor is both a research and a professional challenge.

25.5.7 Summary of Research Issues Related to Information

Information retrieval processes that interface with the storage of information are a substantive research area in information access. Two methods predominate in information retrieval: statistical/probabilistic and cognitive. Statistical and probability techniques focus on the development of management systems. They look at improving technological processes and retrieval functions. Cognitive retrieval efforts attempt to create models of users and develop interfaces for storage systems. They look at translating user needs into useful system retrieval methods. An example of cognitive research is the provision of multiple search strategies via an expert system interface.

Early cognitive research in information retrieval focused on developing prototype models. More recently, expert system intermediaries and hypermedia interfaces have been put into operation and their effects studied. The problems associated with applying both the prototypes and the operational systems to large databases are still under consideration. In addition, research to integrate sound, pictorial, and three-dimensional and moving images into the interface and retrieval processes is beginning. Issues such as image classification, spatial-temporal indexing of video, and hypermedia structuring are becoming part of the research of LIS. Merging features in electronic technologies are beginning to indicate that future research will focus on the processes - such as interface design, retrieval strategies, and knowledge construction-rather than on the specifics of technologies such as CD-ROM, OPACs, or on-line search services.

Technical considerations are one component of information research in LIS. Social, political, and economic issues that affect public policy and information access are another area of research and development. What gets stored must be considered. Who has access to the stored information is part of the decision-making process for building information structures. Information has inherent meaning and value. Sufficient mechanical storage and retrieval by technical means does not necessarily meet the meaning and value needs of users. Some researchers concentrate on improving storage and retrieval mechanisms, technologies, and process. Other researchers focus on social, political, and economic issues (see 13.6. 1) that affect information access. A future research direction might be to integrate the technical and social aspects of information into a wholistic research agenda.


Updated August 3, 2001
Copyright © 2001
The Association for Educational Communications and Technology

AECT
1800 North Stonelake Drive, Suite 2
Bloomington, IN 47404

877.677.AECT (toll-free)
812.335.7675

AECT Home Membership Information Conferences & Events AECT Publications Post and Search Job Listings