5.3 MENTAL REPRESENTATION

5. Cognitive Perspectives in Psychology
PDF

5.1	Introduction
5.2	Historical Overview
5.3	Mental Representation
5.4	Mental Processes
5.5	Cognitive Theory and Educational Technology
	References

5.3 MENTAL REPRESENTATION

The previous section showed the historical origins of the two major aspects of cognitive psychology that are addressed in this and the next section. These are mental representation and mental processes. Our example of representation was the mental image, and passing reference was made to memory structures and hierarchical chunks of information. We also talked generally about the input, processing, and output functions of the cognitive system, and paid particular attention to Marr's account of the processes of vision.

This section deals with cognitive theories of mental representation. How we store information in memory, represent it in our mind's eye, or manipulate it through the processes of reasoning has always seemed relevant to researchers in educational technology. Our field has sometimes supposed that the way in which we represent information mentally is a direct mapping of what we see and hear about us in the world (see Knowlton, 1966; Cassidy & Knowlton, 1983; Sless, 1981). Educational technolgists have paid a considerable amount of attention to how visual presentations of different levels of abstraction affect our ability to reason literally and analogically (Winn, 1982). Since the earliest days of our discipline (Dale, 1946), we have been intrigued by the idea that the degree of realism with which we present information to students determines how well they learn. More recently (Salomon, 1979), we have come to believe that our thinking uses various symbol systems as tools, enabling us both to learn and to develop skills in different symbolic modalities. How mental representation is affected by what a student encounters in the environment has become inextricably bound up with the part of our field we call message design (Fleming & Levie, 1993; Rieber, 1994; Chapter 7).

5.3.1 Schema Theory

The concept of "schema" is central to cognitive theories of representation. There are many descriptions of what schemata are. All descriptions concur that a schema has the following characteristics: (1) It is an organized structure that exists in memory and, in aggregate with all other schemata, contains the sum of our knowledge of the world (Paivio, 1974). (2) It exists at a higher level of generality, or abstraction, than our immediate experience with the world. (3) It consists of concepts that are linked together in propositions. (4) It is dynamic, amenable to change by general experience or through instruction. (5) It provides a context for interpreting new knowledge as well as a structure to hold it. Each of these features requires comment.

5.3.1.1. Schema as Memory Structure. The idea that memory is organized in structures goes back to the work of Bartlett (1932). In experiments designed to explore the nature of memory that required subjects to remember stories, Bartlett was struck by two things: First, recall, especially over time, was surprisingly inaccurate; second, the inaccuracies were systematic in that they betrayed the influence of certain common characteristics of stories and turns of event that might be predicted from common occurrences in the world. Unusual plots and story structures tended to be remembered as closer to "normal" than in fact they were. Bartlett concluded from this that human memory consisted of cognitive structures that were built over time as the result of our interaction with the world and that these structures colored our encoding and recall of subsequently encountered ideas. Since Bartlett's work, both the nature and function of schemata have been amplified and clarified experimentally. The next few paragraphs describe how.

5.3.1.2. Schema as Abstraction. A schema is a more abstract representation than a direct perceptual experience. When we look at a cat, we observe its color, the length of its fur, its size, its breed if that is discernible, and any unique features it might have, such as a torn ear or unusual eye color. However, the schema that we have constructed from experience to represent "cat" in our memory, and by means of which we are able to identify any cat, does not contain these details. Instead, our "cat" schema will tell us that it has eyes, four legs, raised ears, a particular shape, and habits. However, it leaves those features that vary among cats, like eye color and length of fur, unspecified. In the language of schema theory, these are "place-holders," "slots," or "variables" to be "instantiated" through recall or recognition (Norman & Rumelhart, 1975).

It is this abstraction, or generality, that makes schemata useful. If memory required that we encode every feature of every experience that we had, without stripping away variable details, recall would require us to match every experience against templates in order to identify objects and events, a suggestion that has long since been discredited for its unrealistic demands on memory capacity and cognitive processing resources (Pinker, 1985). On rare occasions, the generality of schemata may prevent us from identifying something. For example, we may misindentify a penguin because, superficially, it has few features of a bird. As we shall see below, learning requires the modification of schemata so that they can accurately accommodate unusual instances, like penguins, while still maintaining a level of specificity that makes them useful.

5.3.1.3. Schema as Network. Schemata have been conceived of and described in many ways. One of the most prevalent conceptions of schema has been as a network of concepts connected by links. Illustrative is Palmer's (1975) description of a schema to represent the concept "face." The schema consists of nodes and links that describe the relations between node pairs. The central node in the network is the head, which is roughly oval in shape. The other nodes, representing other features of a face such as eyes, nose, and mouth, are described in terms of their relationship to the head. The right eye is connected to the head by three links specifying shape, size, and location. Thus, the eye is an oval, like the head, but turned through an angle of 90' relative to the head; it is roughly one-eighth the size of the head; it is located above and to the right of the head's center. In this schema, the relationships-size, shape, and orientation-are constant, and the nodes-eye, nose, mouth-are "placeholders" whose exact nature varies from case to case. Eye color, for example, is not specified in the face schema. But eyes are always above the nose. As in most cases, it is therefore the schema's structure, determined by the links, rather than characteristics of individual nodes that is encoded and against which new information is compared.

5.3.1.4. Schema as Dynamic Structure. A schema is not immutable. As we learn new information, either from instruction or from day-to-day interaction with the environment, our memory and understanding of our world will change. Schema theory proposes that our knowledge of the world is constantly interpreting new experience and adapting to it. These processes, which Piaget (1968) has called assimilation and accommodation, and which Thorndyke and Hayes-Roth (1979) have called bottom-up and top-down processing, interact dynamically in an attempt to achieve cognitive equilibrium without which the world would be a tangled blur of meaningless experiences. The process works like this: (1) When we encounter a new object, experience, or piece of information, we attempt to match its features and structure (nodes and links) to a schema in memory (bottom-up). On the basis of the success of this first attempt at matching, we construct a hypothesis about the identity of the object, experience, or information, on the basis of which we look for further evidence to confirm our identification (top-down). If further evidence confirms our hypothesis, we assimilate the experience to the schema. If it does not, we revise our hypothesis, thus accommodating to the experience.

Let us return to Palmer's (1975) "face" schema to illustrate. Palmer describes what happens when a person is shown a "face," whose head consists of a watermelon, whose eyes are apples, whose nose is a pear, and whose mouth is a banana. At first glance, on the basis of structural cues, one interprets the picture as a face. However, this hypothesis is not borne out when confirming evidence is sought and a "fruit" schema (or perhaps "fruitface" schema) is hypothesized. Admittedly, this example is a little unusual. However, it brings home the importance of structure in schemata and illustrates the fact that accommodation of a schema to new information is often achieved by reconciling discrepancies between global and local features.

Learning takes place as schemata change, as they accommodate to new information in the environment, and as new information is assimilated by them. Rumelhart and Norman (1981) discuss important differences in the extent to which these changes take place. Learning takes place by accretion, by schema tuning, or by schema creation.

In the case of accretion, the match between new information and schemata is so good that the new information is simply added to an existing schema with almost no accommodation of the schema at all. A hiker might learn to recognize a golden eagle simply by matching it to an already-familiar bald eagle schema, noting only the absence of the former's white head and tail.

Schema tuning results in more radical changes in a schema. A child raised in the inner city might have formed a "bird" schema on the basis of seeing only sparrows and pigeons. The features of this schema might be: a size of between 3 and 10 inches, flying by flapping wings, found around and on buildings. This child's first sighting of an eagle would probably be confusing, and might lead to a misidentification as an airplane, which is bigger than 10 inches long and does not flap its wings. Learning, perhaps through instruction, that this creature was indeed a bird would lead to changes in the "bird" schema, to include soaring as a means of getting around, large size, and mountain habitat.

Rumelhart and Norman describe schema creation as occurring by analogy. Stretching the bird example to the limits of credibility, imagine someone from a country that has no birds but lots of bats for whom a "bird" schema does not exist. The creation of a bird schema could take place by temporarily substituting the features birds have in common with bats and then specifically teaching the differences. The danger, of course, is that a significant residue of bat features could persist in the bird schema, in spite of careful instruction. Analogies can therefore be misleading (Spiro, Feltovich, Coulson & Anderson, 1989) if they are not used with extreme care.

5.3.1.5. Schema as Context. Not only does a schema serve as a repository of experiences. It provides a context that affects how we interpret new experiences and even directs our attention to particular sources of experience and information. From the time of Bartlett, schema theory has been developed largely from research in reading comprehension. And it is from this area of research that the strongest evidence comes for the decisive role of schemata in interpreting text.

The research design for these studies requires the activation of a well-developed schema to set a context, the presentation of a text that is often deliberately ambiguous, and a comprehension posttest. For example, Bransford and Johnson (1972) had subjects study a text that was so ambiguous as to be meaningless without the presence of an accompanying picture. Anderson, Reynolds, Schallert, and Goetz (1977) presented ambiguous stories to different groups of people. A story that could have been about weight lifting or a prison break was interpreted to be about weight lifting by students in a weight-lifting class, but in other ways by other students. Musicians interpreted a story-that could have been about playing cards or playing music as if it were about music.

Neisser (1976) has argued that schemata not only determine interpretation but also affect people's anticipations of what they are going to find in the environment. Thus, in what Neisser calls a perceptual cycle, "anticipatory schemata"' direct our exploration of the environment. Our exploration of the environment leads us to some sources of information rather than others. The information we find modifies our schemata, in ways we have already encountered, and the cycle repeats itself.

5.3.2 Schema Theory and Educational Technology

Schema theory has influenced educational technology in a variety of ways. For instance, the notion of activating a schema in order to provide a relevant context for learning finds a close parallel in Gagné, Briggs, and Wager's (1988) third instructional "event," "stimulating recall of prerequisite learning." Reigeluth's (Reigeluth & Stein, 1983) "elaboration theory" of instruction consists of, among other things, prescriptions for the progressive refinement of schemata. The notion of a "generality," which has persisted through the many stages of Merrill's instructional theory (Merrill, 1983, 1988; Merrill, Li & Jones, 1991), is close to a schema.

There are however three particular ways in which educational technology research has used schema theory (or at least some of the ideas it embodies, in common with other cognitive theories of representation). The first concerns the assumption, and attempts to support it, that schemata can be more effectively built and activated if the material that students encounter is somehow isomorphic to the putative structure of the schema. This line of research extends into the realm of cognitive theory's earlier attempts to propose and validate a theory of audiovisual (usually more visual than audio) education and concerns the role of pictorial and graphic illustration in instruction (Dale, 1946; Carpenter, 1953; Dwyer, 1972, 1978, 1987).

The second way in which educational technology has used schema theory has been to develop and apply techniques for students to use to impose structure on what they learn and thus make it more memorable. These techniques are referred to, collectively, by the term information mapping.

The third line of research consists of attempts to use schemata to represent information in a computer and thereby to enable the machine to interact with information in ways analogous to human assimilation and accommodation. This brings us to a consideration of the role of schemata, or "scripts" (Schank & Abelson, 1977) or "plans" (Minsky, 1975) in AI and "intelligent" instructional systems (see 19.2.3.1). the next sections examine these lines of research.

5.3.3 Schema-Message Isomorphism: Imaginal Encoding

There are two ways in which pictures and graphics can affect how information is encoded in schemata. Some research suggests that a picture is encoded directly as a mental image. This means that encoding leads to a schema that retains many of the properties of the message that the student saw, such as its spatial structure and the appearance of its features. Other research suggests that the picture or graphic imposes a structure on information first and that propositions about this structure rather than the structure itself are encoded. The schema therefore does not contain a mental image but information that allows an image to be created in the mind's eye when the schema becomes active. This and the next section examine these two possibilities.

Research into imaginal encoding is typically conducted within the framework of theories that propose two (at least) separate, though connected, memory systems (see 29.2.3). Paivio's (1983; Clark & Paivio, 1992) "dual coding" theory and Kulhavy's (Kulhavy, Lee & Caterino, 1985; Kulhavy, Stock & Caterino, 1994) "conjoint retention" theory are typical. Both theories assume that people can encode information as languagelike propositions or as picturelike mental images. 'Ibis research has provided evidence that (1) pictures and graphics contain information that is not contained in text, and (2) that information shown in pictures and graphics is easier to recall because it is encoded in both memory systems, as propositions and as images, rather than just as propositions, which is the case when students read text. As an example, Schwartz and Kulhavy (1981) had subjects study a map while listening to a narrative describing the territory. Map subjects recalled more spatial information related to map features than nonmap subjects, while there was no difference between recall of the two groups on information not related to map features. In another study, Abel and Kulhavy (1989) found that subjects who saw maps of a territory recalled more details than subjects who read a corresponding text, suggesting that the map provided "second stratum cues" that made it easier to recall information.

5.3.4 Schema-Message Isomorphism: Structural Encoding

Evidence for the claim that graphics help students organize content by determining the structure of the schema in which it is encoded comes from studies that have examined the relationship between spatial presentations and cued or free recall. The assumption is that the spatial structure of the information on the page reflects the semantic structure of the information that gets encoded. For example, Winn (1980) used text with or without a block diagram to teach about a typical food web to high school subjects. Estimates of subjects' semantic structures representing the content were obtained from their free associations to words naming key concepts in the food web (e.g., consumer herbivore). It was found that the diagram significantly improved the closeness of the structure the students acquired to the structure of the content.

More recently, McNamara, Hardy, and Hirtle (1989) had subjects learn spatial layouts of common objects. Ordered trees, constructed from free-recall data, revealed hierarchical clusters of items that formed the basis for organizing the information in memory. A recognition test, in which targeted items were primed by items either within or outside the same cluster, produced response latencies that were faster for same-cluster items than for different-item clusters. The placement of an item in one cluster or another was determined, for the most part, by the spatial proximity of the items in the original layout.

In another study, McNamara (1986) had subjects study the layout of real objects placed in an area on the floor. The area was divided by low barriers into four quadrants of equal size. Primed recall produced response latencies suggesting that the physical boundaries imposed categories on the objects when they were encoded that overrode the effect of absolute spatial proximity. For example, recall reposes were slower to items physically close but separated by a boundary than two items further apart but within the same boundary. The results of studies like these have been the basis for recommendations about when and how to use pictures and graphics in instructional materials (Levin, Angling & Carney, 1987; Winn, 1989b).

5.3.5 Schemata and Information Mapping

Strategies exploiting the structural isomorphism of graphics and knowledge schemata have also formed the basis for a variety of text- and information-mapping schemes aimed at improving comprehension (Armbruster & Anderson, 1982, 1984) and study skills (Dansereau et a]., 1979; Holley & Dansereau, 1984). Research on the effectiveness of these strategies and its application is one of the best examples of how cognitive theory has come to be used by instructional designers.

The assumptions underlying all information-mapping strategies are that if information is well organized in memory, it will be better remembered and more easily associated with new information, and that students can be taught techniques exploiting the spatial organization of information on the page that make what they learn better organized in memory (see 24.7). We have already given examples of research that bears out the first of these assumptions. We turn now to research on the effectiveness of information-mapping techniques.

All information-mapping strategies (reviewed and summarized by Hughes, 1989) require students to learn ways to represent information, usually text, in spatially constructed diagrams. With these techniques, they construct diagrams that represent the concepts they are to learn as verbal labels often in boxes and that show interconcept relations as lines or arrows. The most obvious characteristic of these techniques is that students construct the information maps for themselves rather than studying diagrams created by someone else. In this way, the maps require students to process the information they contain in an effortful manner, while allowing a certain measure of idiosyncrasy in the ideas shown, both of which are attributes of effective learning strategies.

Some mapping techniques are radial, with the key concept in the center of the diagram and related concepts on arms reaching out from the center (Hughes, 1989). Other schemes are more hierarchical, with concepts placed on branches of a tree (Johnson, Pittelman & Heimlich, 1986).

Still others maintain the roughly linear format of sentences but use special symbols to encode interconcept relations, like equals signs or different kinds of boxes (Armbruster & Anderson, 1984). Some computer-based systems provide more flexibility by allowing "zooming" in or out on concepts to reveal subconcepts within them and by allowing users to introduce pictures and graphics from other sources (see 24.7; Fisher et al., 1990).

Regardless of format, information mapping has been shown to be effective. In some cases, information-mapping techniques have formed part of study skills curricula (Holley & Dansereau, 1984; Schewel, 1989). In other cases, the technique has been used to improve reading comprehension (Ruddell & Boyle, 1989) or for review at the end of a course (Fisher et al., 1990). Information mapping has been shown to be useful for helping students write about what they have read (Sinatra, Stahl-Gemake & Morgan, 1986) and works with disabled readers as well as with normal ones (Sinatra, Stahl-Gemake & Borg, 1986). Information mapping has proved to be a successful technique in all of these tasks and contexts, showing it to be remarkably robust.

Information mapping can, of course, be used by instructional designers (Jonassen, 1991, 1996; Jonassen, Bersner & Yacci, 1993). In this case, the technique is used not so much to improve comprehension as to help designers understand the relations among concepts in the material they are working with. Often, understanding such relations makes strategy selection more effective. For example, a radial outline based on the concept "zebra" (Hughes, 1989) shows, among other things, that a zebra is a member of the horse family and also that it lives in Africa on the open grasslands. From the layout of the radial map, it is clear that membership of the horse family is a different kind of interconcept relation than the relation with Africa and grasslands. The designer will therefore be likely to organize the instruction so that a zebra's location and habitat are taught together and not at the same time as the zebra's place in the mammalian taxonomy is taught. We will return to instructional designers' use of information mapping techniques in our discussion of cognitive objectives in section 5.5.

All of this seems to suggest that imagery-based and information-structuring strategies based on graphics have been extremely useful in practice. However, the whole idea of isomorphism between an information display outside the learner and the structure and content of a memory schema implies that information in the environment is mapped fairly directly into memory. As we have seen, this basic assumption of much of cognitive theory is currently being seriously challenged. The extent to which this challenge threatens the usefulness of using pictures and graphics in instruction remains to be seen.

5.3.6 Schemata and Al

Another way in which theories of representation have been used in educational technology is to suggest ways in which computer programs designed to "think" like people might represent information. Clearly, this application embodies the "computer models of mind" assumption that we looked at above (Boden, 1988).

The structural nature of schemata makes them particularly attractive to cognitive scientists working in the area of artificial intelligence. The reason for this is that they can be described using the same "language" that is used by computers and therefore provide a convenient link between human and artificial thought. The best examples are to be found in the work of Minsky (1975) and of Schank and his associates (Schank & Abelson, 1977). Here, schemata provide constraints on the meaning of information that the computer and the user share that make the interaction between them more manageable and useful. The constraints arise from only allowing what typically happens in a given situation to be considered. For example, certain actions and verbal exchanges commonly*take place in a restaurant. You enter. Someone shows you to your table. Someone brings you a menu, After a while, the waiter comes back, and you order your meal. Your food is brought to you in a predictable sequence. You eat it in a predictable way. When you have finished, someone brings you the bill, which you pay. You leave. It is not likely (though not impossible, of course) that someone will bring you a basketball rather than the food you ordered. Usually, you will eat your food rather than sing to it. You use cash or a credit card to pay for your meal rather than offering a giraffe. In this way, the almost infinite number of things that can occur in the world are constrained to relatively few, which means that the machine has a better chance of figuring out what your words or actions mean.

Even so, schemata (or "scripts" as Schank [19841 calls them) cannot contend with every eventuality. This is because the assumptions about the world that are implicit in our schemata, and therefore often escape our awareness, have to be made explicit in scripts that are used in Al. Schank (1984) provides examples as he describes the difficulties encountered by TALE-SPIN, a program designed to write stories in the style of Aesop's fables.

One day Joe Bear was hungry. He asked his friend Irving Bird where some honey was. Irving told him there was a beehive in the oak tree. Joe walked to the oak tree. He ate the beehive."

Here, the problem is that we know beehives contain honey, and while they are indeed a source of food, they are not themselves food but contain it. The program did not know this, nor could it infer it. A second example, with Schank's own analysis, makes a similar point:

Henry Ant was thirsty. He walked over to the river bank where his good friend Bill Bird was sitting. Henry slipped and fell in the river. He was unable to call for help. He drowned. This was not the story that TALE-SPIN set out to tell. ... I Had TALE-SPIN found a way for Henry to call to Bill for help, this would have caused Bill to try to save him. But the program had a rule that said that being in water prevents speech. Bill was not asked a direct question, and there was no way for any character to just happen to notice something. Henry drowned because the program knew that that's what happens when a character that can't swim is immersed in water (1984, p. 84).

The rules that the program followed, leading to the sad demise of Henry, are rules that normally apply. People do not usually talk when they're swimming. However, in this case, a second rule should have applied, as we who understand a calling-for-help-while-drowning schema are well aware of.

The more general issue that arises from these examples is that people have extensive knowledge of the world that goes beyond any single set of circumstances that might be defined in a script. And human intelligence rests on the judicious use of this general knowledge. Thus, on the rare occasion that we do encounter someone singing to their food in a restaurant, we have knowledge from beyond the immediate context that lets us conclude the person has had too much to drink, or is preparing to sing a role at the local opera and is therefore not really singing to her food at all, or belongs to a cult for whom praising the food about to be eaten in song is an accepted ritual. The problem for the Al designer is therefore how much of this general knowledge to allow the program to have? Too little, and the correct inferences cannot be made about what has happened when there are even small deviations from the norm. Too much, and the task of building a production system that embodies all the possible reasons for something to occur becomes impossibly complex.

It has been claimed that Al has failed (Dreyfus & Dreyfus, 1986) because "intelligent" machines do not have the breadth of knowledge that permits human reasoning. A current project called "Cyc" (Guha & Lenat, 1991; Lenat, Guha, Pittman, Pratt & Shepherd, 1990) has as its goal to imbue a machine with precisely the breadth of knowledge that humans have. Over a period of years, programmers will have worked away at encoding an impressive number of facts about the world. If this project is successful, it will be testimony to the usefulness of general knowledge of the world for problem solving and will confirm the severe limits of a "schema" or "script" approach to AL It may also suggest that the schema metaphor is misleading. Maybe people do not organize their knowledge of the world in clearly delineated structures. A lot of thinking is "fuzzy," and the boundaries among schemata ate permeable and indistinct.

5.3.7 Mental Models

Another way in which theories of representation have influenced research in educational technology is through psychological and human factors research on mental models. A mental model, like a schema, is a putative structure that contains knowledge of the world. For some, mental models and schemata are synonymous. However, there are two properties of mental models that make them somewhat different from schemata. Mayer (1992, p. 431) identifies these as (1) representations of objects in whatever the model describes and (2) descriptions of how changes in one object effect changes in another. Roughly speaking, a mental model is broader in conception than a schema because it specifies causal actions among objects that take place within it. However, you will find any number of people who disagree with this distinction.

The term envisionment is often applied to the representation of both the objects and the causal relations in a mental model (DeKleer & Brown, 1981; Strittinatter & See], 1989). This term draws attention to the visual metaphors that often accompany discussion of mental models. When we use a mental model, we "see" a representation of it in our "mind's eye." This representation has spatial properties akin to those we notice with our biological eye. Some objects are "closer to" some than to others. And from seeing changes in our mind's eye in one object occurring simultaneously with changes in another, we infer causality between them. This is especially true when we consciously bring about a change in one object ourselves. For example, Sternberg and Weil (1980) gave subjects such problems to solve as: "If A is bigger than B and C is bigger than A, who is the smallest?" Subjects who changed the representation of the problem by placing the objects A, B, and C in a line from tallest to shortest were most successful in solving the problem, because envisioning it in this way allowed them simply to "see" the answer. Likewise, envisioning what happens in an electrical circuit that includes an electric bell (DeKleer & Brown, 1981) allows someone to come to understand how it works. In short, a mental model can be "run" like a film or computer program and watched in the mind's eye while it is running. You may have observed world-class skiers "running" their model of a slalom course, eyes closed, body leaning into each gate, before they make their run.

The greatest interest in mental models by educational technologists lies in ways of getting learners to create good ones. This implies, as in the case of schema creation, that instructional materials and events act with what learners already understand in order to construct a mental model that the student can use to develop understanding. Just how instruction affects mental models has been the subject of considerable research, summarized by Gentner and Stevens (1983), Mayer (1989a), and Rouse and Morris (1986), among others. At the end of his review, Mayer lists seven criteria that instructional materials should meet to induce mental models that are likely to improve understanding(Mayer refers to the materials, typically illustrations and text, as "conceptual models" that describe in graphic form the objects and causal relations among them.) A good model is: Complete-it contains all the objects, states, and actions of the system; Concise-it contains just enough detail; Coherent-it makes "intuitive sense"; Concrete-it is presented at an appropriate level of familiarity; Conceptual-it is potentially meaningful; Correct-the objects and relations in it correspond to actual objects and events; and Considerate-it uses appropriate vocabulary and organization.

If these criteria are met, then instruction can lead to the creation of models that help students understand systems and solve problems arising from the way the systems work. For example, Mayer (1989b) and Mayer and Gallini (1990) have demonstrated that materials, conforming to these criteria, in which graphics and text work together to illustrate both the objects and causal relations in systems (hydraulic drum brakes, bicycle pumps) were effective at promoting understanding. Subjects were able to answer questions requiring them to draw inferences from their mental models of the system using information they had not been explicitly taught. For instance, the answer (not explicitly taught) to the question "Why do brakes get hot?" can be found only in an understanding of the causal relations among the pieces of a brake system. A correct answer implies that an accurate mental model has been constructed.

A second area of research on mental models in which educational technologists are now engaging arises from a belief that interactive multimedia systems are effective tools for model building (Hueyching & Reeves, 1992; Kozma, Russell, Jones, Marx & Davis, 1993; Seel & D6rr, 1994). For the first time, we are able, with reasonable ease, to build instructional materials that are both interactive and that, through animation, can represent the changes of state and causal actions of physical systems. Kozma et al. describe a computer system that allows students to carry out simulated chemistry experiments. The graphic component of the system (which certainly meets Mayer's criteria for building a good model) presents information about changes of state and causality within a molecular system. It "corresponds to the molecular level mental models that chemists have of such systems" (Kozma et al., 1993, p. 16). Analysis of constructed student responses and of think aloud protocols have demonstrated the effectiveness of this system at helping students construct good mental models of chemical reactions. Byrne, Furness, and Winn (1995) describe a virtual environment in which students learn about atomic and molecular structure by building atoms from their subatomic components. The most successful treatment for building mental models was a highly interactive one.

5.3.8 Mental Representation and the Development of Expertise

The knowledge we represent as schemata or mental models changes as we work with it over time. It becomes much more readily accessible and useable, requiring less conscious effort to use it effectively. At the same time, its own structure becomes more robust, and it is increasingly internalized and automatized. The result is that its application becomes relatively straightforward and automatic, and frequently occurs without our conscious attention. When we drive home after work, we do not have to think hard about what to do or where we are going. It is important in the research that we shall examine below that this process of "knowledge compilation and translation" (Anderson, 1983) is a slow process. One of the biggest oversights in our field has occurred when instructional designers have assumed that task analysis should describe the behavior of experts rather than novices, completely ignoring the fact that expertise develops in stages and that novices cannot simply "get there" in one jump.

Out of the behavioral tradition that continues to dominate a great deal of thinking in educational technology comes the assumption that it is possible for mastery to result from instruction. In mastery learning, the only instructional variable is the time required to learn something. Therefore, given enough time, anyone can learn anything. The evidence that this is the case is compelling (Bloom, 1984, 1987; Kulik, 1990a, b). However, "enough time" typically comes to mean the length of a unit, module, or semester, and "mastery" means mastery of performance, not of high-level skills such as problem solving.

There is a considerable body of opinion that expertise arises from a much longer exposure to content in a learning environment than that implied in the case of mastery learning. Labouvie-Vief (1990) has suggested that wisdom arises during adulthood from processes that represent a fourth "stage" of human development, beyond Piaget's traditional three. Achieving a high level of expertise in chess (Chase & Simon, 1973) or in the professions (Schon, 1983, 1987) takes many years of learning and applying what one has learned. This implies that learners move through stages on their way from novicehood to expertise, and that, as in the case of cognitive development (Piaget & Inhelder, 1969), each stage is a necessary prerequisite for the next and cannot be skipped. In this case, expertise does not arise directly from instruction. It may start with some instruction, but it develops fully only with maturity and experience on the job (Lave & Wenger, 199 1).

An illustrative account of the stages a person goes through on the way to expertise is provided by Dreyfus and Dreyfus (1986). The stages are: novice, advanced beginner, competence, proficiency, and expertise. Dreyfus and Dreyfus's examples are exceptionally useful in clarifying the differences between stages. The following few paragraphs are therefore based on their narrative (1986, pp. 21-35).

Novices learn objective and unambiguous facts and rules about the area that they are beginning to study. These facts and rules are typically learned out of context. For example, beginning nurses learn how to take a patient's blood pressure and are taught rules about what to do if the reading is normal, high, or very high. However, they do not yet necessarily understand what blood pressure really indicates or why the actions specified in the rules are necessary or how they affect the patient's recovery. In a sense, the knowledge they acquire is "inert" (Cognition and Technology Group at Vanderbilt, 1990) in that, though it can be applied, it is applied blindly and without a context or rationale.

Advanced beginners continue to learn more objective facts and rules. However, with their increased practical experience, they also begin to develop a sense of the larger context in which their developing knowledge and skill operate. Within that context, they begin to associate the objective rules and facts they have learned with particular situations they encounter on the job. Their knowledge becomes "situational" or "contextualized." For example, student nurses begin to recognize patients' symptoms by means that cannot be expressed in objective, context-free rules. The way a particular patient's breathing sounds may be sufficient to indicate that a particular action is necessary. However, the sound itself cannot be described objectively, nor can recognizing it be learned anywhere except on the job.

As the student moves into competence and develops further sensitivity to information in the working environment, the number of context-free and situational facts and rules begins to overwhelm the student. The situation can be managed only when the student learns effective decision-making strategies. Student nurses at this stage often appear to be unable to make decisions. They are still keenly aware of the things they have been taught to look out for and the procedures to follow in the maternity ward. However, they are also now sensitive to situations in the ward that require them to change the rules and procedures. They begin to realize that the baby screaming its head off requires immediate attention even if to give that attention is not something set down in the rules. They are torn between doing what they have been taught to do and doing what they sense is more important at that moment. And often they dither, as Dreyfus and Dreyfus put it, like a mule between two bales of hay" (1986, p. 24).

Proficiency is characterized by quick, effective, and often unconscious decision making. Unlike the merely competent student, who has to think hard about what to do when the situation is at variance with objective rules and prescribed procedures, the proficient student easily grasps what is going on in any situation and acts, as it were, automatically to deal with whatever arises. The proficient nurse simply notices that a patient is psychologically ready for surgery, without consciously weighing the evidence.

With expertise comes the complete fusion of decision making and action. So completely is the expert immersed in the task, and so complete is the expert's mastery of the task and of the situations in which it is necessary to act, that ". . . When things are proceeding normally, experts don't solve problems and don't make decisions; they do what normally works" (Dreyfus & Dreyfus, 1986, pp. 30-31). Clearly, such a state of affairs can arise only after extensive experience on the job. With such experience comes the expert's ability to act quickly and correctly from information without needing to analyze it into components. Expert radiologists can perform accurate diagnoses from X rays by matching the pattern formed by light and dark areas on the film to patterns they have learned over the years to be symptomatic of particular conditions. They act on what they see as a whole and do not attend to each feature separately. Similarly, early research on expertise in chess (Chase & Simon, 1973) revealed that grand masters rely on the recognition of patterns of pieces on the chessboard to guide their play and engage in less in-depth analysis of situations than merely proficient players. Expert nurses sometimes sense that a patient's situation has become critical without there being any objective evidence, and, although they cannot explain why, they are usually correct.

A number of things are immediately clear from his account of the development of expertise. The first is that any student must start by learning explicitly taught facts and rules even if the ultimate goal is to become an expert who apparently functions, perfectly well without using them at all. Spiro et al. (1992) claim that learning by allowing students to construct knowledge only works for "advanced knowledge" that assumes the basics have already been mastered.

Second, though, is the observation that students begin to learn situational knowledge and skills as early as the "advanced beginner" stage. This means that the abilities that appear intuitive, even magical, in experts are already present in embryonic form at a relatively early stage in a student's development. The implication is that instruction should foster the development of situational, nonobjective knowledge and skill as early as possible in a student's education. This conclusion is corroborated by the study of situated learning (Brown, Collins & Duguid, 1989) and apprenticeships (Lave & Wenger, 1991) in which education is situated in real-world contexts from the start (see also 7.4.4, 20.3).

Third is the observation that as students becomes more expert, they are less able to rationalize and articulate the reasons for their understanding of a situation and for their solutions to problems. Instructional designers and knowledge engineers generally are acutely aware of the difficulty of deriving a systematic and objective description of knowledge and skills from an expert as they go about content or task analyses. Experts just do things that work and do not engage in specific or describable problem solving. This also means that assessment of what students learn as they acquire expertise becomes increasingly difficult and eventually impossible by traditional means, such as tests. Tacit knowledge (Polanyi, 1962) is extremely difficult to measure.

Finally, we can observe that what educational technologists spend most of their time doing--developing explicit and measurable instruction-is only relevant to the earliest step in the process of acquiring expertise. There are two implications of this. First, we have, until recently, ignored the potential of technology to help people learn anything except objective facts and rules. And these, in the scheme of things we have just described, though necessary; are intended to be quickly superseded by other kinds of knowledge and skills that allow us to work effectively in the world. We might conclude that instructional design, as traditionally conceived, has concentrated on creating nothing more than training wheels for learning and acting that are to be jettisoned for more important knowledge and skills as quickly as possible. The second implication is that by basing instruction on the knowledge and skills of experts, we have completely ignored the protracted development that has led up to that state. The student must go through a number of qualitatively different stages that come between novicehood and expertise, and can no more jump directly

from stage I to stage 5 than a child can go from Piaget's preoperational stage of development to formal operations without passing through the intervening developmental steps. If we try to teach the skills of the expert directly to novices, we shall surely fail.

The Dreyfus and Dreyfus account is by no means the only description of how people become experts. Nor is it to any great extent given in terms of the underlying psychological processes that enable it to develop. In the next paragraphs, we look briefly at more specific accounts of how expertise is acquired, focusing on two cognitive processes: automaticity and knowledge organization.

5.3.8.1. Automaticity. From all accounts of expertise, it is clear that experts still do the things they learned to do, as novices, but, more often than not, they do them without thinking about them. The automatization of cognitive and motor skills is a step along the way to expertise that occurs in just about every explanation of the process. By enabling experts to function without deliberate attention to what they are doing, automaticity frees up cognitive resources that the expert can then bring to bear on problems that arise from unexpected and hitherto unexperienced events, as well as allowing more attention to be paid to the more mundane though particular characteristics of the situation. This has been reported to be the case for such diverse skills as learning psychomotor skills (Romiszowski, 1993), developing skill as a teacher (Leinhart, 1987), typing (Larochelle, 1982), and the interpretation of X rays (Lesgold et al., 1988).

Automaticity occurs as a result of overlearning (Shiffrin & Schneider, 1977). Under the mastery learning model (Bloom, 1984), a student keeps practicing and receiving feedback, iteratively, until some predetermined criterion has been achieved. At that point, the student is taught and practices the next task. In the case of overleaming, the student continues to practice after attaining mastery, even if the achieved criterion is 100% performance. The more students practice using knowledge and skill beyond just mastery, the more fluid and automatic their skill will become. This is because practice leads to discrete pieces of knowledge and discrete steps in a skill becoming fused into larger pieces, or "chunks." Anderson (1983, 1986) speaks of this process as "knowledge compilation" in which declarative knowledge becomes procedural. Just as a computer compiles statements in a computer language into a code that will actually run, so, Anderson claims, the knowledge that we first acquire as explicit assertions of facts or rules is "compiled" by extended practice into knowledge and skill that will run on its own without our deliberately having to attend to them. Likewise, Landa (1983) describes the process whereby knowledge is transformed first into skill and then into ability through practice. At an early stage of learning something, we constantly have to refer to statements in order to be able to think and act. Fluency only comes when we no longer have to refer explicitly to what we know. Further practice will turn skills into abilities that are characterized by being our natural, intuitive manner of doing things.

5.3.8.2. Knowledge Organization. We mentioned briefly above that experts appear to solve problems by recognizing and interpreting the patterns in bodies of information, not by breaking down the information into its constituent parts. If automaticity corresponds to the "cognitive process" side of expertise, then knowledge organization is the equivalent of "mental representation" of knowledge by experts.

There is considerable evidence that experts organize knowledge in qualitatively different ways from novices. It appears that the chunking of information that is characteristic of experts' knowledge leads them to consider patterns of information when they are required to solve problems rather than improving the way they search through what they know to find an answer. For example, chess masters are far less affected by time pressure than lesser players (Calderwood, Klein & Crandall, 1988). Requiring players to increase the number of moves they make in a minute will obviously reduce the amount of time they have to search through what they know about the relative success of potential moves. However, pattern recognition is a much more instantaneous process and will therefore not be as affected by increasing the number of moves per minute. Since masters were less affected than less-expert players by increasing the speed of a game of chess, it seems that they use pattern recognition rather than search as their main strategy.

Charness (1989) reported changes in a chess player's strategies over a period of 9 years. There was little change in the player's skill at searching through potential moves. However, there were noticeable changes in recall of board positions, evaluation of the state of the game, and chunking of information, all of which, Charness claims, are pattern-related rather than search-related skills. Moreover, Saariluoma (1990) reported, from protocol analysis, that strong chess players in fact engaged in less extensive search than intermediate players, concluding that what is searched is more important than how deeply the search is conducted.

It is important to note that some researchers (Patel & Groen, 1991) explicitly discount pattern recognition as the primary means by which some experts solve problems. Also, in a study of expert X-ray diagnosticians, Lesgold et al. (1988) propose that experts' knowledge schemata are developed through "deeper" generalization and discrimination than novices'. It is important to * note that in cases where pattern recognition is not taken to be the key to expert performance, studies nonetheless supply evidence of qualitative differences in the nature and use of knowledge between experts and novices.

5.3.9 Summary

In this section we have seen that theories of mental representation have influenced research in educational technology in a number of ways. Schema theory, or something very much like it, is basic to just about all cognitive research on representation. And schema theory is centrally implicated in what we call message design. Establishing predictability and control over what appears in instructional materials and how the depicted information is represented has been high on the research agenda. So it has been of prime importance to discover (a) the nature of mental schemata and (b) how changing messages affects how schemata change or are created.

Mental representation is also the key to information mapping techniques that have proved to help students understand and remember what they read. Here, however, the emphasis is on how the relations among objects and events are encoded and stored in memory and less on how

the objects and events are shown. Also, these interconcept relations are often metaphorical. Within the graphical conventions of information maps-hierarchies, radial outlines, and so on-"above,"" below," "close to," and "far from" use the metaphor of space to convey semantic, not spatial, structure (see Winn & Solomon, 1991, for research on these "metaphorical" conventions). Nonetheless, the supposition is that representing these relations in some kind of structure in memory improves comprehension and recall. The construction of schemata as the basis for computer reasoning has not been entirely successful. This is largely because computers are literal minded and cannot draw on general knowledge of the world outside the scripts they are programmed to follow. The results of this, for storywriting at least, are often whimsical and humorous. However, some would claim that the broader implication is that Al is impossible to attain.

Mental model theory has a lot in common with schema theory. However, studies of comprehension and transfer of changes of state and causality in physical systems suggest that well-developed mental models can be "envisioned" and "run" as students seek answers to questions. The ability of multimedia computer systems to show the dynamic interactions of components suggests that this technology has the potential for helping students develop models that represent the world in accurate and accessible ways.

The way in which mental representation changes with the development of expertise has perhaps received less attention from educational technolgists than it should. This is partly because instructional prescriptions and instructional design procedures (particularly the techniques of task analysis) have not taken into account the stages a novice must go through on the way to expertise, each of which requires the development of qualitatively different forms of knowledge. This is an area to which educational technologists could profitably devote more of their attention.

Updated October 14, 2003
Copyright © 2001
The Association for Educational Communications and Technology

AECT
1800 North Stonelake Drive, Suite 2
Bloomington, IN 47404

877.677.AECT (toll-free)
812.335.7675