Anderson - ACT. A simple theory of complex cognition.pdf - Psychologia - koni86

1995 Award Addresses

ACT

A Simple Theory of Complex Cognition

John R. Anderson

Carnegie Mellon University

In the Adaptive Character of Thought (ACT-R) theory,

complex cognition arisesfrom an interaction ofprocedural

and declarative knowledge. Procedural knowledge is rep-

resented in units called production rules, and declarative

knowledge is represented in units called chunks. The in-

dividual units are created by simple encodings of objects

in the environment (chunks) or simple encodings oftrans-

formations in the environment (production rules). A great

many such knowledge units underlie human cognition.

From this large database, the appropriate units are se-

lectedfor a particular context by activation processes that

are tuned to the statistical structure of the environment.

According to the ACT-R theory, the power of human cog-

nition depends on the amount of knowledge encoded and

the effective deployment of the encoded knowledge.

the concrete illustration to such an abstract statement. It

certainly seems like the kind of cognitive act that we are

unlikely to see from any other species.

We have studied extensively how people write re-

cursive programs (e.g., Anderson, Farrell, & Sauers, 1984;

Pirolli & Anderson, 1985). To test our understanding of

the process, we have developed computer simulations that

are themselves capable of writing recursive programs in

the same way humans do. Underlying this skill are about

500 knowledge units called production rules. For instance,

one of these production rules for programming recursion,

which might apply in the midst of the problem solving,

IF the goal is to identify the recursive relationship in a

function with a number argument

THEN set as subgoals to

1. Find the value of the function for some N

2. Find the value of the function for N- 1

3. Try to identify the relationship between the two

answers.

flects the fact that there is something special about

human cognition--that it achieves a kind of intel-

ligence not even approximated in other species. One can

point to marks of that intelligence in many domains.

Much of my research has been in the area of mathematics

and computer programming, fields in which the capacity

to come up with abstract solutions to problems is one

ability that is frequently cited with almost mystical awe.

A good example of this is the ability to write recursive

programs.

Consider writing a function to calculate the factorial

of a number. The factorial of a number can be described

to someone as the result you get when you multiply all

the positive integers up to that number. For instance,

Thus, in the case above, this might lead to finding that

factorial(5) = 120 (Step 1), factorial(4) = 24 (Step 2), and

that factorial (N) = factorial (N-l) X N (Step 3).

We (e.g., Anderson, Boyle, Corbett, & Lewis, 1990;

Anderson, Corbett, Koedinger, & Pelletier, 1995; Ander-

son & Reiser, 1985) have created computer-based in-

structional systems, called intelligent tutors, for teaching

cognitive skills based on this kind of production-rule

analysis. By basing instruction on such rules, we have

been able to increase students' rate of learning by a factor

of 3. Moreover, within our tutors we have been able to

factorial(5) = 5 X 4 X 3 X 2 X 1 = 120

In addition (it might appear by arbitrary convention), the

factorial of zero is defined to be I. In writing a recursive

program to calculate the factorial for any number N, one

defines factorial in terms of itself. Below is what such a

program might look like:

Editor's note. Articlesbased on APAaward addressesare givenspecial

consideration in the AmericanPsychologist's editorial selection process.

A versionofthis article was originallypresented as part ofan Award

for Distinguished Scientific Contributions address at the 103rd Annual

Convention ofthe American PsychologicalAssociation,New York, NY,

August 1995.

factorial(N) = 1

if N = 0

Author's note. This research was supported by Grant ONR N0014-

90-J-1489 from the Officeof Naval Research and Grant SBR 94-21332

from the National Science Foundation.

I would like to thank Marsha Lovett and Lynne Reder for their

comments on the article.

Correspondenceconcerningthis article shouldbe addressedto John

R. Anderson, Department of Psychology,Carnegie Mellon University,

Pittsburgh, PA 15213.For more information on the ACTtheory,consult

the ACT-Rhome pageon the WorldWideWeb:http://sands.psy.cmu.edu.

= factorial(N-1) × N if N > O.

The first part of the specification, factorial(O) = 1, is just

stating part of the definition of factorial. But the second

recursive specification seems mysterious to many and ap-

pears all the more mysterious that anyone can go from

April 1996 • American Psychologist

Vol. 51, No. 4, 355-365

355

T he designation of our species as homo sapiens re-

Figure 1

Mean Actual Error Rate and Expected Error Rate Across

Successive Rule Applications

0.5

proposed a distinction between declarative knowledge,

which HAM dealt with, and procedural knowledge, which

HAM did not deal with. Borrowing ideas from Newell

(1972, 1973), it was proposed that procedural knowledge

was implemented by production rules. A production-sys-

tem model called ACTE was proposed to embody this

joint procedural-declarative theory. After 7 years of

working with variants of that system, we were able to

develop a theory called ACT* (Anderson, 1983) that em-

bodied a set of neurally plausible assumptions about how

such a system might be implemented and also psycho-

logically plausible assumptions about how production

rules might be acquired. That system remained with us

for l0 years, but a new system called ACT-R was then

put forward by Anderson (1993b). Reflecting technical

developments in the past decade, this system now serves

as a computer simulation tool for a small research com-

munity. The key insight of this version of the system is

that the acquisition and deployment processes are tuned

to give adaptive performance given the statistical structure

of the environment. It is the ACT-R system that we will

describe.

Representational Assumptions

Declarative and procedural knowledge are intimately

connected in the ACT-R theory. Production rules embody

procedural knowledge, and their conditions and actions

are defined in terms of declarative structures. A specific

production rule can only apply when that rule's condi-

tions are satisfied by the knowledge currently available

in declarative memory. The actions that a production

rule can take include creating new declarative structures.

Declarative knowledge in ACT-R is represented in

terms of chunks (Miller, 1956; Servan-Schreiber, 1991 )

that are schema-like structures, consisting of an isa pointer

specifying their category and some number of additional

pointers encoding their contents. Figure 2 is a graphical

display of a chunk encoding the addition fact that 3 + 4

= 7. This chunk can also be represented textually:

----C.--- Actual Error Rate

Expected Error Rate

0,2

Opportunityto Apply Rule(RequiredExercises Only)

Note. From "Student Modeling in the ACT Progromming Tutor," by A. T. Corbett,

.I R. Anderson, and A. 1-. O'Brien, 1995, in P. Nichols, S. Chipman, and B. Brennan,

CognitivelyDiagnosticAssessment, Hillsdale, N J: Erlbaurn. Copyright ] 995 by Erl-

bourn. Reprinted by permission.

track the learning of such rules and have found that they

improve gradually with practice, as illustrated in Figure

1. Our evidence indicates that underlying the complex,

mystical skill of recursive programming is about 500 rules

like the one above, and that each rule follows a simple

learning curve like Figure 1.

This illustrates the major claim of this article:

All that there is to intelligence is the simple accrual and

tuning of many small units of knowledge that in total

produce complex cognition. The whole is no more than

the sum of its parts, but it has a lot of parts.

The credibility of this claim has to turn on whether

we can establish in detail how the claim is realized in

specific instances of complex cognition. The goal of the

ACT theory, which is the topic of this article, has been

to establish the details ofthis claim. It has been concerned

with three principal issues: How are these units of knowl-

edge represented, how are they acquired, and how are

they deployed in cognition?

The ACT theory has origins in the human associative

memory (HAM) theory of human memory (Anderson &

Bower, 1973), which attempted to develop a theory of

how memories were represented and how those repre-

sentations mediated behavior that was observed in mem-

ory experiments. It became apparent that this theory only

dealt with some aspects of knowledge; Anderson (1976)

II I

Figure 2

Network Representation of an ACT-R Chunk

Addition-fact

addendl//fact3+4~ sum

Three~wj sj~ ~,..~.~ §~

Seven

addend2

Four

356

April 1996 • American Psychologist

fact3+4

isa addition-fact

addendl three

addend2 four

sum

time for previous symbols) reflect the time for the extra

production. The next symbol to be encoded (the 3) takes

approximately 550 milliseconds to process (see Part e of

Figure 3), reflecting again two productions but this time

also retrieval of the fact 4 + 3 = 7. The mental represen-

tation of the equation at this point is collapsed into x +

7. The = sign is next processed in Part f of Figure 3. It

takes a particularly short time. We think this reflects the

strategy of some participants of just skipping over that

symbol. The final symbol comes in (see Part g of Figure

3) and leads to a long latency reflecting seven productions

that need to apply to transform the equation and the

execution of the motor response of typing the number

key.

Procedural knowledge, such as mathematical prob-

lem-solving skill, is represented by productions. Produc-

tion rules in ACT-R respond to the existence of specific

goals and often involve the creation of subgoals. For in-

stance, suppose a child was at the point illustrated below

in the solution of a multicolumn addition problem:

531

+248

The example in Figure 3 is supposed to reflect the

relative detail in which we have to analyze human cog-

nition in ACT-R to come up with faithful models. The

simulation is capable of solving the same problems as the

participants. It can actually interact with the same ex-

perimental software as the participants, execute the same

scanning actions, read the same computer screen, and

execute the same motor responses with very similar tim-

ing (Anderson, Matessa, & Douglass, 1995). When I say,

"The whole is no more than the sum of its parts but it

has a lot of parts," these are the parts I have in mind.

These parts are the productions rules and the chunk

structures that represent long-term knowledge and the

evolving understanding of the problem.

Knowledge units like these are capable of giving rel-

atively accurate simulations of human behavior in tasks

such as these. However, the very success of such simu-

lations only makes salient the two other questions that

the ACT-R theory must address, which are how did the

prior knowledge (productions and long-term chunks)

come to exist in the first place and how is it, if the mind

is composed of a great many of these knowledge units,

that the appropriate ones usually come to mind in a par-

ticular problem-solving context? These are the questions

of knowledge acquisition and knowledge deployment.

Focused on the tens column, the following production

rule might apply from the simulation of multicolumn

addition (Anderson, 1993b):

IF the goal is to add n 1 and n2 in a column

andnl +n2=n3

THEN set as a subgoal to write n3 in that column

This production rule specifies in its condition the goal of

working on the tens column and involves a retrieval of a

declarative chunk like the one illustrated in Figure 2. In

its action, it creates a subgoal that might involve things

like processing a carry. The subgoal structure assumed

in the ACT-R production system imposes this strong ab-

stract, hierarchical structure on behavior. As argued else-

where (Anderson, 1993a), this abstract, hierarchical

structure is an important part of what sets human cog-

nition apart from that of other species.

Much of the recent effort in the ACT-R theory has

gone into detailed analyses of specific problem-solving

tasks. One of these involves equation solving by college

students (e.g., Anderson, Reder, & Lebiere, in press). We

have collected data on how they scan equations, including

the amount of time spent on each symbol in the equation.

Figure 3 presents a detailed simulation of the solution of

equations like X + 4 + 3 = 13, plus the average scanning

times of participants solving problems of this form (mixed

in with many other types of equations in the same ex-

periment). As can be seen in Parts a-c of that figure, the

first three symbols are processed to create a chunk struc-

ture of the form x + 4. In the model, there is one pro-

duction responsible for processing each type of symbol.

The actual times for the first three symbols are given in

Parts a-c of Figure 3. They are on the order of 400 mil-

liseconds, which we take as representing approximately

300 milliseconds to implement the scanning and encoding

of the symbol and 100 milliseconds for the production

to create the augmentation to the representation. 2

The next symbol to be encoded, the +, takes about

500 milliseconds to process in Part d. As can be seen, it

involves two productions, one to create a higher level

chunk structure and another to encode the plus into that

structure. The extra 100 milliseconds (over the encoding

Knowledge Acquisition

A theory of knowledge acquisition must address both the

issue of the origins of the chunks and of the origins of

production rules. Let us first consider the origin of

chunks. As the production rules in Figure 3 illustrate,

chunks can be created by the actions of production rules.

However, as we will see shortly, production rules originate

from the encodings of chunks. To avoid circularity in the

theory we also need an independent source for the origin

of the chunks. That independent source involves encoding

from the environment. Thus, in the terms of Anderson

and Bower (1973), ACT-R is fundamentally a sensation-

1This involves a scheme wherein participants must point at the

part of the equation that they want to read next.

2Althoughour data stronglyconstrainthe processing,there remain

a number of arbitrary decisions about how to represent the equation

that could havebeen made differently.

April 1996 • American Psychologist

357

seven

•=

,,~ + ~

~o ~ ~

o g

~o~= ='7-=g =-=- ='= ~

e== eg_:

.--

-~o ~ S~

-'-~" =.

£~a=

£~---'E

-.~=

-= u ga

-5=~

~-'~= __.'~ ,E

~ b4o r~ ~

=,.,.

• ~, ~

~ ~, t,,~ •.-'] ,~.,~ or~

a°o-~

m o..- .-

•- o ~.~

[....

,T = ~ N","

)

Iln

+>~ ~ ~ m~

am ©~

~..,=

"a g =.~

if= ~'e

==r~l

='-=

atist theory in that its knowledge structures result from

environmental encodings.

We have only developed our ideas about environ-

mental encodings of knowledge with respect to the visual

modality (Anderson, Matessa, & Douglass, 1995). In this

area, it is assumed that the perceptual system has parsed

the visual array into objects and has associated a set of

features with each object. ACT-R can move its attention

over the visual array and recognize objects. We have

embedded within ACT-R a theory that might be seen as

a synthesis of the spotlight metaphor of Posner (1980),

the feature-synthesis model of Treisman (Treisman &

Sato, 1990), and the attentional model of Wolfe (1994).

Features within the spotlight can be synthesized into rec-

ognized objects. Once synthesized, the objects are then

available as chunks in ACT's working memory for further

processing. In ACT-R the calls for shifts of attention are

controlled by explicit firings of production rules.

The outputs of the visual module are working mem-

ory elements called chunks in ACT-R. The following is

a potential chunk encoding of the letter H:

object

and encode that the second structure is dependent on the

first. What the learner must do is find some mapping

between the two structures. The default assumption is

that identical structures directly map. In this case, it is

assumed the 3x in the first equation maps onto the 3x

in the second equation. This leaves the issue of how to

relate the 7 and 13 to the 6. ACT-R looks for some chunk

structure to make this mapping. In this case, it will find

a chunk encoding that 7 + 6 = 13. Completing the map-

ping ACT-R will form a production rule to map one

structure onto the other:

IF the goal is to solve an equation of the form

arg + nl = n3

andnl +n2=n3

THEN make the goal to solve an equation

of the form arg = n2

This approach takes a very strong view on instruction.

This view is that one fundamentally learns to solve prob-

lems by mimicking examples ofsolutions. This is certainly

consistent with the substantial literature showing that ex-

amples are as good as or better than abstract instruction

that tells students what to do (e.g., Cheng, Holyoak, Nis-

bett, & Oliver, 1986; Fong, Krantz, & Nisbett, 1986; Reed

& Actor, 199 I). Historically, learning by imitation was

given bad press as cognitive psychology broke away from

behaviorism (e.g., Fodor, Bever, & Garrett, 1974). How-

ever, these criticisms assumed a very impoverished com-

putational sense of what is meant by imitation.

It certainly is the case that abstract instruction does

have some effect on learning. There are two major func-

tions for abstract instruction in the ACT-R theory. On

the one hand, it can provide or make salient the right

chunks (such as 7 + 6 = 13 in the example above) that

are needed to bridge the transformations. It is basically

this that offers the sophistication to the kind of imitation

practiced in ACT-R. Second, instruction can take the form

of specifying a sequence of subgoals to solve a task (as

one finds in instruction manuals). In this case, assuming

the person already knows how to achieve such subgoals,

instruction offers the learner a way to create an example

of such a problem solution from which they can then

learn production rules like the one above.

The most striking thing about the ACT-R theory of

knowledge acquisition is how simple it is. One encx~es

chunks from the environment and makes modest infer-

ences about the rules underlying the transformations in-

volved in examples of problem solving. There are no great

leaps of insight in which large bodies of knowledge are

reorganized. The theory implies that acquiring compe-

tence is very much a labor-intensive business in which

one must acquire one-by-one all the knowledge compo-

nents. This flies very much in the face of current edu-

cational fashion but, as Anderson, Reder, and Simon

(1995) have argued and documented, this educational

fashion is having a very deleterious effect on education.

We need to recognize and respect the effort that goes into

acquiring competence (Ericcson, Krampe, & Tesche-

Romer, 1993). However, it would be misrepresenting the

isa H

left-vertical barl

fight-vertical bar2

horizontal bar3

We assume that before the recognition of the object, these

features (the bars) are available as parts of an object but

that the object itself is not recognized. In general, we

assume that the system can respond to the appearance

of a feature anywhere in the visual field. However, the

system cannot respond to the conjunction of features that

define a pattern until it has moved its attention to that

part of the visual field and recognized the pattern of fea-

tures. Thus, there is a correspondence between this model

and the feature synthesis model of Treisman (Treisman

& Sato, 1990).

A basic assumption is that the process of recognizing

a visual pattern from a set of features is identical to the

process of categorizing an object given a set of features.

We have adapted the Anderson and Matessa (1992) ra-

tional analysis of categorization to provide a mechanism

for assigning a category (such as H) to a particular con-

figuration of features. This is the mechanism within

ACT-R for translating stimulus features from the envi-

ronment into chunks like the ones above that can be pro-

cessed by the higher level production system.

With the environmental origins of chunks specified,

we can now turn to the issue of the origins of production

rules. Production rules specify the transformations of

chunks, and we assume that they are encoded from

examples of such transformations in the environment.

Thus, a student might encounter the following example

in instruction:

3x+7= 13

3x=6

April 1996 • American Psychologist

359

Anderson - ACT. A simple theory of complex cognition.pdf

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: