Basic learning theories [Abstract No. 3880]

Introduction

The basic tenet of learning theory is that almost all behavior is acquired through learning. For example, any psychopathology is understood as the acquisition of maladaptive behavior or as a failure in the acquisition of adaptive behavior. Instead of talking about psychotherapy, proponents of learning theories talk about behavior modification and behavior therapy. Specific actions need to be modified or changed rather than resolving the internal conflicts underlying those actions or reorganizing the personality. Since most problem behaviors have once been learned, they can be abandoned or somehow changed using special procedures based on the laws of learning.

To this day, there is no unified theory of learning, although many general laws of learning are widely accepted and empirically confirmed by various researchers. Within the framework of learning theory, there are three main directions: Pavlovian teaching, classical behaviorism, neobehaviorism.

The term “learning theory” is applied primarily to behavioral psychology. In contrast to the pedagogical concepts of training, education and upbringing, learning theory covers a wide range of processes in the formation of individual experience, such as habituation, imprinting, the formation of simple conditioned reflexes, complex motor and speech skills, sensory difference reactions, etc.

Partial reinforcement.

Instrumental learning using rewards—for example, training a rat in a Skinner box to press a lever for food or praising a child when he says “thank you” and “please”—involves several types of relationships between behavior and reinforcement. The most common type of addiction is constant reinforcement, in which a reward is given for each correct response. Another option is partial reinforcement, which offers reinforcement only for some correct responses, say every third time the desired behavior occurs, or every tenth time, or the first time it occurs every hour or every day. The effects of partial reinforcement are important and of great interest. With partial reinforcement, it takes longer to learn the desired behavior, but the results are much more durable. The persistence of the effect is especially noticeable when the reinforcement is stopped; This procedure is called "extinction". Behavior learned with partial reinforcement persists for a long time, while behavior mastered with constant reinforcement quickly ceases.

The concept of "learning"

Before moving on to learning theory, let's consider the concept of “learning.”

There are several concepts related to a person’s acquisition of life experience in the form of knowledge, skills, abilities, abilities. This is teaching, teaching, learning.

The most general concept is learning. Intuitively, each of us has an idea of what learning is. They speak of learning when a person begins to know or be able to do something that he did not know or could not do before. This new knowledge, skills, and abilities can be a consequence of activities aimed at acquiring them, or act as a side effect of behavior that realizes goals not related to these knowledge and skills.

Learning is the acquisition of individual experience. A wide class of mental processes that ensure the formation of new, adaptive reactions.

In foreign psychology, the concept of learning is often used as an equivalent to “learning”. In Russian psychology, at least during the Soviet period of its development, it is customary to use it in relation to animals. However, recently a number of scientists I.A. Zimnyaya, V.N. Druzhinin, Yu.M. Orlov and others use this term in relation to humans.

Types of learning in humans:

Imprinting;
Operant conditioning;
Conditional – reflex learning;
Vicarious learning;
Verbal learning.

Encoding information in memory.

Many types of learning involve three essential elements: sound, meaning and sight. For example, it is necessary to form an association between the words “dog” and “table”. Learning by encoding sounds requires repeating those words over and over again, listening to how they sound together, and remembering how they feel when they are repeated. This acoustic method, called rote memorization, is sometimes necessary, but is significantly inferior in meaning to encoding. Meaningful learning of the association between the words “dog” and “table” involves thinking about a dog, thinking about a table, and making some kind of connection between them, such as the statement that a dog never works at a table. Semantic encoding is the most important factor in successful school education. Long hours of hard work using rote memorization do not produce the same results as those achieved through much fewer sessions that focus on the meaning of the lesson. Sometimes the third method turns out to be the most effective - the method of forming visual images. In the case of “dog” and “table,” the procedure would be to create a realistic mental image in which both the dog and the table play important roles, such as an image of an antique desk on which stands a paperweight with a handle in the shape of a hunting dog. The more vivid the image is, the easier it is to subsequently remember the connection between these two objects. Of course, in some cases, especially when it comes to abstract concepts like "misfortune" and "energy", there is no simple way of visual representation and you have to rely only on semantic encoding. Thus, effective learning is not only achieved through time and effort spent on practice; The nature of the practice itself is also of great importance.

Learning theories: patterns and provisions

Learning theories in psychology are based on two main principles:

All behavior is learned through the process of learning.

In order to maintain scientific rigor, the principle of data objectivity must be observed when testing hypotheses. External reasons (food reward) are chosen as variables that can be manipulated, in contrast to “internal” variables in the psychodynamic direction (instincts, defense mechanisms, self-concept), which cannot be manipulated.

The laws of learning include:

Law of readiness: the stronger the need, the more successful the learning.

Law of Effect: Behavior that results in a beneficial action causes a decrease in need and will therefore be repeated.

Law of Exercise: All other things being equal, repetition of a particular action makes the behavior easier to perform and leads to faster performance and a reduced likelihood of errors.

The law of recency: the material that is presented at the end of the series is best learned. This law contradicts the primacy effect - the tendency to better learn material that is presented at the beginning of the learning process. The contradiction is eliminated when the law “edge effect” is formulated. The U-shaped dependence of the degree of learning of a material on its place in the learning process reflects this effect and is called the “positional curve.”

Law of Correspondence: There is a proportional relationship between the probability of a response and the probability of reinforcement.

There are three main learning theories:

theory of classical conditioning I.P. Pavlova;
theory of operant conditioning B.F. Skinner;
A. Bandura's theory of social learning.

Classical conditioning theory I.P. Pavlova

The type of classical conditioning is closely associated with the name of I. P. Pavlov, who made a fundamental contribution to the theory of classical conditioned reflexes, which became the basis for the development of behavioral psychotherapy.

The basic scheme of a conditioned reflex is S-R, where S is a stimulus, R is a reaction (behavior). In the classical Pavlovian scheme, reactions occur only in response to the influence of some stimulus, an unconditioned or conditioned stimulus. Pavlov was the first to answer the question of how a neutral stimulus can cause the same reaction as an unconditioned reflex, which occurs automatically, on an innate basis, and does not depend on the individual’s previous experience. Or, in other words, how a neutral stimulus becomes a conditioned stimulus. The formation of a conditioned reflex occurs in the presence of:

contiguity, coincidence in time of the indifferent and unconditional stimuli, with some advance of the indifferent stimulus;
repetition, multiple combinations of indifferent and unconditional stimuli.

The experimenter influences the body with a conditioned stimulus (bell) and reinforces it with an unconditioned stimulus (food), that is, the unconditioned stimulus is used to cause an unconditioned response (saliva production) in the presence of an initially neutral stimulus (bell). After a number of repetitions, the reaction (salivation) is associated with this new stimulus (bell), in other words, such a connection is established between them that the previously neutral unconditioned stimulus (bell) causes a conditioned response (saliva flow). The result or product of learning according to this scheme is respondent behavior - behavior caused by a certain stimulus (S). The supply of reinforcement in this case is associated with a stimulus (S), therefore this type of learning, during which a connection is formed between stimuli, is designated as type S learning.

A great example of classical conditioning is J. Watson's experiment. In 1918, J. Watson began laboratory experiments with children that demonstrated that learning experiences in childhood had long-lasting effects. In one experiment, he first showed that a nine-month-old boy, Albert, was not afraid of a white rat, a rabbit, or other white objects, then struck a steel bar next to Albert's head every time a white rat appeared. After several blows, Albert began to shudder, cry, and try to crawl away at the sight of the rat. He reacted in a similar way when J. Watson showed him other white objects. Here J. Watson used classical conditioning: by combining a loud sound (unconditioned stimulus) with the presentation of a rat (conditioned stimulus), he caused a new reaction - a conditioned fear response - to a previously neutral animal.

This experiment also demonstrates a phenomenon discovered by I. P. Pavlov and called “stimulus generalization.” Its essence is that if a conditioned reaction has developed, then it will also be caused by stimuli similar to the conditioned one: a child can be taught to fear what previously seemed harmless, and this fear will spread to similar objects. Little Albert began to experience fear of all fur toys. From his experiments, Watson concluded that children can learn almost anything, including phobic symptoms (Alexandrov A. A., 1997).

We can name two more phenomena associated with the name of Pavlov and used in behavioral psychotherapy. First, there is a phenomenon called stimulus discrimination, or stimulus discrimination, through which people learn to distinguish between similar stimuli. The baby's cry becomes a conditioned stimulus for the mother: she wakes up from deep sleep at the slightest excitement of the child, but can sleep deeply when someone else's child cries. Secondly, this is extinction - the gradual disappearance of a conditioned response as a result of the elimination of the connection between the conditioned and unconditioned stimuli. Extinction is due to the fact that the conditioned stimulus continues to evoke the conditioned response only if the unconditioned stimulus appears at least periodically. If the conditioned stimulus is not at least sometimes reinforced by the unconditioned one, then the strength of the conditioned reaction begins to decrease.

Organization of practice.

When mastering a skill, as in many other situations, it is helpful to take frequent rest breaks rather than practice continuously. The same number of lessons will lead to more effective learning if they are distributed over time, and not concentrated in a single block, as is done with the so-called. massive training. Classes held partly in the morning and partly in the evening provide a greater difference in learning conditions than classes only in the morning or only in the evening. However, part of the learning process is for the learner to recall stored information, and such recall is facilitated by recreating the situation in which something was learned. For example, test results are better if it is conducted not in a special examination class, but in the same room where the training took place. see also

NEUROPSYCHOLOGY; MEMORY; PSYCHOLOGY; HABIT.

Operant conditioning theory B.F. Skinner

The theory of operant conditioning is associated with the names of Edward Lee Thorndike (EL Thorndike) and Burres Skinner (BF Skinner). In contrast to the principle of classical conditioning (S->R), they developed the principle of operant conditioning (R->S), according to which behavior is controlled by its results and consequences. The main way to influence behavior, based on this formula, is to influence its results.

As mentioned earlier, respondent behavior is B.F.'s version. Skinner's Pavlovian concept of behavior, which he called type S conditioning, to emphasize the importance of a stimulus that appears before and elicits a response. However, Skinner believed that, in general, animal and human behavior cannot be explained in terms of classical conditioning. Skinner emphasized behavior that was not associated with any known stimuli. He argued that your behavior is mainly influenced by the stimulus events that come after it, namely its consequences. Because this type of behavior involves the organism actively influencing its environment to change events in some way, Skinner defined it as operant behavior. He also called it self-type conditioning to emphasize the impact of a response on future behavior.

So, the key structural unit of the behaviorist approach in general and the Skinnerian approach in particular is the reaction. Reactions can range from simple reflex responses (eg, salivating at food, flinching at a loud sound) to complex patterns of behavior (eg, solving a math problem, covert forms of aggression).

A response is an external, observable part of behavior that can be linked to environmental events. The essence of the learning process is the establishment of connections (associations) of reactions with events in the external environment.

In his approach to learning, Skinner distinguished between responses that are elicited by clearly defined stimuli (such as the blink reflex in response to a puff of air) and responses that cannot be associated with any single stimulus. These reactions of the second type are generated by the organism itself and are called operants. Skinner believed that environmental stimuli do not force an organism to behave in a certain way and do not induce it to act. The root cause of behavior is found in the body itself.

Operant behavior (caused by operant conditioning) is determined by the events that follow the response. That is, behavior is followed by a consequence, and the nature of this consequence changes the tendency of the organism to repeat this behavior in the future. For example, rollerblading, playing the piano, throwing darts, and writing one's name are examples of operant response, or operants controlled by the outcomes following the corresponding behavior. These are voluntary acquired reactions for which there is no recognizable stimulus. Skinner understood that it is pointless to speculate about the origin of operant behavior, since we do not know the stimulus or internal cause responsible for its occurrence. It happens spontaneously.

If the consequences are favorable to the organism, then the likelihood of repetition of the operant in the future increases. When this occurs, the consequences are said to be reinforced, and the operant responses resulting from the reinforcement (in the sense that it is highly likely to occur) are conditioned. The strength of a positive reinforcement stimulus is thus determined according to its effect on the subsequent frequency of responses that immediately preceded it.

Conversely, if the consequences of a response are not favorable or reinforced, then the probability of obtaining the operant decreases. Skinner believed that operant behavior was therefore controlled by negative consequences. By definition, negative or aversive consequences weaken the behavior that produces them and strengthen the behavior that eliminates them.

Operant learning can be thought of as a learning process based on the stimulus-response-reinforcement relationship, in which behavior is formed and maintained due to certain consequences.

Secondary reinforcement.

During associative learning, some signals that initially had no value or did not indicate danger are associated in the mind with events that have value or are associated with danger. If this happens, signals or events that were previously neutral in nature begin to act as rewards or punishments; This process is called secondary reinforcement. A classic example of secondary reinforcement is money. Animals in a Skinner box are ready to press a lever to obtain special tokens that can be exchanged for food, or to cause the bell to ring, with the sound of which they are accustomed to identify the appearance of food. Avoidance learning illustrates a variant of secondary reinforcement through punishment. The animal performs certain actions when a signal appears, which, although not itself unpleasant, constantly accompanies some unpleasant event. For example, a dog that is often beaten cowers and runs away when its owner raises his hand, although there is nothing dangerous in the raised hand itself. When positive and negative secondary reinforcement is used to control behavior, there is no need for frequent actual rewards or punishments. Thus, when animals are trained using the successive approach method, the reinforcement for each attempt is usually only the clicking sound that previously regularly accompanied the appearance of food.

A. Bandura's theory of social learning

A. Bandura criticized radical behaviorism, which denied the determinants of human behavior arising from internal cognitive processes. For Bandura, individuals are neither autonomous systems nor mere mechanical transmitters animating the influences of their environment—they possess superior abilities that enable them to predict the occurrence of events and create the means to exercise control over what affects their daily lives. Given that traditional theories of behavior may have been incorrect, this provided an incomplete rather than an inaccurate explanation of human behavior.

From the point of view of A. Bandura, people are not controlled by intrapsychic forces and do not react to their environment. The reasons for human functioning must be understood in terms of the continuous interaction of behavior, cognition, and environment. This approach to the analysis of the causes of behavior, which Bandura designated as reciprocal determinism, implies that predispositional factors and situational factors are interdependent causes of behavior.

Human functioning is viewed as a product of the interaction of behavior, personality factors, and environmental influences.

Simply put, internal determinants of behavior, such as belief and expectation, and external determinants, such as reward and punishment, are part of a system of interacting influences that act not only on behavior, but also on various parts of the system.

Bandura's triad model of reciprocal determinism shows that while behavior is influenced by the environment, it is also partly the product of human activity, meaning people can have some influence on their own behavior. For example, a person's rude behavior at a dinner party can lead to the fact that the actions of those present nearby will be more likely to be a punishment than an encouragement for him. In any case, behavior changes the environment. Bandura also argued that because of their extraordinary ability to use symbols, people can think, create, and plan, that is, they are capable of cognitive processes that are constantly manifested through overt actions.

Each of the three variables in the mutual determinism model is capable of influencing another variable. Depending on the strength of each of the variables, first one, then the other, then the third dominates. Sometimes the influences of the external environment are strongest, sometimes internal forces dominate, and sometimes expectations, beliefs, goals and intentions shape and guide behavior. Ultimately, however, Bandura believes that because of the dual-directional interaction between overt behavior and environmental circumstances, people are both the product and producer of their environment. Thus, social cognitive theory describes a model of reciprocal causation in which cognitive, affective, and other personality factors and environmental events operate as interdependent determinants.

Foreseen consequences. Learning researchers emphasize reinforcement as a necessary condition for the acquisition, maintenance, and modification of behavior. Thus, Skinner argued that external reinforcement is necessary for learning.

A. Bandura, although he recognizes the importance of external reinforcement, does not consider it as the only way by which our behavior is acquired, maintained or changed. People can learn by observing or reading or hearing about other people's behavior. As a result of previous experience, people may expect certain behaviors to produce consequences they value, others to produce undesirable outcomes, and others to be ineffective. Our behavior is therefore governed to a large extent by anticipated consequences. In each case, we are able to imagine in advance the consequences of inadequate preparation for action and take the necessary precautions. Through our ability to represent actual outcomes symbolically, future consequences can be translated into immediate incentives that influence behavior in much the same way as potential consequences. Our higher mental processes give us the ability to foresight.

At the core of social cognitive theory is the proposition that new forms of behavior can be acquired in the absence of external reinforcement. Bandura notes that much of the behavior we exhibit is learned through example: WE simply observe what others do and then imitate their actions. This emphasis on learning through observation or example rather than direct reinforcement is the most characteristic feature of Bandura's theory.

Self-regulation and behavioral cognition. Another characteristic feature of social cognitive theory is that it places an important role on a person’s unique ability to self-regulate. By arranging their immediate environment, providing cognitive support, and being aware of the consequences of their own actions, people are able to exert some influence on their behavior. Of course, the functions of self-regulation are created and not so rarely supported by the influence of the environment. They are thus of external origin, but it should not be downplayed that once established, internal influences partly regulate what actions a person performs. Further, Bandura argues that higher intellectual abilities, such as the ability to manipulate symbols, give us a powerful means of influencing our environment. Through verbal and figurative representations, we produce and store experiences in such a way that they serve as guides for future behavior. Our ability to form images of desired future outcomes results in behavioral strategies designed to guide us towards distant goals. Using symbolic ability, we can solve problems without resorting to trial and error, and can thus anticipate the likely consequences of various actions and change our behavior accordingly.

Hypothetico-deductive method

Since Hull gravitated toward the exact sciences, he sought to create a universal scientific method for psychology. He wanted to solve the problem of the extreme subjectivity of science and believed that a large number of approaches only complicate the situation: each of the concepts tries unsuccessfully to become comprehensive, applicable to any problem in psychology. The diagnostic methods also did not inspire confidence in the researcher [5]. In his work “The Mathematical-Deductive Theory of Mechanistic Learning,” Hull described the methods of scientific knowledge, which from his point of view were the most objective:

Observation;
Supervised observation;
Experimental testing of the hypothesis;
Hypothetico-deductive method. [2]

While three of these methods were already known in psychology, the fourth was Hull's innovation. The hypothetico-deductive method was understood as a method of testing hypotheses according to a strictly defined plan:

The researcher determines a system of primary axioms for the problem under study (the axiom often includes the use of formulas and certain variables necessary for the scientist to better structure the initial information);
Primary axioms are taken as the basis for drawing up a hypothesis/theorem that needs to be tested;
A scientist conducts an experimental test of a hypothesis/theorem. Then there are two possible scenarios: the experiment is either successful or found to be unsuccessful during empirical testing;
If the experiment is considered successful (if the initial hypotheses and predictions are confirmed), then the hypothesis is retained. If the experiment was not successful (predictions were not confirmed empirically), then the hypothesis is reformulated or discarded completely. [6]

Clark L. Hull believed that with careful study and experimental testing, the surviving tested hypotheses would themselves develop into a set of first principles that would be consistent with the results of experimental testing of new hypotheses. Such a principle, in his opinion, should lead to the renewal and improvement of psychology and, finally, to its truly scientific unification. In essence, the researcher adhered to the desire characteristic of behaviorism and neobehaviorism in general - to promote greater scientific character of psychology as opposed to lack of evidence. Hull wanted to create his own objective methodological system that would be able to “predict both the behavior of a rat in a maze and the behavior of a person under the influence of everyday conditions” [7, p. 515].

The hypothetico-deductive approach was taken by Hull himself to construct his concept of learning.