The differential reinforcement of successive approximations, or more commonly, shaping is a conditioning procedure used primarily in the experimental analysis of behavior. It was introduced by B.F. Skinner  with pigeons and extended to dogs, dolphins, humans and other species. In shaping, the form of an existing response is gradually changed across successive trials towards a desired target behavior by rewarding exact segments of behavior. Skinner's explanation of shaping was this:
We first give the bird food when it turns slightly in the direction of the spot from any part of the cage. This increases the frequency of such behavior. We then withhold reinforcement until a slight movement is made toward the spot. This again alters the general distribution of behavior without producing a new unit. We continue by reinforcing positions successively closer to the spot, then by reinforcing only when the head is moved slightly forward, and finally only when the beak actually makes contact with the spot. . . . The original probability of the response in its final form is very low; in some cases it may even be zero. In this way we can build complicated operants which would never appear in the repertoire of the organism otherwise. By reinforcing a series of successive approximations, we bring a rare response to a very high probability in a short time. . . . The total act of turning toward the spot from any point in the box, walking toward it, raising the head, and striking the spot may seem to be a functionally coherent unit of behavior; but it is constructed by a continual process of differential reinforcement from undifferentiated behavior, just as the sculptor shapes his figure from a lump of clay.
Reinforcement, in Skinner's case, meant delivery of food to reward a particular behavior.
The successive approximations reinforced are increasingly accurate approximations of a response desired by a trainer. As training progresses the trainer stops reinforcing the less accurate approximations. For example, in training a rat to press a lever, the following successive approximations might be reinforced.
- simply turning toward the lever will be reinforced
- only stepping toward the lever will be reinforced
- only moving to within a specified distance from the lever will be reinforced
- only touching the lever with any part of the body, such as the nose, will be reinforced
- only touching the lever with a specified paw will be reinforced
- only depressing the lever partially with the specified paw will be reinforced
- only depressing the lever completely with the specified paw will be reinforced
The trainer would start by reinforcing all behaviors in the first category, then restrict reinforcement to responses in the second category, and then progressively restrict reinforcement to each successive, more accurate approximation. As training progresses, the response reinforced becomes progressively more like the desired behavior.
The culmination of the process is that the strength of the response (measured here as the frequency of lever-pressing) increases. In the beginning, there is little probability that the rat would depress the lever, the only possibility being that it would depress the lever by accident. Through training the rat can be brought to depress the lever frequently.
Successive approximation should not be confused with feedback processes as feedback generally refers to numerous types of consequences. Notably, consequences can also include punishment, while shaping instead relies on the use of positive reinforcement. Feedback also often denotes a consequence for a specific response out of a range of responses, such as the production of a desired note on a musical instrument versus the production of incorrect notes. Shaping, on the other hand, involves the reinforcement of each intermediate response that further resembles the desired response.
Shaping is used in two areas in psychology: training operant responses in lab animals, and in applied behavior analysis or behavior modification to change human or animal behaviours considered to be maladaptive or dysfunctional. It also plays an important role in commercial animal training.
Shaping assists in "discrimination", which is the ability to tell the difference between stimuli that are and are not reinforced, and in "generalization", which is the application of a response learned in one situation to a different but similar situation.
Autoshaping (sometimes called "sign tracking") is any of a variety of experimental procedures used to study classical conditioning in pigeons. In autoshaping, in contrast to shaping, food comes irrespective of the behavior of the pigeon. In its simplest form, autoshaping is very similar to Pavlov's salivary conditioning procedure using dogs. In Pavlov's best-known procedure, a short audible tone reliably preceded the presentation of food to dogs. The dogs naturally, unconditionally, salivated (unconditioned response) to the food (unconditioned stimulus) given them, but through learning, conditionally, came to salivate (conditioned response) to the tone (conditioned stimulus) that predicted food. In autoshaping, a light is reliably turned on shortly before pigeons are given food. The pigeons naturally, unconditionally, peck at the food given them, but through learning, conditionally, came to peck at the light source that predicts food, in most experiments, a lighted key.
Autoshaping provides an interesting conundrum for B. F. Skinner’s assertion that one must employ shaping as a method for teaching a pigeon to peck a key. After all, if an animal can shape itself, why use the laborious process of shaping? Autoshaping also contradicts Skinner’s principle of reinforcement. During autoshaping, food comes irrespective of the behavior of the animal. If reinforcement were occurring, random behaviors should increase in frequency because they should have been rewarded by random food. Nonetheless, key-pecking reliably develops in pigeons, even if this behavior had never been rewarded.
But, the clearest evidence that Autoshaping is under Pavlovian and not Skinnerian control was found using the omission procedure. In that procedure, food is normally scheduled for delivery following each keylight presentation, except in cases in which the bird actually emits a pecking response at the key, in which case food is withheld. Here, if the behavior were under instrumental control, the pigeon would stop pecking the key, as pecking is followed by the withholding of grain. But, pigeons persist in pecking the key for thousands of trials (a phenomenon known as negative automaintenance), unable to cease pecking to the lit key even when it costs them food to do so.
- Animal testing
- Behavior therapy
- B. F. Skinner
- Operant conditioning
- Applied behavior analysis
- Society for Quantitative Analysis of Behavior
- http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1285014 Peterson, G. B. (2004)A day of great illumination: B. F. Skinner's discovery of shaping. Journal of the Experimental Analysis of Behavior, 82: 317–328
- Skinner, B. F. (1953). Science and human behavior. pp.92-93. Oxford, England: Macmillan.
- Barbara Engler "Personality theories"
- Brown, P., & Jenkins, H. M. (1968). Auto-shaping of the pigeon’s key peck. J. Exper. Analys. Behav. 11:1-8.
- see Sheffield, 1965; Williams & Williams, 1969
- Killeen 2003