Putting the nail in the positive/negative/reinforcement/punishment coffin.

Lessons
 Posted by jeremy on August 20th, 2009

Many texts explain basic operant conditioning and behaviorism in Skinnerian terms. They usually can’t avoid mentioning B. F. Skinner in the first paragraph or two, but then they explain the difference between reinforcement and punishment like this:

decreases likelihood of behavior increases likelihood of behavior
presented positive punishment positive reinforcement
taken away negative punishment negative reinforcement

Basically, the distinction between punishment and reinforcement is this: Something that decreases behavior is punishment, while reinforcement increases behavior. They can be positive – if the stimuli is administered – or negative – if the stimulus is removed.

There are three problems with this model: First, as Blackman (1974) put it, this definition is “rather different from that prompted by common sense.” This model’s definition of punishment is wildly different from the vernacular definition that dates to the 12th century. (According to the Random House Dictionary, poena, the root of punish, is related to penalty and pain.)

Second, despite claims to the contrary, there is no functional distinction between “positive punishment” and “negative reinforcement.” For example, two psychologists tried to explain:

You have an electronic fence around your yard. Your dog wears a collar that gives him a small electric shock every time he tries to cross the wire buried in the yard. The aversive consequence is the shock, the behavior that is reduced or eliminated is walking or running across the barrier. After several exposures to the shock the dog learns by punishment learning not to run or walk across the barrier.

Negative reinforcement is a different way of learning where a behavior is made more likely to occur because some unpleasant consequence is removed or avoided. Where punishment decreases or eliminates a behavior, negative reinforcement has the opposite effect of increasing behavior.

So when the dog stays in the yard (an increased behavior) because he is not getting shocked (an unpleasant consequence avoided) that’s negative reinforcement. But when the dog doesn’t leave the yard (decreased behavior) because he gets shocked when he does (unpleasant consequence administered), it’s punishment. Got that?

Third, and most importantly, this model of positive/negative/reinforcement/punishment does not approach Skinner’s theories or his experimental results. I do not mean that the model is wrong (though my second point reveals a fundamental weakness), but it’s problematic that many texts present Skinner as the father of operant conditioning, and then present ideas he rejected without noting the incongruence.

Here is how Skinner himself described positive reinforcement in Walden Two (pp. 259-260; emphasis added):

[If] it’s in our power to create any of the situations which a person likes or to remove any situation he doesn’t like, we can control his behavior. When he behaves as we want him to behave, we simply create a situation he likes, or remove one he doesn’t like. As a result, the probability that he will behave that way again goes up, which is what we want. Technically it’s called ‘positive reinforcement’.

While this explanation may fit into the table’s classification of positive reinforcement, it encroaches on “negative reinforcement” because he explicitly states that removing an unpleasant stimulus is also positive reinforcement. For Skinner, the positive/negative dichotomy referred to whether the consequence of the behavior was appetitive (pleasant) or aversive (unpleasant).

Skinner also explained that it is dangerous to place punishment, “removing a situation a person likes or setting up one he doesn’t like,” in symmetry with positive reinforcement. Punishment does not lead to lasting behavioral modification and it presents a slew of socially and individually damaging side effects. In fact, I believe the reason Skinner preferred the term punishment over negative reinforcement – even though he saw them as synonymous – was that he wanted to emphasize that they are not on the same plane.

So, where did this commonly-accepted model originate if not from the father of behaviorism? As Holth pointed out in a 2005 article in Behaviorism Analyst Today, Azrin and Holz first proposed this newer, less effectual model in 1966 (in a chapter on punishment in a volume edited by Honig). For them, giving a cookie to a child when he doesn’t wet his bed is actually punishment because its purpose is to reduce a behavior.

As ridiculous as that sounds, it is more preposterous that this view would be ascribed to Skinner.

Comments are closed.