Reinforcement in psychology and its positive and negative examples

B.F. Skinner, one of the main theorists of behaviorism, defined reinforcement as a type of learning based on associating behavior with consequences that follow from it, thereby increasing the likelihood of its repetition. When the consequences are negative, we talk about punishment, and when they are positive, we talk about support or praise. Within reinforcement learning, experts distinguish between two types of consequences: positive and negative.

While positive reinforcement occurs when a behavior is associated with something approved, negative reinforcement is about avoiding or withdrawing an aversive stimulus. Let's look at the main features of both procedures and talk about how you can use reinforcement in everyday life.

In this article:

What is positive reinforcementExamples of positive reinforcement in the familyWhat is negative reinforcementPrimary reinforcers - satisfaction of basic needsSecondary reinforcers - reward is not immediateMixing different reinforcersUnwanted positive reinforcers

What is positive reinforcement

Photo by Tim Mossholder on Unsplash
In positive reinforcement training, achieving a behavior is associated with pleasant consequences. It doesn't have to be an object, not even a material one.

Eating, stroking, smiling, giving a verbal message, or producing pleasant emotions can be considered positive reinforcers in many contexts.

A mother who congratulates her young daughter every time she uses the toilet correctly promotes learning through positive reinforcement.

The same thing happens when a company gives an economic bonus to its most productive employees, and even gaming winnings can be treated this way. But in psychology, the concept of "positive reinforcement" refers to the difference that follows behavior. Positive reinforcement is the process by which the learner creates associations.

In technical terms, we can say that with positive reinforcement, there is a positive relationship between a particular response and a pleasant stimulus. Awareness of this situation motivates the subject to perform actions in order to receive a reward (or reinforcement).

Method 6: Reinforce behavior change

You reinforce every behavior other than the unwanted one. For example, a child asks you for an expensive gift that you are not going to give, and you have already reported this. And he whines and whines. You do not react to his whining (use the extinction method) At the same time, it is important not to react. No way. And do not demonstrate your reluctance to discuss this or that topic. If, say, when you whine about a gift, you constantly repeat: “I don’t even want to discuss this with you!” or “Well, how much can you whine, you see, I don’t respond to your requests!” - the child sees perfectly well that you react and how! But as soon as the child starts talking about something else, you quickly respond to it. It is important to reinforce the change of topic . Immediately notice this and support it. Don't miss this moment.

Examples of positive reinforcement in the family

Positive reinforcement should be used in doses.

For example, there are many different situations in which parents praise their children. However, for the positive effects of reinforcement to be meaningful, you should not expect a reward for every little thing.

In the long term, it goes without saying that you should clean up after yourself from your desk or put away your trash. However, this does not necessarily mean that praise should not be given at this stage.

See how positive reinforcement works in the family and how it can be implemented in different ways:

In the evening, the child clears the table, even if he is not asked to do so. As a direct consequence, he is allowed to stay awake 10 minutes longer.
Your child is cleaning his room. Then praise him and show him your joy.
If a school report from a teacher is positive, many parents reward their child with money or a toy.

If you want to use positive reinforcement to your advantage, make sure that the appropriate reward comes as soon as possible.

If too much time passes between the action and the reward, the connection is lost and the desired effect (repetition of the behavior) does not materialize.

Positive reinforcement method

Weaning

Karen Pryor also writes about the process of unlearning. Those. when there is some unwanted behavior that you want to get rid of. She gives 8 principles of unlearning. The first four of them are negative, and the second are positive. As you can guess, the second half of the principles work better and produce lasting results.

Kill, delete, get rid of. Simply remove the source or limit it so that it cannot physically perform the unwanted action.
Punishment. Put a child in a corner, hit a dog with a stick, deprive a programmer of a bonus
Negative reinforcement
Extinction. You don't pay attention to unwanted behavior. You don’t reinforce it in any way: neither negatively nor positively.
Development of incompatible behavior. Develop new behavior that will be incompatible with the unwanted one.
Ensure that this behavior occurs on a signal, and then gradually remove this signal
Formation of absence. Anything except unwanted behavior is reinforced.
Change of motivation. Determine why and why the unwanted behavior occurs and try to replace the goal of the behavior with a more necessary/correct one.

PS:

Karen Pryor writes a lot about animal training, but these same principles can be applied just as successfully in our everyday lives. I personally noticed while reading the book how well positive reinforcement worked for me personally. I can say that by mastering the science outlined in the book, you can really get +1 to communication, as announced on the cover of the book.

What is negative reinforcement

Unlike what happens with positive reinforcement, with negative reinforcement, an instrumental response involves the disappearance of an aversive stimulus, that is, an object or situation that prompts the subject to run away or try not to contact it.

From a behavioral point of view, the reinforcement of this procedure is the disappearance or absence of the aversive stimulation. The concept "negative" refers to the fact that the reward is not the receipt of the stimulus, but the absence of it.

With negative reinforcement, an undesirable behavior is prevented from occurring by an aversive stimulus. For example, when a person suffering from agoraphobia deliberately does not use public transport in order to avoid an attack of fear.

The next stage of such learning is the disappearance of the aversive stimulus, which is present until the subject changes the undesirable behavior.

It's like how an annoying alarm clock stops at the touch of a button, like a mother buying her baby something to stop him from crying, or giving her a painkiller when he's in pain.

Now let's talk about some nuances.

Production process

When the subject is already doing what is needed and simply needs to reinforce this behavior, everything is more or less clear.
But what to do if the desired behavior does not exist yet and it is as if there is nothing to reinforce? Development consists of using the slightest tendency towards the desired behavior and step by step moving it towards the goal. Break the final goal into a series of sequential, smaller goals. Find some behavior that is happening now as a first step. It often happens that the subject can perform the desired task (or part of it) by accident. In this case, you need to be sure to notice this behavior and reinforce it. Below are 10 rules of development, which the author examines in detail. A detailed description will not fit within the scope of this article, but you can familiarize yourself with them superficially.

Raise the criterion little by little so that you always have the opportunity to complete what is required and receive reinforcement.
Practice one thing at a time. Don't try to work on several criteria at the same time.
Before moving on to leveling up, reinforce the current one
When introducing new criteria, temporarily relax the old ones
Plan your training program so that you are always ready for dramatic progress in your training
Do not change trainers while developing a specific skill.
If one way of working out doesn't bring success, find another way. A lot of them
Don't finish training without providing positive reinforcement. This is tantamount to punishment.
If a skill deteriorates, quickly go through the entire previous learning process, giving reinforcements
End your workout on a high note. The end of training should be joyful, not sad.

Primary Amplifiers – Satisfying Basic Needs

However, in practice, with reinforcement, not everything is so simple, because many issues are considered subjectively. A very striking example is the opinion that a baby can be “hand trained” if you give him a parental hug at the first cry.

But it is important to remember: in the context of psychology, the main reinforcers are those that are directly focused on the needs of the person.

Hunger and thirst, as well as love and intimacy, are the most important factors for babies and toddlers. However, they should never be made conditional so that children can develop the basic trust they need.

Positive and negative reinforcements can only be used as additional aspects beyond the usual degree of need satisfaction.

There's nothing wrong with after-dinner dessert, sweets, or a hug from your parents.

Method 5. Ensure that unwanted behavior occurs on cue.

And in the future you will stop giving this signal .

There is a parable about a wise old man who valued peace and quiet. A noisy group of children got into the habit of playing near his house. One day the old man came out to the children and gave them a coin, saying that he really liked listening to their cheerful screams. And the next day he gave them a coin again. This went on for some time. And then the old man came out to the children and said that he no longer had money for them. The children replied: “Are we idiots – screaming is free for you?” and left.

The child is noisy and angry. Invite him to make as much noise as possible with you on command. Do this a couple of times on command. First of all, it's fun and unusual. Secondly, such an activity requires a lot of energy and gets tired quite quickly. And then don’t give such a command. Or the child makes a mess in the room and throws his things around. Agree to make as much of a mess as possible in the room within 5 minutes. Perhaps the child did not notice his scattered things at all before. Now he will notice. After he (perhaps with your help) restores order, do not give such commands anymore.

Yes, this requires a certain courage and imagination. Of course, raising children is a challenge and requires creativity.

Secondary reinforcers – the reward is not immediate

Unlike direct need satisfaction, secondary reinforcers are designed to be used only indirectly to individually satisfy a need.

For example, the simplest means at this stage is money. If a person receives a certain amount of money for certain activities, he can later buy something for himself. Again, these could be basic needs: food or clothing.

In families, some parents also use a kind of token system. Positive behavior is marked with an asterisk. If a certain number of stars are collected, the child can choose something from the store.

For example, these could be simple things like eating ice cream after getting five stars or going to the zoo after getting 25 stars.

Method 7: Changing Motivation

This is the best method, but also the most difficult. A change in motivation means that the child no longer wants to do what you consider bad, or wants to do what you consider good. How it works: the child's behavior is related to his needs.

Imagine that your child is irritated and talks to you rudely and boorishly.

And this happens, for example, because you are tired and haven’t gotten enough sleep. Help him organize the right routine, and the irritation will disappear. If his rudeness is due to lack of self-confidence and an attempt to take it out on you, find ways to strengthen his confidence in himself. Or maybe he's being rude because he's upset about a fight with his friends. Support him, show him that you understand his feelings, but don’t bother with advice. This way you can better help him cope with his grief.

Mixing different amplifiers

Many different types of reinforcement are used to facilitate operant learning. They cannot always be classified into a clear category: they are neither negative nor positive.

In general, however, there are three different types of amplifiers:

Material reinforcements.
Social reinforcement: This aspect is characterized by words of praise and recognition. However, a reassuring smile or a friendly nod may be enough.

Photo by Ron Lach from Pexels

Active reinforcements. As a result, the choice is a visit to the zoo, a joint movie evening or a concert.

It is better to avoid material incentives as much as possible.

Negative reinforcement is a manipulative technique.

and the trade union? “A person goes through a certain system of situations, and each of these situations is imprinted on him as his morphological characteristics. And a person is a trace, an imprint of those situations that he went through, and these situations are nothing more than a contemporary social or socio-cultural organization and a person has no choice, except for a choice of this type - either to stay or to leave. He cannot stay because he has already been carried away. And he can’t leave either, because it’s actually a blow to himself. The receipt is that he cannot. What choice is there? Between what and what? Teachers of the late 19th century built the education system as a system that opposed this mechanism of rigid “dragging”, involving a person in activity and in thinking. They recorded the genesis factor. It was objectified in the idea of human development. Starting with Comenius, this line acquires a very powerful sound. And they tried to build an alternative structure in which a person could leave the totality of social relations and go through some situations that are characteristic of a given social structure, no longer forcibly, but as if moving along a certain path. In the idea of a closed school, a person objectifies his experience too, since he clearly understands that a person is completely involved in all this. And no matter how much you invent and show off, the machine works. Therefore, all pedagogy is, in a certain sense, a reflection of life and a reflection of the fact of involvement. And therefore, the introduction of a genetic factor and an attempt to design a sequence of situations that a person must go through. But at the same time, another fact is realized: no matter how much we design all this, a person is still dragged through these situations. Is there another street? There is modern production, which drags a person into the system of division of labor and is interested only in his working part. And between ethics and the theory of activity, ethics and the theory of thinking, a paradox arises, which has been very well discussed - about thinking and activity in one form or another to questions of ethics and self-determination and ask the question: What does this mean? If a person begins to think about a person and abandons the idea of “stop descriptions” (fixing a snapshot of what exists today), then he enters the area of human projects and the human project of collective mental activity, and is removed from the situation of the described paradoxes. And it is the response of the system-thought-activity project to this sum of paradoxes, an attempt to articulate the compulsion of the deployment of activity and freedom. One could, for example, say: only in collective mental activity is a person free. Or, rather, this way: only in collective mental activity can he be free. Whether he is free or not is another question, and all these structures of a person’s existence and his soldering into activity, into a group, into a class and into all other social structures, his dissolution in them - are only the effects of the fact that knowledge works in a certain way , and when knowledge does not create any support for what we call a person due to the presence of structures of modern communication and the availability of knowledge and communication as a factor of democratization: not all knowledge, when launched into the structures of mass communication, produces a shift towards democratization. Only strictly defined knowledge. Because subject knowledge does not democratize, but vice versa. It fragments into professional spheres according to the principle of inclusion, group cohesion, and even the professional thinking provided to us by the Jesuit order (as an inheritance and the opportunity to survive in the conditions of the church hierarchy) - even today it is actually dismantled and melted in the processes of group inclusion of a person in which he with greater joy prefers to renounce human existence and become an element of activity, an element of a group, and who wants to be a person? And it turns out that along this parameter there will immediately be a dividing line, because they don’t want to be people. Why? Well-fed, no need to think. (c)

Challenges Facing Reinforcement Learning

Reinforcement learning, although highly potential, can be difficult to deploy and unfortunately remains limited in application. One of the obstacles to the deployment of this field of machine learning is the dependence on environmental research.

For example, if you deployed a robot that uses reinforcement learning to navigate its environment, it will search for new states and take different actions as it moves. However, it is difficult to consistently take the best actions due to frequent changes in the environment. So, if you set the robot’s environment in the form of your home, then after rearranging objects or furniture, your device will have to be completely adapted to the new conditions.

The time required to properly train with reinforcement learning can limit its usefulness and require significant computing resources. As learning environments become more complex, so do the demands on time and computing resources. These are the problems that reinforcement learning specialists will have to solve in the near future.