top of page

The science behind dog training

Updated: Apr 4

If you know anything about Psychology you'll know that dog training and behavioural training comes from studies in Psychology looking at learning theory and behaviour in a whole host of animals and humans. Having studied psychology myself for over two years I was surprised how much cross over there was from human psychology and dog psychology and that the psychological and scientific studies I studied on human behaviour and learning theory are also what are used in dog learning and behavioural training. I'm going to mainly focus on the fundamental law of learning theory which is the initial studies that were carried out that became the foundation which all learning and behavioural theory has been built upon.

 

If you don't want to read it all feel free to jump to the bottom and just read the practical applications from the studies reports and the section on how this is all relevant to dog training today.

 

Lets start with one of the original studies carried out by Ivan Pavlov, a Russian Physiologist in 1902. Pavlov carried out studies on dogs teaching them the association of a bell being rung meaning that food was going to be presented. At the end of the study, Pavlov had created a physical response in the dog (salivating) when the bell was rung meaning that the dog had learnt an association between the two variables. This was called Classical Conditioning, simply meaning when two stimuli are linked together to produce a learned response in a person or animal. Pavlov's study concluded that you could shape a dogs behaviour using classical conditioning to create a required response.




 

So for example, if you are teaching your dog how to sit you use a treat, get them into the required position and then mark the behaviour with 'sit' and a treat as a reward. The dog then learns that the position of sitting when given the command 'sit' means they are rewarded.

 

Classical conditioning emphasises the importance of learning from the environment and supports nature over nurture. Classical conditioning is also scientific as it is based on empirical evidence carried out by controlled experiments that undeniably prove it's existence.

 

Now we come to the more complex part. In 1938 a Psychologist called B.F.Skinner started to do some further experiments as he believed that Classical conditioning was far too simplistic to be a complete explanation of complex human behaviour. He started to look at understanding behaviour by looking at the causes of an action and it's consequences. This was named Operant Conditioning.


Operant conditioning is a method of learning that occurs through rewards and punishments for behaviour. Through operant conditioning, an individual makes an association between a particular behaviour and a consequence. Skinners research was based on and build upon Thorndike's law of effect in 1898 which said:

 

Behaviour that is followed by pleasant consequences is likely to be repeated and behaviour that is followed by an unpleasant consequences is less likely to be repeated.

 

Now i'm not going to go into huge amounts of detail on Skinners experiments and just summarise but I have put in at the bottom links to a more detailed write up on all of the studies mentioned if you would like to read up on them more yourself.


So Skinner introduced a new term into the Law of Effect - Reinforcement. Behaviour which is reinforced tends to be repeated (i.e. strengthened) and behaviour which is not reinforced tends to die out (i.e. weakened). Skinner studied operant conditioning by conducting experiments using animals which he placed in a 'Skinner box'. (You can read about the full study in the link below)



Skinner identified three types of responses that can follow behaviour:

1. Neutral operants: responses from the environment that neither increase nor decrease the probability of a behaviour being repeated.

2. Reinforcers: responses from the environment that increase the probability of a behaviour being repeated. These can be either positive or negative.

3. Punishers: responses from the environment that decrease the likelihood of a behaviour being repeated. Punishment weakens behaviour.

 

For example, if you were younger and tried smoking at school, and the main consequence was that you got in with the crowd you always wanted to hang out with, you would have been positively reinforced (i.e. rewarded) and would likely repeat the behaviour.

If however, the main consequence was that you were caught, told off, suspended from school and your parents became involved you most certainly would have been punished. As a result of this you would be much less likely to smoke again.

 

Positive reinforcement:

Skinner showed through his experiments that Positive Reinforcement strengthens a behaviour by providing a consequence an individual finds rewarding. For example, if your teacher gives you £5 every time you complete your homework (i.e. a reward) you will be more likely to repeat this behaviour in the future, which means the behaviour of completing your homework will be strengthened.


Negative reinforcement:

The removal of an unpleasant reinforcer can also strengthen a behaviour. This is known as negative reinforcement because it is the removal of an adverse stimulus which is 'rewarding' to the learner. Negative reinforcement strengthens behaviour because it stops or removes an unpleasant experience.

For example, if you do not complete your homework, you have to give your teacher £5. You will complete your homework to avoid paying £5, thus strengthening the behaviour of completing your homework.


Punishment:

Punishment is defined as the opposite of reinforcement since it is designed to weaken or eliminate a response rather than increase it. It is an aversive event that decreases the behaviour that it follows.

Like reinforcers, punishment can work either by directly applying an unpleasant stimulus like a shock (positive punishment, i.e. adding) or removing a potentially rewarding stimulus like withholding a treat to punish undesirable behaviour (negative punishment, i.e. removing).


Now there can be some problems with only using punishment like;

-punishment behaviour isn't forgotten but suppressed so behaviours can return

-can increase aggression by making aggression a coping mechanism

-can create fear that can generalise to undesirable behaviours e.g. fear of school

-does not necessarily guide toward desired behaviour - reinforcement tells you what to do, punishment tells you what not to do


However there are limited studies on the affects of punishment only training in dogs.


 

We are going to briefly now talk about the schedules of reinforcement which is essentially the scientific likelihood of the response rate to the training and the likelihood of the training method being continued (called the extinction rate).

Behaviourists discovered that different patterns (or schedules) of reinforcement had different effects on the speed of learning and the extinction which was experimented by Ferster and Skinner in 1957.


They found that different ways of delivering reinforcement had affects on:

1. The response rate - the rate (or speed) at which the learner worked

2. The extinction rate - the rate at which the learner gives up


Now i'm only going to talk about the reinforces that are relevant to dogs but you can look at the full list on the link at the bottom if you'd like to.

These are:


A - Continuous Reinforcement

A learner is positively reinforced every time a specific behaviour occurs.

Response rate is SLOW.

Extinction rate is FAST.


^This essentially means that the association between the reward and wanted behaviour is slow and the extinction rate of the learner gives up is super quick.


B - Fixed Ratio Reinforcement

Behaviour is reinforced only after the behaviour occurs a specified number of times e.g. a child is given a star for every 5 words spelled correctly.

Response rate is FAST.

Extinction rate is MEDIUM.


^Again the response is fast but the learner gives up eventually.

 

Behaviour modification

Behaviour modification is a set of therapies and techniques based on operant conditioning (using positive reinforcement and punishment) where the main principle compromises changing environmental events are related to a person's behaviour. For example, the reinforcement of desired behaviours and ignoring or punishing undesired behaviours.


This isn't however as simple as it sounds - always reinforcing desired behaviour, for example, is basically bribery. However always punishing undesired behaviour is cruel.


Skinner in 1951 came up with a method called Behaviour shaping which has been a huge contribution to behaviour modification in both humans and dogs. Skinner argued that the principles of operant conditioning can be used to produce extremely complex behaviour if rewards and punishers are delivered in such a way as to encourage the learner closer and closer to the desired behaviour each time.

 

Practical applications


Skinners findings with operant conditioning are widely used in schools and is relevant to shaping skill performance.


For example, if a teacher praises a student for getting the right answer (positively reinforces) the child is more likely to carry on with the behaviour.

However if a child doesn't complete their homework, for example, they are told off and given a detention (a punisher) which means the behaviour of not completing their homework is reduced.


Another example of operant conditioning in the real world is the law. If someone breaks the law and is punished either through being fined or given a prison sentence, they are less likely to repeat that behaviour again.


It is important to remember that Psychology is scientific and that the experiments mentioned were carried out in laboratories with controlled conditions. The findings of these studies are relevant not just to human learning and behaviour but also for dogs and other animal species and operant and classical conditioning is widely observed throughout the animal kingdom and in society today.

 

So how is this relevant to dog training today?


The four quadrants of learning

The principles of operant and classical conditioning are the fundamentals of learning theory for humans and dogs and are completely scientific.


However, when it comes to science you can't ignore parts of the studies or the results as they come as a whole.


Classical conditioning is too simplistic on it's own and operant conditioning has room for error if not used with all four quadrants found in Skinners study.



When is comes to basic dog training like trick training, classical conditioning is great! The more you reinforce a behaviour with a reward to make the association, the better your dog will understand the command and be more likely to give the required outcome.


However, when it comes to behavioural modification, operant conditioning is the most effective and longest lasting option. As Skinner concluded in his study, to change an unwanted behaviour, you need to use a mixture of positive reinforcement and punishment in order for the dog to fully understand what is required by them.


Using one without the other is like only giving someone half of the information they need to do something they have never done before and get to the right answer on their own.


If you only ever tell your dog they are doing good, how will they know they are doing something wrong?

If you only tell your dog they are doing bad, how will they know when they are doing something right?


For operant conditioning to work, it requires a balanced approach to behavioural modification in order for the dog to have all of the information required to reach the desired outcome. So this means using some sort of punisher to tell the dog when they are wrong and a reward when they are doing something right.


Now understand that punishing our dogs is not about beating them and causing them to be fearful and shut down. Simple things like a little pop on the lead when they pull and then redirecting them to the correct position and rewarding with a treat is a very effective way to teach your dog to walk to heel without using punishment in an abusive way. Using a correction of some sort and then redirecting to a wanted behavior allows dogs to understand both what is not wanted and what is and allows learning to be clear and productive.


As with everything, too much of one and not the other causes issues which is why there are four quadrants in learning theory. All have their place and all are there for a reason. Missing one or two out isn't doing anyone any favours and can actually be detrimental to learning which Skinner talked about in his findings.


We as dog owners and dog trainers need to be doing everything we can and using proper, scientific training methods to help to teach our dogs how to live in our human world. The fundamentals of learning theory and behavioural modification training have been around since the 1900's and they haven't changed, further studies have simply added to our understanding.


If your teacher never told you the answer you got in a question was wrong, how would you have ever known to look for another answer? In order for our dogs to learn what is right and wrong, we have to tell them. Sadly, dog's don't speak English so we can't simply talk to them like we would another human that what they are doing is wrong and that they shouldn't do it. We need to be communicating to our dogs in a way that they understand, in a language they know.


That language is essentially classical and operant conditioning. Your dog will thank you for giving them all of the information they need to be able to make the right choices in life and live stress free knowing where the line is and where it stops.


So next time you go to train your dog something or get a trainer to work with your dog, make sure you understand how dogs learn through classical and operant conditioning. If you don't, ask your trainer to explain it to you. If your trainer can't explain it to you, misses parts out or says certain parts aren't important, ask why. Science is science right? The results speak for themselves, you're either following science, all of it, or you're not. It's that simple.

 

References:

bottom of page