Several posts in this series have suggested “starting over” if your dog makes an error within a chain. How exactly do we start over? Do we use a No Reward Marker (NRM) to tell the dog that she was wrong? At what point in a dog’s career do we begin this process? And…what exactly is a no reward marker?
A No Reward Marker (NRM) is usually a word or sound that tells your dog that whatever they just did has ended the possibility of reinforcement. NRM’s are supposed to be neutral in tone and free from disapproval; they simply “mark” the moment of an error. NRM’s can also be hand signals or body movements that mean the same thing, namely, that reinforcement will not happen.
It’s like a “reverse” clicker. It marks an exact moment in time, but instead of marking that a reinforcer is coming it takes the possibility of one away.
The paragraphs above gives you the basic idea; it’s a neutral communication, right?
No, not really.
In addition to marking a moment in time, a NRM also interrupts your dog’s behavior, and an interruption communicates that you didn’t like what happened at that exact moment. You have an opinion and it’s not neutral at all!
There’s really no way around this. A NRM is a marker of both the dog’s specific behavior and the opinion of the person giving it. Remember that your teammate is a dog and not a chicken; your dog cares if you are pleased with them, and also cares if you are displeased. Your dog knows that you have an opinion about your training time together because 95% of the time you should be emotionally engaged and happy about what you are doing. There is no reason to hide that from your dog and indeed, I think it’s critical to show your joy if you want an engaged picture of teamwork with your dog as opposed to a methodical automaton who could care less about you.
How your dog reacts to that NRM/interruption depends on a lot of factors. Does your dog care more about your opinion and/or reinforcer than what they wanted to do? If the interrupter causes the dog to return immediately for another chance, then your dog cares. If your dog’s response to the NRM is to continue whatever they were choosing to do that you did not like (for example, sniffing a nice spot near the dumbbell that they should have been retrieving), then your dog does not care enough about your opinion or your reinforcers to change their course of action – they are more gratified by their choice. And if your dog stops sniffing but freezes with unhappiness or starts hysterically snapping at your face (stress), then the marker was too harsh.
If your dog cares enough to stop the unwanted behavior and also returns cheerfully to try again, then you’ve used the NRM effectively. That’s good. But what if your dog isn’t absolutely sure what caused that NRM? Or what if your dog isn’t absolutely sure about what they should do to avoid another one on their next attempt?
If your dog doesn’t know how to complete the chain correctly, possibly because you never taught that specific behavior correctly in the first place, then your NRM communicates disapproval but without the information needed to make a better decision. That can lead to shutting down or uninspired work very quickly.
Imagine you are a child learning to read and your teacher quietly told you “no” (without emotion) when you got a word wrong. Your teacher may well think that the problem is of the same type as the ones you just did correctly (maybe the word is “cat” when the last one was “rat”) but you just don’t see it – you think “r” looks like “c” because you never learned those sounds thoroughly, and now you don’t know how to decode the word. The teacher’s unemotional “no” has informed you that you are wrong but you’re not sure what is right. Are you having fun with your lesson, or are you deflated?
What happens now depends on your temperament. If you’re enthusiastic, motivated and relatively sturdy, you’ll just keep trying to sound out that word until you get it right. But if you’re more passive by nature, less enthusiastic or never cared about reading in the first place, then it’s quite likely that you won’t try again. Instead, maybe you’ll stare intently at a dirty spot on the floor; a bit of classic human avoidance.
How about your dog? You asked for a dumbbell retrieve but he made an error so you used a NRM to end the chain.
It’s about the same. If your dog loves to work or is highly motivated by the possible reward, he may well keep trying. But if he’s more fragile but training or temperament or not all that interested in your reinforcers, then maybe he’ll start sniffing a random spot on the floor; a bit of classic doggy avoidance.
What’s your plan now, if your dog/child continues to study that random spot? Are you going to escalate from a NRM to a punisher, such as marching in (physical intimidation) or raising your voice (mental intimidation)?
How about if the dog or child just ignores your NRM altogether and continues to proceed as if nothing were wrong? The child reads on in spite of the incorrect word and the dog continues to fetch in spite of the quick sniff. The result is the same really; you haven’t accomplished your goal of a correct behavior chain.
With a reward marker (RM) such as a clicker, if you mis-click then your dog gets a free cookie. While unfortunate for your training goals, it does not erode your dog’s CER towards working with you, it’s just poor training. But with a NRM there is a more active risk. It’s not just the behavior itself, it’s about you and your dog and your relationship. That’s your foundation on the line.
Now that I’ve scared the crap out of you, I’ll add that there IS a place for NRM’s in training, but before I can go there you need to understand that NRM’s carry risk. So now you know.
NRM’s are almost always a polishing technique and not a teaching one because they tend to depress behavior, and the last thing you want to do with a dog that that is learning is to shut down their desire to try.
Let’s go back to our above examples to make that more clear.
Once again you are a child learning to read. You’ve about had your fill of reading for the day, and you’re ready to go play with your friends outside but first you have to finish one paragraph of reading. As a result, you hurry through as fast as you can and when you get to “rat” you say “cat” and try to keep going, hoping that your teacher won’t notice. You have been thoroughly trained on your sounds, and you are well aware that “c” and “r” have different sounds. But your teacher is fast…as you skim over the word she catches you and marks that with a ‘no’. Quietly. You know that she’s going to bring you back to the word so you save her the trouble and find it yourself. You slow down just enough to say it correctly and then you show a bit more care on the rest of the paragraph so there will be no more interruptions. You finish your reading and you are released from your lessons to your friends waiting outside.
And your dog? Your dog is sent to retrieve that dumbbell but on the way your dog makes a quick side trip to sniff. At that exact moment you mark the dog’s incorrect choice with a quiet “no”. Your dog returns, knowing that the delicious cookie sitting in your pocket will not happen this time. The trainer gets the dumbbell and the exercise is restarted. This time, as the dog goes past the nice sniffing spot, he carefully avoids it and completes the exercise correctly. He is rewarded, either with a reinforcer or with the chance to move to another exercise which may well be rewarded if that is the end of the chain.
My first rule of thumb for the appropriate use of NRM’s is to get your dog trained to fluency on the individual pieces before you pull those behaviors together into chains. And then….when your dog demonstrates fluency on those chains, and only then….we can use NRM’s when the dog makes a mistake.
In the correctly executed version of the above example, the NRM was information. Your dog (or child) already knows what to do. Now you’ve provided more information by telling your dog what is NOT correct. You’ve also made it clear that you expect it to be done correctly if the dog or child has any hope of receiving reinforcement. That’s fine, and appropriate, but the learner must be clear on what is right before you spend any energy on what is wrong.
In the next blog we’ll look at a second type of NRM; a “cheerful” verbal/physical interrupter.