Scroll to read more

Using Data Science to: Keep raccoons out of the house.

My girlfriend and I have 3 wonderful cats.  They spend about 90% of their time inside sleeping and eating, but with that remaining 10%, we let them wander around the neighborhood (despite our best efforts, we cannot get them to grasp the concept of property lines).

Most nights they are great about coming home when we call them, but every once in a while, one of them is having too much fun and can’t be bothered.  In this case, we face a dilemma.  We could let them sleep outside, or we can leave the cat door open all night. If we choose the latter, then all three can come and go as they please and they tend to get into more trouble that way.

Leo after a night of exploring

I have a soft spot for them, so I used to always default to leaving the cat door open.  This was never a problem, until one night I awoke to what sounded like a cat dinner party going on in the kitchen.  Now two of the cats are rather large, so I assumed that they were just going to town on their dry food. 

But after about a minute, I realized that there was no way that what I was hearing could be cats.  I got up and went to the kitchen to find 2 raccoons shoveling cat kibble into their mouths.  I chased them out of the cat door and locked it, and thought that this was the end of it.

A few weeks later, Kathy and I were sitting on the couch in the middle of the day when we hear one of the cats start to growl.  I look over and see that one of the raccoons walking around the kitchen like they owned the place.  So I chased them back out again, and realized I would have try something different.

I wondered if I could build some sort of raccoon alarm that would go off if a raccoon tried to come through the cat door.  So I turned to Data Science to see what I could do.

The Plan

I knew that this would be a case best handled by computer vision, but I was not sure which approach I wanted to take.  Now days, for most Machine Learning tasks there are well built “libraries” available to make the process much faster and easier. 

For computer vision I have used libraries such as opencv, TensorFlow, and Pytorch, so it came down to picking the right one for the job.  My eventual plan was to get this running remotely on some sort of photo eye. I did some research and found that Google offers a Raspberry Pi kit that can run custom TensorFlow models, so this was the library I decided to go with. (check it out here: https://aiyprojects.withgoogle.com/vision/)

Raspberry Pi Vision kit

The Data

The next step was to gather some data.  Since I was going to be using a Deep-learning model, more data is always better.  However, as far as computer vision goes, this is actually a pretty easy problem.  Ultimately, I just need to train the model to differentiate between two options: Raccoon or Cat.  So I turned to Kaggle to see what kind of data was already out there. 

*If you’re not familiar, Kaggle is a site that hosts Machine Learning Competitions, datasets, and discussion boards.  Unless the problem is hyper-specific, there is probably a dataset for what you need already.  https://www.kaggle.com/

Immediately I found the “Trash Panda Detection” dataset (https://www.kaggle.com/andrewmvd/racoon-detection), containing 196 images of Raccoons.    This was on the lower end of what I was hoping to find, and I knew that I could also use data augmentation techniques to make the training set a bit more robust. 

However, I decided to scrape about 200 more images from google images to have a better starting point. Once I had around 400 raccoon images, I pulled 400 cat images from a previous project for a total of ~800 training images balanced across two “classes”.

With these examples in hand, the last thing I had to do was generate some additional examples using “data augmentation”. Basically, I took the existing images and slightly altered them using mirror flips and/or random rotations. They are still recognizable, but they will look novel to the model as it is training, giving me even more training examples.

Augmentation Sample

Finally, I split the data into three groups. The first was the “training data”, which was about 75% of my total examples. These are the ones the model would use to learn. The next was “validation data“, which was about 15% of my data so I could evaluate my progress and accuracy while I trained. Then finally I held out 10% as “test data” to evaluate my model on at the very end.

The Model

With the data in hand, all I had left to do was train the model.  I am a big fan of “transfer learning” which is when you take a robust model that was trained for a similar task, and you “fine tune” it to solve your specific task.  I decided to use MobileNetV2 model which was pre-trained by Google on 1.4 million images across 1000 classes.

Now for this next part to make sense, you’ll have to know the absolute basics of deep learning, I wrote a 2 minute description to get you up to speed. Check it out here: https://thaddeus-segura.com/nns-in-200words/

So now that you’re an expert on neural nets, lets talk about the details of my training process.

As I mentioned, I used MobileNetV2. This model is pretty big… it has 155 layers and 2,257,984 parameters. Because its pre-trained on a much more complex task, I really didn’t need to tweak these parameters much. So instead I just added another small layer to the end of it, resulting in 1,281 trainable parameters.

Above is the diagram for my model. The part highlighted in pink is the trainable part, and all those tiny words in the curly brace are the 155 layers.

Step one was to do a little bit of training on that final layer. I gave the model 10 rounds to learn on all of my training examples (each round is called an “epoch”). This got me to about 92% accuracy, which isn’t bad, but I knew I could do better.

The next step was to do a little bit more training, but this time to tap into some of those extra layers and parameters. I decided to unfreeze everything from layer 100 and beyond.

The areas highlighted in pink are what were set to trainable. (*technically there are no parameters in Pooling or Dropout layers to be trained). This gave me 1,863,873 trainable parameters, which was an increase of 145,501%.

This would give me a lot more predictive power but I could run into the problem of “overfitting”, which is when the model basically starts to memorize the training examples, and then does worse on the test examples.

To try to avoid this, I set the model to learn 20x slower than before, meaning that all adjustments to the parameters would be much smaller. Then I trained the model on only 5 additional epochs. This took me to an accuracy of 98.66% on my validation data. Great, but did it overfit?

Starting with the top chart, we can see that the accuracy of the model (orange line) stalled out after just the 2nd epoch when I just used the tiny training layer. Once I started the fine tuning (green line), the accuracy shot up more.

The bottom chart is the one we care about for overfitting, and we can normally identify it if the blue line keeps going down, while the orange line goes up. Here we see no evidence of that, and I stopped training right when they converged so we are good on this front as well.

But the ultimate test is to see how it does on the test set (these are images it has never seen before). So I fed them in, and it came back with 100% accuracy!

*There was probably some luck involved here given how small my test set was, but I feel confident that the true accuracy is in the range of 98% – 100%.

Finally, lets look at some of the predictions it made on the test set.

Predictions

Conclusion + Next Steps

The question I set out to answer with this post is if I could use data science to keep raccoons out of the house.  Given the accuracy of the model, I feel like the answer is probably going to be a yes.  But I’ll have to do a little more work first.

The data I used was great as a toy example, but in practice I would need to get images of the animals as the photo eye would see them. So when I get ready to deploy this, I will gather the training data for the cats using the photo eye, but I will have to manually assemble acceptable raccoon training images from the web. I’ll post an update once the kit arrives and I finish this all off.

While this was fun, a model like this could have many other applications.  For example, the model could be deployed on products like Ring or Nest, and trained to recognize the home-owner and let them in if they forgot their keys. Obviously any of these applications come with other risks and major privacy concerns, so for now I’m just going to stick to the raccoons.