We will see in the simplest way possible to train the Mask R-CNN detector. Mask R-CNN is a Convolutional Neural Network (CNN) which not only identifies the object and its position but also draws a perfect polygon of the object. It can be useful to use it to Identify and Measure precisely Objects distance | with Deep Learning and Intel Realsense.

What will we see in this tutorial for the train mask r-cnn?

1 Collect the images e prepare the dataset with the images

2 How to train the dataset with Colab Notebook

3 Take and use model to detect object

screwdriver detected

1 Collect the images e prepare the dataset with the images

In this case, I decided to identify a specific screwdriver model.

screwdriver model Mask R-CNN

The first step is to shoot many in different positions, with different lighting, in the middle of other objects, and with different backgrounds. In practice, the more images you can recover and the more different they will be, the more effective your model will be. To create even more variety I recommend using Image Augmentation (Improve your Dataset) | with Imgaug.

When you have collected enough images we need to move on to annotations, i.e. manually define the position of the object and assign a label. There are several annotation software but I recommend this open-source project https://www.makesense.ai.

When you are on the home page click on “Get Started”

makesense get started

Now you see the page to import all the images

Makesense drop images

At the end of the upload, it will ask you to insert the labels. For the moment there will be only one class so let’s write, in my case, “screwdriver” and then on the “Start Project” button.

screwdriver label for mask r-cnn

Now we can finally make the annotations for train Mask R-CNN but remember, there are different types of annotations, we need Segmentation Annotation then draw a polygon surrounding every screwdriver.

makesense draw Mask R-CNN polygone

When you have annotated all your images, you need to export them. Go to the Actions menu at the top and then Export annotations

Makesense export annotations

Now you have to check the Coco Format box, it is important to select this format because the notebook I wrote only supports this.

export coco format json annotation for Mask R-CNN training

Save your file with the name “Annotations” and put it in a folder with your dataset, just like in the picture.

annotations segmentations

2 How to train the dataset with Colab Notebook

For the training of Mask R-CNN, I have prepared a notebook for google colab that you can download on the download link. If you are not familiar with google colab is a notebook offered by google for online training, just use a Gmail account or Google account and you can load it here for free.

Google Colab Mask R-CNN training

First of all, we need to enable the GPU because we need the graphics card to do the training. Then go to Edit -> Notebook Settings and when the window opens select GPU from the drop-down menu then save.

Google Colab GPU

Now we can start, I divided the notebook into 4 big steps:

  1. Installation Mask R-CNN
  2. Image Dataset
  3. Training
  4. Detection (test your model on a random image)

1. Installation Mask R-CNN

In this first step, Mask R-CNN will be installed on Google Colab in a totally automatic way, you just need to start it. Tensorflow and all necessary libraries will also be installed by the script, you won’t have to worry about a thing

Google Colab installation Mask R-CNN

When you read … done downloading pre-trained model! you can go to the next step.

downloading pretrained model done!

2. Image Dataset

First of all, we have to put our images folder in a .zip archive and we will have a dataset.zip file. We just have to upload the files to the notebook, the easiest way is to click on the folder design on the right and drop the file one at a time: annotations.json
dataset.zip

Google Colab drop file

If the procedure is successful without problems you will have your files uploaded and by clicking on the Play of the second step you will get the total number of images uploaded. Obviously, make sure that the path is correct or that there are no other errors and you will get the result as in the photo below.

image dataset run

Let’s go to the next step that load the images on the memory. In this case, there is only one class so as you can see from the screen everything is correct.

Mask R-CNN load images on the memory

To see if everything is correct load image samples. This function loads random images to verify that the annotations are correct.

Load image samples

3. Training

In the third step there are two cells, the first checks how many images there are and performs various preparatory processing for the training of Mask R-CNN. You don’t see it but there is a lot of code underneath that processes everything in seconds.

In the next step, you can start with the training. I remember that in this version of the notebook you can not set anything and the training is only for one class (in my case the screwdrivers) because it was designed for a first approach to the training of Mask R-CNN, therefore simplified. If you are interested I suggest you see the complete course on “Training Mask R-CNN” that I have made available.

Start training Mask R-CNN

After a few seconds, we can already check if there is a first template ready mask_rcnn_object_0001.h5. This is the path Mask_RCNN -> logs -> object20210802T1353, your name may vary slightly but surely you will be able to find it.

get .h5 file

4. Detection with Mask R-CNN (test your model on a random image)

Also in the fourth step, there are 2 blocks, the first is used to load the last model created. The latter is supposed to be the most accurate. Finally, we can test the resulting model of the training Mask R-CNN.

It will not be necessary to load anything because it will take random images and execute Mask R-CNN.

Mask R-CNN test

Run your custom model in real-time with multiple class

The notebook you used in this tutorial is simplified to have a first approach with the training of Mask R-CNN Model. If you need something more customizable and complete, I recommend my dedicated Mask R-CNN Pro mini course.

Train Mask-RCNN

Do you want to detect multiple categories and run it in realtime?

You can detect multiple categories, improve the accuracy of your model, run it in realtime or from a video, and more.

GET Mask R-CNN PRO