Train Mask R-CNN for Image Segmentation (online free gpu)
Mask r-cnn on google colab no longer supported
We will see in the simplest way possible to train the Mask R-CNN detector. Mask R-CNN is a Convolutional Neural Network (CNN) that not only identifies the object and its position but also draws a perfect polygon of the object. It can be useful to use it to Identify and Measure precisely Objects distance | with Deep Learning and Intel Realsense.
What will we see in this tutorial for the train mask r-cnn?
1 Collect the images e prepare the dataset with the images
2 How to train the dataset with Colab Notebook
3 Take and use model to detect object

1 Collect the images e prepare the dataset with the images
In this case, I decided to identify a specific screwdriver model.

The first step is to shoot many in different positions, with different lighting, in the middle of other objects, and with different backgrounds. In practice, the more images you can recover and the more different they will be, the more effective your model will be. To create even more variety I recommend using Image Augmentation (Improve your Dataset) | with Imgaug.
When you have collected enough images we need to move on to annotations, i.e. manually define the position of the object and assign a label. There are several annotation software but I recommend this open-source project https://www.makesense.ai.
When you are on the home page click on “Get Started”

Now you see the page to import all the images

At the end of the upload, it will ask you to insert the labels. For the moment there will be only one class so let’s write, in my case, “screwdriver” and then on the “Start Project” button.

Now we can finally make the annotations for train Mask R-CNN but remember, there are different types of annotations, we need Segmentation Annotation then draw a polygon surrounding every screwdriver.

When you have annotated all your images, you need to export them. Go to the Actions menu at the top and then Export annotations

Now you have to check the Coco Format box, it is important to select this format because the notebook I wrote only supports this.

Save your file with the name “Annotations” and put it in a folder with your dataset, just like in the picture.

2 How to train the dataset with Colab Notebook
For the training of Mask R-CNN, I have prepared a notebook for google colab that you can download on the download link. If you are not familiar with google colab is a notebook offered by google for online training, just use a Gmail account or Google account and you can load it here for free.

First of all, we need to enable the GPU because we need the graphics card to do the training. Then go to Edit -> Notebook Settings and when the window opens select GPU from the drop-down menu then save.

Now we can start, I divided the notebook into 4 big steps:
- Installation Mask R-CNN
- Image Dataset
- Training
- Detection (test your model on a random image)
1. Installation Mask R-CNN
In this first step, Mask R-CNN will be installed on Google Colab in a totally automatic way, you just need to start it. Tensorflow and all necessary libraries will also be installed by the script, you won’t have to worry about a thing

When you read … done downloading pre-trained model! you can go to the next step.

2. Image Dataset
First of all, we have to put our images folder in a .zip archive and we will have a dataset.zip file. We just have to upload the files to the notebook, the easiest way is to click on the folder design on the right and drop the file one at a time: annotations.json
dataset.zip

If the procedure is successful without problems you will have your files uploaded and by clicking on the Play of the second step you will get the total number of images uploaded. Obviously, make sure that the path is correct or that there are no other errors and you will get the result as in the photo below.

Let’s go to the next step that load the images on the memory. In this case, there is only one class so as you can see from the screen everything is correct.

To see if everything is correct load image samples. This function loads random images to verify that the annotations are correct.

3. Training
In the third step there are two cells, the first checks how many images there are and performs various preparatory processing for the training of Mask R-CNN. You don’t see it but there is a lot of code underneath that processes everything in seconds.
In the next step, you can start with the training. I remember that in this version of the notebook you can not set anything and the training is only for one class (in my case the screwdrivers) because it was designed for a first approach to the training of Mask R-CNN, therefore simplified. If you are interested I suggest you see the complete course on “Training Mask R-CNN” that I have made available.

After a few seconds, we can already check if there is a first template ready mask_rcnn_object_0001.h5. This is the path Mask_RCNN -> logs -> object20210802T1353, your name may vary slightly but surely you will be able to find it.

4. Detection with Mask R-CNN (test your model on a random image)
Also in the fourth step, there are 2 blocks, the first is used to load the last model created. The latter is supposed to be the most accurate. Finally, we can test the resulting model of the training Mask R-CNN.
It will not be necessary to load anything because it will take random images and execute Mask R-CNN.

Run your custom model in real-time with multiple class
The notebook you used in this tutorial is simplified to have a first approach with the training of Mask R-CNN Model. If you need something more customizable and complete, I recommend my dedicated Mask R-CNN Pro mini course.