Retraining AWS Rekognition with Custom Labels on an annotated dataset

Published by Samy Doloris on

AWS Rekognition

AWS Rekognition is an AWS product that allows to easily perform image and video analysis, and more particularly object detection.
It is based on Machine Learning (ML) even though it is not required to have ML knowledge to use it.

It also provides more advanced features such as:

  • Content moderation
  • Text detection
  • Face detection and analysis
  • Face search and verification
  • Celebrity recognition
  • Pathing
  • Custom Labels

This article focuses on Custom Labels as it extends AWS Rekognition capabilities by allowing you or any user you authorize to handle labelling directly on AWS Rekognition’s web interface.


AWS Rekognition Custom Labels web interface for drawing boxes

And more specifically, I will show you how to retrain an object detection model on AWS Rekognition for a custom dataset (here we used OpenImages Dataset V5).
If you are not familiar yet with AWS Rekognition, I suggest you to go through its features by yourself and at least understand how projects and datasets work.

You might want to retrain object detection on Rekognition when you want a specific application of object detection based on a custom dataset.
Even though AWS Rekognition is being constantly enhanced and upgraded, the base product is still limited in the different objects that can be detected.

Let’s stop talking and see how it can be done!

How to use new images in Rekognition

There are several ways to use new images in Rekognition, but all of them require to create a dataset in Rekognition’s web interface (or using the CLI of course).

Once you create a dataset, you have several choices, as you can see on the image below:

We will use the first option: “Import images labeled by SageMaker Ground Truth”.

As you will see later, a manifest file, stored on AWS S3, will be required when using this option.

OpenImages dataset

OpenImages is a dataset populated with around 9 million images with image-level labels, object bounding boxes, object segmentation masks, visual relationships and localized narratives.
We will use the bounding boxes to retrain AWS Rekognition.

You will need to download all the data required: images, boxes and relationship names (which contains the name of the bounding boxes in English).
Images need to be in an S3 bucket but the other files can be processed locally if you don’t want to process them on AWS.

Manifest files

A manifest file is just a fancy JSON file that contains information about images. However, it needs to contain some information about the images.

The expected format can be seen below:



{
    "source-ref": "S3 bucket location", # Required
    "bounding-box": { # Required
        "image_size": [ # Required
            {
                "width": 500, # Required
                "height": 400, # Required
                "depth":3 # Required
            }
        ],
        "annotations": [ # Required
            {
                "class_id": 0, # Required
                "left": 111, # Required
                "top": 134, # Required
                "width": 61, # Required
                "height": 128 # Required
            },
            {
                "class_id": 5, # Required
                "left": 161, # Required
                "top": 250, # Required
                "width": 30, # Required
                "height": 30 # Required
            },
            {
                "class_id": 5, # Required
                "left": 20, # Required
                "top": 20, # Required
                "width": 30, # Required
                "height": 30 # Required
            }
        ]
    },
    "bounding-box-metadata": { # Required
        "objects": [ # Required
            {"confidence": 0.8}, # Required
            {"confidence": 0.9}, # Required
            {"confidence": 0.9} # Required
        ],
        "class-map": { # Required
            "0": "dog", # Required
            "5": "bone" # Required
        }, 
        "type": "groundtruth/object-detection", # Required
        "human-annotated": "yes", # Required
        "creation-date": "2018-10-18T22:18:13.527256", # Required
        "job-name": "identify-dogs-and-toys" # Not Required
    }
 }

All you need to do is to convert the dataset annotations from OpenImages format to Rekognition format in a manifest file!

A sample code for converting the dataset is available here. You can use it and modify it for any other dataset!

Once the manifest file is generated, you can upload it on S3.

Retraining Rekognition

Now that our data is on S3 and the manifest file containing all the information required to train, nothing else can stop you from retraining AWS Rekognition!

All you need to do is to create your dataset in Rekognition using the option “Import images labeled by SageMaker Ground Truth” and specifying the URI of the manifest file on S3, and voila!
You can retrain AWS Rekognition!