Maximize Object Detection: Image Rotation and Shearing Techniques

Updated for peak results in 2024, this tutorial delves into advanced image augmentation. Learn to enhance object detection models using rotation and shearing transformations, optimizing your datasets for superior performance.

Unlock Data Augmentation Benefits for Bounding Boxes

Enhance model generalization by introducing variability
Overcome data scarcity issues by creating synthetic examples.
Improve model robustness and accuracy in object detection tasks.

Get the Code: GitHub Repository

Find all the augmentation techniques from this article, including rotation and shearing, in the following GitHub repository:

https://github.com/Paperspace/DataAugmentationForObjectDetection

Let's dive into how to use image rotation and image shearing to level up your training data!

Image Rotation: A Deep Dive

Image rotation transformations involve turning an image by a certain angle about a central point. It introduces greater variance into your data, but can also bring about complexity when it comes to correctly augmenting the bounding boxes.

Understanding Affine Transformations

Affine Transformations keep parallel lines parallel after the transformation. Scaling, translation, and rotation all fall under this category, playing a crucial role in our Data Augmentation for Bounding Boxes.

The Transformation Matrix

A transformation matrix is a handy tool for carrying out affine transformations. We can alter an image's rotation by multiplying a point's coordinates to produce the transformed point:

T_p = M * [ x\:y\:1]^T

Thankfully, OpenCV's cv2.warpAffine function does most of the hard work, allowing us to focus on the augmentations.

Implementation: Rotating Images with OpenCV

First, initialize the rotation with an angle.

def __init__(self, angle = 10):
 self.angle = angle

 if type(self.angle) == tuple:
 assert len(self.angle) == 2, "Invalid range"
 else:
 self.angle = (-self.angle, self.angle)

Next, get the transformation matrix to rotate the image, then get the rotated image with the warpAffine function.

(h, w) = image.shape[:2]
(cX, cY) = (w // 2, h // 2)
M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0)

image = cv2.warpAffine(image, M, (w, h))

The third argument being (w,h) preserves the original resolution..

Accounting for Cropping: Expanding the Image

When rotating an image, its dimensions can change, leading to potential cropping. To counteract this, we calculate the new dimensions needed to fully enclose the rotated image via:

$$ N_w = h * sin(\theta) + w * cos(\theta) \\ N_h = h * cos(\theta) + w * sin(\theta) $$

Finally, apply a translation to ensure the image is centered and prevent rotation artifacts. The resulting code for the image rotation function rotate_im lives in bbox_util.py.

Rotating Bounding Boxes: The Challenge

Next, rotate the bounding boxes. This gives a tilted rectangular box. Then, find the tightest rectangle parallel to the sides of the image containing the tilted rectangular box.

To perform this transformation, we need the four corners of the bounding box. With these corners, we can apply a rotation using the transformation matrix.

Step 1: Get the Corners

First, we write the function get_corners in the file bbox_utils.py to get all the 4 corners.

def get_corners(bboxes):
...

Step 2: Rotate the Box

Now define the function rotate_box in the file bbox_util.py which rotates the bounding boxes for us by giving us the transformed points. This makes use of the transofmration matrix.

def rotate_box(corners,angle, cx, cy, h, w):
...

Step 3: Get the Enclosing Box

Define a function get_enclosing_box to calcualte the tightest box about the corners of the rotated box.

def get_enclosing_box(corners):
...

Implementation: Putting it All Together

With the helper functions defined we can implement the __call__ function for actually performing the augmentaiton.

def __call__(self, img, bboxes):
...

Image Shearing: Tilting Your Perspective

Image Shearing involves skewing the image along one or both axes, creating a parallelogram-like distortion. Like rotation, it changes the shape of objects in the image, forcing the model to learn more general features.

The Shearing Matrix

Shearing transformations use the following matrix:

Implementation: Shearing with OpenCV

Since we are only covering horizontal shear, we only need to change the x coordinates of the corners of the boxes according to the equation x = x + alpha*y. Our call function looks like:

def __call__(self, img, bboxes):
...