Maximize Object Detection: Image Rotation and Shearing Techniques
Updated for peak results in 2024, this tutorial delves into advanced image augmentation. Learn to enhance object detection models using rotation and shearing transformations, optimizing your datasets for superior performance.
Unlock Data Augmentation Benefits for Bounding Boxes
- Enhance model generalization by introducing variability
- Overcome data scarcity issues by creating synthetic examples.
- Improve model robustness and accuracy in object detection tasks.
Get the Code: GitHub Repository
Find all the augmentation techniques from this article, including rotation and shearing, in the following GitHub repository:
https://github.com/Paperspace/DataAugmentationForObjectDetection
Let's dive into how to use image rotation and image shearing to level up your training data!
Image Rotation: A Deep Dive
Image rotation transformations involve turning an image by a certain angle about a central point. It introduces greater variance into your data, but can also bring about complexity when it comes to correctly augmenting the bounding boxes.
Understanding Affine Transformations
Affine Transformations keep parallel lines parallel after the transformation. Scaling, translation, and rotation all fall under this category, playing a crucial role in our Data Augmentation for Bounding Boxes.
The Transformation Matrix
A transformation matrix is a handy tool for carrying out affine transformations. We can alter an image's rotation by multiplying a point's coordinates to produce the transformed point:
T_p = M * [ x\:y\:1]^T
Thankfully, OpenCV's cv2.warpAffine
function does most of the hard work, allowing us to focus on the augmentations.
Implementation: Rotating Images with OpenCV
First, initialize the rotation with an angle.
def __init__(self, angle = 10):
self.angle = angle
if type(self.angle) == tuple:
assert len(self.angle) == 2, "Invalid range"
else:
self.angle = (-self.angle, self.angle)
Next, get the transformation matrix to rotate the image, then get the rotated image with the warpAffine
function.
(h, w) = image.shape[:2]
(cX, cY) = (w // 2, h // 2)
M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0)
image = cv2.warpAffine(image, M, (w, h))
The third argument being (w,h)
preserves the original resolution..
Accounting for Cropping: Expanding the Image
When rotating an image, its dimensions can change, leading to potential cropping. To counteract this, we calculate the new dimensions needed to fully enclose the rotated image via:
$$ N_w = h * sin(\theta) + w * cos(\theta) \\ N_h = h * cos(\theta) + w * sin(\theta) $$
Finally, apply a translation to ensure the image is centered and prevent rotation artifacts. The resulting code for the image rotation function rotate_im
lives in bbox_util.py
.
Rotating Bounding Boxes: The Challenge
Next, rotate the bounding boxes. This gives a tilted rectangular box. Then, find the tightest rectangle parallel to the sides of the image containing the tilted rectangular box.
To perform this transformation, we need the four corners of the bounding box. With these corners, we can apply a rotation using the transformation matrix.
Step 1: Get the Corners
First, we write the function get_corners
in the file bbox_utils.py
to get all the 4 corners.
def get_corners(bboxes):
...
Step 2: Rotate the Box
Now define the function rotate_box
in the file bbox_util.py
which rotates the bounding boxes for us by giving us the transformed points. This makes use of the transofmration matrix.
def rotate_box(corners,angle, cx, cy, h, w):
...
Step 3: Get the Enclosing Box
Define a function get_enclosing_box
to calcualte the tightest box about the corners of the rotated box.
def get_enclosing_box(corners):
...
Implementation: Putting it All Together
With the helper functions defined we can implement the __call__
function for actually performing the augmentaiton.
def __call__(self, img, bboxes):
...
Image Shearing: Tilting Your Perspective
Image Shearing involves skewing the image along one or both axes, creating a parallelogram-like distortion. Like rotation, it changes the shape of objects in the image, forcing the model to learn more general features.
The Shearing Matrix
Shearing transformations use the following matrix:
Implementation: Shearing with OpenCV
Since we are only covering horizontal shear, we only need to change the x coordinates of the corners of the boxes according to the equation x = x + alpha*y
. Our call function looks like:
def __call__(self, img, bboxes):
...