Ship Detection - Part 1 (Data Wrangling)

What am I gonna learn?

By the end of this blog, you will know how to use PyTorch dataset iterator to efficiently convert DOTA format annotations to Darknet format.

You will have built the following api to do the darknet format conversion.

train_dataset = ShipDetectionDataset(train_files)
val_dataset = ShipDetectionDataset(val_files)

for img, path, boxes in tqdm(train_dataset):
    convert2darknet(img, path, boxes)

for img, path, boxes in tqdm(val_dataset):
    convert2darknet(img, path, boxes, val=True)

Introduction

Detection of ships is an important task when it comes to congestion control and tracking of ships that have turned off the AIS (Automatic Identification System).

Efficient detection of ships can allow us to optimize the cargo transportation and track ships. Ship tracking using object detection models can be used in law enforcement by altering authorities of ships that are suspected of illegal trafficking of goods.

In this series, we will go over the ship detection competition organized by Data Driven Science. This series is divided into three parts:

The dataset for this competition can be downloaded from HuggingFace which is a subset of DOTA dataset.

Directory Structure

The structure of the dataset directory is given as follows:

ship-detection
├── .extras
│   ├── submission_sample.csv
│   └── train.csv
├── test
└── train

The .extras directory contains a sample of submissions and annotations of training data. The train directory contains training images and likewise, test directory contains testing images.

Conversion to Darknet

Since we want to use yolov5, we need to convert the data annotations into darknet format. As mentioned in the documentations, each image width and height are between 0 and 1. Thus, widths and heights of the objects are normalized in this range as well.

Darknet format annotations consists of *.txt files, one for each image. These files can contain multiple rows, one for each object. These rows are in class x_center y_center width height format. Class labels start from 0 and since there’s only one class in our case (ships), 0 will be the only label here.

For more details, refer to this.

Now we dive into the coding part:

Code

We begin by importing the libraries we are going to use:

import os
import shutil

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

import torch
from torch.utils.data import Dataset, DataLoader, random_split

import torchvision
from torchvision.utils import draw_bounding_boxes
from torchvision.io import read_image

Now we read the train.csv in the extras directory.

df = pd.read_csv("ship-detection/.extras/train.csv")
df.head(5)

	id	xmin	ymin	xmax	ymax
0	0.png	6690	3599	7164	3850
1	0.png	6772	3386	7039	3546
2	0.png	6705	3291	7060	3485
3	0.png	6230	3442	6597	3647
4	0.png	5501	790	5552	868

We can read annotations of any file as following:

df[df['id'] == "0.png"]

	id	xmin	ymin	xmax	ymax
0	0.png	6690	3599	7164	3850
1	0.png	6772	3386	7039	3546
2	0.png	6705	3291	7060	3485
3	0.png	6230	3442	6597	3647
4	0.png	5501	790	5552	868
5	0.png	2076	3189	2634	3797
6	0.png	6195	3530	6246	3565

Now we will split the training files into 80-20 ratio.

files = df['id'].unique() # get distinct files

train_files, val_files = train_test_split(files, test_size=0.2, random_state=42) # for reproducablity
print(f"No. of training images: {len(train_files)}")
print(f"No. of validation images: {len(val_files)}")

Now we will create a torch dataset to iterate over training and validation splits. We will be using this just as an iterator for now to easily convert annotations to darknet format.

class ShipDetectionDataset(Dataset): # this will help us perform data operations
    
    def __init__(self, files):
        self.dataset_loc = "ship-detection"
        self.files = files
        self.bbox_df = pd.read_csv(f"{self.dataset_loc}/.extras/train.csv")
        # bounding box columns ['xmin', 'ymin', 'xmax', 'ymax']
        self.boxes_cols = self.bbox_df.columns[1:]
        
    def __len__(self):
        return len(self.files)
    
    def __getitem__(self, idx):
        filename = self.files[idx]
        filepath = f"{self.dataset_loc}/train/{filename}"
        
        bboxes = self.bbox_df[self.bbox_df['id'] == filename][self.boxes_cols]
        targets = torch.tensor(bboxes.values)
        img = read_image(filepath)
        return img, filename, targets

Next, we define a functions to convert the annotations format.


DATASET_LOC = "ship-detection"

def get_darknet_annots(img, boxes):
    _, height, width = img.shape
    annots = []
    for b in boxes:
        x1, y1, x2, y2 = b.tolist()
        w = x2 - x1
        h = y2 - y1
        x_center = (x1 + x2) / (2*width)
        y_center = (y1 + y2) / (2*height)
        w = w/width
        h = h/height
        annot = f"0 {x_center} {y_center} {w} {h}" # 0 => ship class
        annots.append(annot)
    return annots

def convert2darknet(img, path, boxes, val=False):
    annots = get_darknet_annots(img, boxes)
    
    filename = path.split(".")[0]
    annot_filepath = f"{DATASET_LOC}/{'val' if val else 'train'}/labels/{filename}.txt"
    with open(annot_filepath, "w") as f:
        for annot in annots:
            f.write(f"{annot}\n")
            
    src = f"{DATASET_LOC}/{path}"
    dest = f"{DATASET_LOC}/{'val' if val else 'train'}/images/{path}"
    shutil.move(src, dest)

Creating the needed directories:

!mkdir ship-detection/train/labels ship-detection/train/images

!mkdir ship-detection/val ship-detection/val/labels ship-detection/val/images

Now we will create dataset iterators for training and validation files and call convert2darknet.

train_dataset = ShipDetectionDataset(train_files)
val_dataset = ShipDetectionDataset(val_files)

for img, path, boxes in tqdm(train_dataset):
    convert2darknet(img, path, boxes)

for img, path, boxes in tqdm(val_dataset):
    convert2darknet(img, path, boxes, val=True)

At this point, we are finally done wrangling the dataset.