Bird species classification poses a challenge in computer vision due to
discriminative region localization and fine-grained feature learning
difficulties.
We addressed these challenges using a transfer
learning-based approach with multistage training, aiming to achieve accurate
species recognition from high-resolution photographs, considering issues like
intensity variation, various poses, and class imbalance.
We leveraged pretrained Mask-RCNN for bird region localization and an ensemble model incorporating Inception Nets (Inceptionv3 net & InceptionResnetv2). The architecture aimed to extract both micro and macro-level features from bird regions of interest (ROIs) for
accurate classification.
The Data Set was imbalanced so we used different data augmentation techniques like Gaussian Noise, Gaussian Blur, Flip, Contrast, Hue, Add, Multiply, Sharp and
Affine Transform. This resulted in increasing training data set from 150 to 1330 images.
Utilized Mask-RCNN pretrained on COCO dataset for localizing birds in images. Used ImageNet pretrained weights for initialization.
Inception ResNet V2 model, trained on both original and Mask R-CNN cropped images, showed the best results. Improved accuracy by 2-3% after training on cropped images
Like this project
Posted Feb 21, 2024
Used a transfer learning-based approach with multistage training, aiming to achieve accurate species recognition from high-resolution photographs.