Repository logo
 

UNSUPERVISED IMAGE CLASSIFICATION OF FISH WITHOUT THE INFERENCE OF CLUSTER NUMBER

Date

2023-04-14

Authors

Bhupathiraju, Akhilesh Varma

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This thesis investigates deep learning techniques, particularly unsupervised image classification, for identifying and clustering fish images captured with underwater cameras. In collaboration with Innovasea, the goal is to streamline fish species identification and reduce labor-intensive manual labeling of camera data at the White Rock Dam test site in Nova Scotia, Canada. We developed an unsupervised clustering framework based on the DeepDPM deep learning model. We first reproduced DeepDPM results on several standard datasets. We then integrated ViT MAE embeddings with DeepDPM and applied ESRGAN-based image processing to enhance fish images, which are often blurry and low resolution. These techniques improved clustering accuracy from 30% to 80% for five species of fish. However, using a cluster visualization tool we developed, we observed that fish with similar appearances were clustered together. Our results demonstrate progress towards automating fish species classification and suggest future avenues of research towards this goal.

Description

This thesis looks at how recent deep learning techniques, especially unsupervised im- age classification methods, can be used to identify and group images of fish taken with underwater cameras. The main goal of the thesis is to reduce the work in- volved in identifying and labeling fish species by automatically grouping them into di↵erent clusters based on their species. This automation would cut down on the time-consuming and expensive process of hand-labeling and reduce the risk of human error, which can lead to a lack of accuracy. This work was conducted in collaboration with Innovasea with the intention of deploying the model at the White Rock Dam test site in Nova Scotia, Canada. Innovasea has set out to accelerate the scalability of machine learning models for identifying and classifying fish around commercial in- frastructure. After deploying this model, the identified clusters from the model could potentially be further analyzed by fish experts to rapidly generate a set of train- ing data used to train a low-power YOLO model for the continuous detection and identification of fish. To achieve this objective, we developed a deep learning-based clustering frame- work that combines embeddings obtained from ViT MAE and clustering based on DeepDPM for classifying fish images into di↵erent classes. First, we started by re- producing the results of the DeepDPM algorithm on the MNIST, Fashion-MNIST, USPS, STL10, and ImageNet datasets. Second, we used the algorithm on a subset of the Fish4-Knowledge dataset, which was made up of unlabeled images of five dif- ferent fish species, to find groups of rare fish. To improve the performance of the clustering algorithm, we used an image enhancement technique based on ESRGAN. Finally, we developed a visualization tool to analyze the clustering results and identify opportunities for improvement. Our results obtained from the deep learning-based clustering model are promising as they show progress towards the goal of automating fish species classification without human intervention. The model has the potential to enable automated monitoring and detection of unknown fish species around commer- cial infrastructure. However, it is essential to note that further testing and refinement may be necessary before the model can be used well in every situation.

Keywords

DeepDPM, upsupervised classification, EM, unsupervised clustering, SCAN, split-merge, feature-extraction, self-supervised training, image-enhancement, VIT, MAE, ESRGAN, fish clustering, Non-parametric clustering, fish species classification, automated monitoring

Citation