Forecasting algae blooms in aquaculture using mussels' openings data
Date
2021-04-28T18:15:22Z
Authors
Pondichery Vellamuthu Kripashanker, Deepan Shankar
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Time series data consists of a series of measurements collected over a period of time.
This type of data is very relevant in several domains, including healthcare, manufacturing,
finance, environment, and many more. For these domains it is frequently of
key importance to be able to predict the future values of these time series. Activity
monitoring is a task related to forecasting where time series data is used as input
signals to some events whose occurrence is supposed to depend on the values of these
series. These events are typically of critical importance to the end users and the goal
is to be able to anticipate them with sufficient lead time. Due to the uncertainty of
the future, forecasting and anticipating these scenarios could help prevent or mitigate
hazardous activities. In this thesis we address one such application - the anticipation
of algae blooms in aquaculture industries. As algal blooms hinders the growth
of the aquaculture species, it is of high importance to monitor the farms, avoiding
serious damages to the species. In this thesis we propose a method for anticipating
algae blooms based on measurements of mussels’ valve openings that domain experts
think can be used as bio-sentinels of the blooms. We use machine learning models to
address this predictive task and obtain models that can predict future algal bloom
events based on the micro closures of the mussels. We focus our goal on predicting
the presence of the algae Alexandrium Tamarense in the water environment. Due to
the rarity of algae blooms, sampling procedures were used to balance the distribution
of the target variable to facilitate the task of the learning algorithms. Overall, the
experimental comparisons we have carried out have shown that we were able to obtain
very good results, particularly in terms of being able to anticipate a high percentage
of the blooms (80%) although with some false alarms (48%). Our results have also
shown the advantage of adding sampling procedures to overcome the imbalanced distribution
of our target variable. In summary, in this thesis we have developed a series
of forecasting approaches based on feature engineering, machine learning models and
sampling methods that have shown a great potential in terms of preventing algae
blooms in aquaculture farms.
Description
Keywords
Algal Blooms, Machine Learning, Activity Monitoring, Time Series