Using data science to optimise shrimp farm yields

from Aqua Culture Asia Pacific March/April 2023

EDITORIAL CALENDAR 2023

How data science offers a systematic and scientific approach to detect variables associated with a high survival rate of more than 80% at a farm in central Philippines

By Neil Arvin Bretaña

Seafood is a vital source of sustenance for a growing global population and plays a significant role in food security. With wild fish stocks declining, aquaculture has become a crucial component in meeting the demand for seafood – one that is expected to increase five-fold in the next decade. However, ensuring global seafood security is a complex issue that requires a multi-disciplinary approach, including the use of data science to optimise aquaculture farm yield. Maximising yield while minimising resources, wastes, and environmental impacts requires a thorough understanding of the biological and environmental factors that affect the growth and health of aquatic species.

Traditional aquaculture methods rely on trial-and-error and intuition, leading to suboptimal results and increased costs. Data science, on the other hand, offers a systematic and scientific approach to optimising aquaculture. By collecting and analysing large amounts of data, data scientists can identify patterns and relationships that would be difficult to discern through intuition alone.

A case study in collaboration with a farm in Central Philippines

NSB-NB5 is a shrimp farm in Central Philippines that is starting to embrace data-driven innovation. In three previously recorded harvest cycles, the farm saw a loss of up to 50% in expected yield. This meant that up to half of the projected harvest failed to reach the market. This was despite attempting to maintain the farm’s standard inputs including feed, probiotics, and water quality. It was interested in identifying which of the modifiable parameters can be associated with this outcome. NSB-NB5 tapped Birkentech Solutions Pty Ltd, an Australian data science consultancy firm that focuses on supporting agriculture and aquaculture, to analyse their pond variables and identify the ones related to a high shrimp survival and yield.

Statistics-based data science approach

Birkentech recommended a statistics-based data science approach to achieve this goal. This was done in close collaboration with the farm management team and technicians. Data from three harvest cycles from nine Penaeus vannamei shrimp ponds consisting of daily recorded values for physicochemical measurements, feed and supplementary data and water management inputs were collected.

Physicochemical variables included pH, dissolved oxygen, temperature, salinity, depth, transparency, water colour and weather. Water management input included organic disinfectant, water probiotic, and minerals. Feed and supplement names were de-identified and coded to protect proprietary information.

At the end of each harvest cycle, the percent survival rate was recorded for each pond. The recorded survival rate, indicating the health outcome of the shrimp culture at the end of each harvest cycle, was utilised as the target variable for the study with a survival threshold set to 80%.

A total of 22,968 data points were made available. First, the basic summary of the data was described (Table 1).

This included the mean, median, minimum, maximum, and standard deviation of the variables. For categorical variables such as colour and weather, basic summary was described by calculating the frequency distribution. Missing values were handled by imputation using the MICE package in R.

Ponds were compared in terms of variable variation using an unsupervised clustering technique to assess whether all ponds were similar or different. K-means clustering was applied; this is one of the most basic and often used unsupervised machine learning techniques to find underlying patterns by grouping similar data points together.

The silhouette method revealed k=2 as the optimal number of cluster centroids to do the analysis. Based on a clustering analysis, the data points revealed that the clusters formed from k=2 overlap with each other. This implies that all data points could be regarded as one cluster and there was not a single pond that behaved differently as a separate cluster (Figure 1). For instance, 68 of the 119 data points in pond 2 formed with one centroid and 51 formed with the other centroid. A similar split between the two centroids were found for the data points of all other ponds. Therefore, analysis was performed on all data points collectively.

Using data science to optimise shrimp farm yields

Next Article

EDITORIAL CALENDAR 2023

How data science offers a systematic and scientific approach to detect variables associated with a high survival rate of more than 80% at a farm in central Philippines

A case study in collaboration with a farm in Central Philippines

Statistics-based data science approach

More articles from this publication:

EDITORIAL CALENDAR 2023

Pilot for the development of modern shrimp ponds in Indonesia

Naming of ‘Alain Michel Hatchery’ at Oceanpick, Sri Lanka

GOAL 2022: Global farmed shrimp supply

Industry Review

ASIA’S LEADING AQUACULTURE INDUSTRY EVENT RETURNS TO BALI. SAVE THE DATE!

Marine Shrimp in Asia: Production mired by low prices and disease outbreaks in 2022

Insights on occurrences of mycotoxins in 2022: Focus on the Asia Pacific region

Protect your profits

This article is from:

Aqua Culture Asia Pacific March/April 2023