scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Detecting the presence-absence of bluefin tuna by automated analysis of medium-range sonars on fishing vessels

TL;DR: This study presents a methodology for the automated analysis of commercial medium-range sonar signals for detecting presence/absence of bluefin tuna (Tunnus thynnus) in the Bay of Biscay, and has the potential to automatically analyze high volumes of data at a low cost.
Abstract: This study presents a methodology for the automated analysis of commercial medium-range sonar signals for detecting presence/absence of bluefin tuna (Tunnus thynnus) in the Bay of Biscay. The approach uses image processing techniques to analyze sonar screenshots. For each sonar image we extracted measurable regions and analyzed their characteristics. Scientific data was used to classify each region into a class ("tuna" or "no-tuna") and build a dataset to train and evaluate classification models by using supervised learning. The methodology performed well when validated with commercial sonar screenshots, and has the potential to automatically analyze high volumes of data at a low cost. This represents a first milestone towards the development of acoustic, fishery-independent indices of abundance for bluefin tuna in the Bay of Biscay. Future research lines and additional alternatives to inform stock assessments are also discussed.

Content maybe subject to copyright    Report

RESEARCH ARTICLE
Detecting the presence-absence of bluefin
tuna by automated analysis of medium-range
sonars on fishing vessels
Jon Uranga
1
*, Haritz Arrizabalaga
1
, Guillermo Boyra
1
, Maria Carmen Hernandez
2,3
,
Nicolas Goñi
1
, Igor Arregui
1
, Jose A. Fernandes
4
, Yosu Yurramendi
2
, Josu Santiago
1
1 AZTI-Tecnalia, Marine Research Division, Pasaia, Spain, 2 Computer Science and Artificial Intelligence
Department, University of the Basque Country, Donostia, Spain, 3 Centre for the Research and Technology
of Agro-Environmental and Biological Sciences, University of Tra
´
s-os-Montes and Alto Douro, Vila Real,
Portugal, 4 Plymouth Marine Laboratory, Plymouth, United Kingdom
*
juranga.research@gmail.com
Abstract
This study presents a methodology for the automated analysis of commercial medium-
range sonar signals for detecting presence/absence of bluefin tuna (Tunnus thynnus) in the
Bay of Biscay. The approach uses image processing techniques to analyze sonar screen-
shots. For each sonar image we extracted measurable regions and analyzed their charac-
teristics. Scientific data was used to classify each region into a class (“tuna” or “no-tuna”)
and build a dataset to train and evaluate classification models by using supervised learning.
The methodology performed well when validated with commercial sonar screenshots, and
has the potential to automatically analyze high volumes of data at a low cost. This repre-
sents a first milestone towards the development of acoustic, fishery-independent indices of
abundance for bluefin tuna in the Bay of Biscay. Future research lines and additional alter-
natives to inform stock assessments are also discussed.
Introduction
The Atlantic bluefin tuna (Thunnus thynnus) is an emblematic species exploited for several
centuries that has supported economically important industrial fisheries [
1]. The International
Commission for the Conservation of Atlantic Tunas (ICCAT) manages two Atlantic bluefin
tuna stocks, the western stock that spawns in the Gulf of Mexico, and the eastern stock that
spawns in the Mediterranean. Both stocks have been overfished in recent decades [
2] and cur-
rently they are under recovery plans. Furthermore, the scientific community has warned about
the large uncertainty surrounding the eastern stock status [
3], which is being addressed with
a set of research programs under the Atlantic-wide Research Programme for bluefin Tuna
(GBYP) promoted by ICCAT. In order to be able to quantify the effects of the implemented
recovery plan, it is of outmost importance to be able to monitor changes in abundance and
stock status through accurate indicators.
Fisheries independent scientific surveys are used to monitor the stock abundance of many
groundfish and small pelagics [
4]. Absolute and relative stock abundance estimates are useful
PLOS ONE | DOI:10.1371/journal.pone.0171382 February 2, 2017 1 / 18
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Uranga J, Arrizabalaga H, Boyra G,
Hernandez MC, Goñi N, Arregui I, et al. (2017)
Detecting the presence-absence of bluefin tuna by
automated analysis of medium-range sonars on
fishing vessels. PLoS ONE 12(2): e0171382.
doi:10.1371/journal.pone.0171382
Editor: Brian R. MacKenzie, Technical University of
Denmark, DENMARK
Received: September 6, 2016
Accepted: January 18, 2017
Published: February 2, 2017
Copyright: © 2017 Uranga et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: All relevant data are
within the paper and its Supporting Information
files.
Funding: This research was supported by the
Basque Government through PhD grant 0033-2011
to JU and grant GV 351NPVA00062 to HA (AZTI-
Tecnalia). The funders had no role in study design,
data collection and analysis, decision to publish, or
preparation of the manuscript.
Competing Interests: The authors have declared
that no competing interests exist.

to inform management of exploited fish stocks. Many of the uncertainties associated with our
ability to estimate fish stock abundances can be linked directly to limitations in the spatial cov-
erage of our sampling systems [
5]. For example, in the case of scientific acoustic surveys, highly
precise narrow vertical beam acoustic equipment might fail to detect aggregations if these are
sparsely distributed or if fish are aggregated in the unsampled surface. In such situations, the
use of commercial fishing vessels and their acoustic equipment allows for substantial increase
in the spatial coverage. In fact, major progress has been made in the use of this information as
the basis for stock assessment [
6, 7, 8, 9], as well as to analyze fish behavior [10], vessel avoid-
ance [
11] and fish distribution [12].
In tuna stock assessments, time series of standardized catch per unit effort (CPUE) indices
are used as proxies for relative abundance. However, these series, based on fishery data, have
known analytical challenges, such as lack of scientific design, correlated observations, non-ran-
dom sampling or variable catchability [
13], and do not necessarily reflect trends in population
abundance. In the case of bluefin tuna, the drastic reduction in fishing opportunities as part of
the recovery plan has affected the CPUE indices, and the Standing Committee on Research
and Statistics (SCRS) of ICCAT has recommended urgently developing fisheries independent
indices of abundance [
14].
There are very few fishery-independent surveys for tuna, and other highly mobile species
with wide distributional ranges, because the cost associated with research vessels covering the
whole distribution area is prohibitive. Moreover, it is not possible to account for the uncer-
tainty associated with this type of surveying (e.g. double counting). Therefore, some fishery
independent surveys for tuna have focused on early life stages (larvae) or spawners whose
distributional range is much more concise and spatially limited to spawning areas [
15]. When
the focus has been on juveniles and adults (with high migration capabilities) airplanes have
been used instead of research vessels to provide broad distribution coverage in reasonable
timeframes and with reasonable costs [
16, 17], estimating the approximate horizontal shape of
the visible portion of schools [
18]. Some sonar and echosounder-based acoustic surveys have
also been implemented to monitor southern bluefin tuna recruitment [19], together with troll-
ing transects surveys [
20].
The standardized CPUE of the Bay of Biscay baitboat fleet is used as the only abundance
index for the juvenile fraction of the entire eastern stock [21, 22]. Catchability by baitboats can
be affected by several factors including food availability, feeding behavior and stomach reple-
tion [
23, 24]. These variables are difficult to incorporate during the CPUE standardization pro-
cess. Consequently, inter-annual variability could induce bias in the abundance indices (e.g. a
large tuna biomass could yield a low baitboat CPUE if plenty of food is available in the envi-
ronment and tunas are not attracted by the bait). However, Bay of Biscay baitboats use Omni
mode Medium Range Sonars (MRS) to search for tuna, and omni directional sonars have
proven to be useful tools for characterizing large pelagic schools [
25, 26]. Thus, the informa-
tion obtained by these sonars could provide data about the number and size of tuna schools in
the search area, independent of food availability and feeding behavior. These sonars are analog
and non-scientific, used only for display, and all the information collected is lost as soon as it
disappears from the screen. Thus, our approach is to record sonar screen shots in a large num-
ber of fishing vessels during the tuna fishing campaigns and design an automated methodology
for analyzing these images as a way to utilize the data currently wasted. The automated pro-
cessing of images has been proven to be useful in biological studies and it is a fast-evolving
area of research [
27, 28, 29].
In summary, the estimation of bluefin tuna abundance in the Bay of Biscay using fishery
independent methods remains challenging, but new technologies, datasets and approaches
provide new opportunities to address the challenge. The main objective of this study is to
Automated detection of tuna with medium-range sonars
PLOS ONE | DOI:10.1371/journal.pone.0171382 February 2, 2017 2 / 18

develop an automated image analysis procedure for detecting presence-absence of bluefin tuna
in commercial sonar images, plus a validation of the procedure based on data mining. The util-
ity of the procedure to track abundance of juvenile bluefin tuna in the Bay of Biscay is also dis-
cussed. This constitutes a first milestone towards the longer-term objective of developing new
fishery independent indices of abundance for Atlantic bluefin tuna based on acoustics.
Materials and methods
The research presented in this manuscript involved no endangered or protected species. No
experimentation with animals was performed and no specific field permits were required as
the scientific observations were conducted on commercial fishing activities regulated by the
International Commission for the Conservation of Atlantic Tunas (ICCAT). No other ethical
issues applied to the present research project.
The study area is delimited by the activity of the baitboat fleet in the southeast corner of the
Bay of Biscay, between 43–47˚N and 2–6˚W, from June to October (
Fig 1). The Bay of Biscay
represents a relatively small fraction of the total bluefin tuna habitat in the Atlantic [
30]. How-
ever, it is the most important known feeding area for juveniles during their feeding migration
to the Northeast Atlantic around summer [
31].
Pole and line fishing with live bait is the traditional fishing technique used by the Basque
fleet fishing for bluefin tuna in the Bay of Biscay since the early 1950s. Live bait (mainly small
horse mackerel, sardine, mackerel and anchovy) is caught with a small purse seine and kept in
water tanks onboard. Tuna schools can be spotted visually at large distances and then detected
acoustically by sonar, once the school is within the detection range of the sonars. When the
boat is close to the tuna school, live bait is thrown into the water to keep the tuna next to the
boat, while the boat sprays water so that it is not seen by the tuna. At this point, baited hooks
are used to catch the tuna.
In this study we created a reference dataset of sonar images with known categories (“tuna”
or “no tuna”, based on tuna presence and absence data observed by scientists) to validate an
image analysis and classification procedure. This dataset was used to test the methodology
developed in this study which consists of several steps: 1) Image acquisition and categorization
based on scientific data, 2) Features extraction, 3) Training dataset elaboration and 4) Model
training and evaluation.
Image acquisition and categorization based on scientific data
The images processed in this study were obtained from the commercial sonar MAQ 90 kHz. This
omni-directional MRS is used by the majority of the Bay of Biscay baitboat fleet. The searching
range of the sonar varies with sea conditions and skipper preferences but, in general, range set-
tings of 100–300 m are used when searching for tuna, with a slight tilt of minus 5–7˚ off the hori-
zontal and narrow vertical and horizontal beam widths (5˚).
The screen dumps were acquired using an image acquisition device composed of 400MHz
video splitter, an external VGA Capture Device and a laptop with a script for continuous data
acquisition. The images selected for this study correspond to six different trips from two scien-
tific tuna surveys conducted in summer 2009 and 2011. The scientific surveys were conducted
using a baitboat that behaved similar to the rest of the commercial baitboat fleet. Thus, the
area searched during the scientific surveys significantly overlapped the fishing area used by the
commercial fleet (
Fig 1). The main activities conducted by the scientists during the surveys
were characterization of the vessel activities, recording of MAQ sonar screenshots and SIM-
RAD EK 60 signal, tuna tagging and biological sampling (length measurements as well as col-
lection of genetic tissue). The presence of bluefin tuna in the sonar was validated when bluefin
Automated detection of tuna with medium-range sonars
PLOS ONE | DOI:10.1371/journal.pone.0171382 February 2, 2017 3 / 18

Automated detection of tuna with medium-range sonars
PLOS ONE | DOI:10.1371/journal.pone.0171382 February 2, 2017 4 / 18

tuna was the only specie caught during fishing operations. Presence of bluefin tuna was anno-
tated in the scientific logbooks, and this information was used to classify the images under
“tuna” and “no tuna” categories. For this study, the reference dataset was built by selecting a
balanced set of images, with 1397 images of bluefin tuna presence and 1398 images of bluefin
tuna absence. Bluefin tuna absence was defined as lack of tuna echo in the image and lack of
tuna catch. With the aim to include the main types of images recorded, the reference dataset
included images with different background colors as well as images with and without surface
noise (
Fig 2).
Features extraction
The image processing application was developed with a Java software and it consisted of three
steps: pre-processing, segmentation and extraction of characteristics.
Pre-processing. The pre-processing phase removed the non-relevant parts of the sonar
screen image. The screen of the MAQ sonar has two main regions, the echogram display circle
and the menu panel (
Fig 2). The menu panel provides user information on the operation and
system control settings whereas the echogram represents the acoustic data. During the pre-
processing we divided the sonar screen into these two basic regions and then focused on the
echogram. In the echogram, we worked with the upper half of the circle, as the tuna schools
are not clearly detected in the lower half due to the vessel’s wake. Furthermore, the schools
were observed to appear first in the upper part of the echogram because the vessels move faster
than the fish. The sonar display was set up in such a way that the forward observations were
located at the top of the screen. Additionally, the echogram was cleaned of noise and sonar dis-
play lines and marks, such as cursor crosses, vessel tracks or range circumferences were
removed from the echogram (
Fig 3).
Segmentation. In the segmentation phase, the selected part of the echogram was partitioned
into sub-images or blobs. First, the zero-valued (i.e., black) pixels were considered background
and removed; whereas the non-zero (i.e., colored) pixels were grouped, using the 8 adjacency
rule, into blobs. Then, in order to reduce the size of the training dataset, the blobs containing less
than 100 pixels were removed. We believe that this decision is conservative since the smallest
tuna school observed by expert judgement in the reference dataset contained 415 pixels, and so it
does not restrict the utility of the classification algorithm developed.
Extraction of characteristics. The remaining blobs were considered tuna candidates and
were subject to a characteristics extraction process. For each one, 20 morphologic characteristics
were measured related to area, perimeter, position, smallest rectangle containing the blob, best
ellipse fitting the blob, aspect ratio, circularity, solidity, greatest distance between any pair of pix-
els of the blob (known as Feret or Feret’s diameter), the projections of Feret’s diameter on the
axes, the angle of Feret’s diameter with respect to the horizontal axis and the minimum value of
the Feret’s diameter. Finally, the blobs were labeled with two possible categories: “tuna” and “no-
tuna”, according to scientific observations.
Training dataset elaboration
Based on the reference images, a training dataset of blobs was created to train automatic classi-
fication programs and to test their efficiency before they were used to classify new unsuper-
vised images (e.g. those collected onboard commercial fishing vessels without an observer
Fig 1. The study area. A) Atlantic bluefin tuna distribution based on ICCAT catch data for the period 2000–
2013 [
14]. B) The study area, bluefin tuna fishing locations based on logbook data [22] and scientific surveys
conducted in 2009 and 2011.
doi:10.1371/journal.pone.0171382.g001
Automated detection of tuna with medium-range sonars
PLOS ONE | DOI:10.1371/journal.pone.0171382 February 2, 2017 5 / 18

Citations
More filters
Journal ArticleDOI
Yannick Baidai1, Laurent Dagorn1, M.J. Amande, Daniel Gaertner1, Manuela Capello1 
TL;DR: In this article, a novel methodology is presented which utilizes random forest classification to translate the acoustic backscatter from the buoys into metrics of tuna presence and abundance, and the analysis showed accuracies of 75 and 85 % for the recognition of the presence/absence of tuna aggregations under DFADs in the Atlantic and Indian Oceans, respectively.

21 citations

Journal ArticleDOI
TL;DR: In this paper , the authors reviewed the relevant articles on fish stress monitoring and summarized that the novel technologies were sorted into three categories: machine vision-based, sensor-based and acoustic-based methods.

14 citations

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations

Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Journal ArticleDOI
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

37,861 citations

Book
25 Oct 1999
TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Abstract: Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research. *Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects *Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods *Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks-in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualization

20,196 citations

Journal ArticleDOI
TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Abstract: More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on Source-Forge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

19,603 citations