Abstract
Fish spawning phenology is a major concern for conservation and fisheries management. New intensive data sources such as GPS-based tracking data or high resolution catch declaration data are progressively becoming available in the field of marine ecology. These benefit from high spatio-temporal resolution and open new research avenues to investigate inter-annual and seasonal variability of phenology.
In this paper, we illustrate how catch declarations modeling coupled with spatio-temporal dimension reduction methods known as Empirical Orthogonal Functions (EOF) can be used to synthetize spatio-temporal signals in fish distribution; Specifically, we address the following questions; (1) can we identify spatio-seasonal patterns that can be interpreted in terms of seasonal migration between essential habitats? (2) can we identify changes in the phenology? (3) are those changes related to environmental drivers?
The analysis is illustrated through the analysis of the reproduction phenology on three key commercial species in the Bay of Biscay (Hake, Sole and Sea Bass). The EOF analysis on these species emphasizes strong seasonal spatio-temporal patterns that correspond to migration patterns between feeding areas and reproduction areas. Based on this methodology, we identify seasonal variations in the timing of the reproduction and we relate them to Sea Surface Temperature, a key driver of fish reproduction.
Keywords: species distribution, spatio-temporal modeling, reproduction timing, spawning season.
Introduction
To complete their life cycle, fish require different habitats specific to different life stages (Harden, 1969). Those habitats, also known as Essential Fish habitats (Magnuson-Stevens Fishery Act, 2007) are associated with key demographic processes in the fish life cycle such as spawning, feeding and migrations and are characterized by a strong concentration of individuals within a spatially restricted area. However, rapid environmental changes may force fish to adapt, by tracking their essential habitat in space and time, and by changing the seasonal timing of their demographic processes (termed “phenology”).
Understanding changes in phenology of demographic processes is critical for the management of fish population. Seasonal habitat utilization, timing of migrations and location of spawning areas are key knowledge to preserve fish essential habitats and ensure the renewal of marine resources (Delage and Le Pape, 2016; Lieth, 2013). For instance, areas where fish aggregate for spawning may require specific attention in terms of fisheries management (Biggs et al., 2021; Grüss et al., 2019). Also,marine Spatial Planning requires a good knowledge of fish essential habitats to implement offshore wind farms or limit the impact of marine aggregate extraction (Bastardie et al., 2015, 2014; Campbell et al., 2014).
Still the available data to investigate fish spatio-temporal demographic processes generally have sparse spatio-temporal coverage (e.g.scientific survey data, mark-recapture tagging data). Typically, scientific surveys usually occur once a year and provide samples only on the time span of the survey (Bastardie et al., 2015). Onboard observer data provide additional data on the whole year by recording fishing catches on a small portion of the commercial fleets (Rufener et al., 2021). With these data, it is possible to infer fish distribution at a seasonal or at a quarterly level at best (Kai et al., 2017; Olmos et al., 2023). However, this temporal resolution is generally not enough to investigate precisely the phenology of demographic processes that occur at a shorter temporal scale e.g. month, week (Biggs et al., 2021).
In the last decade, methods to combine fishermen declarations (logbook) with Vessel Monitoring System (VMS) data (‘VMS x logbook’ hereafter) have been developed to provide a fine scale information on fishing activity and fishing landings (Bastardie et al., 2010; Hintzen et al., 2012). In the last decade, ‘VMS x logbook’ data sources have been used to infer fish spatio-temporal distribution at a fine scale (Alglave et al., 2022; Azevedo and Silva, 2020; Dambrine et al., 2021; Murray et al., 2013). These data benefit from a high spatio-temporal resolution and consequently they open huge research avenues to investigate inter- and intra-annual variability of fish spatial distributions.
Recently, a modeling framework has been built (1) to integrate ‘VMS x logbook’ data from distinct fishing fleets to infer fish spatial distributions and (2) to handle preferential sampling of fisheries data (Alglave et al., 2022). The framework has been extended in time at a monthly time step. It has been applied to map fish aggregation areas to identify spawning grounds for a few key species of the Bay of Biscay (Alglave et al., 2023). Still, these approaches only investigate a small part of the time series: single year of data in Azevedo and Silva (2020) or a specific period over several years in Alglave et al. (2023). Consequently, they left apart the huge amount of information that can be extracted from the analysis of a long-term time series. One reason for this is the difficulty of simultaneously interpreting inter-, intra-annual and spatial variations in fish distribution.
Dimension-reductions techniques such as Empirical Orthogonal Functions (EOF - Hannachi et al., 2007; Lorenz, 1956) can provide insights into the spatio-temporal variability of fish population processes. EOF have been mostly used to characterize physical oceanography conditions. Some recent studies have investigated fish processes using EOF (Grüss et al., 2021; Petitgas et al., 2014; Thorson et al., 2020b, 2020a). However, to the best of our knowledge, previous studies on EOFs for biological processes have only aimed to synthesize the inter-annual variability of these processes and have never studied the intra-annual variability (i.e. phenology).
In this paper we aimed at demonstrating the potential of integrated spatio-temporal hierarchical models (ISTHM - Alglave et al., 2023, 2022) combined with EOF to:
Taking sole, sea bass and hake in the Bay of Biscay as case studies, inferences derived from EOF analyses are compared to the literature. This allows to highlight the added value of our results with regards to the available knowledge of the location of spawning grounds, the intra-annual variability of spawning, and the environmental drivers of reproduction. Finally, we also expect that the EOF analysis will help to identify lesser-known or unknown essential habitats, such as feeding grounds.
Material and methods
Outline of the approach
Our approach includes different steps that are detailed hereafter:
We selected three case studies that are important species in the Bay of Biscay and for which some knowledge on essential habitats is available but incomplete: sole , hake and sea bass(ICES 2020, 2022). Most of the literature on these species focus on spawning phenology (summarized in Figure 1). For sole , Arbault et al. (1986), Petitgas (1997) and Alglave et al. (2022) identified spawning grounds along the Bay of Biscay from January to March. For hake , Alvarez et al. (2004) provided similar analysis based on surveys conducted in the 90’s and Poulard (2001) have investigated rough scale spatio-temporal distribution of hake based on logbook data. For seabass , recent analyses have investigated the spawning area and timing based on ‘VMS x logbooks’ data and provide information on phenology (Dambrine et al., 2021). Additional information on the adults feeding grounds is available for Sole (Figure 1).
Model structure and data to fit the model
Data and commercial fleets
We analyze catch per unit of effort (CPUE) of trawlers between 2008 – 2018, a relatively long period that allows to evidence intra- and inter-annual changes in species distribution and phenology.
As we only want to interpret the spatio-temporal dynamics of adult individuals, we filtered the mature fraction of the declarations by crossing catch declarations with the size distribution in each commercial category (see Alglave et al. (2023) for further details).
We selected the data of several trawler fleets as they benefit from a relatively opportunistic behavior, and they usually cover a wide area (Figure 2). Furthermore, their CPUE provides a good indicator of fish relative biomass (Hovgêrd et al. 2008). The selected fleets for each species are presented in Table 1.
Model structure and spatio-temporal resolution
To map the spatio-temporal distribution of the biomass of these different species, we used the framework developed in Alglave et al. (2023, 2022). The framework is a hierarchical integrated statistical model that combines multiple data sources to infer spatial distribution of fish density. The model is fitted to the data between 2008 and 2018 at a monthly time step on a 0.05° grid. It is structured in 3 layers:
(1) the latent field of relative biomass spatial distribution (the field we want to infer); (2) the observations layer; this layer can handle CPUE data from different fleets including the distinct catchability of the fleets; CPUE data are related to the same unique spatio-temporal field of relative abundance (3) unknown parameters, including the ones that control the shape of the biomass latent field;
We simplified the framework developed by Alglave et al. (2022) by ignoring the preferential sampling of fishermen. Indeed, previous results by Alglave et al. (2022) have shown that preferential sampling of trawlers is low. Considering it will therefore only slightly affect spatial predictions while strongly increasing the computation burden (Alglave et al., 2022).
EOF to identify essential habitats and to highlight changes in phenology
EOF Basics: a gentle overview
EOF was initially developed by Lorenz (1956) for weather forecasting. The broad idea is to generalize the classical dimension reduction techniques like Principal Component Analysis to spatio-temporal dimensions. EOF seeks to summarize the information brought by a set of spatio-temporal maps into a smaller set of maps that best describe and summarize the spatio-temporal patterns.
Let’s defined \(S(x,t)\ \)a biomass field defined at a time step\(t\ (t\ =\{1,\ldots,T)\) and spatial cell \(x\), and the centered field of biomass\(S^{*}(x,t)=S(x,t)\ -\ \overset{\overline{}}{S(x,\cdot)}\) (with\(\overset{\overline{}}{S(x,\cdot)}\) the spatial average of\(S(x,t)\)). \(S^{*}(x,t)\) is expressed as a linear combination of spatial patterns \(p_{m}\) (or maps, named EOF) related to temporal indices (or loading factors) \(\alpha_{m}(t)\).
\begin{equation} S^{*}(x,t)=\sum_{m=1}^{M}{\alpha_{m}(t)\cdot p_{m}(x)}\ ;\ \ \ x\in\{1,...,n\},\ t\in\{1,...,T\},\ M\ \leq\ T\nonumber \\ \end{equation}
The loading factors \(\alpha_{m}(t)\) and the spatial patterns\(p_{m}(x)\) are defined to maximize the variation captured by the spatial patterns \(p_{m}(x)\) and to ensure the spatial patterns and the loading factors are orthogonal between each other. The first spatial map\(p_{1}(x)\) captures the biggest amount of spatial variation; the second spatial pattern \(p_{2}(x)\) is orthogonal to the first one and captures the second biggest amount of spatial variation. In matrix terms, this falls back to a diagonalization problem and is equivalent to make a PCA analysis on a data frame where individuals are time steps and variables are locations (Lorenz, 1956). Classical PCA representation can be used to represent EOF results. Typically, the first two loading factors can be projected on the first two spatial patterns to get a visual representation of the spatio-temporal decomposition of the signal on the first plan of variability.
In practice, the diagonalization is performed through Singular Value Decomposition (Banerjee and Roy (2014). It is available in R through the function svd (R Core Team, 2023). Spatial patterns are normalized to 1 and loading factors are standardized by the square root of their eigenvalue.
Filtering EOF dimensions and locations of the spatial pattern
For each species, we filter the number of dimensions based on the graph of the variance explained by each dimension. As a commonly used empirical rule of thumb, we cut the graph at the dimension where there is a drop in the variance explained. When plotting the spatial patterns, all the locations that contribute less to 1 / (number of grid cells over the spatial domain) are shaded to highlight the locations that contribute most to the variation. In standard PCA, this is equivalent to keeping only the variables (i.e. locations in our case) that explain or contribute more than a single variable (or location).
Identifying EOF results to phenological phases