The research performed so far can be divided into three important steps:
Data collection
Data importation
Data analysis
In this chapter, we'll cover the basics of each of these steps.
By the end, you'll hopefully have a clear idea of how we did our research.
Global
Positioning
System
The data analyzed for our research originated from two main sources: GPS devices and interviews. In this section, I'll provide more information on our methods for gathering it.
In addition to this data, our maps also used the publicly available bathymetry data of the General Bathymetric Chart of the Oceans (GEBCO). Methods for obtaining this data are given here, but won't be further detailed on this website
In order to collect accurate and precise spatial data, we provided many artisanal fishers from Governor Generoso, Samal Island and Malita with GPS devices which they could take on their fishing trips.
Once activated, these devices repeatedly logged the location, speed and direction of the fishing vessel. After a month, all logs were compiled in an Excel file for each fisher, and they were then sent to me for analysis.
With the aim of assessing the reliability of interview data when mapping the spatial distribution of fishing efforts, many artisanal fishers were also interviewed.
In these interviews, fishers were asked to highlight the regions they most frequently visited on the grid seen in image 5. Results from all fishers of a municipality were then put together in an Excel table, of which the rows and columns corresponded to that seen in the original grid. Once all fishers were interviewed, this file was sent to me for analysis
In addition to this, interviewed fishermen were also asked about other things. Examples include questions about the average duration of their fishing trips and about their general catch composition. Most of these results were gathered for different studies however, and will therefore not be further discussed here.
Once data was obtained, it had to be imported into the software program best fit for a particular analysis. For our research, these programs were QGIS (v.3.14.16) and R (v.4.0.3).
Most often times, this step was quite easy. As an example, importing the bathymetry data simply required for me to drag the downloaded file onto QGIS. Occasionally however, it was easier said than done.
In this section, I'll provide more info on the methods I used for GPS and interview data.
Importing and georeferencing the GPS data I received into the QGIS environment was fairly easy.
After downloading the data as separate Excel files for each fisher tracked, I converted them into Comma Separated Value (.csv) files. This was necessary in order to import the data into QGIS. As GPS data was split across more than fifty files for each month, I performed this as a batch process. Because I wasn't initially aware about how to do this, I consulted online resources, which quickly helped me out (image 6).
Transferring the interview data into QGIS wasn't so easy however, as the results were given in an Excel table which QGIS couldn't read. As a result, I had to try and find a creative solution.
After several attempts, I managed to use the ImageJ program (v.1.53c) to recreate the grid seen in image 5 in QGIS. By reformatting the Excel table containing interview results and using the "join layers by attribute" function of QGIS in a novel way, I was then able to join the data table with the empty grid I created, leading to all the data successfully being transferred.
Importing the GPS data into R required for it to initially be formatted and filtered on a case-by-case basis, which was done using QGIS and Excel.
Often times, filtering the GPS data required for me to develop novel expressions in the QGIS "field calculator". As an example, I had to filter out incomplete data tracks for some analyses by removing those which contained a data measurement whose associated duration exceeded four hours, or which contained no data measurements within at least one kilometer of the nearest port.
By experimenting with the functions included in the QGIS field calculator, I got to learn a ton about various intricacies involved in working with this program.
After importing our data, we could proceed with analyzing it.
In QGIS, these analyses mostly pertained the production of maps, whilst in R they mostly pertained Kruskal-Wallis analyses.
In this section, I'll discuss the general steps we took for each of these analyses.
Analyses in QGIS usually occurred in six approximate steps: Data formatting, data filtration, data processing and calculations, layer styling and map production.
As you can imagine, the exact processes involved in these steps varied widely depending on both the data worked with and the map we aimed to make. In order to keep this section at an appropriate length, I'll only quickly describe what each of these steps generally pertained.
Data formatting is sometimes required when QGIS doesn't correctly recognize the original format of your data. This step often involved working in the QGIS field calculator and using the "Refactor fields" function.
Data filtration involves taking subsets from your data which are best fit for producing a particular map. For this, I most often worked in the field calculator, but occasionally had to use functions such as "select by location" or "distance to nearest hub".
Important to note here is the filter used for inferring whether or not a GPS point represents a fishing activity, as this filter can strongly affect the end result. For our analysis, we assumed GPS points recorded at a speed below 5 km/h represented fishing activities, a limit often used in research (Forero et al., 2017)
Data processing and calculations are required if your data isn't presented as the correct map element, or if certain values have to be calculated from existing ones before the result can be produced. This step often involved both the field calculator and various functions. As an example, I had to sum up all the duration values of GPS recordings representing fishing activities within each sub-section of the recreated grid of image 5 using the "join attributes by locations (summary)" function in QGIS.
Layer styling is quite a simple step which mainly involves assigning appropriate colors and labels to the layers of your map. Sometimes, it was hard to find the right legend for an important layer which best visualized gradients within it, but I believe anyone can do this step if they take their time.
Map production involves working in the QGIS layout manager. More specifically, it involves trying to optimally arrange the map elements you've prepared. For this, I believe it's important to curiously explore the different possibilities of the layout manager, as this allows for you to fine-tune your maps in order to obtain a professionally looking result.
Analyses in R usually proceeded in approximately four steps: Data exploration, testing of ANOVA assumptions, Kruskal-Wallis analysis and effect size estimation and lastly, plot production.
Again, in order to keep this section short, I'll only quickly describe what each of these steps entailed for our work.
Data exploration involved using functions such as head() and summary() in order to explore data after it's been imported. Additionally, it involved the production of some initial data exploration plots. This step is important to assure all data is correctly formatted before further analysis. For our analyses, this was always the case, as we made sure this was done beforehand in QGIS.
Testing of ANOVA assumptions was essential to our statistical analyses because the ANOVA analysis appeared to be perfect for each of them. In order to test the assumptions, we mainly evaluated the ANOVA diagnostic plots and performed the Shapiro-Wilk test for normality and Bartlett's test for homogeneity of variance. Sadly however, the distance and speed variables we analyzed were non-normally distributed, and residuals in our correlations were most often heteroscedastic.
Kruskal-Wallis analyses were then performed as they were appropriate for the correlations we analyzed and did not assume normality or homoscedasticity. Additionally, Kruskal-Wallis effect size estimations were performed using the function kruskal_effsize() included in the rstatix package for R.
Plot production was lastly done using the ggplot2 package for R. Although I was initially inexperienced at working with this package, I quickly got used to it with the help of an active online community.