To fulfill our goal of providing a robust understanding of the impact of hurricanes on Florida gas prices, we collected, cleaned, and preprocessed two main, relevant, datasets.
Hurricane data from the NOAA (National Ocean and Atmospheric Administration) was integral as the insight serves multiple purposes. The primary purpose of the hurricane data is derived in its rich collection of predictors for the model. Secondly, this data provided a foundation to understand the nature of hurricane events themselves.
Gas data from AAA (American Automobile Association) was also pivotal to our understanding of the impact of hurricanes on Florida's gas markets. This price data serves as the response variable in the model for our main question.
For information on the method of collection and the data source, please visit the Data Acquisition tab.
For snapshots of the raw and processed hurricane and gas price data, please see the Data Preprocessing tab
The respective hurricane and gas price data sets were not fully integrated because the gas price data was recorded in each metro. If the datasets were integrated, hurricanes would be recorded in each metro, making the dataset extremely large and redundant. Instead, we created a method to merge one user-specified metro’s gas data with the hurricane data.
Data Ethics
The team took ethical implications into consideration when collecting then subsequently utilizing the data.
To mitigate the potential negative ramifications that would arise from scraping data from the Wayback Machine, the team limited HTTPs requests sent by scraping once single. The team decided a single pull documenting historic prices from February 2026 and before was sufficient as further collections would not hold substantial additional information for the analysis.
With broadened respect to include the hurricane data; none of the data collected and used was private, personally identifiable, or restricted.
Limitations
As we grouped by metropolitan area for joining data, we are limited by a small loss in detail in gas data within the metro itself. The manner in which the data was collected by AAA could also limit the detail in the data as the variance in data within the county is not accounted for. However, the large radius of these hurricanes posits that neighborhood level data would be potentially deter the accuracy of the models due to noise.
For the geospatial linkage of the hurricane and gas data sets, we utilized a coordinate data set. Simplemaps.com, a reputable interactive maps and data site amalgamated data from the United States Census Bureau and Bureau of Labor Statistics for the creation of an open source data set, mosty recently updated in February 2026. This set includes the metropolitan area name, 5-digit FIPS code, the full county name, and the latitude and longigude of the metro area.
The team utilized the basic, free data set. The data license is included in the github repository. As this static set was primarily for joining data, we did not need a live data source or API.