Diego Burgos

Predicting Hurricane Trajectories Using Geospatial Data

https://github.com/youngdeezy/capstone

Phase I

Introduction

Hurricanes are naturally occurring events that can cause great destruction in their affected regions. Because of this, hurricane forecasting is an important safety measure for any hurricane prone region. Being able to preemptively know when a hurricane will arrive as well as its intensity allows these places to create a proper plan of action on how to respond to the storm.

Although traditional hurricane forecasting is an established method that has been in use for decades, there may be some new possibilities that can be achieved if machine learning can be implemented into hurricane forecasting. Improving hurricane forecasting will only continue to help these areas respond to these storm events.

Objective

•Using geospatial data to analyze hurricane events

•Turn the hurricane data into a data set

•Data set is then used to train a machine learning algorithm to predict the paths of ongoing hurricanes

•Program plots a chart of generated hurricane path and can be used to compare them to previous hurricanes

Goals

•Compare severity and paths of past hurricanes

•Make predictions on hurricane tracks

•Compare predicted results with actual hurricane data

•See how a machine learning model fares at predicting hurricanes compared to traditional forecasting

•Create an easily presentable visualization that will display generated storms as well as previous ones

Literature Review

Before approaching any problem, it is important to see if there is any previous work that can relate to it. Machine Learning is a study that can range across countless of fields and industries. Because of that I was able to find a few articles of interest. The two most important ones are from:

Shiela Alemany - Her group predicted the trajectory of hurricanes using Recurrent Neural Networks. The neural network is employed over a fine grid to reduce truncation errors. This techqniue is used to predict up to 120 hours of hurricane paths.

Albert Kahira - Monthly averages of 6 weather variables(sea surface temperature, mean sea level pressure, sea ice cover, 2 metre pressure, U wind speed and V wind speed) from 1901 to 2010 are provided by the earth science department of Barcelona Supercomputing Center. Uses Convolutional Neural Networks

Exploratory Data Analysis

Data points from Hurricane Wilma's storm track

The dataset that I obtained was a csv file containing 34,415 rows of hurricane data. They contain standard hurricane statistics such as name, time, location, wind speed, and pressure.

All storms that were labelled as unnamed were removed from the dataset.

When the data scrubbing was completed, the DataFrame had 23583 rows of hurricane data. Approximately 10,832 rows of hurricane data were removed from the DataFrame.

Statistics of the dataset

Example of Geospatial dataset used to make continental map of U.S.

Each row represents each state of the U.S.

Visualization of Hurricane Katrina's track.

Windspeed intensity is classified by color (more red)

The visualization that I made would convert all the rows of a certain hurricane into a GeoDataFrame where it can then be plotted onto a map like in the figure above.

Planned Implementation

Using GeoPandas and Matplotlib, the program can track a hurricane onto a map of the US
After the visualization is complete, the dataset is used to train and deploy a model that can predict the tracks of hurricanes
Predicted storm tracks will be plotted against their real counterparts using this program

Phase II

Now that the visulization side of the project is complete, it is now time to approach the machine learning side of it. It is important to find values that will indicate trends of the data to get the optimal model prediction. Some possible training labels include location, wind speed, pressure, and distance between each point.

Clustering

Clustering the entire dataset to see if there are any trends. Unfortunatley, it is hard to get any information from this. It would be better to perform clustering on smaller, more localized datasets.

Distance

One measurement that may be used when training a model is the distance between each point. I modified the dataset so that each point has the longitude and latitude of the previous row. With that I was able to calculate the distance between two points with the haversine formula which requires the previous point's longitude and latitude.

Model

The model that was used to train the data was with a Reccurrent Neural Network. This is implemented with scikit-learn and TensorFlow

Results

Although I was able to train and deploy a model, the results were less than optimal, and I clearly need to improve it before I can make any predictions
Looking to find ways on how to improve the model's performance

Phase III

After my suboptimal results with the RNN I deployed, I investigated possible ways that I can fix the problem and solve the main goals of the project. In this phase I was able to make great strides and deploy a model that predicts the track of hurricanes.

Grid System

After looking back on Shiela Alemany's implementation of grids to train her group's model, I decided to implement a grid system for the purpose of training.

New Dataset

To improve the performance of the model, I decided to use a different hurricane dataset that was obtained from Unisys Weather.

It ranges from 1920-2012, containing approximately 33248 rows of hurricane data.

This dataset mostly contains standard Hurricane data except for the unique-key value. The unique-key value contains the name of the storm, the year, and the hurricane number. This common identifier is used for training purposed. The distance and direction rows are used to implement the grid system. Each point is assigned to a specific location on the grid.

Chart showing all of the datapoints plotted onto a map

Model

The main difference with this RNN model is that grid locations are now being used as the label.

After being deploy the generated grid locations were able to be precise enough to mimic the behavior of their real counterparts. Although there are some inaccuracies, the model was able to generate grid locations of a storm that mimics the behavior of the real counterpart.

Chart comparing the predicted track of hurricane Irene along with its real counterpart

Chart of a data tuple and the grid location that is comparing the predicted grid locations with their real counterpart

Conclusions

Due to the slight inaccuracies in my model's predictions, I believe that traditional forecasting is superior for the time being. This model requires further tuning to make better predictions before it can start being implemented with legitimate hurricane forecasting. However, these are some encouraging first steps to reaching that goal.

Due the time contrainst there is one thing I would have wanted to further work on if I had the time. One thing is that I would have wanted to see if it is possible to revert a generated storm track into a form that it can be plotted onto a map. Being able to make compare these two tracks on an actual map would have made it easier for someone to visually see the performnce of the generated hurricane track.

Works Cited:

Alemany, Sheila & Beltran, Jonathan & Perez, Adrian & Ganzfried, Sam. (2018). Predicting Hurricane Trajectories Using a Recurrent Neural Network. Proceedings of the AAAI Conference on Artificial Intelligence. 33. 10.1609/aaai.v33i01.3301468.

Asif, Amina & Dawood, Muhammad & Jan, Bismillah & Khurshid, J. & DeMaria, Mark & Minhas, Fayyaz ul Amir Afsar. (2020). PHURIE: hurricane intensity estimation from infrared satellite imagery using machine learning. Neural Computing and Applications. 32. 10.1007/s00521-018-3874-6.

Links cited to develop code:

https://heartbeat.fritz.ai/working-with-geospatial-data-in-machine-learning-ad4097c7228d

https://towardsdatascience.com/hurricane-florence-building-a-simple-storm-track-prediction-model-1e1c404eb045

https://medium.com/@kap923/hurricane-path-prediction-using-deep-learning-2f9fbb390f18

https://www.datacamp.com/community/tutorials/geospatial-data-python

https://github.com/sheilaalemany/hurricane-rnn

https://www.freecodecamp.org/news/the-ultimate-guide-to-recurrent-neural-networks-in-python/

Page updated

Report abuse