I propose to develop the Neural Architecture Search (NAS) Machine Learning (ML) Model to emulate Chemical Variables using the Weather Research Forecasting (WRF) Chem model’s output. WRF-Chem is the Weather Research and Forecasting (WRF) model coupled with Chemistry. The problem with running the WRF-Chem model is that it requires high time computational and cost.
Using historical data from the WRF-Chem model I will develop a Machine Learning model which will predict future data.
By doing this I will try to reduce the computation cost and reduce processing time to generate chemical variables.
Neural Architecture Search (NAS): It is a technique to automates neural network architecture engineering.
Artificial Neural Network: An ANN is an information processing model that is inspired to mimic biological nervous systems.
Recurrent Neural Network: Recurrent neural networks, also known as RNNs, are a class of neural networks that allow previous outputs to be used as inputs while having hidden states.
It is the process of automating Neural Network architecture engineering.
We provide a NAS system with a dataset and a task (classification, regression, etc), and it will give us the architecture.
This architecture will perform best among all other architecture for that given task when trained by the dataset provided.
Generally, NAS can be categorized into three dimensions- search space, a search strategy, and a performance estimation strategy
The fundamental of neural architecture search [7]
The search space determines which neural architectures to be assessed. The search space contains every architecture design (often an infinite number) that can be originated from the NAS approaches. It may involve all sets of layer configurations stacked on each other or more complicated architectures that include skipping connections. To reduce the search space dimension, it may also involve sub-modules design.[7]
It will provide a number that reflects the efficiency of all architectures in the search space. It is usually the accuracy of a model architecture when a reference dataset is trained over a predefined number of epochs followed by testing.[7]
NAS majorly relies on search strategies, including random and grid search, gradient-based strategies, evolutionary algorithms, and reinforcement learning strategies. A grid search is to follow the systematic search. In contrast, random search randomly picks architectures from the search space and then tests the accuracy of corresponding architecture through performance estimation strategy.[7]
Dimension of NAS Method [8]
An AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras is to make machine learning accessible to everyone. It provides libraries that implement Neural Architecture Search.[5]
The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications.[3]
It features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility.
For researchers, WRF can produce simulations based on actual atmospheric conditions (i.e., from observations and analyses) or idealized conditions.
In our study, the input data is the data from the WRF model. For this study purpose, we will focus on two major variables from the WRF model i.e. Temperature and Pressure.
We will make efforts to emulate exact Temperature and Pressure data from the Historic data.
The initial Data is in the NetCDF file format.
A major part of Data Loading and Preprocessing involves extracting the useful data from NetCDF files and pickling it as a dictionary and storing it on our server.
NetCDF (network Common Data Form) is a file format for storing multidimensional scientific data (variables) such as temperature, humidity, pressure, wind speed, and direction.
For the initial Experiment, we extracted a month (May-2019) of data from the overall year and a half of data.
Latitude shape : (29,29)
Longitude shape : (29,29)
Temperature Shape: (29,29,29)
Pressure Shape: (29,29,29)
So at single point of time and at single Latitude and Longitude we have 29 values of Temperature
Planetary Boundary Layer (PBL): It is the lowest part of the atmosphere and its behavior is directly influenced by its contact with a planetary surface.
Potential Temperature: Unlike regular temperature is not affected by the physical lifting or sinking associated with flow over obstacles or large-scale atmospheric turbulence.
Perturbation Potential Temperature: It is defined as the difference between the potential temperature of the PBL and the potential temperature of the free atmosphere above the PBL. We will emulate perturbation Potential temperature.
According to the WRF users manual the formula to convert perturbation potential temperature to total potential temperature in K is:
Total pot. temp. in K = T + 300 (T is the perturbation pot. temp.)
[4]
WRF-CHEM experiment using the YSU scheme started from 00:00UTC Jan, 2018 to May 31, 2019, over the United States at 2.50x2.50 degrees with 29 vertical levels 5N-70N and 160W-32W domain, hourly output.
We used WRF CHEM multiple locations in the United States. Trained using data from May 2019.
Locations included in this study are Lamont OK, Florida, and University of Maryland Baltimore County.
The different models trained are all Univariate RNN model, so our input and output variable is same
Model Architecture:
Layers: 2 LSTM bidirectional layer, 1 Dense Layer
Number of Neurons: 29
Activation Function: Tanh
Epochs: 100
Max Trials: 50
Evaluation Matrix:
R squared: 0.96
RMSE: 0.38
Training Time: 15 Min
Model Architecture:
Layers: 2 GRU bidirectional layer, 1 Flatten Layer, 1 Dense Layer
Number of Neurons: 29
Activation Function: Tanh
Epochs: 1000
Max Trials: 100
Evaluation Matrix:
R squared: 0.96
RMSE: 18.5
Training Time: 30 Min
Model Architecture:
Layers: 2 LSTM bidirectional layer, 1 Dense Layer
Number of Neurons: 29
Activation Function: Tanh
Epochs: 100
Max Trials: 50
Evaluation Matrix:
R squared: 0.98
RMSE: 0.58
Training Time: 4 Min
Model Architecture:
Layers: 2 GRU bidirectional layer, 1 Dense Layer
Number of Neurons: 29
Activation Function: Tanh
Epochs: 1000
Max Trials: 100
Evaluation Matrix:
R squared: 0.98
RMSE: 10.55
Training Time: 50 Min
Model Architecture:
Layers: 2 LSTM bidirectional layer, 1 Dense Layer
Number of Neurons: 29
Activation Function: Tanh
Epochs: 100
Max Trials: 50
Evaluation Matrix:
R squared: 0.98
RMSE: 0.42
Training Time: 5 Min
Model Architecture:
Layers: 2 GRU bidirectional layer, 1 Dense Layer
Number of Neurons: 29
Activation Function: Tanh
Epochs: 100
Max Trials: 100
Evaluation Matrix:
R squared: 0.98
RMSE: 57.3
Training Time: 12 Min
References:
Neural Architecture Search - Lil'log, Lilian Weng, Aug 6 2020, https://lilianweng.github.io/lil-log/2020/08/06/neural-architecture-search.html
Definition of Recurrent Neural Network from Stanford - Stanford, Afshine Amidi and Shervine Amidi, https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks
About WRF-Chem model - National Centre for Atmospheric Research, https://www2.acom.ucar.edu/wrf-chem
Perturbation Potential Temperature - Agnes Mika agnes.mika at bmtargoss.com , https://mailman.ucar.edu/pipermail/wrf-users/2010/001896.html, https://en.wikipedia.org/wiki/Potential_temperature#Potential_temperature_perturbations
Haifeng Jin, Qingquan Song, and Xia Hu. "Auto-keras: An efficient neural architecture search system." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.
George Seif: Everything you need to know about AutoML and Neural Architecture Search, Towards Data Science: https://towardsdatascience.com/everything-you-need-to-know-about-automl-and-neural-architecture-search-8db1863682bf
Arjun Ghosh: The Fundamentals of Neural Architecture Search (NAS), Towards AI: https://towardsai.net/p/machine-learning/the-fundamentals-of-neural-architecture-search-nas
Thomas Elsken, Jan Hendrik Metzen, Frank Hutter: Neural Architecture Search: A Survey, https://arxiv.org/pdf/1808.05377.pdf