GPT 4 Climate Data

A new era has arrived. See how ChatGPT transforms climate data analysis.

Are we becoming so lazy that we no longer wish to write code or even click webpage buttons? Maybe, but luck might be on our side. ChatGPT's new plugin could facilitate this laid-back approach to climate data analysis. Let's take a glimpse at how it works.

I will demonstrate an example of plotting and analyzing the Kaplan Sea Surface Temperature Anomaly (SSTA) dataset. First, download the NetCDF data (sst.mean.anom.nc as shown in the image). Second, upload it to ChatGPT4's Advanced Data Analysis plugin. I will show you the input (prompt) and output, skipping the GPT dialog.  Here, I said "produce a horizontal global map for January sst" as the prompt. It then uses Python to perform the analysis.

First, upload the data "sst.mean.anom.nc", a netcdf file.
GPT output:

Here's a horizontal global map showing the sea surface temperature (SST) anomalies for the first January in the dataset. This map uses a basic grid layout without the detailed geographic projections that cartopy would provide. Apparently, it's missing geographical reference. So I said to it:

GPT output:

The map now includes a geographical outline, showing the coastlines and continental boundaries, along with the sea surface temperature (SST) anomalies for January. This provides better context for the spatial distribution of the anomalies relative to the land masses. Next, I told it to...

And it knows the coordinate of Nino3.4 region 😚!

GPT output:

The time series plot above illustrates the Nino3.4 region average sea surface temperature (SST) anomalies from January 1950 to December 2020. The horizontal dashed line at zero indicates the baseline for the anomalies. Next, I said...

GPT output:

On the left is the Niño 4 SST anomalies time series, and on the right is the power spectrum with the periods converted to years. Each plot now clearly displays the respective data in a side-by-side format for easy comparison. 

And next, something exciting happened, as I said: 

GPT output:

And this magically happened. It now displays the global correlation map of SSTAs with the Niño 4 region. The map shows the Pearson correlation coefficient values across the globe, indicating how SSTAs in different locations correlate with those in the Niño 4 region. Areas with positive values (reds) are positively correlated, while areas with negative values (blues) are negatively correlated.

You can even do some crazy things. I asked it to "hatch the area where the correlation coefficients are significant at p<.05", and here is what it produced! 

GPT output:

The global correlation map and areas where the correlation coefficients are significant at p<0.05 are indicated with dots. This visualization accurately reflects the relationship between global SST anomalies and those in the Niño 4 region, with significant correlations highlighted.


Next, performing time series analysis...



In the following example, I'll demonstrate basic time series processing using a text file of data on Northern Utah's snow water equivalent (SWE), courtesy of Matthew, which looks like this:

The example data file looks like the following, which is a one-column text entry.

Next, I uploaded the text data to ChatGPT4, shown below as utah_swe.txt. I then told it to plot the data.

ChatGPT was able to plot it, as shown below.

Then, I gave GPT further instructions as follows:

Next, I wanted to see the power spectrum of this time series. So I prompted:

And GPT generated this power spectral analysis, alongside an interpretation.

To make the figure more telling, specifically for its decadal variability, I said:

And beautifully, it produced a spectrum that shows the known dominant decadal-scale variability of SWE in this region.

You can further instruct GPT to create multi-panel, overlaid figures by telling it "Overlay a 8-16 year bandpass filtered line on this time series. Overlay the two time series by using the SWE with its mean removed. Add another plot of SUE'S power spectral analysis." Then GPT created this:

Case study demo: 

Palm oil yield vs. climate anomaly

Based on actual research, I'll demonstrate how you can perform direct analysis using real-world data. 

First, I upload the data file in Excel format as follows:

The data contains 2003-2021 values of palm oil-related variables. Next, let's see how ChatGPT4 dissects the data. I will show my prompt and you can click the prompt to see the text response from ChatGPT4.

Here are the plots for each variable in the dataset, with the year on the x-axis.
Price seems to anticorrelate with yield, which makes sense.

Next, I uploaded the NetCDF file of station-based precipitation data from the Climate Prediction Center, with a grid spacing of 0.5-deg longitude and latitude.

ChatGPT understands this data:

And then, it started to do magic:

Wow... we now suspect that the value of product of palm oil in Southern Thailand may be linked to increased precipitation, meaning that climate change may have a role in its economy.

Next, let's find out whether there is a seasonal lag between the accumulated precipitation and palm oil yield or value.

We now know that the most relevant precipitation months for palm oil value are the previous late-spring through the current summer.
Now we see a strong inter-decadal signal in the precipitation's variability.

So, does the increased precipitation affect palm oil's yield and value? Given the inter-decadal variation, further attribution research is needed.

And it can be your homework if you choose so.



Next: Understanding the Sea Ice Content (SIC) variability and its relationship with common climate modes.


Let's assume we have a spreadsheet containing monthly data on Sea Ice Concentration (SIC) along with other climate variables such as the Arctic Oscillation (AO), El Niño Southern Oscillation Index (ONI), North Pacific Oscillation (NPO), and East Atlantic pattern (EAT). These are arranged month by month over a period of 40 years, as illustrated here. I have uploaded the data as demonstrated next, and provided the following instruction (prompt).

Here are the plots for September Sea Ice Concentration (SIC) across different years:Left Plot: This shows the raw SIC data. It depicts the actual measurements without any modifications. Right Plot: This is the detrended SIC data. Detrending removes long-term trends from the data, highlighting variations that might be obscured by the trend.

There appears to be an inter-decadal trend in the linearly detrended SIC, so I added:

The updated plots now include the detrended Sea Ice Concentration (SIC) data, from which a second-order polynomial fit has been removed. This detrending method using a polynomial fit allows for the removal of more complex trends than those possible with simple linear detrending, potentially revealing different aspects of the underlying variability in the sea ice concentration data.

Next, to examine its relationships with other indices of climate modes, let's consider:

The revised plot now includes both the cross-correlation of the detrended Sea Ice Concentration (dSIC) for September with other variables (AO, ONI, NPO, EAT) and the autocorrelation of dSIC with September SIC values of each year.

Well, the results suggest that September Sea Ice Concentration (SIC) fluctuations aren't really tied to these climate modes, no matter the season. This is backed up by the low correlation coefficients seen across the various months.

To examine further, I uploaded the NetCDF file of Kaplan SST Anomaly Data from NOAA (see the first example on top). And here's my prompt:

The output correlation map now visualizes the relationship between the detrended September SIC and the SST anomalies during the June-July-August (JJA) period. This map can provide insights into how regional variations in summer SST are correlated with the subsequent sea ice concentration in September.

Since the summer season leading up to September's Sea Ice Concentration (SIC) doesn't display a particularly strong correlation pattern, let's examine its concurrent month, that is, with September Sea Surface Temperature (SST).

Nice! Now the subtropical North Pacific really lights up, showing some potentially meaningful Sea Surface Temperature (SST) correlations. It kind of looks like Nino4, but check it out — it's off the equator by 15 degrees to the north.

PMM - Pacific Meridional Mode.

This pattern is not an El Nino, given its lack of equatorial signals. It might resemble the PMM pattern, though.

Zoomed in on the North Pacific

Unsure? Now, you can ask ChatGPT again, and this is really amazing -- Check it out!

Crazy, isn't it? 🙀

Up to now, we're a bit more confident that September Sea Ice Concentration (SIC) variability might be tied to the Pacific Meridional Mode (PMM). So, to drive this point home, let's whip up a schematic graph using ChatGPT's DALL-E plugin.

A research proposal was born :-)

This AI graphic highlights the north-south variability in SST across the Pacific and depicts atmospheric circulation patterns associated with the PMM. Additionally, they integrate imagery representing the declining sea ice at the poles, indicating the impact of climate change on sea ice.

Soil moisture data

Next, let's try the CPC global soil moisture data from a NetCDF file. Note that this data set is trickier because it uses the Gaussian grids, not rectangular grids.


Then, it was able to create the global map of soil moisture distribution:

Initially, GPT missed half of the globe, so after reminding it, a correct map was producted.

Well, I did talk to GPT like I'm talking to a graduate student 😈. Old habits die hard. But after a few corrective instructions, GPT produces the desired time series:

GPT encountered a lot of errors but eventually could grasp the concept and produced the time series.

Next, let's do some analysis.

GPT was able to show the predominant decadal cycle existed in the soil moisture data over South Korea:

There are a lot more tricks to be learned. Check out How to use ChatGPT for data analysis and research – Beginners Guide


There are numerous ways to make the most of ChatGPT's Advanced Data Analysis plugin. Here's just one example to illustrate its capabilities. Feel free to explore further and join the exciting technological journey of the 21st century.


Explore more on your own: