Image Collection: A collection of geospatial images representing various Earth observations, such as land cover and land surface temperature.
Band Selection: The process of selecting specific bands or variables within a geospatial image for analysis.
Region of Interest (ROI): A specific area or location within a study area that is of interest for analysis or investigation.
Geospatial Sampling: Sampling data points or pixels within a region of interest to extract information for analysis.
Data Transformation: Transforming raw data into a more usable format or converting units, such as converting temperature units from Kelvin to Celsius.
Data Visualization: Creating visual representations of geospatial data to better understand patterns and trends, such as scatter plots and maps.
Map Visualization: Displaying geospatial data layers on a map interface for visualization and analysis.
Layer Visualization Parameters: Setting parameters such as color palette and range for visualizing geospatial data layers.
Basemap Addition: Adding a basemap layer to provide context and reference for the geospatial data layers.
Legend and Colorbar: Adding legends and color bars to maps to provide interpretation of the visualized data.
An Intro to the Earth Engine Python API - MODIS (LST, Land Cover) + USGS DEM
At this point you should know what MODIS is and be familiar with some if its major products:
You should also know what a DEM, how it is collected, and how you get elevation from it
You should also be proficient at looking at the GEE catalogue info and getting relevant info about the dataset you are using:
Example: The description page of the collection tells us that the name of the band associated with the daytime LST is LST_Day_1km which is in units of Kelvin. In addition, values are ranging from 7,500 to 65,535 with a corrective scale of 0.02.
A note on PROJECTIONS:
All of these images come in a different resolution, frequency, and possibly projection, ranging from daily images in a 1 km resolution for LST (hence an ee.ImageCollection — a collection of several ee.Images) to a single image representing data for the year 2000 in a 30 m resolution for the ELV. While we need to have an eye on the frequency, GEE takes care of resolution and projection by resampling and reprojecting all data we are going to work with to a common projection (learn more about projections in Earth Engine). We can define the resolution (called scale in GEE) whenever necessary and of course have the option to force no reprojection.
Please note that you have run a query as part of this tutorial. Namely - you have done a spatial query by pulling data out of the imagery for specific locations
Note that you have also conducted a filter - i.e. filtering collection by date
You should know how to get lat/long points from Google Maps
You have learned how to look at a time series of data from a point and view the data that are produced to evaluate whether the data are impacted by clouds:
Next you have learned that you can take this time series and put it into an array using "pandas". Pandas is a powerful library in Python specifically designed for data manipulation and analysis. Here is the code below annotated so you can get an idea of what is happening here. You would not have to write this code, only alter it for your own use.
1. Import Pandas:
import pandas as pd
This line brings in the pandas library, used to work with dataframes.
2. Define Function:
def ee_array_to_df(arr, list_of_bands):
This creates a function named ee_array_to_df that takes two inputs:
arr: An Earth Engine image array (like a picture with data layers).
list_of_bands: Names of the different data layers in the image.
3. Convert Array to Dataframe:
df = pd.DataFrame(arr)
This turns the arr (a 2D array) into a dataframe, where each row is a pixel and each column is a data layer (band).
4. Fix the Headers:
headers = df.iloc[0]
df = pd.DataFrame(df.values[1:], columns=headers)
The first row usually has no data, so we remove it. Then, we use the remaining rows as data and the removed row as column names.
5. Keep Rows with Data:
df = df[['longitude', 'latitude', 'time', *list_of_bands]].dropna()
We only keep rows that have values in specific columns: longitude, latitude, time, and all the bands you listed. Rows with missing data are removed.
6. Convert Values to Numbers:
for band in list_of_bands:
df[band] = pd.to_numeric(df[band], errors='coerce')
This loop makes sure all the values in each data layer (band) are numbers. If a value can't be converted, it's replaced with NaN (Not a Number).
7. Convert Time to Date & Time:
df['datetime'] = pd.to_datetime(df['time'], unit='ms')
Assuming time is in milliseconds, this turns it into a proper date and time format for easier handling.
8. Pick Useful Columns:
df = df[['time','datetime', *list_of_bands]]
We only keep the columns we need: time, datetime, and all the data layer names.
9. Return the Dataframe:
return df
Finally, the function gives back the cleaned and formatted dataframe, ready for further analysis!
You also learn how to convert from the MODIS Kelvin to the more familiar Celcius using infor from the Data Catalogue (scale factor):
def t_modis_to_celsius(t_modis):
"""Converts MODIS LST units to degrees Celsius."""
t_celsius = 0.02*t_modis - 273.15
return t_celsius
DON'T worry about the generation and plotting of the fitting curves. I am most concerned that you can generate a time series and plot the data points. This is the relevant code:
# Subplots.
fig, ax = plt.subplots(figsize=(14, 6))
# Add scatter plots.
ax.scatter(lst_df_urban['datetime'], lst_df_urban['LST_Day_1km'],
c='black', alpha=0.2, label='Urban (data)')
ax.scatter(lst_df_rural['datetime'], lst_df_rural['LST_Day_1km'],
c='green', alpha=0.35, label='Rural (data)')
# Add some parameters.
ax.set_title('Daytime Land Surface Temperature Near Chicago', fontsize=16)
ax.set_xlabel('Date', fontsize=14)
ax.set_ylabel('Temperature [C]', fontsize=14)
ax.set_ylim(-20, 50)
ax.grid(lw=0.2)
ax.legend(fontsize=14, loc='lower right')
plt.show()
Here is an explanation of what the lines of this code does:
Sure, let's go through the code line by line:
fig, ax = plt.subplots(figsize=(14, 6)): This line creates a figure and a set of subplots. It returns a figure object (fig) and a axes object (ax). The figsize=(14, 6) argument sets the width and height of the figure to 14 inches by 6 inches.
ax.scatter(lst_df_urban['datetime'], lst_df_urban['LST_Day_1km'], c='black', alpha=0.2, label='Urban (data)'): This line adds a scatter plot to the axes (ax). It plots the values of 'LST_Day_1km' (daytime land surface temperature) from the DataFrame lst_df_urban against the corresponding 'datetime' values. The points are colored black (c='black'), with an opacity of 0.2 (alpha=0.2), and labeled as 'Urban (data)'.
ax.scatter(lst_df_rural['datetime'], lst_df_rural['LST_Day_1km'], c='green', alpha=0.35, label='Rural (data)'): This line adds another scatter plot to the same axes (ax). It plots the values of 'LST_Day_1km' from the DataFrame lst_df_rural against the corresponding 'datetime' values. The points are colored green (c='green'), with an opacity of 0.35 (alpha=0.35), and labeled as 'Rural (data)'.
ax.set_title('Daytime Land Surface Temperature Near Chicago', fontsize=16): This line sets the title of the plot to 'Daytime Land Surface Temperature Near Chicago' with a fontsize of 16.
ax.set_xlabel('Date', fontsize=14): This line sets the label for the x-axis to 'Date' with a fontsize of 14.
ax.set_ylabel('Temperature [C]', fontsize=14): This line sets the label for the y-axis to 'Temperature [C]' (temperature in Celsius) with a fontsize of 14.
ax.set_ylim(-20, 50): This line sets the limits of the y-axis from -20 to 50.
ax.grid(lw=0.2): This line adds grid lines to the plot with a linewidth of 0.2.
ax.legend(fontsize=14, loc='lower right'): This line adds a legend to the plot with a fontsize of 14 and locates it at the lower right corner.
plt.show(): This line displays the plot.
Overall, this code creates a plot with two scatter plots representing daytime land surface temperature data for urban and rural areas near Chicago, with appropriate labels, title, and legend.
The last part of this code creates a map with the 3 layers represented (elevation, LST, Land Cover). This map also has legends that represent these 3 layers. This mapping is something that you have seen in previous exercises so please make sure you understand how this mapping works and how to implement it.