As a music concert tour manager, it's crucial for my wife to check the weather conditions in the cities she will visit.
For example, if she goes on a tour in the U.S. for over a month, she will face different weather conditions in different cities. Moreover, sometimes, the seasons even change during her stay in a single town.
I’ve always thought having a weather dashboard would greatly help her, especially when packing clothes before leaving and deciding what to wear daily during the tour.
(Also, it would help her check on her colleagues going on tours in cities outside the U.S., such as Bangkok, Jakarta, and Manila.)
Weather information requires regular updates, and it would be ideal to refresh the dashboard as frequently as possible.
This weather dashboard idea was supposed to showcase my Tableau skills; however, achieving a certain level of automation becomes another goal of this data analysis self-project.
So, this project has two parts:
Establishing the source data for a Tableau dashboard through extracting, transforming, and loading the Open-Meteo weather data (and automating the process)
Create the Tableau dashboard
1-1. Extracting Weather Forecast Data from API
The Open-Meteo API is a free and reliable data source that offers hourly weather forecasts for any geographical coordinate.
I'm going to use R to extract, transform, and load the weather data from Open-Meteo and Google BigQuery to store it.
I have selected 16 cities in North America and Asia, which my wife and her team frequently visit. We need the sets of latitude and longitude of the cities to send a request to the Open-Meteo API.
Let's start with creating a data frame in R to define cities with latitude and longitude.
# Define cities with latitude & longitude
cities <- data.frame(
city = c("New York", "Chicago", "Atlanta", "San Antonio", "San Francisco", "Los Angeles", "Mexico City", "Seoul", "Tokyo", "Osaka", "Singapore", "Manila", "Jakarta", "Bangkok", "Macau", "Taipei"),
latitude = c(40.7128, 41.8781, 33.753746, 29.4241, 37.7749, 34.0522,
19.4326, 37.5665, 35.6828, 34.6937, 1.3521,
14.5995, -6.2088, 13.7563, 22.1987, 25.0330),
longitude = c(-74.0060, -87.6298, -84.386330, -98.4936, -122.4194, -118.2437,
-99.1332, 126.9780, 139.7595, 135.5023, 103.8198,
120.9842, 106.8456, 100.5018, 113.5439, 121.5654),
utc_offset = c(-5, -6, -5, -6, -8, -8, -6, 9, 9, 9, 8, 8, 7, 7, 8, 8), # to convert UTC to local time
utc_offset_dst = c(-4, -5, -4, -5, -7, -7, -6, 9, 9, 9, 8, 8, 7, 7, 8, 8) # applying Daylight Saving Time to U.S. cities
)
You can see "utc_offset" and "utc_offset_dst" variables in addition to latitudes and longitudes.
I need those 'time-offseting' values because the weather data we collect from the Open-Meteo API is based on Coordinated Universal Time(UTC). However, we want to display the local time of each city on our weather dashboard like the map below.
The Open-Meteo API will provide us with hourly weather data for 7 consecutive days starting with the exact date we request. As you already know, I'm going to pull the data for 16 cities. If our data extraction is successful, there will be 24 * 7 * 16 = 2688 rows in the "all_weather" data frame.
The code below completes extracting the weather data from the Open-Meteo API. We'll use variables in this "cities" data frame to create the "all_weather" data frame. Notice I included the API calls for each city to retrieve 7 days of hourly weather forecast data by using the httr and jsonlite packages in R.
# Initialize an empty data frame to store all results
all_weather <- data.frame()
for (i in 1:nrow(cities)) {
city_name <- cities$city[i]
lat <- cities$latitude[i]
lon <- cities$longitude[i]
utc_offset <- cities$utc_offset[i]
utc_offset_dst <- cities$utc_offset_dst[i]
# API request URL
url <- paste0("https://api.open-meteo.com/v1/forecast?latitude=", lat,
"&longitude=", lon, "&hourly=temperature_2m,precipitation")
# Fetch API response
response <- GET(url)
# Convert JSON response to R object
if (status_code(response) == 200) { # check if request was successful
data <- content(response, as = "text") %>% fromJSON()
weather_data <- data$hourly
weather_df <- data.frame(
city = city_name,
lat = lat,
long = lon,
timestamp = as.POSIXct(weather_data$time, format="%Y-%m-%dT%H:%M", tz="UTC"),
temperature = as.numeric(weather_data$temperature_2m),
precipitation = as.numeric(weather_data$precipitation),
timezone_offset = utc_offset,
dst_offset = utc_offset_dst
)
# Append to master data frame - all_weather
all_weather <- bind_rows(all_weather, weather_df)
} else {
message("Failed to fetch data for", city_name)
}
}
1-2. Transforming the Data for Accuracy and Compatibility with Tableau
Let's revisit some of the code we did previously. You can easily see I performed several transformation steps using the "dplyr" and "lubridate" libraries.
Warning: This is not new code. I'm just taking a part from our extraction code above to show you how I transform the data during the data extraction.
weather_df <- data.frame(
city = city_name,
lat = lat,
long = lon,
timestamp = as.POSIXct(weather_data$time, format="%Y-%m-%dT%H:%M", tz="UTC"),
temperature = as.numeric(weather_data$temperature_2m),
precipitation = as.numeric(weather_data$precipitation),
timezone_offset = utc_offset,
dst_offset = utc_offset_dst
)
The first focus was data types. Establishing data types that would work well in Tableau is essential. I converted API timestamps into R’s POSIXct format to support proper time-based operations.
The second was to consider local times in advance. We've discussed this when we created "utc_offset" and "utc_offset_dst" variables in the "cities" data frame before data extraction. Now, they are "timezone_offset" and "dst_offset" (just changing names). You will see how exactly they work in Part 2 of this project.
1-3. Loading to Google BigQuery for Scalability and Tableau Integration
It doesn't have to be Google BigQuery. I chose Google BigQuery because it was the only database I could access at the moment. Also, Tableau supports Google Sheets as one of its types of source data.
I've been using R to extract and transform data, and I will also use it to load the data into Google BigQuery. Then, I'll have the Windows Task Scheduler run this set of R code (so, extraction + transformation + load in a single script) whenever I want to ensure daily update of the weather data.
# load necessary package
library(bigrquery)
library(DBI)
# Download your Google Cloud service account JSON key.
# Set environment variable in R
Sys.setenv(GOOGLE_APPLICATION_CREDENTIALS = "C:/Users/yohan/Documents/R/Weather Forecast Dashboard in Tableau 2025/yohan-geocoding-cd6bc68e792e.json")
# set project and dataset
project_id <- "yohan-geocoding" # use your actual project ID
dataset_id <- "weather_data" # the dataset in BigQuery
table_id <- "all_weather" # table name in BigQuery
# define the full table reference
table_ref <- bq_table(project_id, dataset_id, table_id)
# Upload data to BigQuery
bq_table_upload(
x = table_ref,
values = all_weather,
write_disposition = "WRITE_TRUNCATE" # It's WRITE_APPEND or WRITE_TRUNCATE
)
The coding above consists of three parts:
I used the bigrquery and DBI packages to authenticate securely with BigQuery using a service account key loaded via an environment variable.
I then defined the target project, dataset (weather_data), and table (all_weather).
Each time the R script runs, it uploads a fresh dataset to BigQuery using WRITE_TRUNCATE mode to ensure clean and up-to-date data.
Now, let's set up the Windows Task Scheduler to run automatically the R code we've written whenever we want.
First, I need to create a batch file(.bat file). This file will tell Windows to launch R and run your script. So, open Notepad, write the following code, and save it as a .bat file. When you double-click this file, it should run your R script.
Next, I'm going to the Windows Task Scheduler so that our computer can run this .bat file at the time I set. I want the code to run and update our data table on Google BigQuery every time I'm logged on to my computer. So, the trigger should be "When I run on."
Windows Task Scheduler settings: Run a .bat file when I log on to the computer.
Below is what you'll see on your screen whenever you're logged on to your computer: Whenever you boot up your computer, it automatically runs the R script for extracting, transforming, and loading (so, ETL) of the weather data from Open-Meteo API, resulting in a data table on Google BigQuery.
In the next part, you'll see the "weather_all" data table uploaded to Google BigQuery as a result of our ETL efforts from this part.