Publish AI, ML & data-science insights to a global community of data professionals.

Watching Storms from Space: A Python Script for Creating an Amazing View

Creating an animation from geostationary satellite images to dynamically monitor storms in real-time using Python and Google Colab

Table of Contents

  1. 🌟 Introduction
  2. 🌐 Geostationary Operational Environmental Satellite (GOES)
  3. πŸ” GOES Database in AWS
  4. ⏳ Download GOES NetCDFs
  5. πŸ“Š Visualization of GOES Images
  6. πŸŽ₯ Timelapse of the Storm
  7. πŸ“„ Conclusion
  8. πŸ“š References

🌟 Introduction

If you are keeping up with the news, one of the breaking stories right now is about the powerful storms known as the Atmospheric River set to hit California and affect central and southern regions. An atmospheric river is a narrow channel of winds that transports concentrated water vapor (moisture) from the tropics. The upcoming storm is expected to bring heavy rainfall and significant flash flooding to parts of the state.

Last night, while checking my Twitter account (now called X), I came across numerous tweets about the storm, along with colorful maps projecting and simulating the amount of rainfall over California. These maps quickly became my motivation to write this story.

In this post, we will explore how to download images captured by Geostationary satellites and visualize them using Python. Since we can download a sequence of images with a relatively short time interval, we will expand our code to display these images like an animation (timelapse). By the end of reading this post, I hope you feel that you are at a fixed point in space, watching the Earth from top to bottom!


🌐 Geostationary Operational Environmental Satellite (GOES)

The Geostationary Operational Environmental Satellite (GOES) is a series of weather satellites operated by the United States National Oceanic and Atmospheric Administration (NOAA) and the National Aeronautics and Space Administration (NASA). These satellites are positioned over specific locations, allowing for constant monitoring of changing weather conditions in particular regions.

GOES plays a crucial role in monitoring phenomena such as hurricanes, tornadoes, and storms. In terms of spatial and temporal resolution, the latest generations of GOES (GOES-R) with the Advanced Baseline Imager (ABI) provide imagery with several spectral bands (Visible, Near Infrared, Microwave, and Longwave) at resolutions ranging from 0.5 to 2 kilometers, capturing data every 5 to 15 minutes over the Contiguous United States (CONUS) region.


πŸ” GOES Database in AWS

GOES images are accessible on Amazon Web Services S3 (refer to this [link](https://noaa-goes18.s3.amazonaws.com/index.html)), and you can explore the bucket through this link. In this post, we will be focusing on ABI-L1b-RadC, where "ABI" stands for Advanced Baseline Imager, "L1b" represents Level 1b data, which is calibrated and geolocated data. "RadC" stands for Radiance product for CONUS. You can find detailed descriptions of other products at the following link:

https://docs.opendata.aws/noaa-goes16/cics-readme.html

According to the information provided in that link, as mentioned in the "cisc-readme.htm" file, the real-time feed and full historical archive of original (ABI) radiance data (Level 1b) are freely available on Amazon S3 for anyone to use.

⏳ Download GOES NetCDFs

In this step, we will learn how to connect to the S3 bucket, retrieve a list of available files, filter the GOES netCDF files, and download the netCDF files. Let’s begin by setting up the Google Colab environment, installing only one library, and importing the required libraries:

pip install s3fs
# Import libraries
import os
import s3fs
import numpy as np
from datetime import datetime
import xarray as xr
import matplotlib.pyplot as plt

Next, we will create three folders in our main directory (/content/ in my case) to store GOES netCDF files, "JPG" files of GOES files, and the timelapse file in the "Animation" folder.

# Create the folder names list
folder_names = ["GOES", "JPG", "Animation"]

# Iterate through the folder names and create the folders
for folder_name in folder_names:
    # Check if the folder doesn't exist, then create it
    if not os.path.exists(folder_name):
        os.makedirs(folder_name)
        print(f"Created folder: {folder_name}")
    else:
        print(f"Folder {folder_name} already exists")

GOES_dir=os.path.join('/content','GOES')
JPG_dir =os.path.join('/content','JPG')
Animation_dir =os.path.join('/content','Animation')

If you explore the AWS S3 bucket for GOES images, you can see that GOES-18 images for ABI-L1b-RadC are saved in a structure like this for each hour: noaa-goes18/ABI-L1b/RadC/Year/DOY/Hour.

According to this structure, we need to specify the bucket name, product level, and the hour for downloading the netCDF files:

# Define the bucket and product
bucket = 'noaa-goes18' # other options: 'noaa-goes16', 'noaa-goes17', 'noaa-goes18'
product = 'ABI-L1b-RadC'

# Define the hour, day of the year and year
hour = 22
doy = 31    # day of the year
year = 2024

Keep in mind that you can use GOES-16 and GOES-17 in the bucket name as well, but GOES-18 essentially replaced GOES-17 in January. There are no images available for GOES-17 after January 10th.

Now it’s time to connect to the S3 bucket and get a list of files available in this directory noaa-goes18/ABI-L1b/RadC/Year/DOY/Hour:

# Connect to the S3 file system
fs = s3fs.S3FileSystem(anon=True)

# Define the master URL
master_url = f'{bucket}/{product}/{year}/{doy:03.0f}/{hour:02.0f}/'
print(master_url)

# Get the URLs for different channels
urls = fs.ls(master_url)

If you print "urls", you can see there are many files in that folder. However, for the visualization part, we are only interested in three channels: red, blue, and green. It’s important to note that GOES doesn’t capture images in the green band. The good news is that there is a way to synthetically generate the green band based on the blue, red, and NIR bands, which I will cover later. To filter the list for these three channels (blue, red, and NIR), we can select the files that have "C01" for blue, "C02" for red, and "C03" for the NIR band in their names:

urls_C01 = []
urls_C02 = []
urls_C03 = []

urls_C01 = [url for url in urls if url.endswith('.nc') and 'C01' in url]
urls_C02 = [url for url in urls if url.endswith('.nc') and 'C02' in url]
urls_C03 = [url for url in urls if url.endswith('.nc') and 'C03' in url]

# Sort the URLs
urls_C01.sort()
urls_C02.sort()
urls_C03.sort()

I have sorted them to ensure that the order of the images is correct for creating the time-lapse part.

In this step, we will start downloading the first netCDF file in each list (each channel) and save them in our "GOES" folder:

# Define the paths for different channels
path_C01 = os.path.join(GOES_dir,urls_C01[0].split("/")[-1])
path_C02 = os.path.join(GOES_dir,urls_C02[0].split("/")[-1])
path_C03 = os.path.join(GOES_dir,urls_C03[0].split("/")[-1])

# Download the data for different channels
print("Starting...")

fs.download(urls_C01[0], path_C01)
print(f"{urls_C01[0]} has been downloaded")

fs.download(urls_C02[0], path_C02)
print(f"{urls_C02[0]} has been downloaded")

fs.download(urls_C03[0], path_C03)
print(f"{urls_C03[0]} has been downloaded")

if you follow this step correctly, you have these files in your GOES folder:

Snapshot of the netCDF files saved in the GOES folder. Image by the author.
Snapshot of the netCDF files saved in the GOES folder. Image by the author.

πŸ“Š Visualization of GOES Images

Now that we have the netCDF files for blue, red, and NIR, we can read them using xarray and extract the data as a NumPy array using the following:

# Open the downloaded data using xarray
Blue = xr.open_dataset(path_C01)
Red = xr.open_dataset(path_C02)
Nir = xr.open_dataset(path_C03)

# Extract the data arrays
Blue_array = Blue.Rad.data
Red_array = Red.Rad.data
Nir_array = Nir.Rad.data

As I mentioned earlier, the raw data in ABI-L1b-RadC is in radiance, and it is better to convert them to reflectance for standardizing the image and ensuring consistency. This can be achieved by reading the kappa values stored in the array. Multiplying the kappa by the radiance values will yield the reflectance values. Since reflectance values should fall within the range of 0 and 1, there might be instances where, after the conversion, some pixels become negative or above 1. In such cases, we will clip the values to ensure they are within the valid range of 0 to 1:

# Calculate the reflectance values
Blue_reflec = Blue.kappa0.data * Blue_array
Red_reflec = Red.kappa0.data * Red_array
Nir_reflec = Nir.kappa0.data * Nir_array

# Clip the reflectance values between 0 and 1
Blue_reflec_clip = np.clip(Blue_reflec, 0, 1)
Red_reflec_clip = np.clip(Red_reflec, 0, 1)
Nir_reflec_clip = np.clip(Nir_reflec, 0, 1)

After converting the raw data to reflectance, we can apply gamma correction to adjust the brightness. Feel free to change this parameter after visualizing the image. Usually, a lower gamma yields a darker image, while a higher gamma results in a brighter image.

# Apply gamma correction
gamma = 3
Blue_gamma = np.power(Blue_reflec_clip, 1/gamma)
Red_gamma = np.power(Red_reflec_clip, 1/gamma)
Nir_gamma = np.power(Nir_reflec_clip, 1/gamma)

Now, all three bands have been converted to reflectance and corrected for brightness based on the gamma factor. If you compare the xarray of the blue and NIR bands to the red band, you’ll notice that the red band has different dimensions (6000 x 10000) compared to the blue or NIR bands (3000 x 5000). In other words, the red band has a higher resolution (0.5 km) compared to the blue and NIR bands (1 km). To fix this inconsistency between spatial resolutions, we can upscale the red band to have the same resolution as the blue and NIR bands. This step is necessary as we aim to create a stacked image of red, green, and blue channels to visualize it in true color. In the following lines, we define a function to aggregate the red band:

# Define a function to downsample the array
def downsample_array(input_array):
    # Calculate the dimensions of the output array
    output_rows = input_array.shape[0] // 2
    output_cols = input_array.shape[1] // 2

    # Reshape the input array into a 4D array, then compute the mean along the last two dimensions
    reshaped_array = input_array.reshape(output_rows, 2, output_cols, 2)
    output_array = reshaped_array.mean(axis=(1, 3))

    return output_array

# Downsample the Red_gamma array
Red_gamma_reshaped = downsample_array(Red_gamma)

As I mentioned earlier, GOES doesn’t have the green band, and one approach to creating a synthetic green channel is with an empirical equation based on the blue, red, and NIR bands. In the following lines, we use the coefficients mentioned in this paper to create the green band. Next, we will create the stacked image based on the blue, synthetic green, and red bands:

# Calculate the Green channel using the Red, Blue and Nir channels
# // Bah, M. K., Gunshor, M. M., & Schmit, T. J. (2018). Generation of GOES-16 true color imagery without a green band. Earth and Space Science, 5, 549–558. https://doi.org/10.1029/2018EA000379
G_gamma = 0.45 * Red_gamma_reshaped + 0.1 * Nir_gamma + 0.45 * Blue_gamma

# Combine the Red, Green and Blue channels to create the RGB array for the true color image
RGB = np.dstack([Red_gamma_reshaped, G_gamma, Blue_gamma])

Now that we have the stacked RGB of the GOES image, we can plot them by the following line:

# Plot the true color image
fig = plt.figure(figsize=(15, 12))

plt.imshow(RGB)

# Get the scan start time and format it
scan_start = datetime.strptime(Blue.time_coverage_start, '%Y-%m-%dT%H:%M:%S.%fZ')
Time = scan_start.strftime('%d %B %Y %H:%M')

# Add title and axis labels to the plot
plt.title(f'GOES-18 in True Color (RGB) : {Time}', fontweight='bold', fontsize=20)
plt.axis('off')

# Show the plot
plt.show()

The saved image will be:

The output image based on GOES-18 in true color: January 31st, 2024 at 22:00 UTC (2:00 pm local time)
The output image based on GOES-18 in true color: January 31st, 2024 at 22:00 UTC (2:00 pm local time)

Can you see the Pineapple Express storm approaching the West Coast in this image? Let’s create an animation of this storm to visualize how it moves across the ocean in the next section.


πŸŽ₯ Timelapse of the Storm

In this step, we will download more images captured by GOES for this storm and place each image in one frame to observe the movement of the storm over a couple of hours. The steps required are similar to the previous sections with some minor changes. After installing and importing the libraries as explained in the previous section, you need to modify the hours in the script. We will download the images captured for this storm from 16:00 UTC to 23:00 UTC or 8:00 AM to 3:00 PM local time.

# Define the bucket and product
bucket = 'noaa-goes18' # other options: 'noaa-goes16', 'noaa-goes17', 'noaa-goes18'
product = 'ABI-L1b-RadC'

# Define the hour, day of the year and year
hours = range(16,24)
doy = 31    # day of the year
year = 2024

We will get a list of available images for those hours and sort them using these:

# Connect to the S3 file system
fs = s3fs.S3FileSystem(anon=True)

urls_C01 = []
urls_C02 = []
urls_C03 = []

for hour in hours:
  # Define the master URL
  master_url = f'{bucket}/{product}/{year}/{doy:03.0f}/{hour:02.0f}/'
  print(master_url)

  # Get the URLs for different channels
  urls = fs.ls(master_url)

  urls_C01_hour = [url for url in urls if url.endswith('.nc') and 'C01' in url]
  urls_C02_hour = [url for url in urls if url.endswith('.nc') and 'C02' in url]
  urls_C03_hour = [url for url in urls if url.endswith('.nc') and 'C03' in url]

  urls_C01.extend(urls_C01_hour)
  urls_C02.extend(urls_C02_hour)
  urls_C03.extend(urls_C03_hour)

# Sort the URLs
urls_C01.sort()
urls_C02.sort()
urls_C03.sort()

Since GOES-18 captures images every 5 minutes (12 images per hour), and there will be a large number of images from 8:00 am to 3:00 pm (7 12 = 84 images), we will reduce this by downloading images captured every 15 minutes (4 images per hour) resulting in a total of (7 4 = 28 images). This number of images is sufficient for our purpose:

for i in range(0,len(urls_C01),4):
  path_C01 = os.path.join(GOES_dir,urls_C01[i].split("/")[-1])
  path_C02 = os.path.join(GOES_dir,urls_C02[i].split("/")[-1])
  path_C03 = os.path.join(GOES_dir,urls_C03[i].split("/")[-1])

  print("Starting...")
  fs.download(urls_C01[i], path_C01)
  print(f"{urls_C01[i]} has been downloaded")
  fs.download(urls_C02[i], path_C02)
  print(f"{urls_C02[i]} has been downloaded")
  fs.download(urls_C03[i], path_C03)
  print(f"{urls_C03[i]} has been downloaded")

In this step, you should have these netCDF files in your GOES folder:

Snapshot of the netCDF files saved in the GOES folder. Image by the author.
Snapshot of the netCDF files saved in the GOES folder. Image by the author.

Similar to the previous step, we define a function to resample the red band as it has a higher resolution than the blue and NIR bands. Also, we will define a function for plotting that includes all the corrections we discussed (radiance to reflectance, adjusting brightness, etc.).

# Define a function to downsample the array
def downsample_array(input_array):
    # Calculate the dimensions of the output array
    output_rows = input_array.shape[0] // 2
    output_cols = input_array.shape[1] // 2

    # Reshape the input array into a 4D array, then compute the mean along the last two dimensions
    reshaped_array = input_array.reshape(output_rows, 2, output_cols, 2)
    output_array = reshaped_array.mean(axis=(1, 3))

    return output_array

def plot_GOES(path_C01,path_C02,path_C03,jpg_folder):
  # Open the downloaded data using xarray
  Blue = xr.open_dataset(path_C01)
  Red = xr.open_dataset(path_C02)
  Nir = xr.open_dataset(path_C03)

  # Extract the data arrays
  Blue_array = Blue.Rad.data
  Red_array = Red.Rad.data
  Nir_array = Nir.Rad.data

  # Calculate the reflectance values
  Blue_reflec = Blue.kappa0.data * Blue_array
  Red_reflec = Red.kappa0.data * Red_array
  Nir_reflec = Nir.kappa0.data * Nir_array

  # Clip the reflectance values between 0 and 1
  Blue_reflec_clip = np.clip(Blue_reflec, 0, 1)
  Red_reflec_clip = np.clip(Red_reflec, 0, 1)
  Nir_reflec_clip = np.clip(Nir_reflec, 0, 1)

  # Apply gamma correction
  gamma = 3
  Blue_gamma = np.power(Blue_reflec_clip, 1/gamma)
  Red_gamma = np.power(Red_reflec_clip, 1/gamma)
  Nir_gamma = np.power(Nir_reflec_clip, 1/gamma)

  # Downsample the Red_gamma array
  Red_gamma_reshaped = downsample_array(Red_gamma)

  # Calculate the Green channel using the Red and Nir channels
  # // Bah, Gunshor, Schmit, Generation of GOES-16 True Color Imagery without a Green Band, 2018. https://doi.org/10.1029/2018EA000379
  G_gamma = 0.45 * Red_gamma_reshaped + 0.1 * Nir_gamma + 0.45 * Blue_gamma

  # Combine the Red, Green and Blue channels to create the RGB array for the true color image
  RGB = np.dstack([Red_gamma_reshaped, G_gamma, Blue_gamma])

  # Plot the true color image
  fig = plt.figure(figsize=(15, 12))

  plt.imshow(RGB)

  # Get the scan start time and format it
  scan_start = datetime.strptime(Blue.time_coverage_start, '%Y-%m-%dT%H:%M:%S.%fZ')
  Time = scan_start.strftime('%d %B %Y %H:%M')

  # Add title and axis labels to the plot
  plt.title(f'GOES-18 in True Color (RGB) : {Time}', fontweight='bold', fontsize=20)
  plt.axis('off')

  # Save the plot as a JPEG file
  plt.savefig(os.path.join(f'/content/{jpg_folder}', scan_start.strftime('%Y_%m_%d_%H_%M') + '.jpg'), dpi=300, bbox_inches='tight')

  # Close the plot to free up memory
  plt.close(fig)

We can call these two functions in a loop to plot and save all the images downloaded and processed from GOES in our JPG folder.

for i in range(0,len(urls_C01),4):
  path_C01 = os.path.join(GOES_dir,urls_C01[i].split("/")[-1])
  path_C02 = os.path.join(GOES_dir,urls_C02[i].split("/")[-1])
  path_C03 = os.path.join(GOES_dir,urls_C03[i].split("/")[-1])

  # Call the function
  plot_GOES(path_C01,path_C02,path_C03,jpg_folder='JPG')

If you have followed these steps correctly, you should have these JPG files in your folder:

Snapshot of the images saved in the JPG folder. Image by the author.
Snapshot of the images saved in the JPG folder. Image by the author.

Now we can use the Pillow library in Python to create an animation from the series of images saved in our folder:

pip install pillow
from PIL import Image
import os
import glob

# Replace 'path/to/folder' with the path to your folder containing the jpg images
folder_path = '/content/JPG'
output_gif = os.path.join('/content/Animation', 'GOES_animation.gif')

# Get a sorted list of all jpg files in the folder
file_list = sorted(glob.glob(os.path.join(folder_path, '*.jpg')))

# Set the desired dimensions for the images
width, height = 1500, 1200

# Read the images and store them in a list
frames = []
for file in file_list:
    frame = Image.open(file)

    # Resize the image
    frame = frame.resize((width, height), Image.ANTIALIAS)

    # Add a white background to the image
    background = Image.new('RGB', frame.size, (255, 255, 255))
    background.paste(frame)
    frame = background.convert('RGB')

    # Convert the image to P mode with a global color table
    frame = frame.convert('P', palette=Image.ADAPTIVE, colors=256)
    frames.append(frame)

# Save the frames as an animated GIF
if frames:
    frames[0].save(
        output_gif,
        save_all=True,
        append_images=frames[1:],
        duration=500,  # Set the duration between frames in milliseconds
        loop=0,  # Set the number of loops (0 means infinite)
        optimize=True,
    )
else:
    print("No jpg files found.")

The output will be an animation, saved in your animation folder, showing the images captured by GOES in true color for the approaching Pineapple Express storm on the West Coast on January 31st, 2024:

πŸ“„ Conclusion

Monitoring weather phenomena, particularly storms and wildfires, is possible through GOES images covering the CONUS every 5 minutes. These images are valuable sources for monitoring these events and projecting them for the next few hours. In this post, we covered how to get access to these images, download and correct them, create a synthetic green band based on other bands, visualize them, and create a nice animation for monitoring one of these extreme events called the Pineapple Express Storm hitting the West Coast. I hope you enjoy reading this story, and feel free to reach out if you have any questions.


πŸ“š References

NOAA Geostationary Operational Environmental Satellites (GOES) 16, 17 & 18 was accessed on 2/3/2024 from https://registry.opendata.aws/noaa-goes

Bah, M. K., Gunshor, M. M., & Schmit, T. J. (2018). Generation of GOES-16 true color imagery without a green band. Earth and Space Science, 5, 549–558. https://doi.org/10.1029/2018EA000379


πŸ“± Connect with me on other platforms for more engaging content! LinkedIn, ResearchGate, Github, and Twitter.

Here are the relevant posts available through these links:

How to Download, Process, and Visualize Climate Data in Python

How to Download Satellite Images Without Writing a Single Line of Code

A Long Python Script to Make a Short Video Using Geospatial Big Data

Visualization of Geospatial Wildfire Smoke Data with Python (Google Colab)


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles