Publish AI, ML & data-science insights to a global community of data professionals.

Quantifying Burned Areas from Wildfires Using Satellite Imagery

Determining the burned area in forests due to wildfires using Sentinel-2 images with Python in Google Colab

Sentinel-2 image captured over Walbridge fire area in RGB bands, visualized by the author
Sentinel-2 image captured over Walbridge fire area in RGB bands, visualized by the author
Sentinel-2 images with different bands captured over the Walbridge fire incident, visualized by the author
Sentinel-2 images with different bands captured over the Walbridge fire incident, visualized by the author

Table of Contents

  1. 🌟 Introduction
  2. πŸ”₯ Walbridge Wildfire in California
  3. 🏷 ️ Method
  4. πŸš€ Setup Google Colab
  5. πŸ›° ️ Load Clear Sentinel-2 Images
  6. πŸ”’ Count the Number of Vegetation Pixels
  7. πŸ“ˆ Plot the Timeseries of the Vegetation Area
  8. 🌍 Visualization of Sentinel-2 images for Wildfires
  9. πŸ“„ Conclusion
  10. πŸ“š References

🌟 Introduction

Several years ago, when most climate models predicted that more floods, heat waves, and wildfires would happen if we didn’t take the necessary steps, I didn’t expect these unusual catastrophic phenomena to become common events. Among these, wildfires destroy a huge amount of forest area yearly. If you search for a table of major wildfires in different places, you will find alarming statistics showing how much forest area is vanishing from our earth due to wildfires.

In this tutorial, I’ll combine stories that I’ve already published about downloading, processing satellite images, and visualizing wildfires to quantify the burned area of one of the major wildfires that happened in California. Similar to previous posts, I’ll use the Google Colab environment, and all code will be written in Python, so anyone can follow these steps without significant preparation (installing Python, creating environment variables, etc.). In the end, you can do the same calculation for any wildfire that you are interested in or compare the calculated values with those reported in official reports.

πŸ”₯ Walbridge Wildfire in California

Here we will focus on the Walbridge Fire, which started on August 17, 2020, in Sonoma County, California. This wildfire was part of one of the largest fires in California’s history, called the LNU Lightning Complex, which was caused by a lightning storm. It is reported that the Walbridge Fire burned almost 50,000 acres, while the LNU Lightning Complex burned a total of 363,000 acres. We will compare these numbers with our calculations at the end. I should mention that the Walbridge Fire was eventually contained in mid-September, and the LNU Complex was contained by October 2020.

🏷 ️ Method

To quantify the burned area due to the Walbridge Fire, we will start by loading the clear Sentinel-2 images captured of the Walbridge Fire and consider a buffer based on the fire perimeter to clip them from June 2020 to December 2020 (Step 1). We will ensure that the loaded images cover the entire fire perimeter (Step 2). Then, we will calculate the number of pixels classified as vegetation based on the scene layer of Sentinel-2 images in each image and create a dataframe showing the total vegetation pixels along with the image date (Step 3). Next, we will convert the total number of vegetation pixels to the total vegetation area using pixel sizes (Step 4). After finalizing the dataframe, we will be able to plot the time series and expect to see a significant drop in the area of vegetation between August and September, as the fire started in August, was contained in September, and burned the vegetation area (Step 5). By comparing the vegetation area pre- and post-wildfire, we can calculate the burned area and compare it with reported numbers in the media. At the end, we will visualize the Sentinel-2 images to map the wildfire (Step 6).

πŸš€ Setup Google Colab

To set up your Google Colab environment, you just need to install and import the following libraries:

pip install pystac_client
pip install --upgrade dask odc-stac

from pystac_client import Client
from odc.stac import load

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime

Installing "pystac_client" and "dask" is necessary for loading satellite images captured over the wildfire location, and the rest of the libraries are used for manipulating dataframes, plotting time series, and visualizing satellite images.

πŸ›° ️ Load Clear Sentinel-2 Images

To implement step 1 described in the method section, I refer you to the tutorial I published on downloading clipped images for our area of interest (AOI), which is a buffer around the Walbridge Fire.

A Simple Way for Downloading Hundreds of Clipped Satellite Images Without Retrieving the Entire…

To consider the buffer and wildfire incident date, you just need to use the following parameters in the client.search function:


bbox= [-123.12,38.52,-122.88,38.70]
cloud_cover= (0, 10)
date="2020-06-01/2020-12-30"

After loading the data with those parameters, you should get a cubic dataset with 59 images:

data =load(search.items(),bbox=bbox , groupby = "solar_day", chunks ={})
The cubic dataset after loading clipped and clear sentinel-2 images, image by the author
The cubic dataset after loading clipped and clear sentinel-2 images, image by the author

πŸ”’ Count the Number of Vegetation Pixels

In this section, we will implement steps 2, 3, and 4, which are related to ensuring that each image covers our bounding box, calculating the vegetation area in each image, and exporting the results to a dataframe. The following function can cover all three steps:


def count_vegetation_pixels(data):
    vegetation_counts = []
    vegetation_areas = []
    coverage_ratios = []
    date_labels = []

    total_pixels = data.dims['y'] * data.dims['x']
    total_area_km2 = total_pixels * 10 * 10 / 1e6

    for t in range(len(data.time)):
        scl_image = data["scl"].isel(time=t).values
        date_string = pd.to_datetime(data.time.values[t]).strftime("%Y-%m-%d")
        print(date_string)

        vegetation_pixel_count = np.count_nonzero(scl_image == 4)
        vegetation_area_km2 = vegetation_pixel_count * 10 * 10 / 1e6

        valid_pixel_count = np.count_nonzero(np.isin(scl_image, np.arange(1, 12)))
        valid_area_km2 = valid_pixel_count * 10 * 10 / 1e6
        coverage_ratio = valid_area_km2 / total_area_km2

        if coverage_ratio < 0.8:
            continue

        vegetation_counts.append(vegetation_pixel_count)
        vegetation_areas.append(vegetation_area_km2)
        coverage_ratios.append(coverage_ratio)
        date_labels.append(date_string)

    data_dict = {
        'Date': pd.to_datetime(date_labels),
        'Coverage Ratio': coverage_ratios,
        'Vegetation Surface Area': vegetation_areas
    }

    df = pd.DataFrame(data_dict)

    return df
table= count_vegetation_pixels(data)

If you print the table, you should see the table below summarizing the images, coverage, and vegetation surface area in square kilometers:

The dataframe summarizing the images, coverage, and vegetation surface area in Km2, image by the author
The dataframe summarizing the images, coverage, and vegetation surface area in Km2, image by the author

The calculated vegetation surface area could be noisy since Sentinel-2 satellites are twin satellites (Sentinel-2A and Sentinel-2B), and there are minor differences in their sensor configurations as well as a +/- 10-meter position error in those images. To smooth the data, we check the changes in each two consecutive vegetation surface areas, and if the change is less than 15%, we remove that record. This can be done with the following script:

# Calculate the percentage change between consecutive rows 
table['pct_change'] = table['Vegetation Surface Area'].pct_change() * -100
rows_to_remove = table['pct_change'] > 15

# Remove these rows from the DataFrame
table_filtered = table[~rows_to_remove].drop(columns=['pct_change'])  
print(table_filtered)

If you are interested in learning more about the scene classification of Sentinel-2 and the coverage ratio, I refer you to this post:

Tracking The Great Salt Lake’s Shrinkage Using Satellite Images (Python)

πŸ“ˆ Plot the Timeseries of the Vegetation Area

Now that the small changes in the vegetation surface area have been removed and the dataframe has been smoothed, we can move to step 5 and plot a time series showing the vegetation surface area over time using the following script:

# Convert the 'date' column to datetime type
table_filtered['Date'] = pd.to_datetime(table_filtered['Date'])
table_filtered.set_index('Date', inplace=True)

# Plotting
plt.figure(figsize=(10, 6))  
plt.plot(table_filtered.index, table_filtered['Vegetation Surface Area'], marker='o', linestyle='-', color='b') 

plt.title('Vegetation Surface Area Over Time', fontsize=16)
plt.xlabel('Date', fontsize=14)
plt.ylabel('Vegetation Surface Area', fontsize=14)

plt.gca().xaxis.set_major_formatter(plt.matplotlib.dates.DateFormatter('%Y-%m-%d'))
plt.gca().xaxis.set_major_locator(plt.matplotlib.dates.AutoDateLocator())

plt.xticks(rotation=45)  
plt.tight_layout()  
plt.grid(True)  
plt.show()  

The output will be:

Time series of the vegetation surface area over time, image by the author
Time series of the vegetation surface area over time, image by the author

As shown in the graph, there is a significant drop in the vegetation surface area after mid-August, exactly when the Walbridge Fire occurred. You can see that the total vegetation surface area in June was about 375 sq km, while it was reduced to almost 200 sq km by the end of November, which means a reduction of 175 sq km in vegetation area, equivalent to 43,000 acres. This number is almost close to the 50,000 acres reported in the media.

Time series of the vegetation surface area over time along with Walbridge incident time, image by the author
Time series of the vegetation surface area over time along with Walbridge incident time, image by the author

🌍 Visualization of Sentinel-2 images for Wildfires

In addition to the time series, it would be interesting to take a look at the individual Sentinel-2 images captured over the Walbridge Fire (step 6). The following function will plot and save the RGB images that were clear and complete from August 1st to the end of August:

def process_rgb_images(data):
    numb_days = data["red"].time.size

    for t in range(numb_days):
        rgb_image = data[["red", "green", "blue"]].isel(time=t).to_array()
        dt = pd.to_datetime(rgb_image.time.values)

        year = dt.year
        month = dt.month
        day = dt.day

        if month != 8:
            continue

        date_string = f"{year}_{month:02d}_{day:02d}"
        date_string_title = f"{month:02d}/{day:02d}/{year}"

        green_band = rgb_image.sel(variable='green').values
        count_green_pixels = np.count_nonzero(green_band)
        total_pixels = green_band.size
        ratio = count_green_pixels / total_pixels

        if ratio > 0.7:
            fig, ax = plt.subplots()
            rgb_image.plot.imshow(robust=True, ax=ax)
            ax.set_title(f"Walbridge {date_string_title} by Sentinel-2")
            ax.axis('off')            
            plt.savefig(f"rgb_image_{date_string}.png")
            plt.close(fig)

process_rgb_images(data)

The RGB images will be:

As you can see, there is no trace of fire and smoke until August 15th. Both appear in images captured on August 20th and August 22nd. I’ve published a story below on how shortwave (SWIR) and near-infrared (NIR) bands can be useful to see fires below the smoke layers, as they can penetrate through the smoke.

Satellites Can See Invisible Lava Flows and Active Wildires, But How? (Python)

Let’s use those bands to see if we can detect fires beneath the smoke layer:

def process_swir_images(data):
    numb_days = data["swir22"].time.size

    for t in range(numb_days):
        swir_nir_image = data[["swir22","swir16","nir"]].isel(time=t).to_array()
        dt = pd.to_datetime(swir_nir_image.time.values)

        year = dt.year
        month = dt.month
        day = dt.day

        if month != 8:
            continue

        date_string = f"{year}_{month:02d}_{day:02d}"
        date_string_title = f"{month:02d}/{day:02d}/{year}"

        swir22_band = swir_nir_image.sel(variable='swir22').values
        count_swir22_pixels = np.count_nonzero(swir22_band)
        total_pixels = swir22_band.size
        ratio = count_swir22_pixels / total_pixels

        if ratio > 0.7:
            fig, ax = plt.subplots()
            swir_nir_image.plot.imshow(robust=True, ax=ax)
            ax.set_title(f"Walbridge {date_string_title} by Sentinel-2")
            ax.axis('off')
            plt.savefig(f"SWIR_NIR_image_{date_string}.png")
            plt.close(fig)

process_swir_images(data)

The sentinel-2 images with SWIR and NIR bands will be:

Sentinel-2 image captured over Walbridge fire area in SWIR and NIR bands, visualized by the author
Sentinel-2 image captured over Walbridge fire area in SWIR and NIR bands, visualized by the author

It is evident that with the combination of SWIR and NIR bands, we are able to see the hotspots (bright yellow and orange and red pixels are fires), particularly in the image captured on August 20th (3 days after the incident), while they were not visible in the same image using RGB bands.

πŸ“„ Conclusion

The scene classification layer of satellite images is a great source of information to track and monitor changes in land cover types. In this tutorial, we used the classified Sentinel-2 images to estimate the burned vegetation area due to one of the major fires that occurred in California in August 2020. The estimated burned area was about 43,000 acres compared to the 50,000 acres reported in the news. Additionally, we used the Sentinel-2 images with different band combinations to visualize the smoke and fires over the area. I hope you enjoyed reading this story, and feel free to reach out if you have any questions.

πŸ“š References

https://github.com/stac-utils/pystac-client/blob/main/docs/quickstart.rst

https://www.element84.com/earth-search/examples/

Copernicus Sentinel data [2024] for Sentinel data

Copernicus Service information [2024] for Copernicus Service Information.

πŸ“± Connect with me on other platforms for more engaging content! LinkedIn, ResearchGate, Github, and Twitter.

Here are the relevant posts available through these links:

Creating a Streamlit App for Satellite Imagery Visualization: A Step-by-Step Guide

Watching Storms from Space: A Python Script for Creating an Amazing View


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles