Measuring Wildfire Change Over Time

Data

California Wildfire data in csv and shapefile format was drawn from the California State Geoportal at gis.data.ca.gov. (https://gis.data.ca.gov/maps/e3802d2abf8741a187e73a9db49d68fe/about) The dataset provides information on all recorded wildfires in the state from 1950-2022, including their cause, start date, containment date, acreage, and geographic location/extent. The state boundary was drawn from the US Census Bureau data. (https://www.census.gov/cgi-bin/geo/shapefiles/index.php)

I first mapped the wildfire extents from the four most recent decades using ggplot and Simple Features. The ca.gov dataset only includes data up to 2022, so the 2020s decade will only include three years of data. I filtered the shapefile based on decade and plotted the multipolygons along with the state boundary.

library(ggplot2)
library(sf)

fires_all <- st_read("California_Fire_Perimeters_(all).shp")
## Reading layer `California_Fire_Perimeters_(all)' from data source 
##   `D:\RStudio Projects\geog490project\California_Fire_Perimeters_(all).shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 21926 features and 18 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -13848330 ymin: 3833204 xmax: -12705610 ymax: 5255380
## Projected CRS: WGS 84 / Pseudo-Mercator
fires_1990s <- fires_all[fires_all$DECADES == 1990, ]
fires_2000s <- fires_all[fires_all$DECADES == 2000, ]
fires_2010s <- fires_all[fires_all$DECADES == 2010, ]
fires_2020s <- fires_all[fires_all$DECADES == 2020, ]

stateshapes <- st_read("tl_2023_us_state.shp")
## Reading layer `tl_2023_us_state' from data source 
##   `D:\RStudio Projects\geog490project\tl_2023_us_state.shp' using driver `ESRI Shapefile'
## Simple feature collection with 56 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -179.2311 ymin: -14.60181 xmax: 179.8597 ymax: 71.43979
## Geodetic CRS:  NAD83
cashape <- stateshapes[stateshapes$NAME == "California", ]

ggplot() +
  geom_sf(data = fires_1990s, fill = "darkorange", color = "darkred") +
  geom_sf(data = cashape, fill = NA, color = "black", size = 1) +
  ggtitle("Fire Extents: 1990s") +
  theme_minimal()

ggplot() +
  geom_sf(data = fires_2000s, fill = "darkorange", color = "darkred") +
  geom_sf(data = cashape, fill = NA, color = "black", size = 1) +
  ggtitle("Fire Extents: 2000s") +
  theme_minimal()

ggplot() +
  geom_sf(data = fires_2010s, fill = "darkorange", color = "darkred") +
  geom_sf(data = cashape, fill = NA, color = "black", size = 1) +
  ggtitle("Fire Extents: 2010s") +
  theme_minimal()

ggplot() +
  geom_sf(data = fires_2020s, fill = "darkorange", color = "darkred") +
  geom_sf(data = cashape, fill = NA, color = "black", size = 1) +
  ggtitle("Fire Extents: 2020-2022") +
  theme_minimal()

The maps show that acreage per decade seems to be increasing, there are fewer total fires in the 2020s, but a high number of very large fires. To visualize the increase in area burned over time, I made a bar chart showing acreage burned per decade in millions of acres using filtered csv data.

cafires_data <- read.csv("cafires.csv")

cafires_bar_data <- aggregate(GIS_ACRES ~ DECADES, data = cafires_data, sum)

cafires_bar_data <- cafires_bar_data[cafires_bar_data$DECADES %in% c(1990, 2000, 2010, 2020), ]


ggplot(cafires_bar_data, aes(x = factor(DECADES), y = GIS_ACRES)) +
  geom_bar(stat = "identity", aes(fill = ifelse(DECADES == 2020, "darkorange", "darkred")), width = 0.5) +
  labs(x = "Decades", y = "Millions of Acres", title = "Acres Burned by Decade") +
  scale_y_continuous(labels = function(x) format(x / 1e6, scientific = FALSE)) +
  scale_x_discrete(labels = c("1990", "2000", "2010", "2020-2022")) +  
  scale_fill_identity() + 
  theme_minimal()

This shows that the first 3 years of the 2020s saw more land burned than any of the 3 decades previous, suggesting area burned will likely far surpass previous decades in coming years. Plotting the same data by years instead of decades shows the data more finely.

library(ggplot2)
cafires_data <- read.csv("cafires.csv")
cafires_data_years <- cafires_data[cafires_data$YEAR_ >= 1990 & cafires_data$YEAR_ <= 2022, ]
cafires_data_years_filtered <- cafires_data_years[!is.na(cafires_data_years$YEAR_), ]

ggplot(cafires_data_years_filtered, aes(x = as.factor(YEAR_), y = GIS_ACRES)) +
  geom_bar(stat = "identity", fill = "darkred") +
  labs(x = "Year", y = "Millions of Acres") +
  scale_y_continuous(labels = function(x) format(x / 1e6, scientific = FALSE)) +  
  ggtitle("Acres Burned by Year") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

This suggests a tendency towards steep drop offs in area burned of 1-2 years after 1-3 year periods of higher than average burning, which is consistent with the cyclical nature of droughts and the impact of fire on clearing out areas of dry, built up vegetation.

I created boxplots to show the distribution of wildfire sizes across the decades of interest. Because there is very high variation in size (many wildfires less than 1 acre, but some in the hundreds of thousands) I displayed the data logarithmically. I had to filter out wildfires less than 1 acre as these came out as negative values.

library(ggplot2)
library(dplyr)

cafires <- read.csv("cafires.csv")
cafires_1990 <- cafires %>% filter(DECADES == 1990 & !is.na(GIS_ACRES) & GIS_ACRES >= 1)
cafires_2000 <- cafires %>% filter(DECADES == 2000 & !is.na(GIS_ACRES) & GIS_ACRES >= 1)
cafires_2010 <- cafires %>% filter(DECADES == 2010 & !is.na(GIS_ACRES) & GIS_ACRES >= 1)
cafires_2020 <- cafires %>% filter(DECADES == 2020 & !is.na(GIS_ACRES) & GIS_ACRES >= 1)

ggplot() +
  geom_boxplot(data = cafires_1990, aes(x = "", y = log(GIS_ACRES)), fill = "red", color = "darkred") +
  geom_boxplot(data = cafires_2000, aes(x = "", y = log(GIS_ACRES)), fill = "orange", color = "darkorange3") +
  geom_boxplot(data = cafires_2010, aes(x = "", y = log(GIS_ACRES)), fill = "red", color = "darkred") +
  geom_boxplot(data = cafires_2020, aes(x = "", y = log(GIS_ACRES)), fill = "orange", color = "darkorange3") +
  facet_wrap(~factor(DECADES), scales = "free", nrow = 1) +
  theme_minimal() +
  labs(x = "", y = "Log of Acres Burned", title = "Area Burned Per Decade Boxplots") +
  ylim(log(range(cafires$GIS_ACRES[cafires$GIS_ACRES > 1], na.rm = TRUE)))

The boxplots reveal that the average fire size is actually decreasing each decade. Based on the previous visualizations, this makes it evident that while the average wildfire may be getting smaller, the worst wildfires are getting larger and more destructive.

Discussion

The simple visualizations above reveal trends in California’s worsening fire seasons. The 2020s are already the most destructive decade on record, and this is likely to continue in coming years. The cyclical nature of wildfire seasons is also evident; this could be complemented with drought, precipitation and other climate variables to generate estimates of when fire seasons will be most destructive. Perhaps the most worrying trend is the increasing variance of fire sizes, visible in the boxplots. While this might suggest firefighting capabilities are improving regarding the average fire, extreme wildfires are increasing. Extremely large fires are unpredictable and stretch firefighting resources thin, leading to more deaths and destruction of ecosystems and the built environment. As the effects of climate change are projected to worsen, these trends will only continue.