In the past we have written about how one can use social media to monitor dust storms along with how multi-modal large language models (MLLMs) can be used to analyze images. At the recent American Geophysical Union (AGU) Fall Meeting we (Sage Keidel, Stuart Evans and myself) brought these two strands of research together in a poster entitled "Creating and Assessing an Unconventional Global Database of Dust Storms Utilizing Generative AI."
In this work we showcase how MLLMs are providing new opportunities and accessible methods for information extraction from imagery data using geo-located images from Flickr which have a dust keyword tag associated with it from multiple languages (e.g., Arabic, English, Spanish). We run these images through ChatGPT, which classifies them as dust storms or not and compare this classification with human classifed images. If this sounds of interest, below you can read the abstract, see the poster along with a selection of images that have been labeled as as dust storm or not and ChatGPTs confidence in its classification. While the dust storm database itself can be found here
Abstract:
Complete observations of dust events are difficult, as dust’s spatial and temporal variability means satellites may miss dust due to overpass time or cloud coverage, while ground stations may miss dust due to not being in the plume. As a result, an unknown number of dust events go unrecorded in traditional datasets. Dust’s importance both for atmospheric processes and as a health and travel hazard makes detecting dust events whenever possible important, and in particular, studies of the health impacts of dust are limited by detailed exposure information.
In recent years, social media platforms have emerged as a valuable source of unconventional data to study events such as earthquakes and flooding around the world. However, one challenge with respect to using such data is classifying and labeling it (i.e., is it a dust storm or not?). While it is relatively simple to classify textural data through natural language processing, it is not the case with imagery data. Traditionally, classifying imagery data was a complex computer vision task. However, recent advancements in generative artificial intelligence (AI) especially multi-modal large language models (MLLMs) are opening up new opportunities and offering accessible methods for information extraction from imagery data. Therefore, in this study we collected geotagged Flickr images referencing dust from around the globe from multiple languages (e.g., English, Spanish, Arabic) and use generative AI (i.e., ChatGPT) to classify the images as dust storms or not. Furthermore, we compare a sample of these classified images from ChatGPT with human classified images to assess its accuracy in classification. Our results suggest that ChatGPT can relatively accurately detect dust storms from Flickr images and thus helps us create an unconventional global database of dust storm events that might otherwise go unobserved from more traditional datasets.
![]() |
| Workflow |
![]() |
| Poster |
![]() |
| Dust storm database (click here to go to it) |
Keidel, S., Evans S. and Crooks, A.T. (2025), Creating and Assessing an Unconventional Global Database of Dust Storms Utilizing Generative AI, American Geophysical Union (AGU) Fall Meeting, 15th–19th December, New Orleans, LA. (pdf of poster).




No comments:
Post a Comment