Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images

Publication date

2025-10-07

Authors

Yuan, ZhendongISNI 000000050789514X
Kerckhoffs, JulesORCID 0000-0001-9065-6916ISNI 0000000492497930
Lin, Pi I.Debby
Suel, Esra
Li, Hao
Yi, Li
Jimenez, Marcia Pescador
James, Peter
de Hoogh, Kees
Hoek, GerardISNI 0000000394591966

Editors

Advisors

Supervisors

Document Type

Article
Open Access logo

License

cc_by

Abstract

Mobile monitoring campaigns combined with land use regression (LUR) models effectively capture fine-scale spatial variations in urban air pollution. However, traditional predictor variables often fail to capture the nuances of the built environment and undocumented emission sources. To address this, we developed a framework integrating customizable object-level and segmentation-level visual features from street-view images into stepwise regression and random-forest-based LUR models. Using 5.7 million mobile air pollution measurements (2019-2020) and 0.37 million street-view images (2008-2024), we mapped nitrogen dioxide (NO2), black carbon (BC), and ultrafine particles (UFP) across 46,664 road segments in Amsterdam, The Netherlands. Incorporating street-view images improved model performance, increasing R2 by 0.01-0.05 and reducing mean absolute errors by 0.7-10.3%. Sensitivity analyses indicated that key street-view-derived visual features remained stable across years and seasons. Using images from nearby years expanded training instances, thereby enhancing alignment with mobile measurements at fine granularity. Our open-vocabulary object detection module identified influential but previously unrecognized object predictors, such as chimneys, traffic lights, and shops. Combined with segmentation-derived features (e.g., walls, roads, grass), street-view images contributed 8-18% feature importance to model predictions. These findings highlight the potential of visual data in enhancing hyperlocal air pollution mapping and exposure assessment.

Keywords

air pollution, deep learning, exposure assessment, land use regression (LUR), mobile sensing, street-view image, vision-language model (VLM), vision-transformer models (ViT), General Chemistry, Environmental Chemistry, SDG 11 - Sustainable Cities and Communities, SDG 15 - Life on Land

Citation

Yuan, Z, Kerckhoffs, J, Lin, P I D, Suel, E, Li, H, Yi, L, Jimenez, M P, James, P, de Hoogh, K, Hoek, G & Vermeulen, R 2025, 'Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images', Environmental Science & Technology, vol. 59, no. 39, pp. 21237-21247. https://doi.org/10.1021/acs.est.5c09687