Development of Europe-Wide Models for Particle Elemental Composition Using Supervised Linear Regression and Random Forest

Publikation: Bidrag til tidsskriftTidsskriftartikelfagfællebedømt

  • Jie Chen
  • Kees De Hoogh
  • John Gulliver
  • Barbara Hoffmann
  • Ole Hertel
  • Matthias Ketzel
  • Gudrun Weinmayr
  • Mariska Bauwelinck
  • Aaron Van Donkelaar
  • Ulla A. Hvidtfeldt
  • Richard Atkinson
  • Nicole A.H. Janssen
  • Randall V. Martin
  • Evangelia Samoli
  • Bente M. Oftedal
  • Massimo Stafoggia
  • Tom Bellander
  • Maciej Strak
  • Kathrin Wolf
  • Danielle Vienneau
  • Bert Brunekreef
  • Gerard Hoek

We developed Europe-wide models of long-term exposure to eight elements (copper, iron, potassium, nickel, sulfur, silicon, vanadium, and zinc) in particulate matter with diameter <2.5 μm (PM2.5) using standardized measurements for one-year periods between October 2008 and April 2011 in 19 study areas across Europe, with supervised linear regression (SLR) and random forest (RF) algorithms. Potential predictor variables were obtained from satellites, chemical transport models, land-use, traffic, and industrial point source databases to represent different sources. Overall model performance across Europe was moderate to good for all elements with hold-out-validation R-squared ranging from 0.41 to 0.90. RF consistently outperformed SLR. Models explained within-area variation much less than the overall variation, with similar performance for RF and SLR. Maps proved a useful additional model evaluation tool. Models differed substantially between elements regarding major predictor variables, broadly reflecting known sources. Agreement between the two algorithm predictions was generally high at the overall European level and varied substantially at the national level. Applying the two models in epidemiological studies could lead to different associations with health. If both between- and within-area exposure variability are exploited, RF may be preferred. If only within-area variability is used, both methods should be interpreted equally.

TidsskriftEnvironmental Science and Technology
Udgave nummer24
Sider (fra-til)15698-15709
Antal sider12
StatusUdgivet - 2020

Bibliografisk note

Funding Information:
The research described in this article was conducted under contract to the Health Effects Institute (HEI), an organization jointly funded by the United States Environmental Protection Agency (EPA) (Assistance award no. R-82811201) and certain motor vehicle and engine manufacturers. The contents of this article do not necessarily reflect the views of the HEI, or its sponsors, nor do they necessarily reflect the views and policies of the EPA or motor vehicle and engine manufacturers. This work was also supported by a scholarship under the State Scholarship Fund by the China Scholarship Council (file no. 201606010329).

Publisher Copyright:
© 2020 American Chemical Society.

ID: 269668847