Constructing causal life course models: Comparative study of data-driven and theory-driven approaches

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Constructing causal life course models : Comparative study of data-driven and theory-driven approaches. / Petersen, Anne Helby; Ekstrøm, Claus Thorn; Spirtes, Peter; Osler, Merete.

In: American Journal of Epidemiology, Vol. 192, No. 11, 2023, p. 1917–1927.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Petersen, AH, Ekstrøm, CT, Spirtes, P & Osler, M 2023, 'Constructing causal life course models: Comparative study of data-driven and theory-driven approaches', American Journal of Epidemiology, vol. 192, no. 11, pp. 1917–1927. https://doi.org/10.1093/aje/kwad144

APA

Petersen, A. H., Ekstrøm, C. T., Spirtes, P., & Osler, M. (2023). Constructing causal life course models: Comparative study of data-driven and theory-driven approaches. American Journal of Epidemiology, 192(11), 1917–1927. https://doi.org/10.1093/aje/kwad144

Vancouver

Petersen AH, Ekstrøm CT, Spirtes P, Osler M. Constructing causal life course models: Comparative study of data-driven and theory-driven approaches. American Journal of Epidemiology. 2023;192(11):1917–1927. https://doi.org/10.1093/aje/kwad144

Author

Petersen, Anne Helby ; Ekstrøm, Claus Thorn ; Spirtes, Peter ; Osler, Merete. / Constructing causal life course models : Comparative study of data-driven and theory-driven approaches. In: American Journal of Epidemiology. 2023 ; Vol. 192, No. 11. pp. 1917–1927.

Bibtex

@article{18e2525afa0948a2aa6fc8bebd2cb582,
title = "Constructing causal life course models: Comparative study of data-driven and theory-driven approaches",
abstract = "Life course epidemiology relies on specifying complex (causal) models that describe how variables interplay over time. Traditionally, such models have been constructed by perusing existing theory and previous studies. By comparing data-driven and theory-driven models, we investigate whether data-driven causal discovery algorithms can help this process. We focus on a longitudinal dataset following a cohort of Danish men. The theory-driven models are constructed by two subject-field experts. The data-driven models are constructed by use of temporal Peter-Clark (TPC) algorithm. TPC utilizes the temporal information embedded in life course data. We find that the data-driven models recover some, but not all, causal relationships included in the theory-driven expert models. The data-driven method is especially good at identifying direct causal relationships that the experts have high confidence in. Moreover, in a post-hoc assessment we found that most of the direct causal relationships proposed by the data-driven model, but not included in the theory-driven model, were plausible. Thus, the data-driven model may propose additional meaningful causal hypothesis that are new or have been overlooked by the experts. In conclusion, data-driven methods can aid causal model construction in life course epidemiology, and combining both data-driven and theory-driven methods can lead to even stronger models.",
author = "Petersen, {Anne Helby} and Ekstr{\o}m, {Claus Thorn} and Peter Spirtes and Merete Osler",
note = "{\textcopyright} The Author(s) 2023. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.",
year = "2023",
doi = "10.1093/aje/kwad144",
language = "English",
volume = "192",
pages = "1917–1927",
journal = "American Journal of Epidemiology",
issn = "0002-9262",
publisher = "Oxford University Press",
number = "11",

}

RIS

TY - JOUR

T1 - Constructing causal life course models

T2 - Comparative study of data-driven and theory-driven approaches

AU - Petersen, Anne Helby

AU - Ekstrøm, Claus Thorn

AU - Spirtes, Peter

AU - Osler, Merete

N1 - © The Author(s) 2023. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

PY - 2023

Y1 - 2023

N2 - Life course epidemiology relies on specifying complex (causal) models that describe how variables interplay over time. Traditionally, such models have been constructed by perusing existing theory and previous studies. By comparing data-driven and theory-driven models, we investigate whether data-driven causal discovery algorithms can help this process. We focus on a longitudinal dataset following a cohort of Danish men. The theory-driven models are constructed by two subject-field experts. The data-driven models are constructed by use of temporal Peter-Clark (TPC) algorithm. TPC utilizes the temporal information embedded in life course data. We find that the data-driven models recover some, but not all, causal relationships included in the theory-driven expert models. The data-driven method is especially good at identifying direct causal relationships that the experts have high confidence in. Moreover, in a post-hoc assessment we found that most of the direct causal relationships proposed by the data-driven model, but not included in the theory-driven model, were plausible. Thus, the data-driven model may propose additional meaningful causal hypothesis that are new or have been overlooked by the experts. In conclusion, data-driven methods can aid causal model construction in life course epidemiology, and combining both data-driven and theory-driven methods can lead to even stronger models.

AB - Life course epidemiology relies on specifying complex (causal) models that describe how variables interplay over time. Traditionally, such models have been constructed by perusing existing theory and previous studies. By comparing data-driven and theory-driven models, we investigate whether data-driven causal discovery algorithms can help this process. We focus on a longitudinal dataset following a cohort of Danish men. The theory-driven models are constructed by two subject-field experts. The data-driven models are constructed by use of temporal Peter-Clark (TPC) algorithm. TPC utilizes the temporal information embedded in life course data. We find that the data-driven models recover some, but not all, causal relationships included in the theory-driven expert models. The data-driven method is especially good at identifying direct causal relationships that the experts have high confidence in. Moreover, in a post-hoc assessment we found that most of the direct causal relationships proposed by the data-driven model, but not included in the theory-driven model, were plausible. Thus, the data-driven model may propose additional meaningful causal hypothesis that are new or have been overlooked by the experts. In conclusion, data-driven methods can aid causal model construction in life course epidemiology, and combining both data-driven and theory-driven methods can lead to even stronger models.

U2 - 10.1093/aje/kwad144

DO - 10.1093/aje/kwad144

M3 - Journal article

C2 - 37344193

VL - 192

SP - 1917

EP - 1927

JO - American Journal of Epidemiology

JF - American Journal of Epidemiology

SN - 0002-9262

IS - 11

ER -

ID: 358674293