Misspecified poisson regression models for large-scale registry data: inference for 'large n and small p'

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Poisson regression is an important tool in register-based epidemiology where it is used to study the association between exposure variables and event rates. In this paper, we will discuss the situation with 'large n and small p', where n is the sample size and p is the number of available covariates. Specifically, we are concerned with modeling options when there are time-varying covariates that can have time-varying effects. One problem is that tests of the proportional hazards assumption, of no interactions between exposure and other observed variables, or of other modeling assumptions have large power due to the large sample size and will often indicate statistical significance even for numerically small deviations that are unimportant for the subject matter. Another problem is that information on important confounders may be unavailable. In practice, this situation may lead to simple working models that are then likely misspecified. To support and improve conclusions drawn from such models, we discuss methods for sensitivity analysis, for estimation of average exposure effects using aggregated data, and a semi-parametric bootstrap method to obtain robust standard errors. The methods are illustrated using data from the Danish national registries investigating the diabetes incidence for individuals treated with antipsychotics compared with the general unexposed population.

OriginalsprogEngelsk
TidsskriftStatistics in Medicine
Vol/bind35
Udgave nummer7
Sider (fra-til)1117-1129
Antal sider13
ISSN0277-6715
DOI
StatusUdgivet - 30 mar. 2016

ID: 157491044