scispace - formally typeset
Open AccessJournal ArticleDOI

Confounding adjustment in comparative effectiveness research conducted within distributed research networks.

Reads0
Chats0
TLDR
Methods that incorporate confounder summary scores allow investigators to adjust for a large number of confounding factors without the need to transfer potentially identifiable information in DRNs.
Abstract
Background A distributed research network (DRN) of electronic health care databases, in which data reside behind the firewall of each data partner, can support a wide range of comparative effectiveness research (CER) activities. An essential component of a fully functional DRN is the capability to perform robust statistical analyses to produce valid, actionable evidence without compromising patient privacy, data security, or proprietary interests. Objectives and methods We describe the strengths and limitations of different confounding adjustment approaches that can be considered in observational CER studies conducted within DRNs, and the theoretical and practical issues to consider when selecting among them in various study settings. Results Several methods can be used to adjust for multiple confounders simultaneously, either as individual covariates or as confounder summary scores (eg, propensity scores and disease risk scores), including: (1) centralized analysis of patient-level data, (2) case-centered logistic regression of risk set data, (3) stratified or matched analysis of aggregated data, (4) distributed regression analysis, and (5) meta-analysis of site-specific effect estimates. These methods require different granularities of information be shared across sites and afford investigators different levels of analytic flexibility. Conclusions DRNs are growing in use and sharing of highly detailed patient-level information is not always feasible in DRNs. Methods that incorporate confounder summary scores allow investigators to adjust for a large number of confounding factors without the need to transfer potentially identifiable information in DRNs. They have the potential to let investigators perform many analyses traditionally conducted through a centralized dataset with detailed patient-level information.

read more

Content maybe subject to copyright    Report

Confounding Adjustment in Comparative Effectiveness
Research Conducted Within Distributed
Research Networks
Sengwee Toh, ScD,* Joshua J. Gagne, PharmD, ScD,
w
Jeremy A. Rassen, ScD,
w
Bruce H. Fireman, MA,
z
Martin Kulldorff, PhD,* and Jeffrey S. Brown, PhD*
Background: A distributed research network (DRN) of electronic
health care databases, in which data reside behind the firewall of
each data partner, can support a wide range of comparative effec-
tiveness research (CER) activities. An essential component of a
fully functional DRN is the capability to perform robust statistical
analyses to produce valid, actionable evidence without com-
promising patient privacy, data security, or proprietary interests.
Objectives and Methods: We describe the strengths and limi-
tations of different confounding adjustment approaches that can be
considered in observational CER studies conducted within DRNs,
and the theoretical and practical issues to consider when selecting
among them in various study settings.
Results: Several methods can be used to adjust for multiple con-
founders simultaneously, either as individual covariates or as con-
founder summary scores (eg, propensity scores and disease risk
scores), including: (1) centralized analysis of patient-level data, (2)
case-centered logistic regression of risk set data, (3) stratified or
matched analysis of aggregated data, (4) distributed regression anal-
ysis, and (5) meta-analysis of site-specic effect estimates. These
methods require different granularities of information be shared across
sites and afford investigators different levels of analytic flexibility.
Conclusions: DRNs are growing in use and sharing of highly de-
tailed patient-level information is not always feasible in DRNs.
Methods that incorporate confounder summary scores allow in-
vestigators to adjust for a large number of confounding factors
without the need to transfer potentially identifiable information in
DRNs. They have the potential to let investigators perform many
analyses traditionally conducted through a centralized dataset with
detailed patient-level information.
Key Words: comparative effectiveness research, distributed re-
search network, confounding, propensity score, disease risk score,
instrumental variable, marginal structural model, pharmacoepi-
demiology
(Med Care 2013;51: S4–S10)
A
major goal of comparative effectiveness research (CER)
is to provide timely and actionable evidence regarding
the relative benefits and risks of various treatments in dif-
ferent patients.
1–3
Electronic health care databases—such as
administrative claims databases and electronic health record
databases—have become an important data source for de-
veloping evidence on the comparative effectiveness of med-
ical products and delivery of care.
1,4
These databases
chronicle clinical encounters, medical services, and pharmacy
prescriptions or dispensings of a large number of individuals.
When analyzed appropriately, they can provide useful in-
formation to support clinical and regulatory decision making.
Using multiple databases to conduct CER offers a
number of advantages. The combined sample size is often
large enough to allow evaluations of rare treatments, out-
comes, or populations, and can provide more timely evi-
dence. A network of databases with diverse populations
allows assessment of treatment heterogeneity and improves
generalizability.
5–7
Although storing all databases in a cen-
tralized repository is theoretically appealing, in practice it
often creates concerns about privacy, confidentiality, regu-
lations, and proprietary interests. A distributed approach—in
which data are kept physically behind data partners’ firewalls
and under their direct control—is often preferred because it
minimizes these concerns and expands the number of data
partners willing to contribute information to studies.
5–9
A fundamental issue encountered in distributed re-
search network (DRN) studies is the tradeoffs between an-
alytic flexibility and the granularity of information being
shared. Many existing DRNs can accommodate simple
confounder adjustment using aggregate data (eg, age-strati-
fied and sex-stratified analysis), but investigators generally
have to request patient-level analytic datasets to adjust for
multiple confounders. Although these analytic datasets typ-
ically include little or no Protected Health Information,
concerns about privacy and confidentiality still linger due
to the difficulty of complete deidentification,
10
as well as
From the *Department of Population Medicine, Harvard Medical School and
Harvard Pilgrim Health Care Institute; wDivision of Pharmacoepidemi-
ology and Pharmacoeconomics, Brigham and Women’s Hospital and
Harvard Medical School, Boston, MA; and zDivision of Research,
Kaiser Permanente Northern California, Oakland, CA.
S.T. and J.S.B. were supported by Agency for Healthcare Research and
Quality (AHRQ) grant R01HS019912. J.A.R. is a recipient of a career
development award from AHRQ (K01 HS018088).
The authors declare no conflict of interest.
Reprints: Sengwee Toh, ScD, Department of Population Medicine, Harvard
Medical School and Harvard Pilgrim Health Care Institute, 133 Brook-
line Ave 6th Floor, Boston, MA 02215. E-mail: darrentoh@post.
harvard.edu.
Copyright
r
2013 by Lippincott Williams & Wilkins
ISSN: 0025-7079/13/5108-00S4
ANALYTIC METHODS
S4
|
www.lww-medicalcare.com Medical Care
Volume 51, Number 8 Suppl 3, August 2013

concerns about loss of control over data security and re-
strictions on future use.
A fully functional DRN should have the capability to
minimize transfer of potentially identifiable and proprietary
information while permitting statistical analyses that are
unbiased and efficient. However, there is currently no con-
ceptual analytic framework that maps existing methods to
CER needs for studies conducted within DRNs. In this paper,
we describe: (1) the strengths and limitations of existing
approaches that can handle a large number of confounders,
either as individual covariates or as confounder summary
scores, in CER studies conducted within DRNs; and (2) the
theoretical and practical issues to consider when selecting
among these approaches in various study settings.
SCOPE
Data harmonization is the most common approach
used to perform multisite studies in DRNs. Multisite har-
monization through use of a common data model enables
investigators to develop and test the analytic code necessary
for analysis and allows that program to be executed in-
dependently at each site. This paper focuses on studies that
are conducted within DRNs with a common data model. We
focus solely on observational studies with between-person
comparisons because most CER studies compare Z 2 groups
of patients receiving different treatments. Confounding can
sometimes be addressed through within-person comparisons;
interested readers are referred to papers that discuss self-
controlled designs.
11,12
CONFOUNDING ADJUSTMENT IN
OBSERVATIONAL CER STUDIES
Confounding may arise when individuals receiving
different treatments have unequal underlying risks of de-
veloping the outcome of interest. In observational CER
studies, confounding is a reflection of real-world clinical
practice where patients receive a certain treatment based on,
among othe r things, their clinical condition (indication),
disease severity, and prognosis. Within a health plan or de-
livery system, confounding may also originate at the pro-
vider level from differences in treatment preference, care
offered, and other ways that affect the outcome. In a network
of multiple health plans or deliver y systems, confounding
may further arise from institutional guidelines or re-
imbursement policies that prefer one treatment to another
based on patients’ disease severity or treatment history.
Confounding may sometimes be more problematic in
observational studies that examine anticipated treatment effects
than those examining unanticipated effects.
4,13
There may be
more confounding when comparing different types of treat-
ments (eg, pharmacotherapy vs. surgery to treat atrial fi-
brillation) than comparing treatments of the same therapeutic
class (eg, captopril vs. enalapril to treat hypertension). Gen-
erating valid comparative effectiveness evidence using ob-
servational data relies heavily on the ability to identify
confounders, the availability and accuracy of confounder in-
formation, and the use of appropriate methods to analyze this
information.
Individual Confounders Versus Confounder
Summary Scores
When the number of confounders is small relative to
the number of outcome events, investigators can handle each
confounder individually in the analysis. However, in most
observational CER studies, adjustment for a large number of
confounders is necessary because of the expected imbalances
in many outcome risk factors between the treatment groups.
Confounder summary scores condense the information of
many confounders into a single variable. They obscure po-
tentially identifiable information into nonidentifiable mea-
sures and are therefore partic ularly useful for DRN studies.
The exposure propensity score (PS)
14,15
and the disease risk
score (DRS)
16,17
are the most commonly used confounder
summary scores. PSs are the probabilities of having the study
exposure given patients’ baseline characteristics, whereas
DRSs are patients’ probabilities or hazards of having the
study outcome conditional on their baseline characteristics.
The 2 scores have been shown to provide results com-
parable to those from individual covariate adjustment.
18–21
In
general, PSs are particularly well suited for CER studies that
compare the effects of 2 treatments on multiple outcomes,
whereas DRSs are more practical than PSs when there are >2
treatments and a single outcome.
17
When there are 2 treat-
ments and 1 outcome, the choice between PSs and DRSs de-
pends on the prevalence of the exposure and the outcome, the
specific need of the study, and the investigator’s preference.
PSs are more favorable than DRSs when the exposure is
common and outcome is infrequent; DRSs are preferred if the
study aims include assessment of treatment heterogeneity by
baseline outcome risk.
Although there is no standardized threshold for the
number of outcome events per confounder over which one
would prefer using confounder summary scores to handling
each confounder individually, Cepeda et al
22
suggested that PSs
may perform better than the multivariable outcome logistic re-
gression approach when there were r7 outcome events per
confounder. Peduzzi et al
23
foundthatatleast10eventsper
confounder might be needed for the outcome logistic regression
model to produce valid estimates.
Estimating Confounder Summary Scores in
DRNs
Within DRNs, each database likely includes patients
with different characteristics, and each may be created by
organizations with distinct practice, coding patterns, or in-
stitutional policies. It is generally necessary to estimate
confounder summary scores by site. Investigators could have
each site run a distributed program that fits a statistical model
containing the same covariates. The advantage of this ap-
proach is consistency, but by utilizing only variables com-
mon to all sites, the approach may not fully utilize the
information available at each site.
An alternative would be to have each site fit its own
model. This approach reduces residual confounding at sites
that can adjust for more confounders; this may be the case if,
for instance, certain sites have laboratory results data while
others do not. This approach allows each site to provide the
maximally adjusted result based on the local data available,
Medical Care
Volume 51, Number 8 Suppl 3, August 2013 Confounding Adjustment in DRNs
r
2013 Lippincott Williams & Wilkins www.lww-medicalcare.com
|
S5

but it is operationally more cumbersome, as each site must
build its own model, and some sites may not have the re-
quired programming or analytic expertise to do that.
A variant to fitting the site-specific PS model is the
high-dimensional PS approach, which allows investigators to
prespecify a set of common confounders and use an auto-
mated approach to empirically identify additional site-spe-
cific confounders.
24–26
A constraint of this approach is that
smaller sites may not have adequate sample size to perform
the analysis.
27
Another potential weakness is the inclusion of
variables that are only predictive of the exposure but not of
the outcome except through their associations with the ex-
posure (ie, instrumental variables), although some studies
suggested that any bias may be considered trivial relative to
the primary source of bias—residual confounding.
28,29
Estimating Confounder Summary Scores in
CER Studies of Newly Approved Treatments
The issue of confounding is particularly complex in
studies that compare a newly approved treatment to existing
alternatives.
30
As evidence on comparative effectiveness is
generally sparse when a new treatment is approved,
31
some
physicians may reserve the treatment for sicker patients or
those who fail prior therapies. Patients, physicians, insurers,
or delivery systems that adopt a new treatment earlier may be
different from those who embrace it later. Treatm ent choice
and characteristics of patients receiving the new treatment
will change over time as more is learned of its benefits and
risks. Investigators should allow the contribution of in-
dividual confounders to vary over time by, for example,
estimating the PSs at regular intervals (eg, quarterly), starting
from the time the new treatment is available.
30,32
In contrast,
outcome risk factors are generally more stable over time,
therefore evolving prescribing dynamics in the early marketing
period may have smaller impact on the DRS estimation. For
certain outcomes, it may even be possible to fit a DRS model
in the period before the introduction of the new treatment, and
use the model to estimate exposed and unexposed patients’
disease risk after introduction.
33
METHODS FOR CONFOUNDING
ADJUSTMENT IN DRNs
Several analytic approaches can handle a large number
of confounders, with or without using confounder summary
scores, in DRN studies. As discussed below, some allow
investigators to conduct an array of prespecified and ad hoc
analyses but require more granular information, whereas
others limit what investigators can do analytically but pro-
vide good protection for patient privacy, data security, and
proprietary interests.
34
Centralized Analysis of Patient-Level Data
With this approach, the participating sites send the lead
team a patient-level analytic dataset with individual co-
variate information necessar y for the analysis, yielding what
is essential a single centralized dataset. Individual con-
founders can be incorporated into the analysis through re-
striction, stratification, matching, weighting, or outcome
modeling, and the data can be considered all together or
stratified by contributing site.
35,36
Confounder summary
scores can be estimated after centralizing the data. This ap-
proach offers the most analytic flexib ility at the expense of
sharing potentially identifiable patient-level information and
participating sites losing operational control over potentially
sensitive and proprietary data.
37
In principle, most, if not all,
Protected Health Information or proprietary information can
be removed before leaving participating sites’ firewalls. In
practice, however, one often cannot completely rule out the
possibility of reidentifying distinctive patients, especially at
smaller sites,
10,38
and many data partners may be unwilling
to give up operational control over sensitive data, data se-
curity, and potential future uses.
Alternatively, each participating site can first estimate
confounder summary scores and then send the lead team a
patient-level analytic dataset with information on the exposure,
outcome, follow-up (for time-to-event analysis), confounder
summary scores, and other variables needed for the analysis
(eg, age group information if one wishes to perform age-
stratified analysis).
37
Confounder summary scores can be in-
corporated into the analysis through restriction, stratification,
matching, weighting, or outcome modeling.
17,39,40
This ap-
proach can perform essentially all the prespecified analyses
afforded by the approach that shares individual confounder
information. Depending on the variables requested, it may or
may not be able to accommodate ad hoc analyses. For ex-
ample, if race is included in PS estimation but race information
is not requested separately, investigators will not be able to
perform a secondary, race-stratified analysis.
Combining patients with similar values of PS across
sites should not be done if PSs were estimated separately
within each site. PSs will likely not be comparable across
sites as patients’ PS values will depend on the prevalence of
the exposure in the population in which the PS is estimated.
Many factors, including formularies and regional prescribing
patterns, will influence the prevalence of exposure. Inves-
tigators should account for site in the analysis by either in-
cluding it as a stratification variable or performing within-site
PS matching. In contrast, the influence of risk factors on the
outcome is more stable across data sources, even if the out-
come incidence varies by site. For example, the relation be-
tween male sex and the 1-year risk for acute myocardial
infarction, conditional on all other risk factors, should be
similar across sites. Therefore, it may be possible to combine
DRSs across sites.
Case-centered Logistic Regression of
Risk Set Data
This approach was originally developed for vaccine
safety research
41
and has since been expanded to studies of
other medical products.
32,42
Sites transfer an aggregated
dataset to the lead team that includes 1 record per risk set, with
each risk set anchored by a case (ie, patient with the outcome
of interest) and comprised of the cases and comparable in-
dividuals at risk of the outcome at the time the case occurs (see
below). Each record includes a binary variable indicating
whether the case is exposed to the treatment of interest and the
log odds of the site-specific proportion of exposed patients in
the risk set. The lead team fits a logistic regression model with
Toh et al Medical Care
Volume 51, Number 8 Suppl 3, August 2013
S6
|
www.lww-medicalcare.com
r
2013 Lippincott Williams & Wilkins

the indicator variable as the dependent variable and the log
odds as the independent variable (specified as an offset).
Fireman et al
41
have shown that such a model maximizes the
same likelihood as a Cox model fit using patient-level data, and
both yield the same parameter estimates.
Confounding adjustment is achieved through the se-
lection of comparable patients into the risk sets. In strat-
ification, the risk set comprises at-risk patients belongin g to
the same stratum (eg, same age group or PS stratum) as the
case. In 1:1 matching, the risk set includes all at-risk patients
in the matched cohort. In general, the number of confounders
that can be adjusted for may be relatively small if stratifying
by or matching on individual confounders. In addition,
stratification requires that all stratifying variables be dicho-
tomized or categorized. Investigators may use PSs or DRSs
to adjust for a larger number of confounders, and handle
these scores through stratification or matching.
Stratified or Matched Analysis of Aggregated
Data
In a stratified analysis, participating sites send the lead
team the total exposed and unexposed persons or person-times,
and the number of exposed and unexposed outcomes within
each stratum. In the matched analysis, if each site matches in
the same fixed ratio, the information needed for the analysis
includes only the total exposed and unexposed persons or
person-times, and the number of exposed and unexposed out-
comes in the matched cohort. As noted above, stratification
requires that all stratifying variables be dichotomized or cate-
gorized. Therefore the number of confounders that can be ad-
justed for may be relatively small if stratifying by or matching
on individual confounders. Using confounder summary scores
allows adjustment of a larger number of confounders, but as
discussed above, matching, stratifying, or other treatment of
PSs should occur within site rather than across sites.
Distributed Regression Analysis
Distributed regression analysis fits regression models,
with or without confounder summary scores, on individual
databases within DRNs and produces results identical to
those from centralized outcome regression analysis of pa-
tient-level data.
43–46
The approach involves an iterative
process, with participating sites transferring only summary
statistics to the lead team at each step. The number of iter-
ations depends on the regression model (eg, linear, logistic)
and the complexity of the model. For example, fitting a linear
regression model is a 2-step process. At step 1, each site
executes a distributed program locally and submits inter-
mediate summarized statistical results to the lead team. The
lead team combines the intermediate results, and computes
the parameter estimates. At step 2, partic ipating sites execute
another distributed program and deliver the variance/co-
variance estimates of the parameter estimates to the lead
team to compute the confidence intervals . Although new
approach may be developed in the future, this method is
currently limit ed to linear and logistic regression.
Meta-Analysis
In meta-analysis, only site-specific effect estimates and
their variances (or other information needed to calculate
weight) are sent to the lead team. The site-specific estimates
can be obtained from restriction, stratification, matching,
weighting, or outcome modeling, with or without using con-
founder summary scores. These estimates are then pooled
through meta-analysis.
47,48
This approach requires the least
amount of potentially identifiable information to leave partic-
ipating sites’ firewalls. It has been shown to produce similar
pooled estimates when compared with patient-level data
analysis.
42,49
However, it is very analytically rigid. Every
subgroup or sensitivity analysis requires all sites to perform
each analysis internally, and then transfer the effect estimates
to the lead team. Smaller sites may not be able to perform
certain analyses, although sometimes using confounder sum-
mary scores may help.
ADDITIONAL CONSIDERATIONS
Unmeasured Confounding
None of the methods discussed thus far is robust
against unmeasured confounders. In the presence of un-
measured confounding, instrumental variable analyses may
provide valid effect estimates.
50,51
This approach has been
used in observational studies.
52
In DRNs, a number of po-
tential instrumental variables can be entertained, such as
geographic variation in the propensity to use one or another
treatment
53,54
and physician preference.
55,56
In theory, in-
strumental variable analyses enable sites to send little
amount of patient-level data—only exposure, outcome, and
instrumental variable information.
However, instrumental variable analyses are not
without limitations. The greatest challenge is to identify a
valid instrument. There is no way to verify empirically whether
one of the necessary assumptions—the instrument is not as-
sociated with the outcome except through the exposure—holds
in any observational study.
57–59
In other words, instrumental
variable analyses in observational studies replace the as-
sumption of no unmeasured confounding for the exposure-
outcome relation with the assumptions of (i) no unmeasured
confounding for the instrument-outcome relation, and (ii) no
direct effect of the instrument on the outcome by paths other
than through exposure. Investigators have to weigh the plau-
sibility of each set of assumptions when choosing between
available methods. In addition, the treatment effect estimated
by this approach applies only to the “marginal population or
“complier population,” which is a group of patients that cannot
really be identified in practice.
51
To address unmeasured
confounding, investigators should always perform sensitivity
analysis to examine the robustness of their results.
60,61
Time-varying Confounders
Time-dependent treatments are ubiquitous in CER.
Sometimes the treatment of interest is a dynamic regimen that
depends on patients’ responses or prognoses. An example may
be take drug A, if the cholesterol level is still above 240 mg/
dL after 2 months, then add drug B.” CER studies of time-
dependent treatments must appropriately adjust for time-varying
Medical Care
Volume 51, Number 8 Suppl 3, August 2013 Confounding Adjustment in DRNs
r
2013 Lippincott Williams & Wilkins www.lww-medicalcare.com
|
S7

confounding. Standard approaches, such as matching, strat-
ification, and regression, may introduce bias if the time-varying
confounders are also intermediate variables on the causal
pathway of the exposure-outcome relation.
62
To appropriately
adjust for such confounders, investigators should instead use
methods such as inverse probability weighting,
63,64
g-estima-
tion,
65,66
or the g-formula.
67,68
Although electronic health care
databases provide longitudinal information for a large number
of patients, the availability and accuracy of time-varying con-
founder information may be inadequate for certain studies.
Operational Efficiency
For all methods, more operational efficiency can be
gained when sites first transform their source data into a
common data structure.
5–7
The lead team can develop and
test code that creates the analytic dataset or performs the
analysis. Other participating sites can then execute the code,
often with minimal or no modification. The methods dis-
cussed above require different levels of statistical sophi sti-
cation at participating sites, those that involve more
programming and analytic efforts at participating sites may
sometimes be less logistically feasible.
34
CER STUDIES WITH PROSPECTIVE
DATA COLLECTION
Studies that combine information recorded routinely in
electronic health care databases with additional, prospectively
collected data are increasingly common.
69
Such additional
information may improve the accuracy of exposure or outcome
classification, the ability to study patient-reported outcomes,
and the capability to adjust for otherwise unmeasured con-
founders. It also introduces a number of issues that need to be
addressed, many of them may be more challenging in DRNs,
as DRNs usually require a concerted effort among sites.
Missing Information in Subset of Study
Population
A typical database study can include thousands or even
millions of patients. Prospective data collection is often only
feasible in a subset of the study population. Even if data are
prospectively collected for all patients, nonresponse will lead
to missing data. There are many ways to handle missing data.
Some, like the complete-case analysis and the missing in-
dicator approach, are easy to implement but are valid only
under very strong assumptions.
70,71
Investigators should
consider methods that require weaker assumptions, such as
multiple imputation,
72,73
inverse probability weighting,
74
and
PS calibration.
75
None of these methods is appropriate for all
studies, so investigators should be aware of their strengths
and constraints when choosing among them. At the design
phase, a 2-stage sampling approach may be considered.
76
Analyzing Data as They Accrue or After All Data
are Collected
Electronic health care databases capture patient expe-
riences longitudinally, so they can be used to conduct pro-
spective studies. If timely comparative effectiveness
information is needed, then data shoul d be analyzed weekly,
monthly, or quarterly as they arrive using sequential analytic
techniques.
77
However, the fresher the data, the more likely
it is that they may be inaccurate or incomplete because they
may not have undergone the usual adjudication process done
as part of claims processing or data quality check.
78
Hence,
such sequential analysis should be viewed as an activity that
needs follow-up investigations whenever a potential rela-
tionship is detected.
CONCLUSIONS
Cross-institutional sharing of detailed patient-level
information for CER is not always feasible, which has led to
an increase in the use of DRNs. A fully operational DRN
should have the capability to perform robust statistical
analysis while maintaining patient privacy, data security, and
proprietary interests. A range of analytic options are avail-
able. Methods that incorporate confounder summary scores
have the potential to adjust adequately for a large number of
confounders without requiring potentially identifiable in-
formation to leave participating sites’ firewalls, and allow
investigators to perform a wide range of analyses tradition-
ally afforded by a centralized dataset with detailed patient-
level information.
REFERENCES
1. Federal Coordinating Council for Comparative Effectiveness Research.
Report to the President and Congress on Comparative Effectiveness
Research. Washington, DC: Department of Health and Human Services;
2009.
2. Institute of Medicine. Initial Nationa l Priorities for Comparative Effective-
ness Research. Washington, DC: The National Academies Press; 2009.
3. Selby JV, Beal AC, Frank L. The Patient-Centered Outcomes Research
Institute (PCORI) national priorities for research and initial research
agenda. JAMA. 2012;307:1583–1584.
4. Schneeweiss S. Developments in post-marketing comparative effective-
ness research. Clin Pharmacol Ther. 2007;82:143–156.
5. Maro JC, Platt R, Holmes JH, et al. Design of a national distributed
health data network. Ann Intern Med. 2009;151:341–344.
6. Brown JS, Holmes JH, Shah K, et al. Distributed health data networks: a
practical and preferred approach to multi-institutional evaluations of
comparative effectiveness, safety, and quality of care. Med Care. 2010;
48:S45–51.
7. Toh S, Platt R, Steiner JF, et al. Comparative-effectiveness research in
distributed health data networks. Clin Pharmacol Ther. 2011;90:883–887.
8. McMurry AJ, Gilbert CA, Reis BY, et al. A self-scaling, distributed
information architecture for public health, research, and clinical care.
J Am Med Inform Assoc. 2007;14:527–533.
9. Diamond CC, Mostashari F, Shirky C. Collecting and sharing data for
population health: a new paradigm. Health Aff (Millwood). 2009;28:
454–466.
10. Ohm P. Broken promises of privacy: responding to the surprising failure
of anonymization. UCLA Law Review. 2010;57:1701–1777.
11. Maclure M, Mittleman MA. Should we use a case-crossover design?
Annu Rev Public Health. 2000;21:193–221.
12. Whitaker HJ, Farrington CP, Spiessens B, et al. Tutorial in biostatistics:
the self-controlled case series method. Stat Med. 2006;25:1768–1797.
13. Vandenbroucke JP. When are observational studies as credible as
randomised trials? Lancet. 2004;363:1728–1731.
14. Rosenbaum PR, Rubin DB. The central role of the propensity score in
observational studies for causal effects. Biometrika. 1983;70:41–55.
15. Rosenbaum PR, Rubin DB. Reducing bias in observational studies using
subclassification on the propensity score. J Am Stat Assoc. 1984;79:
516–524.
16. Miettinen OS. Stratification by a multivariate confounder score. Am J
Epidemiol. 1976;104:609–620.
17. Arbogast PG, Ray WA. Use of disease risk scores in pharmacoepide-
miologic studies. Stat Methods Med Res. 2009;18:67–80.
Toh et al Medical Care
Volume 51, Number 8 Suppl 3, August 2013
S8
|
www.lww-medicalcare.com
r
2013 Lippincott Williams & Wilkins

Citations
More filters
Journal ArticleDOI

Launching PCORnet, a national patient-centered clinical research network

TL;DR: The Patient-Centered Outcomes Research Institute has launched PCORnet, a major initiative to support an effective, sustainable national research infrastructure that will advance the use of electronic health data in comparative effectiveness research (CER) and other types of research.

The European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP)

Canary Wharf, +1 more
TL;DR: The design of a study of the drug etiology of agranulocytosis and aplastic anemia and the role of automated record linkage in postmarketing comparative effectiveness research are reviewed.
Journal ArticleDOI

Estimating Causal Effects in Observational Studies using Electronic Health Data: Challenges and (Some) Solutions

TL;DR: The challenges in estimating causal effects using electronic health data are outlined and some solutions are offered, with particular attention paid to propensity score methods that help ensure comparisons between similar groups.
References
More filters
Journal ArticleDOI

Meta-Analysis in Clinical Trials*

TL;DR: This paper examines eight published reviews each reporting results from several related trials in order to evaluate the efficacy of a certain treatment for a specified medical condition and suggests a simple noniterative procedure for characterizing the distribution of treatment effects in a series of studies.
Journal ArticleDOI

The central role of the propensity score in observational studies for causal effects

Paul R. Rosenbaum, +1 more
- 01 Apr 1983 - 
TL;DR: The authors discusses the central role of propensity scores and balancing scores in the analysis of observational studies and shows that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates.
Book

Multiple imputation for nonresponse in surveys

TL;DR: In this article, a survey of drinking behavior among men of retirement age was conducted and the results showed that the majority of the participants reported that they did not receive any benefits from the Social Security Administration.
Journal ArticleDOI

A simulation study of the number of events per variable in logistic regression analysis.

TL;DR: Findings indicate that low EPV can lead to major problems, and the regression coefficients were biased in both positive and negative directions, and paradoxical associations (significance in the wrong direction) were increased.
Journal ArticleDOI

Marginal Structural Models and Causal Inference in Epidemiology

TL;DR: In this paper, the authors introduce marginal structural models, a new class of causal models that allow for improved adjustment of confounding in observational studies with exposures or treatments that vary over time, when there exist time-dependent confounders that are also affected by previous treatment.
Related Papers (5)
Frequently Asked Questions (6)
Q1. What are the contributions in "Confounding adjustment in comparative effectiveness research conducted within distributed research networks" ?

A distributed research network ( DRN ) of electronic health care databases, in which data reside behind the firewall of each data partner, can support a wide range of comparative effectiveness research ( CER ) activities this paper. 

in most observational CER studies, adjustment for a large number of confounders is necessary because of the expected imbalances in many outcome risk factors between the treatment groups. 

The exposure propensity score (PS)14,15 and the disease risk score (DRS)16,17 are the most commonly used confounder summary scores. 

PSs are the probabilities of having the study exposure given patients’ baseline characteristics, whereas DRSs are patients’ probabilities or hazards of having the study outcome conditional on their baseline characteristics. 

In general, PSs are particularly well suited for CER studies that compare the effects of 2 treatments on multiple outcomes, whereas DRSs are more practical than PSs when there are >2 treatments and a single outcome. 

They obscure potentially identifiable information into nonidentifiable measures and are therefore particularly useful for DRN studies.