Home
/
Authors
/
Hossein Estiri

Author

Hossein Estiri

Other affiliations: Partners HealthCare, University of Washington

Bio: Hossein Estiri is an academic researcher from Harvard University. The author has contributed to research in topics: Data quality & Energy policy. The author has an hindex of 9, co-authored 37 publications receiving 405 citations. Previous affiliations of Hossein Estiri include Partners HealthCare & University of Washington.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data

[...]

Michael G. Kahn¹, Tiffany J. Callahan¹, Juliana Barnard¹, Alan Bauck², Jeffrey S. Brown³, Bruce N. Davidson⁴, Hossein Estiri⁵, Carsten Goerg¹, Erin Holve⁶, Steven G. Johnson⁷, Siaw-Teng Liaw⁸, Marianne Hamilton-Lopez⁹, Daniella Meeker¹⁰, Toan C. Ong¹¹, Patrick B. Ryan¹², Ning Shang¹³, Nicole G. Weiskopf¹⁴, Chunhua Weng¹³, Meredith N. Zozus¹⁵, Lisa M. Schilling¹¹ - Show less +16 more•Institutions (15)

Anschutz Medical Campus¹, Kaiser Permanente², Harvard University³, Hoag⁴, University of Washington⁵, AcademyHealth⁶, University of Minnesota⁷, University of New South Wales⁸, National Academy of Sciences⁹, University of Southern California¹⁰, University of Colorado Denver¹¹, Janssen Pharmaceutica¹², Columbia University¹³, Oregon Health & Science University¹⁴, University of Arkansas for Medical Sciences¹⁵

11 Sep 2016

TL;DR: A consistent, common DQ terminology, organized into a logical framework, is an initial step in enabling data owners and users, patients, and policy makers to evaluate and communicate data quality findings in a well-defined manner with a shared vocabulary.

...read moreread less

Abstract: Objective: Harmonized data quality (DQ) assessment terms, methods, and reporting practices can establish a common understanding of the strengths and limitations of electronic health record (EHR) data for operational analytics, quality improvement, and research. Existing published DQ terms were harmonized to a comprehensive unified terminology with definitions and examples and organized into a conceptual framework to support a common approach to defining whether EHR data is ‘fit’ for specific uses. Materials and Methods: DQ publications, informatics and analytics experts, managers of established DQ programs, and operational manuals from several mature EHR-based research networks were reviewed to identify potential DQ terms and categories. Two face-to-face stakeholder meetings were used to vet an initial set of DQ terms and definitions that were grouped into an overall conceptual framework. Feedback received from data producers and users was used to construct a draft set of harmonized DQ terms and categories. Multiple rounds of iterative refinement resulted in a set of terms and organizing framework consisting of DQ categories, subcategories, terms, definitions, and examples. The harmonized terminology and logical framework’s inclusiveness was evaluated against ten published DQ terminologies. Results: Existing DQ terms were harmonized and organized into a framework by defining three DQ categories: (1) Conformance (2) Completeness and (3) Plausibility and two DQ assessment contexts: (1) Verification and (2) Validation. Conformance and Plausibility categories were further divided into subcategories. Each category and subcategory was defined with respect to whether the data may be verified with organizational data, or validated against an accepted gold standard, depending on proposed context and uses. The coverage of the harmonized DQ terminology was validated by successfully aligning to multiple published DQ terminologies. Discussion: Existing DQ concepts, community input, and expert review informed the development of a distinct set of terms, organized into categories and subcategories. The resulting DQ terms successfully encompassed a wide range of disparate DQ terminologies. Operational definitions were developed to provide guidance for implementing DQ assessment procedures. The resulting structure is an inclusive DQ framework for standardizing DQ assessment and reporting. While our analysis focused on the DQ issues often found in EHR data, the new terminology may be applicable to a wide range of electronic health data such as administrative, research, and patient-reported data. Conclusion: A consistent, common DQ terminology, organized into a logical framework, is an initial step in enabling data owners and users, patients, and policy makers to evaluate and communicate data quality findings in a well-defined manner with a shared vocabulary. Future work will leverage the framework and terminology to develop reusable data quality assessment and reporting methods.

...read moreread less

271 citations

Journal Article•DOI•

Predicting COVID-19 mortality with electronic medical records

[...]

Hossein Estiri¹, Zachary H. Strasser, Jeffy G. Klann¹, Pourandokht Naseri², Kavishwar B. Wagholikar¹, Shawn N. Murphy - Show less +2 more•Institutions (2)

Harvard University¹, University of Sydney²

04 Feb 2021

TL;DR: In this paper, the authors used a combination of computational methods and clinical expertise to predict death after COVID-19 using only the past medical information collected in electronic health records (EHRs) and understand the differences in risk factors across age groups.

...read moreread less

Abstract: This study aims to predict death after COVID-19 using only the past medical information routinely collected in electronic health records (EHRs) and to understand the differences in risk factors across age groups. Combining computational methods and clinical expertise, we curated clusters that represent 46 clinical conditions as potential risk factors for death after a COVID-19 infection. We trained age-stratified generalized linear models (GLMs) with component-wise gradient boosting to predict the probability of death based on what we know from the patients before they contracted the virus. Despite only relying on previously documented demographics and comorbidities, our models demonstrated similar performance to other prognostic models that require an assortment of symptoms, laboratory values, and images at the time of diagnosis or during the course of the illness. In general, we found age as the most important predictor of mortality in COVID-19 patients. A history of pneumonia, which is rarely asked in typical epidemiology studies, was one of the most important risk factors for predicting COVID-19 mortality. A history of diabetes with complications and cancer (breast and prostate) were notable risk factors for patients between the ages of 45 and 65 years. In patients aged 65-85 years, diseases that affect the pulmonary system, including interstitial lung disease, chronic obstructive pulmonary disease, lung cancer, and a smoking history, were important for predicting mortality. The ability to compute precise individual-level risk scores exclusively based on the EHR is crucial for effectively allocating and distributing resources, such as prioritizing vaccination among the general population.

...read moreread less

74 citations

Journal Article•DOI•

Evolving phenotypes of non-hospitalized patients that indicate long COVID.

[...]

Hossein Estiri¹, Zachary H. Strasser¹, Gabriel A. Brat¹, Yevgeniy R. Semenov¹, Chirag J. Patel¹, Shawn N. Murphy¹ - Show less +2 more•Institutions (1)

Harvard University¹

27 Sep 2021-BMC Medicine

TL;DR: In this paper, the authors applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19.

...read moreread less

Abstract: For some SARS-CoV-2 survivors, recovery from the acute phase of the infection has been grueling with lingering effects. Many of the symptoms characterized as the post-acute sequelae of COVID-19 (PASC) could have multiple causes or are similarly seen in non-COVID patients. Accurate identification of PASC phenotypes will be important to guide future research and help the healthcare system focus its efforts and resources on adequately controlled age- and gender-specific sequelae of a COVID-19 infection. In this retrospective electronic health record (EHR) cohort study, we applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19. We evaluated the post-test phenotypes in two temporal windows at 3–6 and 6–9 months after the test and by age and gender. Data from longitudinal diagnosis records stored in EHRs from Mass General Brigham in the Boston Metropolitan Area was used for the analyses. Statistical analyses were performed on data from March 2020 to June 2021. Study participants included over 96 thousand patients who had tested positive or negative for COVID-19 and were not hospitalized. We identified 33 phenotypes among different age/gender cohorts or time windows that were positively associated with past SARS-CoV-2 infection. All identified phenotypes were newly recorded in patients’ medical records 2 months or longer after a COVID-19 RT-PCR test in non-hospitalized patients regardless of the test result. Among these phenotypes, a new diagnosis record for anosmia and dysgeusia (OR 2.60, 95% CI [1.94–3.46]), alopecia (OR 3.09, 95% CI [2.53–3.76]), chest pain (OR 1.27, 95% CI [1.09–1.48]), chronic fatigue syndrome (OR 2.60, 95% CI [1.22–2.10]), shortness of breath (OR 1.41, 95% CI [1.22–1.64]), pneumonia (OR 1.66, 95% CI [1.28–2.16]), and type 2 diabetes mellitus (OR 1.41, 95% CI [1.22–1.64]) is one of the most significant indicators of a past COVID-19 infection. Additionally, more new phenotypes were found with increased confidence among the cohorts who were younger than 65. The findings of this study confirm many of the post-COVID-19 symptoms and suggest that a variety of new diagnoses, including new diabetes mellitus and neurological disorder diagnoses, are more common among those with a history of COVID-19 than those without the infection. Additionally, more than 63% of PASC phenotypes were observed in patients under 65 years of age, pointing out the importance of vaccination to minimize the risk of debilitating post-acute sequelae of COVID-19 among younger adults.

...read moreread less

68 citations

Journal Article•DOI•

Age matters: Ageing and household energy demand in the United States

[...]

Hossein Estiri¹, Emilio Zagheni²•Institutions (2)

Harvard University¹, Max Planck Society²

01 Sep 2019-Energy research and social science

TL;DR: In this paper, the authors evaluated the presence and shape of an age-energy consumption profile in the U.S. residential sector, using household-level data from four waves of the Residential Energy Consumption Survey (RECS) in 1987, 1990, 2005, and 2009.

...read moreread less

Abstract: Age is an important proxy for many life course trajectories, and has complex and understudied relationships with energy consumption. We evaluated the presence and the shape of an age-energy consumption profile in the U.S. residential sector, using household-level data from four waves of the Residential Energy Consumption Survey (RECS) in 1987, 1990, 2005, and 2009. We constructed pseudo-cohorts from Bayesian generalized linear model estimates to create micro-profiles for energy consumption across the life course. Overall, we found that residential energy consumption increases over the life course. Much of the increase in energy consumption is due to housing size. Variations in the age-energy consumption micro-profiles can be described by concave and convex functions that transform from one to another across the life course. We conclude with a demographic perspective on the future of residential energy demand in the U.S. The confluence of demographic and climatic changes will likely cause an amplification of effects, challenging the supply and demand of energy for the older population.

...read moreread less

59 citations

Posted Content•DOI•

Evolving Phenotypes of non-hospitalized Patients that Indicate Long Covid

[...]

Hossein Estiri¹, Zachary H. Strasser¹, Gabriel A. Brat¹, Yevgeniy R. Semenov¹, Chirag J. Patel¹, Shawn N. Murphy¹ - Show less +2 more•Institutions (1)

Harvard University¹

10 Jul 2021-medRxiv

...read moreread less

Abstract: For some SARS-CoV-2 survivors, recovery from the acute phase of the infection has been grueling with lingering effects. Many of the symptoms characterized as the post-acute sequelae of COVID-19 (PASC) could have multiple causes or are similarly seen in non-COVID patients. Accurate identification of phenotypes will be important to guide future research and help the healthcare system focus its efforts and resources on adequately controlled age- and gender-specific sequelae of a COVID-19 infection. In this retrospective electronic health records (EHR) cohort study, we applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19. We evaluated the post-test phenotypes in two temporal windows at 3-6 and 6-9 months after the test and by age and gender. Data from longitudinal diagnosis records stored in EHRs from Mass General Brigham in the Boston metropolitan area was used for the analyses. Statistical analyses were performed on data from March 2020 to June 2021. Study participants included over 96 thousand patients who had tested positive or negative for COVID-19 and were not hospitalized. We identified 33 phenotypes among different age/gender cohorts or time windows that were positively associated with past SARS-CoV-2 infection. All identified phenotypes were newly recorded in patientsâ€™ medical records two months or longer after a COVID-19 RT-PCR test in non-hospitalized patients regardless of the test result. Among these phenotypes, a new diagnosis record for anosmia and dysgeusia (OR: 2.60, 95% CI [1.94 - 3.46]), alopecia (OR: 3.09, 95% CI [2.53 - 3.76]), chest pain (OR: 1.27, 95% CI [1.09 - 1.48]), chronic fatigue syndrome (OR 2.60, 95% CI [1.22-2.10]), shortness of breath (OR 1.41, 95% CI [1.22 - 1.64]), pneumonia (OR 1.66, 95% CI [1.28 - 2.16]), and type 2 diabetes mellitus (OR 1.41, 95% CI [1.22 - 1.64]) are some of the most significant indicators of a past COVID-19 infection. Additionally, more new phenotypes were found with increased confidence among the cohorts who were younger than 65. Our approach avoids a flood of false positive discoveries while offering a more robust probabilistic approach compared to the standard linear phenome-wide association study (PheWAS). The findings of this study confirm many of the post-COVID symptoms and suggest that a variety of new diagnoses, including new diabetes mellitus and neurological disorder diagnoses, are more common among those with a history of COVID-19 than those without the infection. Additionally, more than 63 percent of PASC phenotypes were observed in patients under 65 years of age, pointing out the importance of vaccination to minimize the risk of debilitating post-acute sequelae of COVID-19 among younger adults.

...read moreread less

49 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Book Chapter•DOI•

Prospective Cohort Study

[...]

Victor R. Preedy, Ronald R. Watson

01 Jan 2010

5,842 citations

Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study

[...]

Fei Zhou¹, Ting Yu, Ronghui Du, Guohui Fan², Ying Liu, Zhibo Liu¹, Jie Xiang³, Yeming Wang⁴, Bin Song, Xiaoying Gu¹, Xiaoying Gu², Lulu Guan, Yuan Wei, Li Hui¹, Xudong Wu, Jiuyang Xu⁵, Shengjin Tu, Yi Zhang¹, Hua Chen, Bin Cao - Show less +16 more•Institutions (5)

Peking Union Medical College¹, China-Japan Friendship Hospital², Wuhan Jinyintan Hospital³, Capital Medical University⁴, Tsinghua University⁵

01 Jan 2020

TL;DR: Prolonged viral shedding provides the rationale for a strategy of isolation of infected patients and optimal antiviral interventions in the future.

...read moreread less

Abstract: Summary Background Since December, 2019, Wuhan, China, has experienced an outbreak of coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Epidemiological and clinical characteristics of patients with COVID-19 have been reported but risk factors for mortality and a detailed clinical course of illness, including viral shedding, have not been well described. Methods In this retrospective, multicentre cohort study, we included all adult inpatients (≥18 years old) with laboratory-confirmed COVID-19 from Jinyintan Hospital and Wuhan Pulmonary Hospital (Wuhan, China) who had been discharged or had died by Jan 31, 2020. Demographic, clinical, treatment, and laboratory data, including serial samples for viral RNA detection, were extracted from electronic medical records and compared between survivors and non-survivors. We used univariable and multivariable logistic regression methods to explore the risk factors associated with in-hospital death. Findings 191 patients (135 from Jinyintan Hospital and 56 from Wuhan Pulmonary Hospital) were included in this study, of whom 137 were discharged and 54 died in hospital. 91 (48%) patients had a comorbidity, with hypertension being the most common (58 [30%] patients), followed by diabetes (36 [19%] patients) and coronary heart disease (15 [8%] patients). Multivariable regression showed increasing odds of in-hospital death associated with older age (odds ratio 1·10, 95% CI 1·03–1·17, per year increase; p=0·0043), higher Sequential Organ Failure Assessment (SOFA) score (5·65, 2·61–12·23; p Interpretation The potential risk factors of older age, high SOFA score, and d-dimer greater than 1 μg/mL could help clinicians to identify patients with poor prognosis at an early stage. Prolonged viral shedding provides the rationale for a strategy of isolation of infected patients and optimal antiviral interventions in the future. Funding Chinese Academy of Medical Sciences Innovation Fund for Medical Sciences; National Science Grant for Distinguished Young Scholars; National Key Research and Development Program of China; The Beijing Science and Technology Project; and Major Projects of National Science and Technology on New Drug Creation and Development.

...read moreread less

4,408 citations

Journal Article•

Statistical Analysis With Missing Data (2nd ed.) (Book)

[...]

Russell V. Lenth

01 Jan 2004-Journal of the American Statistical Association

1,583 citations

Journal Article•

Interactome networks and human disease

[...]

Marc Vidal

02 Dec 2014-The Biomedical & Life Sciences Collection

TL;DR: Why interactome networks are important to consider in biology, how they can be mapped and integrated with each other, what global properties are starting to emerge from interactome network models, and how these properties may relate to human disease are detailed.

...read moreread less

Abstract: Complex biological systems and cellular networks may underlie most genotype to phenotype relationships. Here, we review basic concepts in network biology, discussing different types of interactome networks and the insights that can come from analyzing them. We elaborate on why interactome networks are important to consider in biology, how they can be mapped and integrated with each other, what global properties are starting to emerge from interactome network models, and how these properties may relate to human disease.

...read moreread less

1,323 citations

Posted Content•

Deep Learning for Anomaly Detection: A Survey.

[...]

Raghavendra Chalapathy¹, Sanjay Chawla²•Institutions (2)

Cooperative Research Centre¹, Qatar Computing Research Institute²

10 Jan 2019-arXiv: Learning

TL;DR: A structured and comprehensive overview of research methods in deep learning-based anomaly detection, grouped state-of-the-art research techniques into different categories based on the underlying assumptions and approach adopted.

...read moreread less

Abstract: Anomaly detection is an important problem that has been well-studied within diverse research areas and application domains. The aim of this survey is two-fold, firstly we present a structured and comprehensive overview of research methods in deep learning-based anomaly detection. Furthermore, we review the adoption of these methods for anomaly across various application domains and assess their effectiveness. We have grouped state-of-the-art research techniques into different categories based on the underlying assumptions and approach adopted. Within each category we outline the basic anomaly detection technique, along with its variants and present key assumptions, to differentiate between normal and anomalous behavior. For each category, we present we also present the advantages and limitations and discuss the computational complexity of the techniques in real application domains. Finally, we outline open issues in research and challenges faced while adopting these techniques.

...read moreread less

522 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143

Collapse