scispace - formally typeset
Search or ask a question
Author

Alan Karthikesalingam

Bio: Alan Karthikesalingam is an academic researcher from Google. The author has contributed to research in topics: Endovascular aneurysm repair & Abdominal aortic aneurysm. The author has an hindex of 49, co-authored 170 publications receiving 8980 citations. Previous affiliations of Alan Karthikesalingam include St George's Hospital & University of London.


Papers
More filters
Journal ArticleDOI
TL;DR: A novel deep learning architecture performs device-independent tissue segmentation of clinical 3D retinal images followed by separate diagnostic classification that meets or exceeds human expert clinical diagnoses of retinal disease.
Abstract: The volume and complexity of diagnostic imaging is increasing at a pace faster than the availability of human expertise to interpret it. Artificial intelligence has shown great promise in classifying two-dimensional photographs of some common diseases and typically relies on databases of millions of annotated images. Until now, the challenge of reaching the performance of expert clinicians in a real-world clinical pathway with three-dimensional diagnostic scans has remained unsolved. Here, we apply a novel deep learning architecture to a clinically heterogeneous set of three-dimensional optical coherence tomography scans from patients referred to a major eye hospital. We demonstrate performance in making a referral recommendation that reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans. Moreover, we demonstrate that the tissue segmentations produced by our architecture act as a device-independent representation; referral accuracy is maintained when using tissue segmentations from a different type of device. Our work removes previous barriers to wider clinical use without prohibitive training data requirements across multiple pathologies in a real-world setting.

1,665 citations

Journal ArticleDOI
01 Jan 2020-Nature
TL;DR: A robust assessment of the AI system paves the way for clinical trials to improve the accuracy and efficiency of breast cancer screening and using a combination of AI and human inputs could help to improve screening efficiency.
Abstract: Screening mammography aims to identify breast cancer at earlier stages of the disease, when treatment can be more successful1. Despite the existence of screening programmes worldwide, the interpretation of mammograms is affected by high rates of false positives and false negatives2. Here we present an artificial intelligence (AI) system that is capable of surpassing human experts in breast cancer prediction. To assess its performance in the clinical setting, we curated a large representative dataset from the UK and a large enriched dataset from the USA. We show an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives. We provide evidence of the ability of the system to generalize from the UK to the USA. In an independent study of six radiologists, the AI system outperformed all of the human readers: the area under the receiver operating characteristic curve (AUC-ROC) for the AI system was greater than the AUC-ROC for the average radiologist by an absolute margin of 11.5%. We ran a simulation in which the AI system participated in the double-reading process that is used in the UK, and found that the AI system maintained non-inferior performance and reduced the workload of the second reader by 88%. This robust assessment of the AI system paves the way for clinical trials to improve the accuracy and efficiency of breast cancer screening. An artificial intelligence (AI) system performs as well as or better than radiologists at detecting breast cancer from mammograms, and using a combination of AI and human inputs could help to improve screening efficiency.

1,413 citations

Journal ArticleDOI
TL;DR: The safe and timely translation of AI research into clinically validated and appropriately regulated systems that can benefit everyone is challenging, and robust clinical evaluation, using metrics that are intuitive to clinicians and ideally go beyond measures of technical accuracy, is essential.
Abstract: Artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications being demonstrated across various domains of medicine. However, there are currently limited examples of such techniques being successfully deployed into clinical practice. This article explores the main challenges and limitations of AI in healthcare, and considers the steps required to translate these potentially transformative technologies from research to clinical practice. Key challenges for the translation of AI systems in healthcare include those intrinsic to the science of machine learning, logistical difficulties in implementation, and consideration of the barriers to adoption as well as of the necessary sociocultural or pathway changes. Robust peer-reviewed clinical evaluation as part of randomised controlled trials should be viewed as the gold standard for evidence generation, but conducting these in practice may not always be appropriate or feasible. Performance metrics should aim to capture real clinical applicability and be understandable to intended users. Regulation that balances the pace of innovation with the potential for harm, alongside thoughtful post-market surveillance, is required to ensure that patients are not exposed to dangerous interventions nor deprived of access to beneficial innovations. Mechanisms to enable direct comparisons of AI systems must be developed, including the use of independent, local and representative test sets. Developers of AI algorithms must be vigilant to potential dangers, including dataset shift, accidental fitting of confounders, unintended discriminatory bias, the challenges of generalisation to new populations, and the unintended negative consequences of new algorithms on health outcomes. The safe and timely translation of AI research into clinically validated and appropriately regulated systems that can benefit everyone is challenging. Robust clinical evaluation, using metrics that are intuitive to clinicians and ideally go beyond measures of technical accuracy to include quality of care and patient outcomes, is essential. Further work is required (1) to identify themes of algorithmic bias and unfairness while developing mitigations to address these, (2) to reduce brittleness and improve generalisability, and (3) to develop methods for improved interpretability of machine learning predictions. If these goals can be achieved, the benefits for patients are likely to be transformational.

855 citations

Journal ArticleDOI
01 Aug 2019-Nature
TL;DR: A deep learning approach that predicts the risk of acute kidney injury and provides confidence assessments and a list of the clinical features that are most salient to each prediction, alongside predicted future trajectories for clinically relevant blood tests are developed.
Abstract: The early prediction of deterioration could have an important role in supporting healthcare professionals, as an estimated 11% of deaths in hospital follow a failure to promptly recognize and treat deteriorating patients1. To achieve this goal requires predictions of patient risk that are continuously updated and accurate, and delivered at an individual level with sufficient context and enough time to act. Here we develop a deep learning approach for the continuous risk prediction of future deterioration in patients, building on recent work that models adverse events from electronic health records2–17 and using acute kidney injury—a common and potentially life-threatening condition18—as an exemplar. Our model was developed on a large, longitudinal dataset of electronic health records that cover diverse clinical environments, comprising 703,782 adult patients across 172 inpatient and 1,062 outpatient sites. Our model predicts 55.8% of all inpatient episodes of acute kidney injury, and 90.2% of all acute kidney injuries that required subsequent administration of dialysis, with a lead time of up to 48 h and a ratio of 2 false alerts for every true alert. In addition to predicting future acute kidney injury, our model provides confidence assessments and a list of the clinical features that are most salient to each prediction, alongside predicted future trajectories for clinically relevant blood tests9. Although the recognition and prompt treatment of acute kidney injury is known to be challenging, our approach may offer opportunities for identifying patients at risk within a time window that enables early treatment. A deep learning approach that predicts the risk of acute kidney injury may help to identify patients at risk of health deterioration within a time window that enables early treatment.

617 citations

Posted Content
TL;DR: This work shows the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain, and shows that this problem appears in a wide variety of practical ML pipelines.
Abstract: ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predictors returned by underspecified pipelines are often treated as equivalent based on their training domain performance, but we show here that such predictors can behave very differently in deployment domains. This ambiguity can lead to instability and poor model behavior in practice, and is a distinct failure mode from previously identified issues arising from structural mismatch between training and deployment domains. We show that this problem appears in a wide variety of practical ML pipelines, using examples from computer vision, medical imaging, natural language processing, clinical risk prediction based on electronic health records, and medical genomics. Our results show the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain.

374 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: WRITING GROUP MEMBERS Emelia J. Benjamin, MD, SCM, FAHA Michael J. Reeves, PhD Matthew Ritchey, PT, DPT, OCS, MPH Carlos J. Jiménez, ScD, SM Lori Chaffin Jordan,MD, PhD Suzanne E. Judd, PhD
Abstract: WRITING GROUP MEMBERS Emelia J. Benjamin, MD, SCM, FAHA Michael J. Blaha, MD, MPH Stephanie E. Chiuve, ScD Mary Cushman, MD, MSc, FAHA Sandeep R. Das, MD, MPH, FAHA Rajat Deo, MD, MTR Sarah D. de Ferranti, MD, MPH James Floyd, MD, MS Myriam Fornage, PhD, FAHA Cathleen Gillespie, MS Carmen R. Isasi, MD, PhD, FAHA Monik C. Jiménez, ScD, SM Lori Chaffin Jordan, MD, PhD Suzanne E. Judd, PhD Daniel Lackland, DrPH, FAHA Judith H. Lichtman, PhD, MPH, FAHA Lynda Lisabeth, PhD, MPH, FAHA Simin Liu, MD, ScD, FAHA Chris T. Longenecker, MD Rachel H. Mackey, PhD, MPH, FAHA Kunihiro Matsushita, MD, PhD, FAHA Dariush Mozaffarian, MD, DrPH, FAHA Michael E. Mussolino, PhD, FAHA Khurram Nasir, MD, MPH, FAHA Robert W. Neumar, MD, PhD, FAHA Latha Palaniappan, MD, MS, FAHA Dilip K. Pandey, MBBS, MS, PhD, FAHA Ravi R. Thiagarajan, MD, MPH Mathew J. Reeves, PhD Matthew Ritchey, PT, DPT, OCS, MPH Carlos J. Rodriguez, MD, MPH, FAHA Gregory A. Roth, MD, MPH Wayne D. Rosamond, PhD, FAHA Comilla Sasson, MD, PhD, FAHA Amytis Towfighi, MD Connie W. Tsao, MD, MPH Melanie B. Turner, MPH Salim S. Virani, MD, PhD, FAHA Jenifer H. Voeks, PhD Joshua Z. Willey, MD, MS John T. Wilkins, MD Jason HY. Wu, MSc, PhD, FAHA Heather M. Alger, PhD Sally S. Wong, PhD, RD, CDN, FAHA Paul Muntner, PhD, MHSc On behalf of the American Heart Association Statistics Committee and Stroke Statistics Subcommittee Heart Disease and Stroke Statistics—2017 Update

7,190 citations

Journal ArticleDOI
TL;DR: Author(s): Writing Group Members; Mozaffarian, Dariush; Benjamin, Emelia J; Go, Alan S; Arnett, Donna K; Blaha, Michael J; Cushman, Mary; Das, Sandeep R; de Ferranti, Sarah; Despres, Jean-Pierre; Fullerton, Heather J; Howard, Virginia J; Huffman, Mark D; Isasi, Carmen R; Jimenez, Monik C; Judd, Suzanne
Abstract: Author(s): Writing Group Members; Mozaffarian, Dariush; Benjamin, Emelia J; Go, Alan S; Arnett, Donna K; Blaha, Michael J; Cushman, Mary; Das, Sandeep R; de Ferranti, Sarah; Despres, Jean-Pierre; Fullerton, Heather J; Howard, Virginia J; Huffman, Mark D; Isasi, Carmen R; Jimenez, Monik C; Judd, Suzanne E; Kissela, Brett M; Lichtman, Judith H; Lisabeth, Lynda D; Liu, Simin; Mackey, Rachel H; Magid, David J; McGuire, Darren K; Mohler, Emile R; Moy, Claudia S; Muntner, Paul; Mussolino, Michael E; Nasir, Khurram; Neumar, Robert W; Nichol, Graham; Palaniappan, Latha; Pandey, Dilip K; Reeves, Mathew J; Rodriguez, Carlos J; Rosamond, Wayne; Sorlie, Paul D; Stein, Joel; Towfighi, Amytis; Turan, Tanya N; Virani, Salim S; Woo, Daniel; Yeh, Robert W; Turner, Melanie B; American Heart Association Statistics Committee; Stroke Statistics Subcommittee

6,181 citations

Journal ArticleDOI
TL;DR: March 5, 2019 e1 WRITING GROUP MEMBERS Emelia J. Virani, MD, PhD, FAHA, Chair Elect On behalf of the American Heart Association Council on Epidemiology and Prevention Statistics Committee and Stroke Statistics Subcommittee.
Abstract: March 5, 2019 e1 WRITING GROUP MEMBERS Emelia J. Benjamin, MD, ScM, FAHA, Chair Paul Muntner, PhD, MHS, FAHA, Vice Chair Alvaro Alonso, MD, PhD, FAHA Marcio S. Bittencourt, MD, PhD, MPH Clifton W. Callaway, MD, FAHA April P. Carson, PhD, MSPH, FAHA Alanna M. Chamberlain, PhD Alexander R. Chang, MD, MS Susan Cheng, MD, MMSc, MPH, FAHA Sandeep R. Das, MD, MPH, MBA, FAHA Francesca N. Delling, MD, MPH Luc Djousse, MD, ScD, MPH Mitchell S.V. Elkind, MD, MS, FAHA Jane F. Ferguson, PhD, FAHA Myriam Fornage, PhD, FAHA Lori Chaffin Jordan, MD, PhD, FAHA Sadiya S. Khan, MD, MSc Brett M. Kissela, MD, MS Kristen L. Knutson, PhD Tak W. Kwan, MD, FAHA Daniel T. Lackland, DrPH, FAHA Tené T. Lewis, PhD Judith H. Lichtman, PhD, MPH, FAHA Chris T. Longenecker, MD Matthew Shane Loop, PhD Pamela L. Lutsey, PhD, MPH, FAHA Seth S. Martin, MD, MHS, FAHA Kunihiro Matsushita, MD, PhD, FAHA Andrew E. Moran, MD, MPH, FAHA Michael E. Mussolino, PhD, FAHA Martin O’Flaherty, MD, MSc, PhD Ambarish Pandey, MD, MSCS Amanda M. Perak, MD, MS Wayne D. Rosamond, PhD, MS, FAHA Gregory A. Roth, MD, MPH, FAHA Uchechukwu K.A. Sampson, MD, MBA, MPH, FAHA Gary M. Satou, MD, FAHA Emily B. Schroeder, MD, PhD, FAHA Svati H. Shah, MD, MHS, FAHA Nicole L. Spartano, PhD Andrew Stokes, PhD David L. Tirschwell, MD, MS, MSc, FAHA Connie W. Tsao, MD, MPH, Vice Chair Elect Mintu P. Turakhia, MD, MAS, FAHA Lisa B. VanWagner, MD, MSc, FAST John T. Wilkins, MD, MS, FAHA Sally S. Wong, PhD, RD, CDN, FAHA Salim S. Virani, MD, PhD, FAHA, Chair Elect On behalf of the American Heart Association Council on Epidemiology and Prevention Statistics Committee and Stroke Statistics Subcommittee

5,739 citations

01 Mar 2007
TL;DR: An initiative to develop uniform standards for defining and classifying AKI and to establish a forum for multidisciplinary interaction to improve care for patients with or at risk for AKI is described.
Abstract: Acute kidney injury (AKI) is a complex disorder for which currently there is no accepted definition. Having a uniform standard for diagnosing and classifying AKI would enhance our ability to manage these patients. Future clinical and translational research in AKI will require collaborative networks of investigators drawn from various disciplines, dissemination of information via multidisciplinary joint conferences and publications, and improved translation of knowledge from pre-clinical research. We describe an initiative to develop uniform standards for defining and classifying AKI and to establish a forum for multidisciplinary interaction to improve care for patients with or at risk for AKI. Members representing key societies in critical care and nephrology along with additional experts in adult and pediatric AKI participated in a two day conference in Amsterdam, The Netherlands, in September 2005 and were assigned to one of three workgroups. Each group's discussions formed the basis for draft recommendations that were later refined and improved during discussion with the larger group. Dissenting opinions were also noted. The final draft recommendations were circulated to all participants and subsequently agreed upon as the consensus recommendations for this report. Participating societies endorsed the recommendations and agreed to help disseminate the results. The term AKI is proposed to represent the entire spectrum of acute renal failure. Diagnostic criteria for AKI are proposed based on acute alterations in serum creatinine or urine output. A staging system for AKI which reflects quantitative changes in serum creatinine and urine output has been developed. We describe the formation of a multidisciplinary collaborative network focused on AKI. We have proposed uniform standards for diagnosing and classifying AKI which will need to be validated in future studies. The Acute Kidney Injury Network offers a mechanism for proceeding with efforts to improve patient outcomes.

5,467 citations