scispace - formally typeset
Search or ask a question

Showing papers by "Johns Hopkins University published in 2019"


Journal ArticleDOI
28 Aug 2019-BMJ
TL;DR: The Cochrane risk-of-bias tool has been updated to respond to developments in understanding how bias arises in randomised trials, and to address user feedback on and limitations of the original tool.
Abstract: Assessment of risk of bias is regarded as an essential component of a systematic review on the effects of an intervention. The most commonly used tool for randomised trials is the Cochrane risk-of-bias tool. We updated the tool to respond to developments in understanding how bias arises in randomised trials, and to address user feedback on and limitations of the original tool.

9,228 citations


Journal ArticleDOI
TL;DR: This work presents a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index, and uses it to represent and search an expanded model of the human reference genome.
Abstract: The human reference genome represents only a small number of individuals, which limits its usefulness for genotyping. We present a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index. We use HISAT2 to represent and search an expanded model of the human reference genome in which over 14.5 million genomic variants in combination with haplotypes are incorporated into the data structure used for searching and alignment. We benchmark HISAT2 using simulated and real datasets to demonstrate that our strategy of representing a population of genomes, together with a fast, memory-efficient search algorithm, provides more detailed and accurate variant analyses than other methods. We apply HISAT2 for HLA typing and DNA fingerprinting; both applications form part of the HISAT-genotype software that enables analysis of haplotype-resolved genes or genomic regions. HISAT-genotype outperforms other computational methods and matches or exceeds the performance of laboratory-based assays. A graph-based genome indexing scheme enables variant-aware alignment of sequences with very low memory requirements.

4,855 citations


Journal ArticleDOI
TL;DR: Food in the Anthropocene : the EAT-Lancet Commission on healthy diets from sustainable food systems focuses on meat, fish, vegetables and fruit as sources of protein.

4,710 citations


Journal ArticleDOI
TL;DR: On a shelf in the sunny, open-plan office of Cochrane Australia in Melbourne, there's a large, white ring-binder that, it's fair to say, hasn't been opened in a while.
Abstract: On a shelf in the sunny, open-plan office of Cochrane Australia in Melbourne, there's a large, white ring-binder that, it's fair to say, hasn't been opened in a while. It's a printed copy of the original, 1994 edition of the Cochrane Collaboration Handbook, edited by Dave Sackett,[1] and within it the original guidance on the methods to be used. The section on preparing and maintaining systematic reviews, edited by Andy Oxman, weighs in at a total of 76 pages.[2]

4,228 citations


Journal ArticleDOI
TL;DR: Kraken 2 improves upon Kraken 1 by reducing memory usage by 85%, allowing greater amounts of reference genomic data to be used, while maintaining high accuracy and increasing speed fivefold.
Abstract: Although Kraken’s k-mer-based approach provides a fast taxonomic classification of metagenomic sequence data, its large memory requirements can be limiting for some applications. Kraken 2 improves upon Kraken 1 by reducing memory usage by 85%, allowing greater amounts of reference genomic data to be used, while maintaining high accuracy and increasing speed fivefold. Kraken 2 also introduces a translated search mode, providing increased sensitivity in viral metagenomics analysis.

2,261 citations


Journal ArticleDOI
TL;DR: In patients with severe aortic stenosis who were at low surgical risk, TAVR with a self‐expanding supraannular bioprosthesis was noninferior to surgery with respect to the composite end point of death or disabling stroke at 24 months.
Abstract: Background Transcatheter aortic-valve replacement (TAVR) is an alternative to surgery in patients with severe aortic stenosis who are at increased risk for death from surgery; less is know...

2,240 citations


Journal ArticleDOI
TL;DR: In this paper, an improved determination of the Hubble constant (H0) from HST observations of 70 long-period Cepheids in the Large Magellanic Cloud was presented.
Abstract: We present an improved determination of the Hubble constant (H0) from Hubble Space Telescope (HST) observations of 70 long-period Cepheids in the Large Magellanic Cloud. These were obtained with the same WFC3 photometric system used to measure Cepheids in the hosts of Type Ia supernovae. Gyroscopic control of HST was employed to reduce overheads while collecting a large sample of widely-separated Cepheids. The Cepheid Period-Luminosity relation provides a zeropoint-free link with 0.4% precision between the new 1.2% geometric distance to the LMC from Detached Eclipsing Binaries (DEBs) measured by Pietrzynski et al (2019) and the luminosity of SNe Ia. Measurements and analysis of the LMC Cepheids were completed prior to knowledge of the new LMC distance. Combined with a refined calibration of the count-rate linearity of WFC3-IR with 0.1% precision (Riess et al 2019), these three improved elements together reduce the full uncertainty in the LMC geometric calibration of the Cepheid distance ladder from 2.5% to 1.3%. Using only the LMC DEBs to calibrate the ladder we find H0=74.22 +/- 1.82 km/s/Mpc including systematic uncertainties, 3% higher than before for this particular anchor. Combining the LMC DEBs, masers in NGC 4258 and Milky Way parallaxes yields our best estimate: H0 = 74.03 +/- 1.42 km/s/Mpc, including systematics, an uncertainty of 1.91%---15% lower than our best previous result. Removing any one of these anchors changes H0 by < 0.7%. The difference between H0 measured locally and the value inferred from Planck CMB+LCDM is 6.6+/-1.5 km/s/Mpc or 4.4 sigma (P=99.999% for Gaussian errors) in significance, raising the discrepancy beyond a plausible level of chance. We summarize independent tests which show this discrepancy is not readily attributable to an error in any one source or measurement, increasing the odds that it results from a cosmological feature beyond LambdaCDM.

1,924 citations


Journal ArticleDOI
TL;DR: TMB, in concert with PD-L1 expression, has been demonstrated to be a useful biomarker for ICB selection across some cancer types; however, further prospective validation studies are required.

1,490 citations


Posted ContentDOI
03 Oct 2019-bioRxiv
TL;DR: Analysis of the v8 data provides insights into the tissue-specificity of genetic effects, and shows that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues.
Abstract: The Genotype-Tissue Expression (GTEx) project was established to characterize genetic effects on the transcriptome across human tissues, and to link these regulatory mechanisms to trait and disease associations. Here, we present analyses of the v8 data, based on 17,382 RNA-sequencing samples from 54 tissues of 948 post-mortem donors. We comprehensively characterize genetic associations for gene expression and splicing in cis and trans, showing that regulatory associations are found for almost all genes, and describe the underlying molecular mechanisms and their contribution to allelic heterogeneity and pleiotropy of complex traits. Leveraging the large diversity of tissues, we provide insights into the tissue-specificity of genetic effects, and show that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues.

1,243 citations


Journal ArticleDOI
TL;DR: The NCCN Guidelines for Prostate Cancer include recommendations regarding diagnosis, risk stratification and workup, treatment options for localized disease, and management of recurrent and advanced disease for clinicians who treat patients with prostate cancer.
Abstract: The NCCN Guidelines for Prostate Cancer include recommendations regarding diagnosis, risk stratification and workup, treatment options for localized disease, and management of recurrent and advanced disease for clinicians who treat patients with prostate cancer. The portions of the guidelines included herein focus on the roles of germline and somatic genetic testing, risk stratification with nomograms and tumor multigene molecular testing, androgen deprivation therapy, secondary hormonal therapy, chemotherapy, and immunotherapy in patients with prostate cancer.

1,218 citations



Journal ArticleDOI
TL;DR: These data provide the most comprehensive survey of genetic risk within Parkinson's disease to date, providing a biological context for these risk factors, and showing that a considerable genetic component of this disease remains unidentified.
Abstract: Summary Background Genome-wide association studies (GWAS) in Parkinson's disease have increased the scope of biological knowledge about the disease over the past decade. We aimed to use the largest aggregate of GWAS data to identify novel risk loci and gain further insight into the causes of Parkinson's disease. Methods We did a meta-analysis of 17 datasets from Parkinson's disease GWAS available from European ancestry samples to nominate novel loci for disease risk. These datasets incorporated all available data. We then used these data to estimate heritable risk and develop predictive models of this heritability. We also used large gene expression and methylation resources to examine possible functional consequences as well as tissue, cell type, and biological pathway enrichments for the identified risk factors. Additionally, we examined shared genetic risk between Parkinson's disease and other phenotypes of interest via genetic correlations followed by Mendelian randomisation. Findings Between Oct 1, 2017, and Aug 9, 2018, we analysed 7·8 million single nucleotide polymorphisms in 37 688 cases, 18 618 UK Biobank proxy-cases (ie, individuals who do not have Parkinson's disease but have a first degree relative that does), and 1·4 million controls. We identified 90 independent genome-wide significant risk signals across 78 genomic regions, including 38 novel independent risk signals in 37 loci. These 90 variants explained 16–36% of the heritable risk of Parkinson's disease depending on prevalence. Integrating methylation and expression data within a Mendelian randomisation framework identified putatively associated genes at 70 risk signals underlying GWAS loci for follow-up functional studies. Tissue-specific expression enrichment analyses suggested Parkinson's disease loci were heavily brain-enriched, with specific neuronal cell types being implicated from single cell data. We found significant genetic correlations with brain volumes (false discovery rate-adjusted p=0·0035 for intracranial volume, p=0·024 for putamen volume), smoking status (p=0·024), and educational attainment (p=0·038). Mendelian randomisation between cognitive performance and Parkinson's disease risk showed a robust association (p=8·00 × 10−7). Interpretation These data provide the most comprehensive survey of genetic risk within Parkinson's disease to date, to the best of our knowledge, by revealing many additional Parkinson's disease risk loci, providing a biological context for these risk factors, and showing that a considerable genetic component of this disease remains unidentified. These associations derived from European ancestry datasets will need to be followed-up with more diverse data. Funding The National Institute on Aging at the National Institutes of Health (USA), The Michael J Fox Foundation, and The Parkinson's Foundation (see appendix for full list of funding sources).

Journal ArticleDOI
Eli A. Stahl1, Eli A. Stahl2, Gerome Breen3, Andreas J. Forstner  +339 moreInstitutions (107)
TL;DR: Genome-wide analysis identifies 30 loci associated with bipolar disorder, allowing for comparisons of shared genes and pathways with other psychiatric disorders, including schizophrenia and depression.
Abstract: Bipolar disorder is a highly heritable psychiatric disorder. We performed a genome-wide association study (GWAS) including 20,352 cases and 31,358 controls of European descent, with follow-up analysis of 822 variants with P < 1 × 10-4 in an additional 9,412 cases and 137,760 controls. Eight of the 19 variants that were genome-wide significant (P < 5 × 10-8) in the discovery GWAS were not genome-wide significant in the combined analysis, consistent with small effect sizes and limited power but also with genetic heterogeneity. In the combined analysis, 30 loci were genome-wide significant, including 20 newly identified loci. The significant loci contain genes encoding ion channels, neurotransmitter transporters and synaptic components. Pathway analysis revealed nine significantly enriched gene sets, including regulation of insulin secretion and endocannabinoid signaling. Bipolar I disorder is strongly genetically correlated with schizophrenia, driven by psychosis, whereas bipolar II disorder is more strongly correlated with major depressive disorder. These findings address key clinical questions and provide potential biological mechanisms for bipolar disorder.

Journal ArticleDOI
TL;DR: With prolonged follow-up, first-line pembrolizumab monotherapy continues to demonstrate an OS benefit over chemotherapy in patients with previously untreated, advanced NSCLC without EGFR/ALK aberrations, despite crossover from the control arm to pembrolezumab as subsequent therapy.
Abstract: PurposeIn the randomized, open-label, phase III KEYNOTE-024 study, pembrolizumab significantly improved progression-free survival and overall survival (OS) compared with platinum-based chemotherapy in patients with previously untreated advanced non–small-cell lung cancer (NSCLC) with a programmed death ligand 1 tumor proportion score of 50% or greater and without EGFR/ALK aberrations. We report an updated OS and tolerability analysis, including analyses adjusting for potential bias introduced by crossover from chemotherapy to pembrolizumab.Patients and MethodsPatients were randomly assigned to pembrolizumab 200 mg every 3 weeks (for up to 2 years) or investigator’s choice of platinum-based chemotherapy (four to six cycles). Patients assigned to chemotherapy could cross over to pembrolizumab upon meeting eligibility criteria. The primary end point was progression-free survival; OS was an important key secondary end point. Crossover adjustment analysis was done using the following three methods: simplified ...

Journal ArticleDOI
Željko Ivezić1, Steven M. Kahn2, J. Anthony Tyson3, Bob Abel4  +332 moreInstitutions (55)
TL;DR: The Large Synoptic Survey Telescope (LSST) as discussed by the authors is a large, wide-field ground-based system designed to obtain repeated images covering the sky visible from Cerro Pachon in northern Chile.
Abstract: We describe here the most ambitious survey currently planned in the optical, the Large Synoptic Survey Telescope (LSST). The LSST design is driven by four main science themes: probing dark energy and dark matter, taking an inventory of the solar system, exploring the transient optical sky, and mapping the Milky Way. LSST will be a large, wide-field ground-based system designed to obtain repeated images covering the sky visible from Cerro Pachon in northern Chile. The telescope will have an 8.4 m (6.5 m effective) primary mirror, a 9.6 deg2 field of view, a 3.2-gigapixel camera, and six filters (ugrizy) covering the wavelength range 320–1050 nm. The project is in the construction phase and will begin regular survey operations by 2022. About 90% of the observing time will be devoted to a deep-wide-fast survey mode that will uniformly observe a 18,000 deg2 region about 800 times (summed over all six bands) during the anticipated 10 yr of operations and will yield a co-added map to r ~ 27.5. These data will result in databases including about 32 trillion observations of 20 billion galaxies and a similar number of stars, and they will serve the majority of the primary science programs. The remaining 10% of the observing time will be allocated to special projects such as Very Deep and Very Fast time domain surveys, whose details are currently under discussion. We illustrate how the LSST science drivers led to these choices of system parameters, and we describe the expected data products and their characteristics.

Journal ArticleDOI
TL;DR: A new population of CAFs that express MHC class II and CD74, but do not express classical co-stimulatory molecules are described, and it is found that they activate CD4+ T cells in an antigen-specific fashion in a model system, confirming their putative immune-modulatory capacity.
Abstract: Cancer-associated fibroblasts (CAF) are major players in the progression and drug resistance of pancreatic ductal adenocarcinoma (PDAC). CAFs constitute a diverse cell population consisting of several recently described subtypes, although the extent of CAF heterogeneity has remained undefined. Here we use single-cell RNA sequencing to thoroughly characterize the neoplastic and tumor microenvironment content of human and mouse PDAC tumors. We corroborate the presence of myofibroblastic CAFs and inflammatory CAFs and define their unique gene signatures in vivo. Moreover, we describe a new population of CAFs that express MHC class II and CD74, but do not express classic costimulatory molecules. We term this cell population "antigen-presenting CAFs" and find that they activate CD4+ T cells in an antigen-specific fashion in a model system, confirming their putative immune-modulatory capacity. Our cross-species analysis paves the way for investigating distinct functions of CAF subtypes in PDAC immunity and progression. SIGNIFICANCE: Appreciating the full spectrum of fibroblast heterogeneity in pancreatic ductal adenocarcinoma is crucial to developing therapies that specifically target tumor-promoting CAFs. This work identifies MHC class II-expressing CAFs with a capacity to present antigens to CD4+ T cells, and potentially to modulate the immune response in pancreatic tumors.See related commentary by Belle and DeNardo, p. 1001.This article is highlighted in the In This Issue feature, p. 983.

Journal ArticleDOI
TL;DR: The optimization of circular consensus sequencing (CCS) is reported to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb).
Abstract: The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions 15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads. High-fidelity reads improve variant detection and genome assembly on the PacBio platform.

Proceedings ArticleDOI
15 Jun 2019
TL;DR: Li et al. as mentioned in this paper proposed to search the network level structure in addition to the cell level structure, which formed a hierarchical architecture search space and achieved state-of-the-art performance without any ImageNet pretraining.
Abstract: Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification. In this paper, we study NAS for semantic image segmentation. Existing works often focus on searching the repeatable cell structure, while hand-designing the outer network structure that controls the spatial resolution changes. This choice simplifies the search space, but becomes increasingly problematic for dense image prediction which exhibits a lot more network level architectural variations. Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space. We present a network level search space that includes many popular designs, and develop a formulation that allows efficient gradient-based architecture search (3 P100 GPU days on Cityscapes images). We demonstrate the effectiveness of the proposed method on the challenging Cityscapes, PASCAL VOC 2012, and ADE20K datasets. Auto-DeepLab, our architecture searched specifically for semantic image segmentation, attains state-of-the-art performance without any ImageNet pretraining.

Journal ArticleDOI
TL;DR: A Kavli Institute for Theoretical Physics workshop in July 2019 directed attention to the Hubble constant discrepancy and proposed solutions focused on the pre-recombination era as mentioned in this paper.
Abstract: A Kavli Institute for Theoretical Physics workshop in July 2019 directed attention to the Hubble constant discrepancy. New results showed that it does not appear to depend on the use of any one method, team or source. Proposed solutions focused on the pre-recombination era.

Proceedings ArticleDOI
15 Jun 2019
TL;DR: It is suggested that adversarial perturbations on images lead to noise in the features constructed by these networks, and new network architectures are developed that increase adversarial robustness by performing feature denoising.
Abstract: Adversarial attacks to image classification systems present challenges to convolutional networks and opportunities for understanding them. This study suggests that adversarial perturbations on images lead to noise in the features constructed by these networks. Motivated by this observation, we develop new network architectures that increase adversarial robustness by performing feature denoising. Specifically, our networks contain blocks that denoise the features using non-local means or other filters; the entire networks are trained end-to-end. When combined with adversarial training, our feature denoising networks substantially improve the state-of-the-art in adversarial robustness in both white-box and black-box attack settings. On ImageNet, under 10-iteration PGD white-box attacks where prior art has 27.9% accuracy, our method achieves 55.7%; even under extreme 2000-iteration PGD white-box attacks, our method secures 42.6% accuracy. Our method was ranked first in Competition on Adversarial Attacks and Defenses (CAAD) 2018 --- it achieved 50.6% classification accuracy on a secret, ImageNet-like test dataset against 48 unknown attackers, surpassing the runner-up approach by ~10%. Code is available at https://github.com/facebookresearch/ImageNet-Adversarial-Training.

Journal ArticleDOI
TL;DR: This selection from the NCCN Guidelines for Esophageal and Esophagogastric Junction Cancers focuses on recommendations for the management of locally advanced and metastatic adenocarcinoma of the esophagus and EGJ.
Abstract: Esophageal cancer is the sixth leading cause of cancer-related deaths worldwide. Squamous cell carcinoma is the most common histology in Eastern Europe and Asia, and adenocarcinoma is most common in North America and Western Europe. Surgery is a major component of treatment of locally advanced resectable esophageal and esophagogastric junction (EGJ) cancer, and randomized trials have shown that the addition of preoperative chemoradiation or perioperative chemotherapy to surgery significantly improves survival. Targeted therapies including trastuzumab, ramucirumab, and pembrolizumab have produced encouraging results in the treatment of patients with advanced or metastatic disease. Multidisciplinary team management is essential for all patients with esophageal and EGJ cancers. This selection from the NCCN Guidelines for Esophageal and Esophagogastric Junction Cancers focuses on recommendations for the management of locally advanced and metastatic adenocarcinoma of the esophagus and EGJ.

Journal ArticleDOI
TL;DR: Gilteritinib resulted in significantly longer survival and higher percentages of patients with remission than salvage chemotherapy among patients with relapsed or refractory FLT3-mutated AML.
Abstract: Background Patients with relapsed or refractory acute myeloid leukemia (AML) with mutations in the FMS-like tyrosine kinase 3 gene (FLT3) infrequently have a response to salvage chemothera...

Journal ArticleDOI
TL;DR: Both oscillometric and auscultatory methods are considered acceptable for measuring BP in children and adolescents and initial and ongoing training of technicians and healthcare providers and the use of validated and calibrated devices are critical for obtaining accurate BP measurements.
Abstract: The accurate measurement of blood pressure (BP) is essential for the diagnosis and management of hypertension. This article provides an updated American Heart Association scientific statement on BP measurement in humans. In the office setting, many oscillometric devices have been validated that allow accurate BP measurement while reducing human errors associated with the auscultatory approach. Fully automated oscillometric devices capable of taking multiple readings even without an observer being present may provide a more accurate measurement of BP than auscultation. Studies have shown substantial differences in BP when measured outside versus in the office setting. Ambulatory BP monitoring is considered the reference standard for out-of-office BP assessment, with home BP monitoring being an alternative when ambulatory BP monitoring is not available or tolerated. Compared with their counterparts with sustained normotension (ie, nonhypertensive BP levels in and outside the office setting), it is unclear whether adults with white-coat hypertension (ie, hypertensive BP levels in the office but not outside the office) have increased cardiovascular disease risk, whereas those with masked hypertension (ie, hypertensive BP levels outside the office but not in the office) are at substantially increased risk. In addition, high nighttime BP on ambulatory BP monitoring is associated with increased cardiovascular disease risk. Both oscillometric and auscultatory methods are considered acceptable for measuring BP in children and adolescents. Regardless of the method used to measure BP, initial and ongoing training of technicians and healthcare providers and the use of validated and calibrated devices are critical for obtaining accurate BP measurements.

Posted Content
TL;DR: This paper showed that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks and proposed a Transformer-based masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data.
Abstract: This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer-based masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data. Our model, dubbed XLM-R, significantly outperforms multilingual BERT (mBERT) on a variety of cross-lingual benchmarks, including +14.6% average accuracy on XNLI, +13% average F1 score on MLQA, and +2.4% F1 score on NER. XLM-R performs particularly well on low-resource languages, improving 15.7% in XNLI accuracy for Swahili and 11.4% for Urdu over previous XLM models. We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale. Finally, we show, for the first time, the possibility of multilingual modeling without sacrificing per-language performance; XLM-R is very competitive with strong monolingual models on the GLUE and XNLI benchmarks. We will make our code, data and models publicly available.

Posted ContentDOI
Daniel Taliun1, Daniel N. Harris2, Michael D. Kessler2, Jedidiah Carlson1  +191 moreInstitutions (61)
06 Mar 2019-bioRxiv
TL;DR: The nearly complete catalog of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and non-coding sequence variants to phenotypic variation as well as resources and early insights from the sequence data.
Abstract: Summary paragraph The Trans-Omics for Precision Medicine (TOPMed) program seeks to elucidate the genetic architecture and disease biology of heart, lung, blood, and sleep disorders, with the ultimate goal of improving diagnosis, treatment, and prevention. The initial phases of the program focus on whole genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here, we describe TOPMed goals and design as well as resources and early insights from the sequence data. The resources include a variant browser, a genotype imputation panel, and sharing of genomic and phenotypic data via dbGaP. In 53,581 TOPMed samples, >400 million single-nucleotide and insertion/deletion variants were detected by alignment with the reference genome. Additional novel variants are detectable through assembly of unmapped reads and customized analysis in highly variable loci. Among the >400 million variants detected, 97% have frequency

Journal ArticleDOI
TL;DR: In this article, a Markov Chain Monte-Carlo search of the parameter space for the EDE parameters, in conjunction with the standard cosmological parameters, identifies regions in which H = 0.
Abstract: Early dark energy (EDE) that behaves like a cosmological constant at early times (redshifts z≳3000) and then dilutes away like radiation or faster at later times can solve the Hubble tension. In these models, the sound horizon at decoupling is reduced resulting in a larger value of the Hubble parameter H_{0} inferred from the cosmic microwave background (CMB). We consider two physical models for this EDE, one involving an oscillating scalar field and another a slowly rolling field. We perform a detailed calculation of the evolution of perturbations in these models. A Markov Chain Monte Carlo search of the parameter space for the EDE parameters, in conjunction with the standard cosmological parameters, identifies regions in which H_{0} inferred from Planck CMB data agrees with the SH0ES local measurement. In these cosmologies, current baryon acoustic oscillation and supernova data are described as successfully as in the cold dark matter model with a cosmological constant, while the fit to Planck data is slightly improved. Future CMB and large-scale-structure surveys will further probe this scenario.

Journal ArticleDOI
Nasim Mavaddat1, Kyriaki Michailidou1, Kyriaki Michailidou2, Joe Dennis1  +307 moreInstitutions (105)
TL;DR: This PRS, optimized for prediction of estrogen receptor (ER)-specific disease, from the largest available genome-wide association dataset is developed and empirically validated and is a powerful and reliable predictor of breast cancer risk that may improve breast cancer prevention programs.
Abstract: Stratification of women according to their risk of breast cancer based on polygenic risk scores (PRSs) could improve screening and prevention strategies. Our aim was to develop PRSs, optimized for prediction of estrogen receptor (ER)-specific disease, from the largest available genome-wide association dataset and to empirically validate the PRSs in prospective studies. The development dataset comprised 94,075 case subjects and 75,017 control subjects of European ancestry from 69 studies, divided into training and validation sets. Samples were genotyped using genome-wide arrays, and single-nucleotide polymorphisms (SNPs) were selected by stepwise regression or lasso penalized regression. The best performing PRSs were validated in an independent test set comprising 11,428 case subjects and 18,323 control subjects from 10 prospective studies and 190,040 women from UK Biobank (3,215 incident breast cancers). For the best PRSs (313 SNPs), the odds ratio for overall disease per 1 standard deviation in ten prospective studies was 1.61 (95%CI: 1.57-1.65) with area under receiver-operator curve (AUC) = 0.630 (95%CI: 0.628-0.651). The lifetime risk of overall breast cancer in the top centile of the PRSs was 32.6%. Compared with women in the middle quintile, those in the highest 1% of risk had 4.37- and 2.78-fold risks, and those in the lowest 1% of risk had 0.16- and 0.27-fold risks, of developing ER-positive and ER-negative disease, respectively. Goodness-of-fit tests indicated that this PRS was well calibrated and predicts disease risk accurately in the tails of the distribution. This PRS is a powerful and reliable predictor of breast cancer risk that may improve breast cancer prevention programs.

Journal ArticleDOI
TL;DR: This manuscript discusses guiding principles for the workup, staging, and treatment of early stage and locally advanced cervical cancer, as well as evidence for these recommendations.
Abstract: Cervical cancer is a malignant epithelial tumor that forms in the uterine cervix. Most cases of cervical cancer are preventable through human papilloma virus (HPV) vaccination, routine screening, and treatment of precancerous lesions. However, due to inadequate screening protocols in many regions of the world, cervical cancer remains the fourth-most common cancer in women globally. The complete NCCN Guidelines for Cervical Cancer provide recommendations for the diagnosis, evaluation, and treatment of cervical cancer. This manuscript discusses guiding principles for the workup, staging, and treatment of early stage and locally advanced cervical cancer, as well as evidence for these recommendations. For recommendations regarding treatment of recurrent or metastatic disease, please see the full guidelines on NCCN.org.

Journal ArticleDOI
TL;DR: StringTie2 is a reference-guided transcriptome assembler that works with both short and long reads and offers the ability to work with full-length super-reads assembled from short reads, which further improves the quality of short-read assemblies.
Abstract: RNA sequencing using the latest single-molecule sequencing instruments produces reads that are thousands of nucleotides long. The ability to assemble these long reads can greatly improve the sensitivity of long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler that works with both short and long reads. StringTie2 includes new methods to handle the high error rate of long reads and offers the ability to work with full-length super-reads assembled from short reads, which further improves the quality of short-read assemblies. StringTie2 is more accurate and faster and uses less memory than all comparable short-read and long-read analysis tools.

Posted Content
TL;DR: There is substantial room for improvement in NLI systems, and the HANS dataset can motivate and measure progress in this area, which contains many examples where the heuristics fail.
Abstract: A machine learning system can score well on a given test set by relying on heuristics that are effective for frequent example types but break down in more challenging cases. We study this issue within natural language inference (NLI), the task of determining whether one sentence entails another. We hypothesize that statistical NLI models may adopt three fallible syntactic heuristics: the lexical overlap heuristic, the subsequence heuristic, and the constituent heuristic. To determine whether models have adopted these heuristics, we introduce a controlled evaluation set called HANS (Heuristic Analysis for NLI Systems), which contains many examples where the heuristics fail. We find that models trained on MNLI, including BERT, a state-of-the-art model, perform very poorly on HANS, suggesting that they have indeed adopted these heuristics. We conclude that there is substantial room for improvement in NLI systems, and that the HANS dataset can motivate and measure progress in this area