Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010
TL;DR: The Global Burden of Diseases, Injuries, and Risk Factors Study 2010 aimed to estimate annual deaths for the world and 21 regions between 1980 and 2010 for 235 causes, with uncertainty intervals (UIs), separately by age and sex, using the Cause of Death Ensemble model.
Abstract: Summary Background Reliable and timely information on the leading causes of death in populations, and how these are changing, is a crucial input into health policy debates. In the Global Burden of Diseases, Injuries, and Risk Factors Study 2010 (GBD 2010), we aimed to estimate annual deaths for the world and 21 regions between 1980 and 2010 for 235 causes, with uncertainty intervals (UIs), separately by age and sex. Methods We attempted to identify all available data on causes of death for 187 countries from 1980 to 2010 from vital registration, verbal autopsy, mortality surveillance, censuses, surveys, hospitals, police records, and mortuaries. We assessed data quality for completeness, diagnostic accuracy, missing data, stochastic variations, and probable causes of death. We applied six different modelling strategies to estimate cause-specific mortality trends depending on the strength of the data. For 133 causes and three special aggregates we used the Cause of Death Ensemble model (CODEm) approach, which uses four families of statistical models testing a large set of different models using different permutations of covariates. Model ensembles were developed from these component models. We assessed model performance with rigorous out-of-sample testing of prediction error and the validity of 95% UIs. For 13 causes with low observed numbers of deaths, we developed negative binomial models with plausible covariates. For 27 causes for which death is rare, we modelled the higher level cause in the cause hierarchy of the GBD 2010 and then allocated deaths across component causes proportionately, estimated from all available data in the database. For selected causes (African trypanosomiasis, congenital syphilis, whooping cough, measles, typhoid and parathyroid, leishmaniasis, acute hepatitis E, and HIV/AIDS), we used natural history models based on information on incidence, prevalence, and case-fatality. We separately estimated cause fractions by aetiology for diarrhoea, lower respiratory infections, and meningitis, as well as disaggregations by subcause for chronic kidney disease, maternal disorders, cirrhosis, and liver cancer. For deaths due to collective violence and natural disasters, we used mortality shock regressions. For every cause, we estimated 95% UIs that captured both parameter estimation uncertainty and uncertainty due to model specification where CODEm was used. We constrained cause-specific fractions within every age-sex group to sum to total mortality based on draws from the uncertainty distributions. Findings In 2010, there were 52·8 million deaths globally. At the most aggregate level, communicable, maternal, neonatal, and nutritional causes were 24·9% of deaths worldwide in 2010, down from 15·9 million (34·1%) of 46·5 million in 1990. This decrease was largely due to decreases in mortality from diarrhoeal disease (from 2·5 to 1·4 million), lower respiratory infections (from 3·4 to 2·8 million), neonatal disorders (from 3·1 to 2·2 million), measles (from 0·63 to 0·13 million), and tetanus (from 0·27 to 0·06 million). Deaths from HIV/AIDS increased from 0·30 million in 1990 to 1·5 million in 2010, reaching a peak of 1·7 million in 2006. Malaria mortality also rose by an estimated 19·9% since 1990 to 1·17 million deaths in 2010. Tuberculosis killed 1·2 million people in 2010. Deaths from non-communicable diseases rose by just under 8 million between 1990 and 2010, accounting for two of every three deaths (34·5 million) worldwide by 2010. 8 million people died from cancer in 2010, 38% more than two decades ago; of these, 1·5 million (19%) were from trachea, bronchus, and lung cancer. Ischaemic heart disease and stroke collectively killed 12·9 million people in 2010, or one in four deaths worldwide, compared with one in five in 1990; 1·3 million deaths were due to diabetes, twice as many as in 1990. The fraction of global deaths due to injuries (5·1 million deaths) was marginally higher in 2010 (9·6%) compared with two decades earlier (8·8%). This was driven by a 46% rise in deaths worldwide due to road traffic accidents (1·3 million in 2010) and a rise in deaths from falls. Ischaemic heart disease, stroke, chronic obstructive pulmonary disease (COPD), lower respiratory infections, lung cancer, and HIV/AIDS were the leading causes of death in 2010. Ischaemic heart disease, lower respiratory infections, stroke, diarrhoeal disease, malaria, and HIV/AIDS were the leading causes of years of life lost due to premature mortality (YLLs) in 2010, similar to what was estimated for 1990, except for HIV/AIDS and preterm birth complications. YLLs from lower respiratory infections and diarrhoea decreased by 45–54% since 1990; ischaemic heart disease and stroke YLLs increased by 17–28%. Regional variations in leading causes of death were substantial. Communicable, maternal, neonatal, and nutritional causes still accounted for 76% of premature mortality in sub-Saharan Africa in 2010. Age standardised death rates from some key disorders rose (HIV/AIDS, Alzheimer's disease, diabetes mellitus, and chronic kidney disease in particular), but for most diseases, death rates fell in the past two decades; including major vascular diseases, COPD, most forms of cancer, liver cirrhosis, and maternal disorders. For other conditions, notably malaria, prostate cancer, and injuries, little change was noted. Interpretation Population growth, increased average age of the world's population, and largely decreasing age-specific, sex-specific, and cause-specific death rates combine to drive a broad shift from communicable, maternal, neonatal, and nutritional causes towards non-communicable diseases. Nevertheless, communicable, maternal, neonatal, and nutritional causes remain the dominant causes of YLLs in sub-Saharan Africa. Overlaid on this general pattern of the epidemiological transition, marked regional variation exists in many causes, such as interpersonal violence, suicide, liver cancer, diabetes, cirrhosis, Chagas disease, African trypanosomiasis, melanoma, and others. Regional heterogeneity highlights the importance of sound epidemiological assessments of the causes of death on a regular basis. Funding Bill & Melinda Gates Foundation.
Summary (6 min read)
- Cause-specific mortality is arguably one of the most fundamental metrics of population health.
- For the remaining deaths which are not medically certified, many different data sources and diagnostic approaches must be used from surveillance systems, demographic research sites, surveys, censuses, disease registries, and police records to construct a consolidated picture of causes of death in various populations.
- These efforts often include very specific steps undertaken for different data sources and are frequently poorly documented.
- GBD revisions for 1999, 2000, 2001, 2002, 2004, and 2008 have used these compositional models to allocate deaths according to three broad cause groups: communicable, maternal, neonatal, and nutritional causes; noncommunicable diseases; and injuries.
- 38 Given the profusion of statistical modeling options, an important innovation has been the reporting of out-of-sample predictive validity to document the performance of complex models.
Data and methods
- Some general aspects of the analytical framework such as the creation of the 21 GBD regions and the full hierarchical cause list including the mapping of the International Classification of Diseases and Injuries (ICD) to the GBD cause list are reported elsewhere.
- While results are reported in this paper at the regional level for 1990 and 2010, the cause of death analysis has been undertaken at the country level for 187 countries from 1980 to 2010.
- Using longer time series improves the performance of many types of estimation models; data prior to 1980, however, are much sparser for developing countries so the authors have restricted the analysis to 1980-2010.
- Over the five year duration of the GBD 2010 study, the authors have sought to identify all published and unpublished data sources relevant to estimating causes of death for 187 countries from 1980 to 2010.
- Web Table 1a provides a summary of the siteyears of data identified by broad type of data system and, similarly, Web Table 1b illustrates the number of site-years by GBD region.
- Of the GBD regions, subSaharan Africa Central has the most limited evidence base with data on only 27 causes from at least one country.
- In addition, there is country to country variation in the detail used to report causes of death included in national reporting lists, namely the basic tabulation list for ICD9, the ICD10 tabulation list, three digit and four digit detail, and special reporting lists.
- Verbal autopsy data collected through sample registration systems, demographic surveillance systems, or surveys Verbal autopsy (VA) is a means for ascertaining the cause of death of individuals and the cause-specific mortality fractions in populations with incomplete vital registration systems.
Population-based cancer registries
- Population-based cancer registries provide an important source of data on incidence of cancers in various countries.
- The authors identified 2,715 site-years of cancer registry data across 93 countries.
- The log of the MI ratio has been estimated as a function of national income per capita with random effects for country, year, and age.
- The estimated mortality to incidence ratios have been used to map cancer registry data on incidence to expected deaths which have been incorporated into the database.
- MI ratios by country, age, and sex are available on request.
- In most countries, police and crime reports are an important source of information for some types of injuries, notably road injuries and inter-personal violence.
- The police reports used in this analysis were collected from published studies, national agencies, and institutional surveys such as the United Nations Crime Trends survey and the WHO Global Status Report on Road Safety Survey.30,41.
- The authors identified 32 site-years of burial and mortuary data in 11 countries from ministries of health, published reports, and mortuaries themselves.
- The authors also identified 52 surveys/censuses covering injury mortality across 65 survey/census years.
- Surveillance data on the number of maternal deaths, or the maternal mortality ratio multiplied by births, were converted into cause fractions by dividing by the total number of deaths estimated in the reproductive age groups.
Health facility data
- The authors chose to only incorporate deaths due to injury from this source because of known bias.
- In settings where a data source does not capture all deaths in a population, the cause composition of deaths captured may be different than those that are not.
- At the global level, the number of deaths estimated in 2010 for ARI and diarrhea for example differ by 0·9% and 1·2%, respectively, between models that include all data and those that exclude data where under-five death registration is below 70% complete.
- Garbage codes are causes of death that should not be identified as underlying causes of death but have been entered as the underlying cause of death on death certificates.
- For the GBD 2010, the authors have identified causes that should not be assigned as underlying cause of death at a much more detailed level.
Cause of Death Ensemble modeling (CODEm)
- For all major causes of death except for HIV/AIDS and measles, the authors have used cause of death ensemble modeling – 133 causes in the cause list and three other special aggregates.
- In addition, four families of statistical models are developed using covariates: mixed effects linear models of the log of the death rate, mixed effects linear models of the logit of the cause fraction, spatial-temporal Gaussian process regression (ST-GPR) models of the log of the death rate, and ST-GPR of the logit of the cause fraction.
- 3) Based on out-of-sample predictive validity, the best performing model or ensemble is selected.
- Web Table 6 summarizes the performance of the CODEm models developed for 133 causes in the cause list for which the authors exclusively use CODEm and three special aggregates in the GBD 2010.
- In all cases the out-of-sample performance is worse (larger RMSE) than the in-sample performance.
Negative binomial models
- For 13 causes, the number of deaths observed in the database is too low to generate stable estimates of out-of-sample predictive validity.
- For these causes, the authors developed negative binomial models using plausible covariates.
- These causes are identified in Web Table 5.
- For these negative binomial models, standard model building practice was followed where plausible covariates were included in the model development and reverse stepwise procedures followed for covariate inclusion.
- Uncertainty distributions were estimated using both uncertainty in the regression betas for the covariates and from the gamma distribution of the negative binomial.
Fixed proportion models
- In 28 cases where death is a rare event, the authors have first modeled the parent cause in the GBD hierarchy using CODEm and then allocated deaths to specific causes using a fixed proportion.
- Proportions have been computed using all available data in the database and are fixed over time, but, depending on data density, allowed to vary by region, age, or sex.
- Finally, cellulitis, decubitus ulcer, other skin and subcutaneous diseases, abscess, impetigo, and other bacterial skin diseases all varied by age and sex.
- The meta-regression have generated region-age-sex estimates with uncertainty of etiological fractions for diarrhea, LRI, meningitis, chronic kidney disease, maternal conditions, cirrhosis.
- In the cases of cirrhosis, liver cancer, maternal conditions, and chronic kidney disease, the studies or datasets on etiology identify primary cause as assessed clinically; for diarrhea, LRI, and meningitis, etiology is based on laboratory findings.
Natural History Models
- For a few selected causes, there is evidence that cause of death data systems do not capture sufficient information for one of two reasons.
- Second, there are reasons to believe that there is systematic misclassification of deaths in cause of death data sources, particularly for congenital syphilis,54,55 whooping cough,56 measles,57 and HIV/AIDS.58.
- In the case of HIV/AIDS, a hybrid approach has been used.
- For 36 countries, with complete and high quality vital registration systems, the authors have used CODEm, in consultation with UNAIDS.
- For the remaining countries, the authors have used the estimates with uncertainty by age and sex provided directly by UNAIDS based on their 2012 revision.
Mortality Shock Regressions
- To estimate deaths directly due to natural disasters or collective violence, the authors use a different approach.
- Details of this approach are outlined in Murray et al.59.
- To develop the covariate on battle deaths during collective violence, the authors used data from the Armed Conflict Database from the International Institute for Strategic Studies (1997-2011), the Uppsala Conflict Data Program(UCDP)/PRIO Armed Conflict Dataset (1946-present), and available data from complete VR systems.
- The relationships between under-five mortality and adult mortality and the disaster and collective violence covariates are estimated using 43 empirical observations for disasters and 206 empirical observations for collective violence (only years with over 1 per 10,000 crude death rate from shocks are kept in this analysis).
- The relationship is estimated for excess mortality from these data sources by first subtracting from observed mortality rates the expected death rates in shock years using the methods outlined in Murray et al.59.
Combining Results for Individual Causes of Death to Generate Final Estimates
- Given that the authors develop single cause models, it is imperative as a final step to ensure that individual cause estimates sum to the all-cause mortality estimate for each age-sex-country-year group.
- The authors use a simple algorithm called CoDCorrect; at the level of each draw from the posterior distribution of each cause, they proportionately rescale each cause such that the sum of the cause-specific estimates equals the number of deaths from all causes generated from the demographic analysis.47.
- The authors have chosen levels for each cause based on consideration of the amount and quality of available data.
- Because there are substantially more data on all cardiovascular causes from verbal autopsy studies than for specific cardiovascular causes, the authors have designated “all cardiovascular” as a level 1 cause for CoDCorrect.
- For the presentation of leading causes of death, the level at which one ranks causes is subject to debate.
- Given the GBD cause list tree structure, multiple options are possible such as all cancers versus sitespecific cancers.
- The authors have opted to produce tables of rankings using the level of disaggregation that seems most relevant for public health decision-making.
- The reference standard has been constructed using the lowest observed death rate in each age group across countries with a population greater than five million (see Murray et al39 for details).
- Because the all-cause mortality analysis is undertaken, however, for more detailed age-groups up to age 110, the authors are able to take into account the mean age of death over 80 in each country-year-sex group in computing YLLs.
- To help understand the drivers of change in the numbers of deaths by cause or region, the authors have decomposed change from 1990 to 2010 into growth in total population, change in population age- and sex-structure, and change in age- and sex-specific rates.
- The difference between 2010 deaths and the population growth and aging scenario is the difference in death numbers due to epidemiological change in age- and sex-specific death rates.
- Each of these three differences is also presented as a percent change with reference to the 1990 observed death number.
- Further details on the data and methods used for specific causes of death is available on request.
Global Causes of Death
- The GBD cause list divides causes into three broad groups.
- With declining age-specific death rates from all three groups of causes, including noncommunicable diseases, the global shift towards noncommunicable diseases and injuries as leading causes of death is being driven by population growth and aging, and not by increases in age-sex-cause specific death rates.
- Among communicable diseases, notably lower respiratory infections (194 thousand), diarrhea (77 thousand) and meningitis (46 thousand) account for the remaining neonatal deaths.
- A number of causes have much larger uncertainty intervals than adjacent causes in the rank list.
- At both time periods, there is substantial variation across regions in the relative importance of different causes, with communicable diseases and related causes being much more important in parts of sub-Saharan Africa and parts of Asia than in North Africa, and vascular diseases and cancer predominating in most other regions.
- The GBD 2010 is the most comprehensive and systematic analysis of causes of death undertaken to date.
- The global health community can now draw on annual estimates of mortality, by age and sex, for 21 regions of the world, for each year from 1980 to 2010, for 235 separate causes, each with 95% uncertainty intervals to aid interpretation.
- An innovative dimension of the GBD 2010 has been the addition of estimates of deaths due to different diarrhea and lower respiratory infection (LRI) pathogens.
- First, cause of death data even in settings with medical certification may not always accurately capture the underlying cause of death.
Figures and Tables
- Decomposition analysis of the change of global death numbers by broad cause groups from 1990 to 2010 into total population growth, population aging and changes in age-,sex-and cause-specific death rates.
- Percentage of global deaths for females and males in 1990 and 2010 by cause and age.
- The colors represent the various level one causes; blue is for non-communicable diseases, red is for communicable, maternal, neonatal and nutritional conditions, and green is for injuries.
- The dashed lines signify descending order in rank, while the solid lines signify ascending order in rank.
- Some cause abbreviations used in the figure are lower respiratory infections (“LRI”); ischemic heart disease (“IHD”); chronic obstructive pulmonary disorder (“COPD”); protein energy malnutrition (“PEM”); tuberculosis (“TB”); neonatal encephalopathy (“N Enceph”); neonatal sepsis (“N Sepsis”); road injuries (“Road Inj”); cancer (“CA”); and chronic kidney disease (“CKD”).
70 Bates M, O’Grady J, Mudenda V, Shibemba A, Zumla A. New global estimates of malaria deaths.
- Global Enterics Mutli-Center Study (GEMS): University of Maryland School of Medicine.
- Differences in the etiology of communityacquired pneumonia according to site of care: A population-based study.
- Respiratory syncytial virus infection in elderly and high-risk adults.
- Lancet 2011; 377: 1438–47. 92 Beaglehole R, Yach D. Globalisation and the prevention and control of non-communicable disease: the neglected chronic diseases of adults.
- Population Health Metrics Research Consortium gold standard verbal autopsy validation study: design, implementation, and development of analysis datasets.
Did you find this useful? Give us your feedback
Related Papers (5)
Frequently Asked Questions (15)
Q1. What have the authors contributed in "Global and regional mortality from 235 causes of death for 20 age- groups in 1990 and 2010: a systematic analysis" ?
In this paper, the authors proposed a method to provide timely and accurate information on causes of death by age and sex.
Q2. What are the future works mentioned in the paper "Global and regional mortality from 235 causes of death for 20 age- groups in 1990 and 2010: a systematic analysis" ?
Improved estimation of mortality from HIV/AIDS including uncertainty in the future will come both from continued progress in the estimation of the time course of the HIV epidemic by UNAIDS as well as further data on the levels of adult mortality in some key countries such as Nigeria. These are important both for the prioritization of existing treatments, such as rotavirus or pneumococcal vaccines, but also for the development of future technologies. When large multi-center studies such as GEMS publish their results this will be an important addition to the analysis ; future revisions of the GBD should make use of these results as they become available. The authors believe that for causes where the magnitude of these corrections is comparatively large, future research should be targeted to trying to build a better understanding of the strengths and weaknesses of the various data sources, whether epidemiological or demographic.
Q3. Why do the authors use only information on the fraction of injuries due to specific sub-causes?
Because of known bias in the epidemiological composition of burial and mortuary data, the authors only use information on the fraction of injuries due to specific sub-causes from these sources.
Q4. What could be learned about causes of death in countries where death certification is poor?
Much could be learned about causes of death in countries where death certification is poor through the more widespread testing and application of recent advances in verbal autopsy methods which greatly reduce heterogeneity in diagnostic practices across populations where VA is currently used.
Q5. What causes of death are dominated by by the post-neonatal period?
By the post-neonatal period, causes of death are dominated by diarrhea, LRI, and other infectious diseases such as measles, among others.
Q6. Why are the causes of death assessments subject to debate?
Because of the variety of data sources and their associated biases, cause of death assessments are inherently uncertain and subject to vigorous debate.
Q7. How many countries had to make decisions about data sources, quality adjustments and modeling strategies?
The ambition to estimate mortality from 235 causes with uncertainty for 187 countries over time from 1980 to 2010 means that many choices about data sources, quality adjustments to data and modeling strategies had to be made.
Q8. Why do the authors include more disaggregated causes in the ranking list?
Although the authors report more disaggregated causes, because of considerations related to public health programs, the authors have chosen to include diarrheal diseases, lower respiratory infections, maternal causes, cerebrovascular disease, liver cancer, cirrhosis, drug use, road injury, exposure to mechanical forces, animal contact, homicide, and congenital causes in the ranking list.
Q9. What are the four families of statistical models developed using covariates?
In addition, four families of statistical models are developed using covariates: mixed effects linear models of the log of the death rate, mixed effects linear models of the logit of the cause fraction, spatial-temporal Gaussian process regression (ST-GPR) models of the log of the death rate, and ST-GPR of the logit of the cause fraction.
Q10. How many empirical observations are used to estimate the relationships between under-five and adult mortality?
60–62The relationships between under-five mortality and adult mortality and the disaster and collective violence covariates are estimated using 43 empirical observations for disasters and 206 empirical observations for collective violence (only years with over 1 per 10,000 crude death rate from shocks are kept in this analysis).
Q11. How many causes are too low to generate stable estimates of out-of-sample predictive?
For 13 causes, the number of deaths observed in the database is too low to generate stable estimates of out-of-sample predictive validity.
Q12. What are the opportunities for improving cause of death data?
Opportunities for strengthening death registration, cause of death certification, and the more widespread use of verbal autopsy exist.
Q13. What are the coefficients used to predict excess deaths from these two causes?
The coefficients from these regressions and the disaster and collective violence covariates are used to predict excess deaths from these two causes.
Q14. What are the reasons to be concerned about the bias in the data?
There are reasons, however, to also be concerned that deaths recorded in systems with low coverage may be biased towards selected causes that are more likely to occur in hospital.
Q15. What is the effect of CoDCorrect on the size of more certain causes?
Although at the draw level the same scalar is applied to all causes, the net effect of CoDCorrect is to change the size of more uncertain causes by more than is done for more certain causes, a desirable property.