scispace - formally typeset
Search or ask a question
Author

Andy Boyd

Bio: Andy Boyd is an academic researcher from University of Bristol. The author has contributed to research in topics: Longitudinal study & Record linkage. The author has an hindex of 14, co-authored 54 publications receiving 4272 citations. Previous affiliations of Andy Boyd include Institute of Education & University of Glasgow.


Papers
More filters
Journal ArticleDOI
TL;DR: The Avon Longitudinal Study of Parents and Children (ALSPAC) is a transgenerational prospective observational study investigating influences on health and development across the life course and is currently set up as a supported access resource.
Abstract: The Avon Longitudinal Study of Parents and Children (ALSPAC) is a transgenerational prospective observational study investigating influences on health and development across the life course. It considers multiple genetic, epigenetic, biological, psychological, social and other environmental exposures in relation to a similarly diverse range of health, social and developmental outcomes. Recruitment sought to enroll pregnant women in the Bristol area of the UK during 1990-92; this was extended to include additional children eligible using the original enrollment definition up to the age of 18 years. The children from 14541 pregnancies were recruited in 1990-92, increasing to 15247 pregnancies by the age of 18 years. This cohort profile describes the index children of these pregnancies. Follow-up includes 59 questionnaires (4 weeks-18 years of age) and 9 clinical assessment visits (7-17 years of age). The resource comprises a wide range of phenotypic and environmental measures in addition to biological samples, genetic (DNA on 11343 children, genome-wide data on 8365 children, complete genome sequencing on 2000 children) and epigenetic (methylation sampling on 1000 children) information and linkage to health and administrative records. Data access is described in this article and is currently set up as a supported access resource. To date, over 700 peer-reviewed articles have been published using ALSPAC data.

2,440 citations

Journal ArticleDOI
TL;DR: The Avon Longitudinal Study of Children and Parents (ALSPAC) was established to understand how genetic and environmental characteristics influence health and development in parents and children.
Abstract: Summary The Avon Longitudinal Study of Children and Parents (ALSPAC) was established to understand how genetic and environmental characteristics influence health and development in parents and children. All pregnant women resident in a defined area in the South West of England, with an expected date of delivery between 1st April 1991 and 31st December 1992, were eligible and 13 761 women (contributing 13 867 pregnancies) were recruited. These women have been followed over the last 19–22 years and have completed up to 20 questionnaires, have had detailed data abstracted from their medical records and have information on any cancer diagnoses and deaths through record linkage. A follow-up assessment was completed 17–18 years postnatal at which anthropometry, blood pressure, fat, lean and bone mass and carotid intima media thickness were assessed, and a fasting blood sample taken. The second follow-up clinic, which additionally measures cognitive function, physical capability, physical activity (with accelerometer) and wrist bone architecture, is underway and two further assessments with similar measurements will take place over the next 5 years. There is a detailed biobank that includes DNA, with genome-wide data available on >10 000, stored serum and plasma taken repeatedly since pregnancy and other samples; a wide range of data on completed biospecimen assays are available. Details of how to access these data are provided in this cohort profile.

1,902 citations

Journal ArticleDOI
14 Mar 2019
TL;DR: A total of 913 additional G1 (the cohort of index children) participants have been enrolled in the study since the age of 7 years with 195 of these joining since the Age of 18, which provides a baseline sample of 14,901 participants who were alive at 1 year of age.
Abstract: The Avon Longitudinal Study of Parents and Children (ALSPAC) is a prospective population-based study. Initial recruitment of pregnant women took place in 1990-1992 and the health and development of the index children from these pregnancies and their family members have been followed ever since. The eligible sampling frame was constructed retrospectively using linked recruitment and health service records. Additional offspring that were eligible to enrol in the study have been welcomed through major recruitment drives at the ages of 7 and 18 years; and through opportunistic contacts since the age of 7. This data note provides a status update on the recruitment of the index children since the age of 7 years with a focus on enrolment since the age of 18, which has not been previously described. A total of 913 additional G1 (the cohort of index children) participants have been enrolled in the study since the age of 7 years with 195 of these joining since the age of 18. This additional enrolment provides a baseline sample of 14,901 G1 participants who were alive at 1 year of age.

358 citations

Journal ArticleDOI
TL;DR: The technical implementation of DataSHIELD is described, using a modified R statistical environment linked to an Opal database deployed behind the computer firewall of each DC, which is currently used by the Healthy Obese Project and the Environmental Core Project for the federated analysis of 10 data sets across eight European countries.
Abstract: Background: Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling of information from individuals in a central database that may be queried by researchers raises important ethico-legal questions and can be controversial. In the UK this has been highlighted by recent debate and controversy relating to the UK’s proposed ‘care.data’ initiative, and these issues reflect important societal and professional concerns about privacy, confidentiality and intellectual property. DataSHIELD provides a novel technological solution that can circumvent some of the most basic challenges in facilitating the access of researchers and other healthcare professionals to individual-level data. Methods: Commands are sent from a central analysis computer (AC) to several data computers (DCs) storing the data to be co-analysed. The data sets are analysed simultaneously but in parallel. The separate parallelized analyses are linked by non-disclosive summary statistics and commands transmitted back and forth between the DCs and the AC. This paper describes the technical implementation of DataSHIELD using a modified R statistical environment linked to an Opal database deployed behind the computer firewall of each DC. Analysis is controlled through a standard R environment at the AC. Results: Based on this Opal/R implementation, DataSHIELD is currently used by the Healthy Obese Project and the Environmental Core Project (BioSHaRE-EU) for the federated analysis of 10 data sets across eight European countries, and this illustrates the opportunities and challenges presented by the DataSHIELD approach. Conclusions: DataSHIELD facilitates important research in settings where: (i) a co-analysis of individual-level data from several studies is scientifically necessary but governance restrictions prohibit the release or sharing of some of the required data, and/or render data access unacceptably slow; (ii) a research group (e.g. in a developing nation) is particularly vulnerable to loss of intellectual property—the researchers want to fully share the information held in their data with national and international collaborators, but do not wish to hand over the physical data themselves; and (iii) a data set is to be included in an individual-level co-analysis but the physical size of the data precludes direct transfer to a new site for analysis.

187 citations

Journal ArticleDOI
TL;DR: A well-standardized metabolomics platform is used to identify metabolic predictors of long-term mortality in the circulation of 44,168 individuals and identifies key metabolites independently associated with all-cause mortality risk.
Abstract: Predicting longer-term mortality risk requires collection of clinical data, which is often cumbersome. Therefore, we use a well-standardized metabolomics platform to identify metabolic predictors of long-term mortality in the circulation of 44,168 individuals (age at baseline 18–109), of whom 5512 died during follow-up. We apply a stepwise (forward-backward) procedure based on meta-analysis results and identify 14 circulating biomarkers independently associating with all-cause mortality. Overall, these associations are similar in men and women and across different age strata. We subsequently show that the prediction accuracy of 5- and 10-year mortality based on a model containing the identified biomarkers and sex (C-statistic = 0.837 and 0.830, respectively) is better than that of a model containing conventional risk factors for mortality (C-statistic = 0.772 and 0.790, respectively). The use of the identified metabolic profile as a predictor of mortality or surrogate endpoint in clinical studies needs further investigation.

160 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: An adaption of Egger regression can detect some violations of the standard instrumental variable assumptions, and provide an effect estimate which is not subject to these violations, and provides a sensitivity analysis for the robustness of the findings from a Mendelian randomization investigation.
Abstract: Background: The number of Mendelian randomization analyses including large numbers of genetic variants is rapidly increasing. This is due to the proliferation of genome-wide association studies, and the desire to obtain more precise estimates of causal effects. However, some genetic variants may not be valid instrumental variables, in particular due to them having more than one proximal phenotypic correlate (pleiotropy). Methods: We view Mendelian randomization with multiple instruments as a meta-analysis, and show that bias caused by pleiotropy can be regarded as analogous to small study bias. Causal estimates using each instrument can be displayed visually by a funnel plot to assess potential asymmetry. Egger regression, a tool to detect small study bias in meta-analysis, can be adapted to test for bias from pleiotropy, and the slope coefficient from Egger regression provides an estimate of the causal effect. Under the assumption that the association of each genetic variant with the exposure is independent of the pleiotropic effect of the variant (not via the exposure), Egger’s test gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid instrumental variables. Results: We illustrate the use of this approach by re-analysing two published Mendelian randomization studies of the causal effect of height on lung function, and the causal effect of blood pressure on coronary artery disease risk. The conservative nature of this approach is illustrated with these examples. Conclusions: An adaption of Egger regression (which we call MR-Egger) can detect some violations of the standard instrumental variable assumptions, and provide an effect estimate which is not subject to these violations. The approach provides a sensitivity analysis for the robustness of the findings from a Mendelian randomization investigation.

3,392 citations

Journal ArticleDOI
TL;DR: The Avon Longitudinal Study of Parents and Children (ALSPAC) is a transgenerational prospective observational study investigating influences on health and development across the life course and is currently set up as a supported access resource.
Abstract: The Avon Longitudinal Study of Parents and Children (ALSPAC) is a transgenerational prospective observational study investigating influences on health and development across the life course. It considers multiple genetic, epigenetic, biological, psychological, social and other environmental exposures in relation to a similarly diverse range of health, social and developmental outcomes. Recruitment sought to enroll pregnant women in the Bristol area of the UK during 1990-92; this was extended to include additional children eligible using the original enrollment definition up to the age of 18 years. The children from 14541 pregnancies were recruited in 1990-92, increasing to 15247 pregnancies by the age of 18 years. This cohort profile describes the index children of these pregnancies. Follow-up includes 59 questionnaires (4 weeks-18 years of age) and 9 clinical assessment visits (7-17 years of age). The resource comprises a wide range of phenotypic and environmental measures in addition to biological samples, genetic (DNA on 11343 children, genome-wide data on 8365 children, complete genome sequencing on 2000 children) and epigenetic (methylation sampling on 1000 children) information and linkage to health and administrative records. Data access is described in this article and is currently set up as a supported access resource. To date, over 700 peer-reviewed articles have been published using ALSPAC data.

2,440 citations

Journal ArticleDOI
TL;DR: The Avon Longitudinal Study of Children and Parents (ALSPAC) was established to understand how genetic and environmental characteristics influence health and development in parents and children.
Abstract: Summary The Avon Longitudinal Study of Children and Parents (ALSPAC) was established to understand how genetic and environmental characteristics influence health and development in parents and children. All pregnant women resident in a defined area in the South West of England, with an expected date of delivery between 1st April 1991 and 31st December 1992, were eligible and 13 761 women (contributing 13 867 pregnancies) were recruited. These women have been followed over the last 19–22 years and have completed up to 20 questionnaires, have had detailed data abstracted from their medical records and have information on any cancer diagnoses and deaths through record linkage. A follow-up assessment was completed 17–18 years postnatal at which anthropometry, blood pressure, fat, lean and bone mass and carotid intima media thickness were assessed, and a fasting blood sample taken. The second follow-up clinic, which additionally measures cognitive function, physical capability, physical activity (with accelerometer) and wrist bone architecture, is underway and two further assessments with similar measurements will take place over the next 5 years. There is a detailed biobank that includes DNA, with genome-wide data available on >10 000, stored serum and plasma taken repeatedly since pregnancy and other samples; a wide range of data on completed biospecimen assays are available. Details of how to access these data are provided in this cohort profile.

1,902 citations

Journal ArticleDOI
TL;DR: It is found that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.
Abstract: Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network's prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.

1,491 citations