scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Harnessing genomic information for livestock improvement

01 Mar 2019-Nature Reviews Genetics (Nature Publishing Group)-Vol. 20, Iss: 3, pp 135-156
TL;DR: Genomic information of increasing complexity (including genomic, epigenomic, transcriptomic and microbiome data), combined with technological advances for its cost-effective collection and use, will make a major contribution to tackling the looming food crisis.
Abstract: The world demand for animal-based food products is anticipated to increase by 70% by 2050. Meeting this demand in a way that has a minimal impact on the environment will require the implementation of advanced technologies, and methods to improve the genetic quality of livestock are expected to play a large part. Over the past 10 years, genomic selection has been introduced in several major livestock species and has more than doubled genetic progress in some. However, additional improvements are required. Genomic information of increasing complexity (including genomic, epigenomic, transcriptomic and microbiome data), combined with technological advances for its cost-effective collection and use, will make a major contribution.

Summary (2 min read)

Introduction

  • Since 1960, global livestock productivity (including carcass weight of meat- producing species, milk yield of dairy cows and egg production) has increased by 20–30% as a result of advances in nutrition, disease control and genetics1.
  • Genome- wide SNP arrays are available for the main livestock species.
  • These efforts have uncovered millions of genetic variants for all the main livestock species and profoundly changed their understanding of the domestication process (box 1).
  • The description of the number, location and effects of the genetic variants that affect a phenotype of interest.
  • Unlike in humans, who typically produce only one affected offspring, samples were available from a number of affected animals, which greatly facilitated the identification of the causative mutation.

GS for complex agricultural traits

  • With the exception of the breed- defining characteristics, inherited defects and EL mutations discussed above, nearly all economically important traits in livestock are complex polygenic traits.
  • Examples of such major gene effects segregating within breeds include, among others, variants in MSTN in cattle112–115 and sheep67 and RYR1, PRKAG3 and IGF2 in pig116–118, which all affect muscularity; DGAT1, GHR and ABCG2, which affect milk yield and composition in cattle119–121; and PLAG1, HMGA2 and LCORL, which affect stature in cattle122,123.
  • The accuracy of GEBV will be highest when the prior distribution best matches the true distribution of SNP effects111.
  • Balancing selection for variants with large effects is common in livestock.

GBLUP selection

  • A more complete understanding of balancing selection operating at specific loci could be exploited to prioritize or avoid specific matings in breeding programmes (see From selecting animals to selective matings using genomic information).
  • Strategies to further improve the accuracy of whole- genome-sequence- based GS currently involve either selecting or assigning more weight to a subset of imputed variants that are more likely to be causative.
  • Much remains to be learned about how variants perturb regulatory elements, including whether they need to be within the element or can influence regulatory function from a distance.
  • First, the linkage phase between causative variants and distant genotyped SNPs may differ between breeds.

Editing livestock genomes

  • Programmable nucleases have revived interest in editing livestock genomes.
  • The anticipated revolution has yet to occur.
  • Indeed, the rate of double- stranded break- induced NHEJ is now high enough that, despite mosaicism commonly reducing germline transmission, it has become more effective (in terms of the number of embryos required to obtain an edited offspring) to circumvent SCNT and inject the programmable nucleases directly into the zygote when aiming to generate LoF mutations172.
  • Thus far, efforts in editing the genome of livestock have mostly concentrated on largely uncontroversial human health applications, such as generating animal models of human genetic diseases, producing biopharmaceuticals and xenotransplantation.
  • As the number of animals with both phenotypic records and SNP genotypes increases into the millions and as fine- mapping methods continue to improve, a growing number of causative variants (particularly those with the largest effects) is bound to be identified, as has been shown to occur for common complex diseases in humans36.

New applications of genomic technology

  • Detecting cows with subclinical mastitis by bulk genotyping of tank milk.
  • One of the major health issues on dairy farms is mastitis186.
  • This ensemble of allelic ratios reflects the combination of the cows’ known SNP genotypes, and the unknown proportion of DNA contributed by each cow to the tank milk.

Conclusions and future perspectives

  • The field of animal breeding just completed a prototypical, once- in-a- lifetime Gartner hype cycle.
  • Whole- genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Since 1960, global livestock productivity (includ-
ing carcass weight of meat- producing species, milk
yield of dairy cows and egg production) has increased
by 20–30% as a result of advances in nutrition, dis-
ease control and genetics
1
. Genetic improvement has
accrued through breed substitution, cross- breeding and
within- breed selection. In contrast to the one- off measures
of breed substitution and cross- breeding, within- breed
selection drives sustained, cumulative genetic progress.
It has increasingly relied on sophisticated statistical
methods, including mixed model methodology, to pro-
vide ever more accurate individual estimated breed-
ing values (EBVs)
2
. Spectacular genetic improvements
have been achieved in several species by combining
within- breed selection with reproductive technologies
(such as artificial insemination and embryo transfer) to
more effectively disseminate elite genomes. For example,
average annual milk yield per cow in the United States
increased from 1,890 kg in 1924 to 9,682 kg in 2011,
and more than 50% of this progress was attributed to
improved genetics
1
. Between 1957 and 2001, the time
for broiler chickens to reach market weight decreased
threefold despite a decrease in feed consumption
3,4
.
Typically, within- breed selection is expected to result in
annual
genetic gains of ~1–3%
1
.
Currently, the most effective route to minimize the
detrimental environmental impact of livestock is to
increase productivity: the carbon footprint of 1 kg of
milk produced in the United States in 2007 was 37%
of that in 1944, and the carbon footprint of the total
US dairy industry was reduced by 41% over the same
period despite the 250% increase in total milk produc-
tion
5
. However, exclusive emphasis on production has
led to detrimental correlated responses in other traits,
particularly those associated with fitness. For example,
although selection for milk production in dairy cat-
tle was extremely successful, there was a substantial
undesired decline in fertility over the same period
6
.
Thus, selection schemes now increasingly attempt
to balance animal health, fertility, production and
environmental impact.
The emergence of genomics as a discipline in the
1980s led to the concept of marker- assisted selection
(MAS), in which genetic variants and genes that influ-
ence agriculturally important traits would be identified
and used to further increase genetic response. A global
chase for
quantitative trait loci (QTL) ensued in all live-
stock species. QTL with large effects on economically
important traits were indeed mapped, first by linkage
analyses and then by
genome- wide association studies
(GWAS), but these did not account for a large enough
proportion of the heritability to render them useful
selection tools on their own
7
. MAS met with limited
enthusiasm from the breeding industry until a land-
mark paper proposed
genomic selection (GS)
8
. In its
simplest form, GS makes the same assumption as
standard EBV selection: the genetic variance for the
traits of interest reflects the additive effects of thou-
sands of variants with very small (and, unlike QTL,
unmappable) effects that are uniformly scattered
throughout the genome
2,9
. As soon as genome- wide
single- nucleotide polymorphism arrays (SNP arrays)
became an affordable reality, GS was tested and was
soon widely adopted by the dairy cattle breeding
industry as an effective and easily implemented alter-
native to the time- consuming and costly standard
progeny testing (PT). Since 2008, more than 3 million
dairy animals have been genotyped worldwide, and
GS has become an essential tool for breeding com-
panies that is expected to double genetic progress
10
.
Within- breed selection
A process by which sires and
dams that have above average
breeding values are selected as
parents to produce the next
generation of animals.
Genetic gains
Differences in the average
breeding values of the
population before and after
selection. Genetic gain is a
function of the amount of
genetic variance, the accuracy
of selection, the intensity of
selection and the generation
interval.
Harnessing genomic information
for livestock improvement
MichelGeorges
1,2
*, CaroleCharlier
1,2
and BenHayes
3
Abstract
|
The world demand for animal- based food products is anticipated to increase by
70% by 2050. Meeting this demand in a way that has a minimal impact on the environment will
require the implementation of advanced technologies, and methods to improve the genetic
quality of livestock are expected to play a large part. Over the past 10 years, genomic selection
has been introduced in several major livestock species and has more than doubled genetic
progress in some. However, additional improvements are required. Genomic information of
increasing complexity (including genomic, epigenomic, transcriptomic and microbiome data),
combined with technological advances for its cost- effective collection and use, will make a
major contribution.
1
Unit of Animal Genomics,
GIGA Institute, University of
Liège, Liège, Belgium.
2
Faculty of Veterinary
Medicine, University of Liège,
Liège, Belgium.
3
Queensland Alliance for
Agriculture and Food
Innovation (QAAFI),
Queensland Bioscience
Precinct, The University of
Queensland, Brisbane,
Queensland, Australia.
*e- mail: michel.georges@
uliege.be
https://doi.org/10.1038/
s41576-018-0082-2
REVIEWS
Nature reviews
|
Genetics

GS is increasingly being adopted by other livestock
industries and in plant breeding
11,12
. Similar meth-
ods are now also used in human genetics to study the
genetic architecture of common complex diseases and
to predict individual disease risk
1315
. Although GS as
implemented today is expected to enable genetic pro-
gress of up to twofold in dairy cattle and layer hens,
with more modest gains in other species, it is unlikely
to be sufficient to meet the expected 70% increase in
the world demand of animal products by 2050
(REF.
16
).
Further improvements and additions to GS will be
needed to meet this target.
In this Review, we examine the status of genomic
resources available in the major livestock species (that
is, cattle, sheep, goat, pig, poultry and salmon) and how
these are being used to accelerate the discovery and
management of defect- causing genes, to improve the
accuracy and extend the scope of GS, to orient genome
editing strategies and to develop new applications that
take advantage of the genomic information that is
becoming widely available.
Genomic resources for livestock species
New scaffolding methods have dramatically improved
livestock reference genomes. Following the lead of the
human and mouse genome projects, the animal genom-
ics community generated draft reference genomes for the
major livestock species (poultry
17
, cattle
18
, pig
19
, goat
20
,
sheep
21
and salmon
22
), first using Sanger sequencing with
hybrid (clone- by-clone and whole genome) shotgun
approaches
23
, increasingly complemented with massively
parallel generation of short reads. These efforts provided
initial insights into the evolution of the gene repertoire
underlying adaptive features (such as plumage and beak
formation, rumination and lactation, wool growth, and
smell and taste specification) and into changes resulting
from whole- genome duplication in salmon. They also
contributed to the identification of evolutionary con-
served elements
24
. However, the quality of most of these
reference genomes has remained a source of concern.
They were highly fragmented and littered with assembly
errors, which affect positional cloning efforts and impu-
tation accuracy, among other uses
25
. Critical mass and
funding have long been missing to upgrade their status
from highly fragmented drafts to high- quality finished
genomes. However, the development of new scaffold-
ing approaches, including long- read sequences (such as
PacBio), optical mapping (such as Bionano Genomics)
and chromatin conformation capture now provides
an affordable path to high- quality reference genomes
for all species
26
. The integrated use of these methods
has recently enabled spectacular improvements in the
quality of reference genomes for goat
27
and other live-
stock species
(TABLE1) (genome assemblies are available
through the NCBI Genome database).
Genome- wide SNP arrays are available for the main
livestock species. Draft reference genomes were typically
accompanied by shallow (1–2-fold depth) sequencing
of tens to hundreds of individuals representing distinct
breeds and populations in order to characterize genetic
variation and infer demographic history (see, for exam-
ple,
REFS
19,28,29
). These efforts have uncovered millions
of genetic variants for all the main livestock species and
profoundly changed our understanding of the domesti-
cation process
(BOX1). Databases of available SNPs (such
as dbSNP) have been used to develop a large number
of arrays that allow cost- effective genotyping of tens of
thousands to hundreds of thousands of variants in
the major livestock species (Supplementary Table 1).
These arrays are extensively used to conduct GWAS, as
well as GS. It is estimated that at least 3 million cattle
and possibly millions of pigs and poultry have been
genotyped using genome- wide SNP arrays
10,11
.
Population- based resequencing for imputation- based
GWAS and GS. As sequencing costs continue to
decrease, livestock geneticists are resequencing the
genomes of a growing number of animals. More
than 2,500 cattle have had their whole genome rese-
quenced, while the corresponding numbers are at
least in the hundreds for pig, poultry, sheep and goats
(M. Groenen, R. Hawken and G. Tosser- Klopp, personal
communications). The best- known large- scale rese-
quencing initiative in livestock genetics is the 1,000
Bull Genomes Project
25
, but other large sequencing
projects are being conducted by academic groups and
breeding companies
30
. These efforts are largely inspired
by the human 1,000 Genomes Project
31
and hope to
achieve deep characterization of the genetic varia-
tion between and within populations. Importantly,
sequencing the whole genomes of a reference popula-
tion of hundreds of animals enables
genotype imputation
at millions of common variants in the much larger
number of animals that have been genotyped with
genome- wide SNP arrays. This approach can be
Table 1
|
Current status of the reference genomes for the most important livestock species
Species Assembly Release date Coverage Number of
contigs
Contig
N50 (Mb)
Total
(Gb)
Pig (Sus scrofa) Sscrofa11.1 7 Feb 2017
65×
1,118 48.2 2.5
Goat (Capra hircus) ARS1 24 Aug 2016
50×
30,399 26.2 2.9
Cattle (Bos taurus) ARS- UCD1.2 11 Apr 2018
80×
2,597 25.9 2.7
Chicken (Gallus gallus) GRCg6a 27 Mar 2018
82×
1,402 17.6 1
Sheep (Ovis aries) Oar_rambouillet_V1.0 2 Nov 2017
126×
7 ,485 2.6 2.9
Zebu (Bos indicus) AM293397v1 22 Feb 2018
100×
337 ,292 0.064 2.7
Salmon (Salmo salar) ICSASG_v2 10 Jun 2015
206×
368,060 0.058 3
Quantitative trait loci
(QTL). Regions in the genome
that encompass genetic
variants with an effect on a
quantitative trait of interest.
Genome- wide association
studies
(GWAS). Scan of the entire
genome to identify genetic
variants for which variation in
genotype is associated with
variation for one or more
phenotypes of interest.
Genomic selection
(GS). An ensemble of methods
to estimate the breeding values
of individual animals on the
basis of genome- wide single-
nucleotide polymorphism
genotype information.
Single- nucleotide
polymorphism arrays
(SNP arrays). Microarrays used
to determine the genotype of
individuals for hundreds to
millions of SNPs at once.
Progeny testing
(PT). An approach by which the
breeding value of an animal is
estimated from phenotypic
measures made on its progeny.
Genetic architecture
The description of the number,
location and effects of the
genetic variants that affect a
phenotype of interest.
www.nature.com/nrg
Reviews

implemented using pyramidal schemes in which a
top layer of a few (possibly hundreds) highly influen-
tial animals are sequenced, an intermediate layer of
multiplier’ animals are genotyped with high- density
SNP arrays and the most populated bottom layer of
animals are genotyped with low- density SNP arrays.
Sequence information is then projected from the upper
two layers onto the animals of the bottom layer using
a two- step imputation strategy
32
. Livestock represent a
unique opportunity to implement this approach
because samples from key ancestors of the population
are often available in the form of semen straws or
ampules. In the 1,000 Bull Genomes Project, bulls
born in the 1960s are included in the set of sequenced
animals. The availability of whole- genome sequence
information for tens to hundreds of animals from
specific breeds has proved extremely useful to pin-
point the causative mutations underlying monogenic
defects
25,33
. Imputation of sequence information on
large cohorts of phenotyped animals greatly accelerates
fine- mapping and identification of causative variants
for QTL detected by GWAS
25,34
. It is also anticipated
that imputed sequence information could increase
the accuracy of GS (see Increasing the accuracy of GS
using whole- genome sequence imputation).
Epigenome maps and eQTL data sets enable functional
follow- up of GWAS hits. It is increasingly recognized
that regulatory (rather than coding) variants account for
the majority of the genetic variation underlying complex
traits, such as common complex diseases in humans or
economically important traits in plants and animals
35,36
.
Most of these regulatory variants are expected to affect
components of gene switches, that is, proximal pro-
moters and more distant enhancers and silencers. To
aid in the identification of such regulatory variants
and the genes whose expression they affect, the animal
genomics community has begun to generate
epigenome
maps, mainly using
ChIP- Seq (chromatin immunopre-
cipitation followed by sequencing),
DNase- Seq (DNase
I hypersensitive site sequencing) and
ATAC- Seq (assay
for transposase- accessible chromatin using sequenc-
ing), which will provide exhaustive catalogues of gene
regulatory elements in livestock. Most of these efforts
are coordinated through the international Functional
Annotation of Animal Genomes (FAANG) project
37,38
.
Liver- specific comparative enhancer maps based on his-
tone modification data have already been generated for
20 mammals, including cow, pig and rabbit
39
. In addi-
tion, bovine DNA methylation maps have been gener-
ated for ten somatic tissues using reduced representation
bisulfite sequencing
40
.
Epigenome maps are complemented by multi-
tissue transcriptome data sets for the analysis of
expression quantitative trait loci (eQTL). In cattle, such
data sets have been generated for mammary gland, liver,
blood and adrenal gland and have been used to identify
causative genes underlying GWAS- identified QTL
4145
.
In pigs, eQTL studies have been conducted in skeletal
muscle, lung, adipose tissue and liver
4657
. In poultry,
genome- wide eQTL analyses have been reported for
liver, bone, adrenal gland and hypothalamus
5861
. The
time seems right for the animal genomics community to
take advantage of working with livestock species to col-
laboratively generate large, multi- omic, multi- tissue data
sets similar to the human Genotype- Tissue Expression
(GTEx) data set
62
. This approach would provide inval-
uable comparative information about genome function
and greatly facilitate follow- up studies of GWAS and GS
hits in these species.
Important Mendelian traits in livestock
The early 20th century saw a heated debate between
Mendelists and Galtonists, with Galtonists claiming that
Mendelian genes accounted for only a small proportion
of inherited features. The debate was settled when it
was realized that quantitative traits derive their contin-
uous distribution from the combined effects of many
segregating Mendelian genes (that is, they are poly-
genic traits). It remains true, however, that Mendelian
traits — that is, phenotypes that are fully determined
by one gene (monogenic) or a small number of genes
(oligogenic) — are the exception rather than the rule.
In humans, Mendelian traits are largely limited to blood
groups and an admittedly long list of severe genetic
defects that are compiled in the Online Mendelian
Inheritance in Man (OMIM) database and that include
the ‘inborn errors of metabolism. In addition to blood
Genotype imputation
The in silico prediction of the
genotype of an individual for
ungenotyped variants on the
basis of known genotypes at
neighbouring variants and a
reference population with
genotype information for all
variants. Imputation exploits
the nonrandom association of
alleles at neighbouring
variants, referred to as linkage
disequilibrium.
Soft sweeps
The process by which the
frequency of a favourable old
variant rapidly increases in the
population by positive selection
until eventual fixation. Soft
sweeps are not associated with
the concomitant fixation of one
predominant haplotype, as the
variant has been distributed
over multiple haplotypes by
recombination before selection.
Old variants that are substrates
for new selection constitute the
standing variation in the
population.
Box 1
|
Genetic variation provides insight into the domestication process
Punctuated versus continuous domestication. One of the most striking insights gained
from studying genetic variation in livestock species is the realization that the degree of
genetic variation, measured, for instance, by the average heterozygosity per nucleotide
site (π), is typically higher in livestock than in humans
19,28,29
. This observation is against
expectations. It is often assumed that animal domestication occurred through rare,
isolated events involving a limited number of animals, which would have caused drastic
genetic bottlenecks. Further reduction in effective population size would have
accompanied more recent breed creation and been accentuated by intensifying
selection schemes. However, domestic animal populations remain more variable than
the people who domesticated them. This realization forces us to revisit our views of the
domestication process. Domestication most likely involved continuous gene flow
between domestic and wild individuals from the same species, as well as from inter-
fertile sub- species, during most of agricultural history
191
. As a result, the genomes of the
majority of domestic livestock species probably have a mosaic structure that is at least
as pronounced as that of the laboratory mouse
192
or human
193
. Some haplotypes
segregating within pig and cattle breeds have been shown to differ approximately
every 100 bp, which is a similar sequence identity to humans and chimpanzees; thus,
they possibly coalesced ~5 million years ago, which is before the creation of the studied
species
118,194,195
.
Hard sweeps, soft sweeps and polygenic adaptation during domestication.
Comparisons
between the genome sequences of domestic animals and their wild extant or extinct
progenitors (that is, red jungle fowl
196
, rabbit
197
, wild boar
191
, bezoar
198
and auroch
199
)
have identified chromosome regions that may have undergone hard sweeps driven by
the domestication process. These regions seem to be enriched in genes that control
behaviour and stature. The most convincing 40 kb hard sweep signature encompasses
the G558R missense mutation in the chicken thyroid stimulating hormone receptor
(TSHR), known to have a key role in metabolic regulation and photoperiod control of
reproduction
196
. These genomic regions may correspond to islands of domestication
that resist recurrent gene flow from wild progenitor species
191
. It is worth noting that
the methods used to detect selective sweeps associated with domestication pick up
only hard sweeps acting on very rare or denovo mutations.
Soft sweeps acting on older
and hence more common mutations (that is, standing variation in the wild progenitor)
require alternative methods for their detection
200
. It is also noteworthy that evidence
suggests that tame behaviour in rabbits and possibly other species evolved through
shifts in allelic frequency at many loci (that is, polygenic adaptation) rather than critical
changes at a few domestication loci
197
.
Nature reviews
|
Genetics
Reviews

groups and a similar list of severe genetic defects com-
piled in the Online Mendelian Inheritance in Animals
(OMIA) database, Mendelian traits in domestic animals
also include an extended list of breed- defining char-
acteristics, such as coat colour, tegument variation,
polledness, double- muscling and hyper- prolificacy.
Most breed- defining traits have been molecularly
characterized. For millennia, animal breeders have per-
formed what amounts to a mega- scale phenotype- driven
mutagenesis screen. In the process, they have identified a
series of mutations with large phenotypic effects that —
when desirable — were selected, often becoming trade-
marks and breed- defining features. In many instances,
mutant variants were valued because of their aesthetic
effects on the animals, such as patterns of coat and plum-
age colour; shape of ears, horns, wattles or combs; and
tonality of songs. The long- standing interest of breeders
for ‘fancy’ animals is well illustrated by 7,000-year- old
rock paintings in the Sahara
63
. In other instances, the value
of the mutant variants reflects their utility. For instance,
mutations with major beneficial effects on hair (such as
quality of angora or cashmere) and skin texture (such as
heat tolerance of slick cattle), fertility (such as twinning)
and muscularity (for example, double- muscling) are all
highly desired. Although some mutant phenotypes are
easily recognized, others are subtler and may require
human–animal proximity for their detection. An exam-
ple of such a phenotype is pacing in horses, which was
shown recently to result from a premature stop codon in
DMRT3, a gene that controls spinal circuitry
64
. It is hard
to imagine that such a phenotype could be detected in the
systematic phenotype- driven screens that are currently
being conducted in the mouse.
Over the past 10 years, as genomic resources and
methods improved, the causative genes and mutations
underlying most of these breed- defining characteristics
have been identified, and a number of dominant themes
have emerged
65
(TABLE2). Most (75%) of the corre-
sponding mutations are at least partially dominant, that
is, heterozygotes express a phenotype; such mutations
would have been easier to detect and maintain in the
population than recessive ones. A large proportion of
mutations affect gene regulation (43%), resulting in gain-
of-function phenotypes through ectopic gene expres-
sion. Regulatory mutations are often structural (64%)
and involve duplications, insertions (including of retro-
elements), inversions or combinations thereof. The same
phenotype is often determined by mutations in the same
gene in different species and by allelic series within spe-
cies, which indicates that mutations of only that gene can
generate the corresponding phenotype without major
deleterious
pleiotropy. Different phenotypes are some-
times caused by allelic series that have evolved one from
the other by serial accumulation of multiple mutations.
The molecular dissection of breed- defining traits
has revealed some remarkable biology, including the
demonstration of serial translocation by circular inter-
mediates, which is likely to be an ancient exon shuffling
mechanism
66
, the identification of a hypomorphic MSTN
mutation resulting from the acquisition of an illegiti-
mate microRNA target site
67
and the interplay between
cis- effects and microRN A- mediated trans- effects under-
lying polar
overdominance of the callipyge phenotype
in sheep
68
. However, it is noteworthy that the molecu-
lar underpinnings of the cashmere and mohair wool
types in goat and the very widespread recessive piebald
phenotype in cattle remain unknown.
About the number of defect- causing recessive muta-
tions carried per individual. Diploidy has enabled an
increase in genome size while ensuring that most indi-
viduals in the population have at least one functional
copy of each gene. Concomitantly, most individuals
are expected to be heterozygous for loss- of-function
(LoF) alleles in a number of
haplosufficient genes. The
analysis of whole- genome sequences from large num-
bers of individuals indicates that this number is ~100
in humans
69
. It appears to be very similar in livestock
species, including the cow
30
. This estimate is much
higher than expected from epidemiological studies,
which suggest that humans carry on average ~0.5–1
allele that is lethal when homozygous
70
. This apparent
conundrum can be explained by the observation that
for the majority of genes (~75%), homozygosity for LoF
mutations is viable but confers a modest selective dis-
advantage that is sufficient to preclude fixation of the
mutations, which explains the evolutionary conserva-
tion of the corresponding gene. Indeed, data from the
International Mouse Phenotype Consortium indicate
that only ~25% of mammalian genes are essential in
the sense that at least one functional allele is needed
for survival until reproductive age. Homozygosity (or
compound heterozygosity) for LoF mutations in the
corresponding genes are lethal, either before (embry-
onic lethal (EL)) or after birth. The number of such
recessive lethal alleles that are carried, on average, by
healthy individuals has been of considerable interest
for a long time
71
. Indeed, this number determines, for
instance, the increased morbidity endured by offspring
of consanguineous marriages or matings. Simulations
for mammalian genomes
30
suggest that this number
increases with effective population size (N
e
), from ~0.5
for N
e
= 100 (which is the N
e
for many livestock popu-
lations) to ~5 for N
e
= 10,000 (which is the N
e
of the
human population). Approximately 1% (independent
of N
e
) of conceptuses succumbs from homozygosity (or
compound heterozygosity) for at least one of around ten
common EL mutations (frequency >0.02) in livestock
compared with at least one among thousands of rare EL
mutations in humans
30
. This observation suggests that
managing severe genetic defects in livestock populations
(including EL mutations) is a tractable problem that
requires the identification and tracking of around ten
such common mutations per population. It is notewor-
thy that in humans (and probably in other mammals),
an estimated 3,000 genes are haploinsufficient and hence
LoF- intolerant
72
.
Identifying causative mutations for recessive defects
has become trivial. Livestock populations are charac-
terized by recurrent outbursts of genetic defects. This
is particularly true for species such as cattle in which
artificial insemination allows elite sires to have tens
Epigenome
The combination of chemical
modifications of the DNA
sequence (such as cytosine
methylation) or nucleosomes
(such as methylation of Lys 27
of histone H3) that mark
functionally distinct segments
of the genome (such as active
enhancers) and are inherited
mitotically and/or meiotically.
ChIP- Seq
A combination of chromatin
immunoprecipitation and next-
generation sequencing for
genome- wide mapping of
binding sites occupied by
specific DNA- binding proteins
or chromatin regions enriched
in specific histone
modifications.
DNase- Seq
A method based on next-
generation sequencing for
genome- wide detection of
gene- switch components on
the basis of their open
chromatin conformation and
resulting hypersensitivity to
digestion by DNase I.
ATAC- Seq
An assay based on next-
generation sequencing for
genome- wide detection of
gene- switch components on
the basis of their open
chromatin conformation and
resulting increased accessibility
to transposase Tn5.
Expression quantitative
trait loci
(eQTL). Quantitative trait loci
that influence the transcript
levels of specific genes. Cis-
eQTL are due to regulatory
variants that control the levels
of RNA molecules transcribed
from gene copies located on
the same DNA molecule as the
variant. Trans- eQTL are due to
regulatory variants that can
also control the levels of RNA
molecules transcribed from
gene copies located on
different DNA molecules to the
variant (homologous or other
chromosomes).
Pleiotropy
The ability of a genetic variant
to affect more than one
phenotype.
Hypomorphic
Pertaining to an allele with
partial loss of function when
compared with the wild- type
allele.
www.nature.com/nrg
Reviews

Table 2
|
Breed- defining traits that have been characterized at the molecular level
Gene Species Transmission Phenotype Mutations
Coding Regulatory
Coat or feather colour: melanocyte development
KIT Bovine dom Colour- sided DUPC6
a,b
, DUPC29
a,b
Bovine 1/2 dom Degree of white spotting Unknown
Pig dom White
(SS
+ DUP1
a
+ DUP2–4
a
)
b
Pig dom Patch DUP1
a,b
Pig dom Belt (DUP2–4)
a,b
Pig codom Roan SS
KITLG Bovine codom Roan MS
Goat codom Roan Unknown
MIFT Bovine dom White, blue eyes, hearing loss MS 3bpDEL
Bovine 1/2 dom Degree of white spotting Unknown
SOX10 Chicken rec Dark brown DEL
a
TWIST2 Bovine dom White belt QUAD
a
CDKN2A Chicken Z- linked dom Extreme dilution (SNP1–2)
b
Chicken Z- linked 1/2 dom Dilution
(MS1
+ SNP1–2)
b
Chicken Z- linked dom Barring
(MS2
+ SNP1–2)
b
EDNRA Goat 1/2 dom Degree of white spotting
(MS
+ CNV
a
)
EDNRB2 Chicken rec Mottled MS
Chicken rec White MS
Coat or feather colour: melanin synthesis
MC1R Bovine dom Black MS
Bovine rec Red FS
Bovine rec Telstar Unknown
Pig dom Black MS1, MS2
Pig rec Red
MS3 + MS4
Pig Coat- colour diversity MS1–8, FS1
Pig som Black spotting
(FS + MS2)
Sheep dom Black MS1, MS2
Goat dom Black MS
Goat rec Red SG
Chicken dom Extended black MS1
b
Chicken rec Buttercup
(MS1 + MS2)
b
ASIP Bovine 1/2 dom Brindle INS (LINE)
a
Sheep dom White or tan DUP
a
Sheep rec Self- colour black FS, 9bpDEL , MS
TYR Bovine rec Albino FS
Chicken rec Albino 6bpDEL
Chicken rec White INS (ERV)
a
TYRP1 Bovine rec Dun MS
Pig 1/2 dom Brown or blond 6bpDEL
Sheep rec Light coat MS
Goat dom Brown MS
Coat or feather colour: melanin transport
PMEL Bovine rec Dilution MS, 3bpDEL
Chicken dom White 9bpINS
b
Chicken dom Smokey
(9bpINS + 12bpDEL)
b
Chicken dom Dun 15bpDEL
Nature reviews
|
Genetics
Reviews

Citations
More filters
Journal Article
TL;DR: The comparison of related genomes has emerged as a powerful lens for genome interpretation as mentioned in this paper, which reveals a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons.
Abstract: The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.

926 citations

Journal Article
30 Dec 2002-Genomics
TL;DR: Using a denser chromosome 20 marker map and exploiting linkage disequilibrium using two distinct approaches, strong evidence is provided that a chromosome segment including the gene coding for the growth hormone receptor accounts for at least part of the chromosome 20 QTL effect.

382 citations

Journal ArticleDOI
TL;DR: The authors review how genomics is being applied to aquaculture species at all stages of the domestication process to optimize selective breeding and how combining genomic selection with biotechnological innovations, such as genome editing and surrogate broodstock technologies, may further expedite genetic improvement in Aquaculture.
Abstract: Aquaculture is the fastest-growing farmed food sector and will soon become the primary source of fish and shellfish for human diets. In contrast to crop and livestock production, aquaculture production is derived from numerous, exceptionally diverse species that are typically in the early stages of domestication. Genetic improvement of production traits via well-designed, managed breeding programmes has great potential to help meet the rising seafood demand driven by human population growth. Supported by continuous advances in sequencing and bioinformatics, genomics is increasingly being applied across the broad range of aquaculture species and at all stages of the domestication process to optimize selective breeding. In the future, combining genomic selection with biotechnological innovations, such as genome editing and surrogate broodstock technologies, may further expedite genetic improvement in aquaculture.

257 citations

19 Aug 2014
TL;DR: It is demonstrated that embryonic lethal mutations account for a non-negligible fraction of the decline in fertility of domestic cattle, and that associated positive effects on milk yield may account for part of the negative genetic correlation.
Abstract: In dairy cattle, the widespread use of artificial insemination has resulted in increased selection intensity, which has led to spectacular increase in productivity. However, cow fertility has concomitantly severely declined. It is generally assumed that this reduction is primarily due to the negative energy balance of high-producing cows at the peak of lactation. We herein describe the fine-mapping of a major fertility QTL in Nordic Red cattle, and identify a 660-kb deletion encompassing four genes as the causative variant. We show that the deletion is a recessive embryonically lethal mutation. This probably results from the loss of RNASEH2B, which is known to cause embryonic death in mice. Despite its dramatic effect on fertility, 13%, 23% and 32% of the animals carry the deletion in Danish, Swedish and Finnish Red Cattle, respectively. To explain this, we searched for favorable effects on other traits and found that the deletion has strong positive effects on milk yield. This study demonstrates that embryonic lethal mutations account for a non-negligible fraction of the decline in fertility of domestic cattle, and that associated positive effects on milk yield may account for part of the negative genetic correlation. Our study adds to the evidence that structural variants contribute to animal phenotypic variation, and that balancing selection might be more common in livestock species than previously appreciated.

122 citations

Journal ArticleDOI
13 Jan 2020
TL;DR: Large-scale application of genomic selection in plants can be achieved by refining field management to improve heritability estimation and prediction accuracy and developing optimum GS models with the consideration of genotype-by-environment interaction and non-additive effects, along with significant cost reduction.
Abstract: Although long-term genetic gain has been achieved through increasing use of modern breeding methods and technologies, the rate of genetic gain needs to be accelerated to meet humanity's demand for agricultural products. In this regard, genomic selection (GS) has been considered most promising for genetic improvement of the complex traits controlled by many genes each with minor effects. Livestock scientists pioneered GS application largely due to livestock's significantly higher individual values and the greater reduction in generation interval that can be achieved in GS. Large-scale application of GS in plants can be achieved by refining field management to improve heritability estimation and prediction accuracy and developing optimum GS models with the consideration of genotype-by-environment interaction and non-additive effects, along with significant cost reduction. Moreover, it would be more effective to integrate GS with other breeding tools and platforms for accelerating the breeding process and thereby further enhancing genetic gain. In addition, establishing an open-source breeding network and developing transdisciplinary approaches would be essential in enhancing breeding efficiency for small- and medium-sized enterprises and agricultural research systems in developing countries. New strategies centered on GS for enhancing genetic gain need to be developed.

115 citations

References
More filters
Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

Journal ArticleDOI
15 Feb 2013-Science
TL;DR: The type II prokaryotic CRISPR (clustered regularly interspaced short palindromic repeats)/Cas adaptive immune system has been shown to facilitate RNA-guided site-specific DNA cleavage as discussed by the authors.
Abstract: Functional elucidation of causal genetic variants and elements requires precise genome editing technologies. The type II prokaryotic CRISPR (clustered regularly interspaced short palindromic repeats)/Cas adaptive immune system has been shown to facilitate RNA-guided site-specific DNA cleavage. We engineered two different type II CRISPR/Cas systems and demonstrate that Cas9 nucleases can be directed by short RNAs to induce precise cleavage at endogenous genomic loci in human and mouse cells. Cas9 can also be converted into a nicking enzyme to facilitate homology-directed repair with minimal mutagenic activity. Lastly, multiple guide sequences can be encoded into a single CRISPR array to enable simultaneous editing of several sites within the mammalian genome, demonstrating easy programmability and wide applicability of the RNA-guided nuclease technology.

12,265 citations

01 Feb 2013
TL;DR: Two different type II CRISPR/Cas systems are engineered and it is demonstrated that Cas9 nucleases can be directed by short RNAs to induce precise cleavage at endogenous genomic loci in human and mouse cells, demonstrating easy programmability and wide applicability of the RNA-guided nuclease technology.
Abstract: Genome Editing Clustered regularly interspaced short palindromic repeats (CRISPR) function as part of an adaptive immune system in a range of prokaryotes: Invading phage and plasmid DNA is targeted for cleavage by complementary CRISPR RNAs (crRNAs) bound to a CRISPR-associated endonuclease (see the Perspective by van der Oost). Cong et al. (p. 819, published online 3 January) and Mali et al. (p. 823, published online 3 January) adapted this defense system to function as a genome editing tool in eukaryotic cells. A bacterial genome defense system is adapted to function as a genome-editing tool in mammalian cells. [Also see Perspective by van der Oost] Functional elucidation of causal genetic variants and elements requires precise genome editing technologies. The type II prokaryotic CRISPR (clustered regularly interspaced short palindromic repeats)/Cas adaptive immune system has been shown to facilitate RNA-guided site-specific DNA cleavage. We engineered two different type II CRISPR/Cas systems and demonstrate that Cas9 nucleases can be directed by short RNAs to induce precise cleavage at endogenous genomic loci in human and mouse cells. Cas9 can also be converted into a nicking enzyme to facilitate homology-directed repair with minimal mutagenic activity. Lastly, multiple guide sequences can be encoded into a single CRISPR array to enable simultaneous editing of several sites within the mammalian genome, demonstrating easy programmability and wide applicability of the RNA-guided nuclease technology.

10,746 citations

Journal ArticleDOI
Monkol Lek, Konrad J. Karczewski1, Konrad J. Karczewski2, Eric Vallabh Minikel1, Eric Vallabh Minikel2, Kaitlin E. Samocha, Eric Banks2, Timothy Fennell2, Anne H. O’Donnell-Luria3, Anne H. O’Donnell-Luria2, Anne H. O’Donnell-Luria1, James S. Ware, Andrew J. Hill1, Andrew J. Hill2, Andrew J. Hill4, Beryl B. Cummings2, Beryl B. Cummings1, Taru Tukiainen1, Taru Tukiainen2, Daniel P. Birnbaum2, Jack A. Kosmicki, Laramie E. Duncan1, Laramie E. Duncan2, Karol Estrada2, Karol Estrada1, Fengmei Zhao2, Fengmei Zhao1, James Zou2, Emma Pierce-Hoffman1, Emma Pierce-Hoffman2, Joanne Berghout5, David Neil Cooper6, Nicole A. Deflaux7, Mark A. DePristo2, Ron Do, Jason Flannick2, Jason Flannick1, Menachem Fromer, Laura D. Gauthier2, Jackie Goldstein2, Jackie Goldstein1, Namrata Gupta2, Daniel P. Howrigan1, Daniel P. Howrigan2, Adam Kiezun2, Mitja I. Kurki1, Mitja I. Kurki2, Ami Levy Moonshine2, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso1, Gina M. Peloso2, Ryan Poplin2, Manuel A. Rivas2, Valentin Ruano-Rubio2, Samuel A. Rose2, Douglas M. Ruderfer8, Khalid Shakir2, Peter D. Stenson6, Christine Stevens2, Brett Thomas1, Brett Thomas2, Grace Tiao2, María Teresa Tusié-Luna, Ben Weisburd2, Hong-Hee Won9, Dongmei Yu, David Altshuler2, David Altshuler10, Diego Ardissino, Michael Boehnke11, John Danesh12, Stacey Donnelly2, Roberto Elosua, Jose C. Florez1, Jose C. Florez2, Stacey Gabriel2, Gad Getz2, Gad Getz1, Stephen J. Glatt13, Christina M. Hultman14, Sekar Kathiresan, Markku Laakso15, Steven A. McCarroll1, Steven A. McCarroll2, Mark I. McCarthy16, Mark I. McCarthy17, Dermot P.B. McGovern18, Ruth McPherson19, Benjamin M. Neale1, Benjamin M. Neale2, Aarno Palotie, Shaun Purcell8, Danish Saleheen20, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan21, Patrick F. Sullivan14, Jaakko Tuomilehto22, Ming T. Tsuang23, Hugh Watkins17, Hugh Watkins16, James G. Wilson24, Mark J. Daly2, Mark J. Daly1, Daniel G. MacArthur2, Daniel G. MacArthur1 
18 Aug 2016-Nature
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

8,758 citations

Book
01 Jan 1996
TL;DR: This book discusses the genetic Basis of Quantitative Variation, Properties of Distributions, Covariance, Regression, and Correlation, and Properties of Single Loci, and Sources of Genetic Variation for Multilocus Traits.
Abstract: I. The Genetic Basis of Quantitative Variation - An Overview of Quantitative Genetics - Properties of Distributions - Covariance, Regression, and Correlation - Properties of Single Loci - Sources of Genetic Variation for Multilocus Traits - Sources of Environmental Variation - Resemblance Between Relatives - Introduction to Matrix Algebra and Linear Models - Analysis of Line Crosses - Inbreeding Depression - Matters of Scale - II. Quantitative-Trait Loci - Polygenes and Polygenic Mutation - Detecting Major Genes - Basic Concepts of Marker-Based Analysis - Mapping and Characterizing QTLs: Inbred-Line Crosses - Mapping and Characterizing QTLs: Outbred Populations - III. Estimation Procedures - Parent-Offspring Regression - Sib AnalysisTwins and Clones - Cross-Classified Designs - Correlations Between Characters - Genotype x Environment Interaction - Maternal Effects Sex Linkage and Sexual Dimorphism - Threshold Characters - Estimation of Breeding Values - Variance-Component Estimation with Complex Pedigrees - Appendices - Expectations, Variances and Covariances of Compound Variables - Path Analysis - Matrix Algebra and Linear Models - Maximum Likelihood Estimation and Likelihood-Ratio Tests - Estimation of Power of Statistical Tests -

6,530 citations

Frequently Asked Questions (14)
Q1. How many SNPs did the study estimate for milk yield?

The study estimated that for milk yield, 4,330 SNPs had a non- zero effect, with only 7 SNPsexplaining 1% or more of the genetic variation. 

Before dissemination, the sire’s genome would be edited for a number of causative variants to render them homozygous for the favourable allele. 

DNMs with large effects on the selected traits sequentially undergo hard sweeps, causing large effects detectable by GWAS until the corresponding variants reach fixation126. 

The number of SNPs needed to achieve adequate accuracy depends on the number of cows on the farm: tens of thousands of SNPs are sufficient for farms with tens of cows, but hundreds of thousands of SNPs are needed for farms with several hundred cows. 

One way to compensate for the fact that most causative SNPs are not directly interrogated on the arrays is to impute full sequence information on genotyped animals. 

The inability to derive embryonic stem cells prevented homologous recombination- based techniques until the development of somatic cell nuclear transfer (SCNT)171, which enabled refined gene replacement by homologous recombination in cultured fetal fibroblasts followed by nuclear transfer to enucleated oocytes. 

effects of a magnitude that is virtually impossible under this model have been identified and with GBLUP, their effects will be over- conservatively regressed downwards in genomic predictions. 

Thus far, efforts in editing the genome of livestock have mostly concentrated on largely uncontroversial human health applications, such as generating animal models of human genetic diseases, producing biopharmaceuticals and xenotransplantation. 

The most convincing evidence indicates that the remainder of the heritability is highly polygenic, corresponding to hundreds if not thousands of genetic variants that each has a very small effect on the trait of interest89. 

Examples of such major gene effects segregating within breeds include, among others, variants in MSTN in cattle112–115 and sheep67 and RYR1, PRKAG3 and IGF2 in pig116–118, which all affect muscularity; DGAT1, GHR and ABCG2, which affect milk yield and composition in cattle119–121; and PLAG1, HMGA2 and LCORL, which affect stature in cattle122,123. 

Classic examples include a RYR1 variant in pigs that increases carcass yield in heterozygotes but causes porcine stress syndrome and related syndromes in homozygotes112 and bovine MSTN LoF variants that increase muscle mass in heterozygotes but cause birthing difficulties for mothers of homozygous calves. 

Before GS, candidate elite dairy sires that had identical EBVs based on pedigree information (for instance, because they were full- sibs) required expensive and time- consuming PT to expose differences in the BVs: their individual EBVs were estimated from the performances of tens to hundreds (depending on the country) of daughters, and PT took at least 5 years at a cost of ~US$50,000 per bull10. 

Nearly 1 million evolutionarily constrained elements that overlap potential promoters, enhancers and insulators have been identified24. 

eQTL information can certainly help to identify the target genes whose expression is perturbed by these regulatory variants41,42,155.