- In Zeberg and Pääbo (2020), Figure 1a is showing a Manhattan plot. The Y-axis is significance, but what is it showing the significance of? (10 points)
- In Figure 1B, the bar represents the core Neanderthal haplotype. There is near-complete LD of variants within this core haplotype. However, LD with variants starts to drop outside of this core haplotype. Why? (10 points)
- Look at Extended Data Figure 2 of Zeberg and Pääbo (2020). In the depicted genomic region (~45.72-46.58), where is recombination most active? Explain your answer. (10 points)
A genomic region associated with protection against
severe COVID-19 is inherited from
Neandertals
Hugo Zeberga,b,1
and Svante Pääboa,c,1
aDepartment of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany; bDepartment of Neuroscience,
Karolinska Institutet, SE-17177 Stockholm, Sweden; and cHuman Evolutionary Genomics Unit, Okinawa Institute of Science and Technology, Okinawa
904-0495, Japan
Contributed by Svante Pääbo, January 22, 2021 (sent for review December 21, 2020; reviewed by Tobias L. Lenz and Lluis Quintana-Murci)
It was recently shown that the major genetic risk factor associated
with becoming severely ill with COVID-19 when infected by severe
acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is inherited
from Neandertals. New, larger genetic association studies now allow
additional genetic risk factors to be discovered. Using data from the
Genetics of Mortality in Critical Care (GenOMICC) consortium, we
show that a haplotype at a region on chromosome 12 associatedwith
requiring intensive carewhen infectedwith the virus is inherited from
Neandertals. This region encodes proteins that activate enzymes that
are important during infections with RNA viruses. In contrast to the
previously described Neandertal haplotype that increases the risk for
severe COVID-19, this Neandertal haplotype is protective against se-
vere disease. It also differs from the risk haplotype in that it has a
more moderate effect and occurs at substantial frequencies in all re-
gions of the world outside Africa. Among ancient human genomes in
western Eurasia, the frequency of the protective Neandertal haplo-
type may have increased between 20,000 and 10,000 y ago and again
during the past 1,000 y.
Neandertals | COVID-19 | OAS1 | SARS-CoV-2
Neandertals evolved in western Eurasia about half a million
years ago and subsequently lived largely separated from the
ancestors of modern humans in Africa (1), although limited gene
flow from Africa is likely to have occurred (2–5). Neandertals as
well as Denisovans, their Asian sister group, then became extinct
about 40,000 y ago (6). However, they continue to have a bio-
logical impact on human physiology today through genetic con-
tributions to modern human populations that occurred during
the last tens of thousands of years of their existence (e.g., refs.
7–10).
Some of these contributions may reflect adaptations to envi-
ronments outside Africa where Neandertals lived over several
hundred thousands of years (11). During this time, they are likely
to have adapted to infectious diseases, which are known to be
strong selective factors that may, at least partly, have differed
between sub-Saharan Africa and Eurasia (12). Indeed, several
genetic variants contributed by archaic hominins to modern hu-
mans have been shown to affect genes involved in immunity (e.g.,
refs. 7, 8, 13, 14). In particular, variants at several loci containing
genes involved in innate immunity come from Neandertals and
Denisovans (15), for example, toll-like receptor gene variants
which decrease the susceptibility to Helicobacter pylori infections
and the risk for allergies (16). Furthermore, proteins interacting
with RNA viruses have been shown to be encoded by DNA re-
gions introgressed from Neandertals more often than expected
(17), and RNA viruses might have driven many adaptive events in
humans (18).
Recently, it was shown that a haplotype in a region on chromo-
some 3 is associated with becoming critically ill upon infection with
the novel severe acute respiratory coronavirus 2 (SARS-CoV-2)
(19) and was contributed to modern humans by Neandertals (20).
Each copy of this haplotype approximately doubles the risk of its
carriers requiring intensive care when infected by SARS-CoV-2. It
reaches carrier frequencies of up to ∼65% in South Asia and ∼16%
in Europe, whereas it is almost absent in East Asia. Thus, although
this haplotype is detrimental for its carriers during the current
pandemic, it may have been beneficial in earlier times in South Asia
(21), perhaps by conferring protection against other pathogens,
whereas it may have been eliminated in East Asia by negative
selection.
A new study from the Genetic of Mortality in Critical Care
(GenOMICC) consortium, which includes 2,244 critically ill
COVID-19 patients and controls (22), recently became available.
In addition to the risk locus on chromosome 3, it identifies seven
loci with genome-wide significant effects located on chromo-
somes 6, 12, 19, and 21. Here, we show that, at one of these loci,
a haplotype associated with reduced risk of becoming severely ill
upon SARS-CoV-2 infection is derived from Neandertals.
Results and Discussion
A Neandertal Haplotype on Chromosome 12.We investigated whether
the index single-nucleotide polymorphisms (SNPs), that is, the
SNPs with the strongest association (Materials and Methods), at
the seven loci associated with risk of requiring intensive care upon
SARS-CoV-2 infection on chromosomes 6, 12, 19, and 21 (22)
harbor Neandertal-like alleles. To this end, we required that one
of the alleles of the index SNPs should match all three high-quality
Neandertals genomes, while being absent in the genomes of 108
African Yoruba individuals [r2 > 0.80; the 1000 Genomes Project
(23)]. None of the index SNPs for the loci on chromosomes 6, 19,
and 21 fulfilled these criteria, whereas the locus on chromosome
12 did.
To further investigate this locus, we used data from the
COVID-19 Host Genetics Initiative [HGI; round 4 (24)]. We find
that the SNPs in the chromosome 12 locus associated with
COVID-19 hospitalization (P < 1.0e-5; Fig. 1) are in linkage
disequilibrium (LD) (r2 ≥ 0.8) in Europeans and form a haplotype
Significance
We show that a haplotype on chromosome 12, which is asso-
ciated with a ∼22% reduction in relative risk of becoming se-
verely ill with COVID-19 when infected by SARS-CoV-2, is
inherited from Neandertals. This haplotype is present at sub-
stantial frequencies in all regions of the world outside Africa.
The genomic region where this haplotype occurs encodes
proteins that are important during infections with RNA viruses.
Author contributions: H.Z. and S.P. designed research; H.Z. performed research; H.Z. an-
alyzed data; and H.Z. and S.P. wrote the paper.
Reviewers: T.L.L. , Institut Pasteur; and L.Q.-M., Max Planck Institute for
Evolutionary Biology.
The authors declare no competing interest.
This open access article is distributed under Creative Commons Attribution License 4.0
(CC BY).
1To whom correspondence may be addressed. Email: hugo.zeberg@ki.se or paabo@eva.
mpg.de.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/
doi:10.1073/pnas.2026309118/-/DCSupplemental.
Published February 15, 2021.
PNAS 2021 Vol. 118 No. 9 e2026309118 https://doi.org/10.1073/pnas.2026309118 | 1 of 5
G
EN
ET
IC
S
D
ow
nl
oa
de
d
at
U
ni
ve
rs
ity
o
f T
ex
as
a
t A
rli
ng
to
n
on
F
eb
ru
ar
y
22
, 2
02
1
https://orcid.org/0000-0001-7118-1249
https://orcid.org/0000-0002-4670-6311
http://crossmark.crossref.org/dialog/?doi=10.1073/pnas.2026309118&domain=pdf
http://creativecommons.org/licenses/by/4.0/
http://creativecommons.org/licenses/by/4.0/
mailto:hugo.zeberg@ki.se
mailto:paabo@eva.mpg.de
mailto:paabo@eva.mpg.de
https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2026309118/-/DCSupplemental
https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2026309118/-/DCSupplemental
https://doi.org/10.1073/pnas.2026309118
https://doi.org/10.1073/pnas.2026309118
of ∼75 kb (chr12: 113,350,796 to 113,425,679; hg19). LD to the
index SNP of the GenOMICC study is given in SI Appendix, Table
S1. Haplotypes of this length carrying alleles absent in Yoruba but
present in Neandertals are likely to have been introduced into
the gene pool of modern humans due to interbreeding with
Neandertals (25).
To test whether the 75-kb haplotype is the result of gene flow
from Neandertals, we analyzed its relationship to present-day
and archaic genomes. To do this, we used the haplotypes seen
more than 10 times among the individuals in the 1000 Genomes
Project (23) and the genome sequences of a ∼70,000-y-old Ne-
andertal from Chagyrskaya Cave in southern Siberia (26), a
∼50,000-y-old Neandertal from Vindija Cave in Croatia (27), a
∼120,000-y-old Neandertal from Denisova Cave in southern
Siberia (1), and a ∼80,000-y-old Denisovan individual from the
same site (28). Fig. 2 shows a phylogenetic tree estimating the
relationships among these haplotypes. Among the 64 modern
human haplotypes, eight form a monophyletic group with the
three Neandertal sequences.
Genomic segments with similarity to Neandertal genomes may
either derive from common ancestors of the two groups that lived
about half a million years ago or be contributed by Neandertals to
modern humans by mixing between the two groups when they met
less than 100,000 y ago (25). To test whether a segment of 75 kb
may have survived in this region of the genome since the common
ancestor of the groups without being broken down by recombi-
nation that affects chromosomes in each generation, we use a
published equation (29), a generation time of 29 y (30), a regional
recombination rate of 0.80 cM/Mb (31), and a split time between
Neandertals and modern humans of 550,000 y (1) followed by
interbreeding ∼50,000 y ago. Under these assumptions, in this
region, segments of length 16.3 kb or longer are not expected to
derive from the population ancestral to Neandertals and modern
humans (P = 0.05), making it highly unlikely that a 75-kb haplo-
type does so (P = 8.2e-9). We thus conclude that the haplotype
entered the human gene pool from Neandertals. In agreement
with this, a previous study (32) has described gene flow from
Neandertals in this genomic region.
COVID-19 Protection and Geographic Distribution. We find that the
index variant of the protective haplotype in the GenOMICC study
(rs10735079, P = 1.7e-8) matches all three Neandertal genomes
available. The relative risk of needing intensive care is reduced by
∼22% per copy of the Neandertal haplotype (under the rare disease
assumption, odds ratio [OR] = 0.78, 95% CI 0.71 to 0.85). As
expected given the phylogeny (Fig. 2), almost all of the alleles
cosegregating with the protective allele of the index SNP are found
in the Neandertal genomes (34 of 35 called SNPs; see SI Appendix,
Table S2, which, in contrast to Fig. 1, includes data contributed by
23andMe to HGI).
Today, the haplotype is almost completely absent in African
populations south of the Sahara but exists at frequencies of ∼25 to
30% in most populations in Eurasia (Fig. 3). In the Americas, it
occurs in lower frequencies in some populations of African ancestry,
presumably due to gene flow from populations of European or
Native American ancestry (33).
Putative Functional Variants. The Neandertal haplotype protective
against severe COVID-19 on chromosome 12 contains parts or all
of the three genes OAS1, OAS2, and OAS3, which encode oligoa-
denylate synthetases. These enzymes are induced by interferons and
activated by double-stranded RNA. They produce short-chain pol-
yadenylates, which, in turn, activate ribonuclease L, an enzyme that
degrades intracellular double-stranded RNA and activates other
antiviral mechanisms in cells infected by viruses (reviewed by
ref. 34).
To investigate which of these genes might be involved in pro-
tection against severe COVID-19, we plot the genomic location of
Fig. 1. Genetic variants associated with COVID-19 hospitalization at the
OAS locus. Variants marked in red have P values less than 1e-5. In Europeans,
they are in LD with the index variant (r2 ≥ 0.8), forming a haplotype (black
bar) with the genomic coordinates chr12: 113,350,796 to 113,425,679. P
values are from the HGI (24), excluding the 23andMe data for which only
sparse SNP data are available. The x axis gives hg19 coordinates; genes in the
region are indicated below. The three OAS genes are transcribed from left to
right. Yellow dots indicate rs10735079 (right, the GenOMICC index SNP) and
rs1156361 (left, typed by the Human Origins Array).
Fig. 2. Phylogeny relating DNA sequences associated with COVID-19 severity
on chromosome 12. Haplotypes from three Neandertal genomes, the Deni-
sovan genome, and haplotypes seen more than 20 times in individuals in the
1000 Genomes Project are included. The colored area indicates haplotypes that
carry the protective allele at rs1156361. The tree is rooted with the inferred
ancestral sequence from Ensembl (46). Six heterozygous positions in the ar-
chaic genomes were excluded. Haplotypes XXIX and XXX are partially made
up of Neandertal-like DNA sequences due to recombination events.
2 of 5 | PNAS Zeberg and Pääbo
https://doi.org/10.1073/pnas.2026309118 A genomic region associated with protection against severe COVID-19 is inherited from
Neandertals
D
ow
nl
oa
de
d
at
U
ni
ve
rs
ity
o
f T
ex
as
a
t A
rli
ng
to
n
on
F
eb
ru
ar
y
22
, 2
02
1
https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2026309118/-/DCSupplemental
https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2026309118/-/DCSupplemental
https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2026309118/-/DCSupplemental
https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2026309118/-/DCSupplemental
https://doi.org/10.1073/pnas.2026309118
the OAS genes below the P values for the SNPs associated with
severe COVID-19 (Fig. 1). While the association (P < 1.0e-5)
overlaps all three OAS genes, the SNPs with the most significant
associations (P < 5.0e-8) are in OAS3. However, the high level of
LD and stochasticity in the associations make any conclusion re-
garding causality based on P values tenuous.
Nevertheless, there are alleles on the Neandertal haplotype
which stand out as potentially functionally important. One SNP
(rs10774671) has been described as affecting a splice acceptor
site in OAS1 (35). The derived allele at this SNP, which is the
most frequent allele in present-day humans, alters splicing of
OAS1 transcript such that several protein isoforms are produced
instead of the ancestral isoform which is preserved in Neander-
tals (p46) (36). The latter, Neandertal-like isoform has higher
enzymatic activity than the derived isoforms common in modern
humans (37). Outside Africa, the ancestral allele is present only
in the context of the Neandertal haplotype, whereas, in Africa, it
exists independently of this haplotype, presumably as a genetic
variant inherited from the common ancestors of modern humans
and Neandertals that was lost in modern human populations that
left Africa (35).
In addition to the splice acceptor site, the Neandertal haplotype
contains a missense variant (rs2660) in OAS1, a missense variant
(rs1859330) and two synonymous variants (rs1859329 and
rs2285932) in OAS3, and a missense variant in OAS2 (rs1293767).
Three of these Neandertal-like variants are ancestral and occur in
Africa (rs2660, rs1859330, and rs1859329), whereas two are derived
in Neandertals (rs2285932 and rs1293767).
Several SNPs on the chromosome 12 haplotype have previously
been studied with respect to their effects on other viral infections.
The Neandertal-like splice acceptor variant has been associated
with protection against West Nile Virus (rs10774671, OR = 0.63,
95% CI 0.5–0.83) (38), and the Neandertal-like haplotype has
been associated with increased resistance to hepatitis C infections
(39). Notably, the Neandertal missense variant in OAS1 (rs2660)
(or variants in LD with this variant) has been shown to be asso-
ciated with moderate to strong protection against SARS-CoV
[OR = 0.42, 95% CI: 0.20 to 0.89 (40)], although this study was
limited in numbers of cases and controls. The SARS-CoV is
closely related to SARS-CoV-2, emerged in 2003, and caused a
mortality rate of ∼9% among infected individuals of all ages, and
much higher rates of fatalities in older individuals (41). Finally, the
Fig. 3. Geographic distribution of the allele indicative of the Neandertal haplotype protective against severe COVID-19. Pie charts indicate minor allele
frequency in red at rs1156361. Frequency data are from the 1000 Genomes Project (23). Map source data are from OpenStreetMap.
Fig. 4. Frequencies across time of two Neandertal haplotypes associated with COVID-19 severity. Frequencies for rs1156361 at the OAS locus on chromosome
12 (A) and rs10490770 at the chromosome 3 locus (B). Error bars indicate SE (Wilson scores). Time periods are indicated in years before present (bp). Ancient
data are from a compiled dataset (42), and present-day data are from the 1000 Genomes Project (23).
Zeberg and Pääbo PNAS | 3 of 5
A genomic region associated with protection against severe COVID-19 is inherited from
Neandertals
https://doi.org/10.1073/pnas.2026309118
G
EN
ET
IC
S
D
ow
nl
oa
de
d
at
U
ni
ve
rs
ity
o
f T
ex
as
a
t A
rli
ng
to
n
on
F
eb
ru
ar
y
22
, 2
02
1
https://doi.org/10.1073/pnas.2026309118
Neandertal versions of the OAS genes are expressed differently in
response to different viral infections in cells in tissue culture in
terms of both expression levels and splice forms (35).
Haplotype Frequencies across Time. During the past few years,
genome-wide data from thousands of prehistoric humans have
been generated and compiled (42). This makes it possible to begin
to directly gauge how frequencies of genetic variants have changed
over time. Although this approach is still limited by the relatively
small numbers of individuals and geographic regions for which
data are available, we apply it here for the two Neandertal-derived
haplotypes that affect the clinical outcomes upon infection with
SARS-CoV-2.
To tag the Neandertal OAS haplotype on chromosome 12, we
use an SNP (rs1156361) that carries a derived Neandertal-like
allele, is associated with the index variant of the GenOMICC
study (r2 = 0.99 in Eurasia), and is typed by the Affymetrix Human
Origins array used to study the majority of ancient human ge-
nomes used here (42). Although this analysis is limited in that it
tracks a single tag SNP, the fact that it is derived on the Nean-
dertal lineage and in LD with the Neandertal haplotype makes this
analysis feasible. We restrict the analysis to Eurasia and divide the
data into five time windows that vary between 20,000 and 2,000 y
in length, to balance the number of genomes available while still
allowing potential differences in frequency to be discerned.
Fig. 4A shows that the Neandertal OAS haplotype seems to have
occurred at frequencies below 10% prior to 20,000 y ago. Between
20,000 and 10,000 y ago, the allele frequency was in the order of
15%. Subsequently, it seems to have been present at frequencies at
or slightly below 20% until 3,000 y to 1,000 y ago. Intriguingly, the
current allele frequency in Eurasia is ∼30%, suggesting that the
NeandertalOAS haplotype may have increased in frequency relatively
recently.
To similarly estimate the frequency of the Neandertal risk
haplotype on chromosome 3 (20), we use the SNP rs10490770 that
fulfills the criteria applied above for the chromosome 12 haplo-
type (Fig. 4B). Prior to 20,000 y ago, we find no carrier of the risk
haplotype among 16 genomes available. Among individuals who
lived between 20,000 and 10,000 y ago and later, the haplotype is
present in ∼10% until today, when it occurs at a frequency of
∼12.5%. Thus, similar to the OAS locus, the Neandertal chro-
mosome 3 locus, the frequency seems to be lower in the period
prior to 20,000 y ago than in the later periods. However, the data
are still scarce, making this observation preliminary. In contrast to
the OAS locus, there is no indication of any increase in the fre-
quency of the Neandertal haplotype on chromosome 3 in
historical times.
We caution that the prehistoric data available are heavily bi-
ased toward western Eurasia and are still sparse, particularly for
older periods. However, additional data from ancient human
remains are rapidly being generated, making us confident that it
will soon be possible to identify loci that may have been the
targets of positive and negative selection, by studying allele fre-
quencies over time in certain geographical regions while cor-
recting for migration events that caused genome-wide shifts in
allele frequencies.
Despite theses caveats, it is interesting that the Neandertal-
derived OAS locus has recently increased in frequency in Eura-
sia. This is compatible with previous work on the variation
among present-day populations (32, 35, 43) suggesting that this
locus has been positively selected. It is also compatible with
Denisovans having contributed a version of this locus, which
carries ancestral variants, for example, at the slice acceptor site
(rs10774671), to people in Oceania, where it occurs at substan-
tial frequencies today (44).
Conclusions. A Neandertal haplotype on chromosome 12 is pro-
tective for severe disease in the current SARS-CoV-2 pandemic.
It is present in populations in Eurasia and the Americas at car-
rier frequencies that often reach and exceed 50%. The ancestral
Neandertal OAS locus variants may thus have been advanta-
geous to modern humans throughout Eurasia, perhaps due to
one or many epidemics involving RNA viruses, especially given
that the Neandertal haplotype has been found to be protective
for at least three RNA viruses (West Nile virus, hepatitis C virus,
SARS-CoV). Supporting this notion, simulations have demon-
strated that the Neandertal OAS haplotype has been under
positive selection in modern humans (35). Strikingly, the OAS1
protein encoded by the modern human OAS haplotype is of
lower enzymatic activity than the one encoded by the Neandertal
haplotype (37). This may have been advantageous at some point
in Africa, because loss-of-function mutations of the OAS1 locus
have occurred numerous times among primates (45), suggesting
that the maintenance of OAS1 activity is costly to an organism.
One may speculate that, when modern humans encountered new
RNA viruses outside Africa, the higher enzymatic activity of the
ancestral variants that they acquired through genetic interactions
with Neandertals may have been advantageous.
Intriguingly, there is evidence that the Neandertal-like OAS
haplotype may have recently increased in frequency in Eurasia
(Fig. 4A), suggesting that selection may have positively affected the
Neandertal-derived OAS locus in the last millennium. Future
studies of human remains from historical times will clarify whether,
and when, this occurred.
Materials and Methods
The index variants for the seven novel loci (rs9380142, rs143334143, rs3131294,
rs10735079, rs74956615, rs2109069, and rs2236757) were obtained from
GenOMICC (22). The regional summary statistics from the round 4 release of
the metaanalysis carried out by the COVID-19 HGI (24) (https://covid19hg.org/
results) was used to analyze the chromosome 12 locus (hospitalized vs. pop-
ulation controls, i.e., “B2” phenotype, using all ancestries but not including the
23andMe study, due to limited release of number of variants). LD was calcu-
lated using LDlink 4.1, and alleles were compared to the archaic genomes
using tabix (HTSlib 1.10). The haplotype associated with protection against
severe COVID-19 was investigated using phylogenetic software (PhyML 3.0),
and the probability of observing a haplotype of a certain length or longer due
to incomplete lineage sorting was calculated as described (29). The present-
day haplotypes were constructed by including all variable positions in the re-
gion chr12: 113,350,796 to 113,425,679, excluding singletons. Haplotypes seen
more than 10 times were included in the phylogenetic analysis. The inferred
ancestral states at variable positions among present-day humans were taken
from Ensembl. Genotypes of ancient genomes of modern humans were
obtained from a compiled database (42). Maps displaying allele frequencies of
different populations were made using Mathematica 11.0 (Wolfram Research,
Inc.) and OpenStreetMap data.
Data Availability.. Previously published data were used for this work (COVID-19
HGI 1000 Genomes Project).
ACKNOWLEDGMENTS. We are indebted to the COVID-19 HGI for making
the summary statistics of the genetic associations available and to the Max
Planck Society and the NOMIS Foundation for funding.
1. K. Prüfer et al., The complete genome sequence of a Neanderthal from the Altai
Mountains. Nature 505, 43–49 (2014).
2. M. Kuhlwilm et al., Ancient gene flow from early modern humans into Eastern Ne-
anderthals. Nature 530, 429–433 (2016).
3. M. Meyer et al., Nuclear DNA sequences from the Middle Pleistocene Sima de los
Huesos hominins. Nature 531, 504–507 (2016).
4. C. Posth et al., Deeply divergent archaic mitochondrial genome provides lower time
boundary for African gene flow into Neanderthals. Nat. Commun. 8, 16046 (2017).
5. M. Petr et al., The evolutionary history of Neanderthal and Denisovan Y chromo-
somes. Science 369, 1653–1656 (2020).
6. T. Higham et al., The timing and spatiotemporal patterning of Neanderthal disap-
pearance. Nature 512, 306–309 (2014).
4 of 5 | PNAS Zeberg and Pääbo
https://doi.org/10.1073/pnas.2026309118 A genomic region associated with protection against severe COVID-19 is inherited from
Neandertals
D
ow
nl
oa
de
d
at
U
ni
ve
rs
ity
o
f T
ex
as
a
t A
rli
ng
to
n
on
F
eb
ru
ar
y
22
, 2
02
1
https://covid19hg.org/results
https://covid19hg.org/results
https://doi.org/10.1073/pnas.2026309118
7. C. N. Simonti et al., The phenotypic legacy of admixture between modern humans
and Neandertals. Science 351, 737–741 (2016).
8. M. Dannemann, J. Kelso, The contribution of Neanderthals to phenotypic variation in
modern humans. Am. J. Hum. Genet. 101, 578–589 (2017).
9. H. Zeberg et al., A Neanderthal sodium channel increases pain sensitivity in present-
day humans. Curr. Biol. 30, 3465–3469.e4 (2020).
10. H. Zeberg, J. Kelso, S. Pääbo, The Neandertal progesterone receptor. Mol. Biol. Evol.
37, 2655–2660 (2020).
11. F. Racimo, S. Sankararaman, R. Nielsen, E. Huerta-Sánchez, Evidence for archaic
adaptive introgression in humans. Nat. Rev. Genet. 16, 359–371 (2015).
12. E. K. Karlsson, D. P. Kwiatkowski, P. C. Sabeti, Natural selection and infectious disease
in human populations. Nat. Rev. Genet. 15, 379–393 (2014).
13. L. Abi-Rached et al., The shaping of modern human immune systems by multiregional
admixture with archaic humans. Science 334, 89–94 (2011).
14. H. Quach et al., Genetic adaptation and Neandertal admixture shaped the immune
system of human populations. Cell 167, 643–656.e17 (2016).
15. M. Deschamps et al., Genomic signatures of selective pressures and introgression from
archaic hominins at human innate immunity genes. Am. J. Hum. Genet. 98, 5–21
(2016).
16. M. Dannemann, A. M. Andrés, J. Kelso, Introgression of Neandertal- and Denisovan-
like haplotypes contributes to adaptive variation in human toll-like receptors. Am.
J. Hum. Genet. 98, 22–33 (2016).
17. D. Enard, D. A. Petrov, Evidence that RNA viruses drove adaptive introgression be-
tween Neanderthals and modern humans. Cell 175, 360–371.e13 (2018).
18. D. Enard, D. A. Petrov, Ancient RNA virus epidemics through the lens of recent ad-
aptation in human genomes. Philos. Trans. R. Soc. Lond. B Biol. Sci. 375, 20190575
(2020).
19. D. Ellinghaus et al., Genomewide association study of severe Covid-19 with respira-
tory failure. N. Engl. J. Med. 383, 1522–1534 (2020).
20. H. Zeberg, S. Pääbo, The major genetic risk factor for severe COVID-19 is inherited
from Neanderthals. Nature 587, 610–612 (2020).
21. S. R. Browning, B. L. Browning, Y. Zhou, S. Tucci, J. M. Akey, Analysis of human se-
quence data reveals two pulses of archaic Denisovan admixture. Cell 173, 53–61.e9
(2018).
22. E. Pairo-Castineira et al.; GenOMICC Investigators; ISARICC Investigators; COVID-19
Human Genetics Initiative; 23andMe Investigators; BRACOVID Investigators; Gen-
COVID Investigators, Genetic mechanisms of critical illness in Covid-19. Nature,
10.1038/s41586-020-03065-y (2020).
23. A. Auton et al.; 1000 Genomes Project Consortium, A global reference for human
genetic variation. Nature 526, 68–74 (2015).
24. COVID-19 Host Genetics Initiative, The COVID-19 Host Genetics Initiative, a global
initiative to elucidate the role of host genetic factors in susceptibility and severity of
the SARS-CoV-2 virus pandemic. Eur. J. Hum. Genet. 28, 715–718 (2020).
25. S. Sankararaman, N. Patterson, H. Li, S. Pääbo, D. Reich, The date of interbreeding
between Neandertals and modern humans. PLoS Genet. 8, e1002947 (2012).
26. F. Mafessoni et al., A high-coverage Neandertal genome from Chagyrskaya Cave.
Proc. Natl. Acad. Sci. U.S.A. 117, 15132–15136 (2020).
27. K. Prüfer et al., A high-coverage Neandertal genome from Vindija Cave in Croatia.
Science 358, 655–658 (2017).
28. M. Meyer et al., A high-coverage genome sequence from an archaic Denisovan in-
dividual. Science 338, 222–226 (2012).
29. E. Huerta-Sánchez et al., Altitude adaptation in Tibetans caused by introgression of
Denisovan-like DNA. Nature 512, 194–197 (2014).
30. K. E. Langergraber et al., Generation times in wild chimpanzees and gorillas suggest
earlier divergence times in great ape and human evolution. Proc. Natl. Acad. Sci.
U.S.A. 109, 15716–15721 (2012).
31. A. Kong et al., A high-resolution recombination map of the human genome. Nat.
Genet. 31, 241–247 (2002).
32. F. L. Mendez, J. C. Watkins, M. F. Hammer, Neandertal origin of genetic variation at
the cluster of OAS immunity genes. Mol. Biol. Evol. 30, 798–801 (2013).
33. A. R. Martin et al., Human demographic history impacts genetic risk prediction across
diverse populations. Am. J. Hum. Genet. 100, 635–649 (2017).
34. U. Y. Choi, J.-S. Kang, Y. S. Hwang, Y.-J. Kim, Oligoadenylate synthase-like (OASL)
proteins: Dual functions and associations with diseases. Exp. Mol. Med. 47, e144
(2015).
35. A. J. Sams et al., Adaptively introgressed Neandertal haplotype at the OAS locus
functionally impacts innate immune responses in humans. Genome Biol. 17, 246
(2016).
36. H. Li et al.; for UK Primary Sjögren’s Syndrome Registry, Identification of a Sjögren’s
syndrome susceptibility locus at OAS1 that influences isoform switching, protein ex-
pression, and responsiveness to type I interferons. PLoS Genet. 13, e1006820 (2017).
37. V. Bonnevie-Nielsen et al., Variation in antiviral 2′,5′-oligoadenylate synthetase
(2‘5’AS) enzyme activity is controlled by a single-nucleotide polymorphism at a splice-
acceptor site in the OAS1 gene. Am. J. Hum. Genet. 76, 623–633 (2005).
38. J. K. Lim et al., Genetic variation in OAS1 is a risk factor for initial infection with West
Nile virus in man. PLoS Pathog. 5, e1000321 (2009).
39. M. K. El Awady et al., Single nucleotide polymorphism at exon 7 splice acceptor site of
OAS1 gene determines response of hepatitis C virus patients to interferon therapy.
J. Gastroenterol. Hepatol. 26, 843–850 (2011).
40. J. He et al., Association of SARS susceptibility with single nucleic acid polymorphisms
of OAS1 and MxA genes: A case-control study. BMC Infect. Dis. 6, 106 (2006).
41. M. D. Sørensen et al., Severe acute respiratory syndrome (SARS): Development of
diagnostics and antivirals. Ann. N. Y. Acad. Sci. 1067, 500–505 (2006).
42. David Reich Lab, Allen Ancient DNA Resource (AADR): Downloadable genotypes of
present-day and ancient DNA data, version 42.4, https://reich.hms.harvard.edu/allen-
ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-
data. Accessed 19 April 2020.
43. S. Yair, K. M. Lee, G. Coop, The timing of human adaptation from Neanderthal in-
trogression. bioRxiv, [Preprint] (2020). 2020.10.04.325183. Accessed 30 November
2020.
44. F. L. Mendez, J. C. Watkins, M. F. Hammer, Global genetic variation at OAS1 provides
evidence of archaic admixture in Melanesian populations. Mol. Biol. Evol. 29,
1513–1520 (2012).
45. C. M. Carey et al., Recurrent loss-of-function mutations reveal costs to OAS1 antiviral
activity in primates. Cell Host Microbe 25, 336–343.e4 (2019).
46. A. D. Yates et al., Ensembl 2020. Nucleic Acids Res. 48, D682–D688 (2020).
Zeberg and Pääbo PNAS | 5 of 5
A genomic region associated with protection against severe COVID-19 is inherited from
Neandertals
https://doi.org/10.1073/pnas.2026309118
G
EN
ET
IC
S
D
ow
nl
oa
de
d
at
U
ni
ve
rs
ity
o
f T
ex
as
a
t A
rli
ng
to
n
on
F
eb
ru
ar
y
22
, 2
02
1
https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-data
https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-data
https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-data
https://doi.org/10.1073/pnas.2026309118
610 | Nature | Vol 587 | 26 November 2020
Article
The major genetic risk factor for severe
COVID-19 is inherited from Neanderthals
Hugo Zeberg1,2 ✉ & Svante Pääbo1,3 ✉
A recent genetic association study1 identified a gene cluster on chromosome 3 as a risk
locus for respiratory failure after infection with severe acute respiratory syndrome
coronavirus 2 (SARS-CoV-2). A separate study (COVID-19 Host Genetics Initiative)2
comprising 3,199 hospitalized patients with coronavirus disease 2019 (COVID-19) and
control individuals showed that this cluster is the major genetic risk factor for severe
symptoms after SARS-CoV-2 infection and hospitalization. Here we show that the risk
is conferred by a genomic segment of around 50 kilobases in size that is inherited from
Neanderthals and is carried by around 50% of people in south Asia and around 16% of
people in Europe.
The COVID-19 pandemic has caused considerable morbidity and mortal-
ity, and has resulted in the death of over a million people to date3. The
clinical manifestations of the disease caused by the virus, SARS-CoV-2,
vary widely in severity, ranging from no or mild symptoms to rapid
progression to respiratory failure4. Early in the pandemic, it became
clear that advanced age is a major risk factor, as well as being male and
some co-morbidities5. These risk factors, however, do not fully explain
why some people have no or mild symptoms whereas others have severe
symptoms. Thus, genetic risk factors may have a role in disease pro-
gression. A previous study1 identified two genomic regions that are
associated with severe COVID-19: one region on chromosome 3, which
contains six genes, and one region on chromosome 9 that determines
ABO blood groups. Recently, a dataset was released by the COVID-19
Host Genetics Initiative in which the region on chromosome 3 is the
only region that is significantly associated with severe COVID-19 at the
genome-wide level (Fig. 1a). The risk variant in this region confers an
odds ratio for requiring hospitalization of 1.6 (95% confidence interval,
1.42–1.79) (Extended Data Fig. 1).
The genetic variants that are most associated with severe COVID-
19 on chromosome 3 (45,859,651–45,909,024 (hg19)) are all in high
linkage disequilibrium (LD)—that is, they are all strongly associated
with each other in the population (r2 > 0.98)—and span 49.4 thousand
bases (kb) (Fig. 1b). This ‘core’ haplotype is furthermore in weaker link-
age disequilibrium with longer haplotypes of up to 333.8 kb (r2 > 0.32)
(Extended Data Fig. 2). Some such long haplotypes have entered the
human population by gene flow from Neanderthals or Denisovans,
extinct hominins that contributed genetic variants to the ancestors of
present-day humans around 40,000–60,000 years ago6,7. We therefore
investigated whether the haplotype may have come from Neanderthals
or Denisovans.
The index variants of the two studies1,2 are in high linkage disequi-
librium (r2 > 0.98) in non-African populations (Extended Data Fig. 3).
We found that the risk alleles of both of these variants are present in a
homozygous form in the genome of the Vindija 33.19 Neanderthal, an
approximately 50,000-year-old Neanderthal from Croatia in southern
Europe8. Of the 13 single nucleotides polymorphisms constituting the
core haplotype, 11 occur in a homozygous form in the Vindija 33.19
Neanderthal (Fig. 1b). Three of these variants occur in the Altai9 and
Chagyrskaya 810 Neanderthals, both of whom come from the Altai
Mountains in southern Siberia and are around 120,000 and about
60,000 years old, respectively (Extended Data Table 1), whereas none
of the variants occurs in the Denisovan genome11. In the 333.8-kb hap-
lotype, the alleles associated with risk of severe COVID-19 similarly
match alleles in the genome of the Vindija 33.19 Neanderthal (Fig. 1b).
Thus, the risk haplotype is similar to the corresponding genomic region
in the Neanderthal from Croatia and less similar to the Neanderthals
from Siberia.
We next investigated whether the core 49.4-kb haplotype might be
inherited by both Neanderthals and present-day people from the com-
mon ancestors of the two groups that lived about 0.5 million years ago9.
The longer a present-day human haplotype shared with Neanderthals
is, the less likely it is to originate from the common ancestor, because
recombination in each generation will tend to break up haplotypes into
smaller segments. Assuming a generational time of 29 years12, the local
recombination rate13 (0.53 cM per Mb), a split between Neanderthals
and modern humans of 550,000 years9 and interbreeding between the
two groups around 50,000 years ago, and using a published equation14,
we exclude that the Neanderthal-like haplotype derives from the com-
mon ancestor (P = 0.0009). For the 333.8-kb-long Neanderthal-like
haplotype, the probability of an origin from the common ancestral
population is even lower (P = 1.6 × 10−26). The risk haplotype thus entered
the modern human population from Neanderthals. This is in agree-
ment with several previous studies, which have identified gene flow
from Neanderthals in this chromosomal region15–21 (Extended Data
Table 2). The close relationship of the risk haplotype to the Vindija 33.19
Neanderthal is compatible with this Neanderthal being closer to the
majority of the Neanderthals who contributed DNA to present-day
people than the other two Neanderthals10.
A Neanderthal haplotype that is found in the genomes of the present
human population is expected to be more similar to a Neanderthal
genome than to other haplotypes in the current human population.
To investigate the relationships of the 49.4-kb haplotype to Neander-
thal and other human haplotypes, we analysed all 5,008 haplotypes
in the 1000 Genomes Project22 for this genomic region. We included
https://doi.org/10.1038/s41586-020-2818-3
Received: 3 July 2020
Accepted: 22 September 2020
Published online: 30 September 2020
Check for updates
1Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany. 2Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden. 3Okinawa Institute of Science and Technology,
Onna-son, Japan. ✉e-mail: hugo.zeberg@ki.se; paabo@eva.mpg.de
https://doi.org/10.1038/s41586-020-2818-3
http://crossmark.crossref.org/dialog/?doi=10.1038/s41586-020-2818-3&domain=pdf
mailto:hugo.zeberg@ki.se
mailto:paabo@eva.mpg.de
Nature | Vol 587 | 26 November 2020 | 611
all positions that are called in the Neanderthal genomes and excluded
variants found on only one chromosome and haplotypes seen only once
in the 1000 Genomes Project data. This resulted in 253 present-day
haplotypes that contained 450 variable positions. Figure 2 shows a
phylogeny relating the haplotypes that were found more than 10 times
(see Extended Data Fig. 4 for all haplotypes). We find that all risk hap-
lotypes associated with severe COVID-19 form a clade with the three
high-coverage Neanderthal genomes. Within this clade, they are
most closely related to the Vindija 33.19 Neanderthal.
Among the individuals in the 1000 Genomes Project, the
Neanderthal-derived haplotypes are almost completely absent from
Africa, consistent with the idea that gene flow from Neanderthals into
African populations was limited and probably indirect20. The Neander-
thal core haplotype occurs in south Asia at an allele frequency of 30%,
in Europe at an allele frequency of 8%, among admixed Americans with
an allele frequency of 4% and at lower allele frequencies in east Asia23
(Fig. 3). In terms of carrier frequencies, we find that 50% of people in
South Asia carry at least one copy of the risk haplotype, whereas 16% of
people in Europe and 9% of admixed American individuals carry at least
one copy of the risk haplotype. The highest carrier frequency occurs in
Bangladesh, where more than half the population (63%) carries at least
one copy of the Neanderthal risk haplotype and 13% is homozygous for
the haplotype. The Neanderthal haplotype may thus be a substantial
contributor to COVID-19 risk in some populations in addition to other
risk factors, including advanced age. In apparent agreement with this,
individuals of Bangladeshi origin in the UK have an about two times
higher risk of dying from COVID-19 than the general population24 (haz-
ard ratio of 2.0, 95% confidence interval, 1.7–2.4).
It is notable that the Neanderthal risk haplotype occurs at a frequency
of 30% in south Asia whereas it is almost absent in east Asia (Fig. 3). This
extent of difference in allele frequencies between south and east Asia is
unusual (P = 0.006, Extended Data Fig. 5) and indicates that it may have
been affected by selection in the past. Indeed, previous studies have
suggested that the Neanderthal haplotype has been positively selected
in Bangladesh25. At this point, we can only speculate about the reason
for this—one possibility is protection against other pathogens. It is also
possible that the haplotype has decreased in frequency in east Asia
a b
1
4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 2
2
–l
og
10
(P
)
Li
nk
ag
e
d
is
eq
ui
lib
riu
m
(r
2 )
Chromosome 3 coordinate (Mb)
LIMD1
SACM1L
SLC6A20
LZTFL1 XCR1
FYCO1
CXCR
6
CCR9
CCR1
CCR3
12
1.0
0.9
0.
8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
45.6 45.7 45.8 45.9 46.0 46.1 46.2 46.6
10
8
6
4
2
Chromosome
Fig. 1 | Genetic variants associated with severe COVID-19. a, Manhattan plot
of a genome-wide association study of 3,199 hospitalized patients with
COVID-19 and 897,488 population controls. The dashed line indicates genome-
wide significance (P = 5 × 10−8). Data were modified from the COVID-19 Host
Genetics Initiative2 (https://www.covid19hg.org/). b, Linkage disequilibrium
between the index risk variant (rs35044562) and genetic variants in the 1000
Genomes Project. Red circles indicate genetic variants for which the alleles are
correlated to the risk variant (r2 > 0.1) and the risk alleles match the Vindija 33.19
Neanderthal genome. The core Neanderthal haplotype (r2 > 0.98) is indicated
by a black bar. Some individuals carry longer Neanderthal-like haplotypes. The
location of the genes in the region are indicated below using standard gene
symbols. The x axis shows hg19 coordinates.
V
I
X
X
V
IIIL
IX
LXL
I
XX
IX
XX
XI
V
XX
XV
IIXL
IIIXLIV
XLV
XXXI
XLVI
X
X
XVI
XLVII
XXXII
XXX
XXXIX
XXXV
XXXIII
XLII
XX
XVIII
Ancestral
XX
XLVIII
L
XLIX
XIX
IX
V
II
V
III V X XI
XI
V
XI
II
XI
I
XX
V
X
XVII
XXVI
XV
XVII
XVI
XVIII
Altai
Chagyrskaya
Vindija
I
IV
II
III
XXI
XXIIIXXIILIILIIILIVLVLV
I
X
X
IV 0.01
97
100
Fig. 2 | Phylogeny relating the DNA sequences that cover the core
Neanderthal haplotype in individuals from the 1000 Genomes Project and
Neanderthals. The coloured area highlights the haplotypes that carry the risk
allele at rs35044562—that is, the risk haplotypes for severe COVID-19. Arabic
numbers indicate bootstrap support (100 replicates). The phylogeny is rooted
with the inferred ancestral sequence of present-day humans. The three
Neanderthal genomes carry no heterozygous positions in this region. Scale
bar, number of substitutions per nucleotide position.
https://www.covid19hg.org/
612 | Nature | Vol 587 | 26 November 2020
Article
owing to negative selection, perhaps because of coronaviruses or other
pathogens. In any case, the COVID-19 risk haplotype on chromosome 3 is
similar to some other Neanderthal and Denisovan genetic variants that
have reached high frequencies in some populations owing to positive
selection or drift14,26–28, but it is now under negative selection owing to
the COVID-19 pandemic.
It is currently not known what feature in the Neanderthal-derived
region confers risk for severe COVID-19 and whether the effects of
any such feature are specific to SARS-CoV-2, to other coronaviruses or
to other pathogens. Once the functional feature is elucidated, it may
be possible to speculate about the susceptibility of Neanderthals to
relevant pathogens. However, with respect to the current pandemic,
it is clear that gene flow from Neanderthals has tragic consequences.
Online content
Any methods, additional references, Nature Research reporting sum-
maries, source data, extended data, supplementary information,
acknowledgements, peer review information; details of author con-
tributions and competing interests; and statements of data and code
availability are available at https://doi.org/10.1038/s41586-020-2818-3.
1. Ellinghaus, D. et al. Genomewide association study of severe COVID-19 with respiratory
failure. N. Engl. J. Med. https://doi.org/10.1056/NEJMoa2020283 (2020).
2. COVID-19 Host Genetics Initiative. The COVID-19 Host Genetics Initiative, a global
initiative to elucidate the role of host genetic factors in susceptibility and severity of the
SARS-CoV-2 virus pandemic. Eur. J. Hum. Genet. 28, 715–718 (2020).
3. WHO. Coronavirus disease (COVID-19) Weekly Epidemiological Update and Weekly
Operational Update: Weekly Epidemiological Update 14 September 2020 https://
www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports (2020).
4. Vetter, P. et al. Clinical features of COVID-19. Br. Med. J. 369, m1470 (2020).
5. Zhou, F. et al. Clinical course and risk factors for mortality of adult inpatients with
COVID-19 in Wuhan, China: a retrospective cohort study. Lancet 395, 1054–1062 (2020).
6. Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722
(2010).
7. Sankararaman, S., Patterson, N., Li, H., Pääbo, S. & Reich, D. The date of interbreeding
between Neandertals and modern humans. PLoS Genet. 8, e1002947 (2012).
8. Prüfer, K. et al. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science
358, 655–658 (2017).
9. Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai
Mountains. Nature 505, 43–49 (2014).
10. Mafessoni, F. et al. A high-coverage Neandertal genome from Chagyrskaya Cave. Proc.
Natl Acad. Sci. USA 117, 15132–15136 (2020).
11. Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual.
Science 338, 222–226 (2012).
12. Langergraber, K. E. et al. Generation times in wild chimpanzees and gorillas suggest
earlier divergence times in great ape and human evolution. Proc. Natl Acad. Sci. USA 109,
15716–15721 (2012).
13. Kong, A. et al. A high-resolution recombination map of the human genome. Nat. Genet.
31, 241–247 (2002).
14. Huerta-Sánchez, E. et al. Altitude adaptation in Tibetans caused by introgression of
Denisovan-like DNA. Nature 512, 194–197 (2014).
15. Sankararaman, S. et al. The genomic landscape of Neanderthal ancestry in present-day
humans. Nature 507, 354–357 (2014).
16. Vernot, B. & Akey, J. M. Resurrecting surviving Neandertal lineages from modern human
genomes. Science 343, 1017–1021 (2014).
17. Vernot, B. et al. Excavating Neandertal and Denisovan DNA from the genomes of
Melanesian individuals. Science 352, 235–239 (2016).
18. Steinrücken, M., Spence, J. P., Kamm, J. A., Wieczorek, E. & Song, Y. S. Model-based
detection and analysis of introgressed Neanderthal ancestry in modern humans. Mol.
Ecol. 27, 3873–3888 (2018).
19. Gittelman, R. M. et al. Archaic hominin admixture facilitated adaptation to out-of-Africa
environments. Curr. Biol. 26, 3375–3382 (2016).
20. Chen, L., Wolf, A. B., Fu, W., Li, L. & Akey, J. M. Identifying and interpreting apparent
Neanderthal ancestry in African individuals. Cell 180, 677–687 (2020).
21. Skov, L. et al. The nature of Neanderthal introgression revealed by 27,566 Icelandic
genomes. Nature 582, 78–83 (2020).
22. The 1000 Genomes Project Consortium. A global reference for human genetic variation.
Nature 526, 68–74 (2015).
23. OpenStreetMap. Planet OSM. https://planet.osm.org/ (2017).
24. Public Health England. COVID-19: Review of Disparities in Risks and Outcomes. https://
www.gov.uk/government/publications/covid-19-review-of-dis
parities-in-risks-and-outcomes (2020).
25. Browning, S. R., Browning, B. L., Zhou, Y., Tucci, S. & Akey, J. M. Analysis of human
sequence data reveals two pulses of archaic Denisovan admixture. Cell 173, 53–61 (2018).
26. Dannemann, M., Andrés, A. M. & Kelso, J. Introgression of Neandertal- and Denisovan-like
haplotypes contributes to adaptive variation in human Toll-like receptors. Am. J. Hum.
Genet. 98, 22–33 (2016).
27. Zeberg, H., Kelso, J. & Pääbo, S. The Neandertal progesterone receptor. Mol. Biol. Evol. 37,
2655–2660 (2020).
28. Zeberg, H. et al. A Neanderthal sodium channel increases pain sensitivity in present-day
humans. Curr. Biol. 30, 3465–3469 (2020).
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
© The Author(s), under exclusive licence to Springer Nature Limited 2020
Fig. 3 | Geographical distribution of the Neanderthal core haplotype that confers risk for severe COVID-19. Pie charts show the minor allele frequency at
rs35044562. Frequency data were obtained from the 1000 Genomes Project22. Map source data were obtained from OpenStreetMap23.
https://doi.org/10.1038/s41586-020-2818-3
https://doi.org/10.1056/NEJMoa2020283
https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports
https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports
https://planet.osm.org/
https://www.gov.uk/government/publications/covid-19-review-of-disparities-in-risks-and-outcomes
https://www.gov.uk/government/publications/covid-19-review-of-disparities-in-risks-and-outcomes
https://www.gov.uk/government/publications/covid-19-review-of-disparities-in-risks-and-outcomes
Methods
Linkage disequilibrium was calculated using LDlink 4.129 and alleles were
compared to the archaic genomes8–11 using tabix30 (HTSlib 1.10). Haplo-
types were constructed from the phase 3 release of the 1000 Genomes
Project22 as described. Phylogenies were estimated with phyML 3.331
using the Hasegawa–Kishino–Yano-8532 substitution model with a
gamma shape parameter and the proportion of invariant sites estimated
from the data. The probability of observing a haplotype of a particular
length or longer owing to incomplete lineage sorting was calculated as
previously described14. The inferred ancestral states at variable positions
among present-day humans were taken from Ensembl33. The distribu-
tion of frequency differences of Neanderthal haplotypes between east
and south Asia was computed by filtering diagnostic Neanderthal vari-
ants (fixed positions in the three high-coverage Neanderthal genomes
and the Neanderthal allele missing in 108 Yoruba individuals) using a
published introgression map20, followed by pruning using PLINK1.9034
(r2 cut-off of 0.5 in a sliding window of 100 variants) and allele frequency
assessment in the 1000 Genomes Project. Maps displaying allele fre-
quencies and linkage disequilibrium in different populations were made
using Mathematica 11.0 (Wolfram Research) and OpenStreetMap data.
For the meta-analysis carried out by the COVID-19 Host Genetics
Initiative2, participants consented and ethical approvals were obtained
(https://www.covid19hg.org/partners/). The following eight stud-
ies contributed to the meta-analysis of hospitalization versus pop-
ulation controls: Genetic modifiers for COVID-19-related disease
‘BelCovid’ (Université Libre de Bruxelles, Belgium), Genetic deter-
minants of COVID-19 complications in the Brazilian population ‘BRA-
COVID’ (University of Sao Paulo, Brazil), deCODE (deCODE Genetics,
Iceland), FinnGen (Institute for Molecular Medicine Finland, Finland),
GEN-COVID (University of Siena, Italy), Genes & Health (Queen Mary
University of London, UK), COVID-19-Host(age) (Kiel University and
University Hospitals of Oslo and Schleswig-Holstein, Germany and
Norway) and the UK Biobank (UK).
Reporting summary
Further information on research design is available in the Nature
Research Reporting Summary linked to this paper.
Data availability
The summary statistics of the genome-wide association study that
support the finding of this study are available from the COVID-19 Host
Genetics Initiative (round 3, ANA_B2_V2: hospitalized patients with
COVID-19 compared with population controls; https://www.covid19hg.
org/). The genomes used are available from the 1000 Genomes Project
(phase 3 release, https://www.internationalgenome.org/) and the Max
Planck Institute for Evolutionary Anthropology (Chagyrskaya, Altai
and Vindija 33.19, http://cdna.eva.mpg.de/neandertal/). The ancestral
alleles are available at Ensembl (release 100, https://www.ensembl.
org/). Map data are from OpenStreetMap and available from https://
www.openstreetmap.org.
29. Machiela, M. J. & Chanock, S. J. LDlink: a web-based application for exploring
population-specific haplotype structure and linking correlated alleles of possible
functional variants. Bioinformatics 31, 3555–3557 (2015).
30. Li, H. Tabix: fast retrieval of sequence features from generic TAB-delimited files.
Bioinformatics 27, 718–719 (2011).
31. Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood
phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
32. Hasegawa, M., Kishino, H. & Yano, T. Dating of the human–ape splitting by a molecular
clock of mitochondrial DNA. J. Mol. Evol. 22, 160–174 (1985).
33. Yates, A. D. et al. Ensembl 2020. Nucleic Acids Res. 48, D682–D688 (2020).
34. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer
datasets. Gigascience 4, 7 (2015).
Acknowledgements We thank the COVID-19 Host Genetics Initiative for making the data from
the genome-wide association study available, and the Max Planck Society and the NOMIS
Foundation for funding.
Author contributions H.Z. performed the haplotype analysis. H.Z. and S.P. jointly wrote the
manuscript.
Competing interests The authors declare no competing interests.
Additional information
Supplementary information is available for this paper at https://doi.org/10.1038/s41586-020-
2818-3.
Correspondence and requests for materials should be addressed to H.Z. or S.P.
Peer review information Nature thanks Tobias Lenz, Yang Luo and the other, anonymous,
reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are
available.
Reprints and permissions information is available at http://www.nature.com/reprints.
https://www.covid19hg.org/partners/
https://www.covid19hg.org/
https://www.covid19hg.org/
https://www.internationalgenome.org/
http://cdna.eva.mpg.de/neandertal/
https://www.ensembl.org/
https://www.ensembl.org/
https://www.openstreetmap.org/
https://www.openstreetmap.org/
https://doi.org/10.1038/s41586-020-2818-3
https://doi.org/10.1038/s41586-020-2818-3
http://www.nature.com/reprints
Article
Extended Data Fig. 1 | Odds ratios for hospitalization owing to COVID-19 for
cohorts contributing to the meta-analysis (round 3) of the COVID-19 Host
Genetics Initiative (rs35044562). The odds ratio and the P value for the
summary effect are odds ratio = 1.60 (95% confidence interval, 1.42–1.79) and
P = 3.1 × 10−15 (two-sided z-test, n = 3,199 patients with COVID-19 and 897,488
controls over 8 independent studies). Data are the odds ratios and 95%
confidence intervals. HOST(age), UK Biobank European (EUR), GENCOVID,
deCODE and BelCovid use European population controls. BRACOVID, Genes &
Health and FinnGen use American, south Asian and Finnish population
controls, respectively.
Extended Data Fig. 2 | Pairwise linkage disequilibrium between diagnostic
Neanderthal variants. Heat map of linkage disequilibrium between genetic
variants in which one allele is shared with three Neanderthal genomes and
missing in 108 Yoruba individuals. The black box highlights a haplotype of
333.8 kb between rs17763537 and rs13068572 (chromosome 3: 45,843,315–
46,177,096). Red, r2 correlation; blue, D′ correlation.
Article
Extended Data Fig. 3 | Linkage disequilibrium between index variant
rs11385942 and the index variant of the COVID-19 Host Genetics Initiative
(rs35044562). Shades of red indicate the extent of linkage disequilibrium (r2)
in the populations included in the 1000 Genomes Project. Populations labelled
‘n/a’ are monomorphic for the protective allele of rs35044562. The previously
described index variant (rs11385942)1 does not have any genetic variants in
linkage disequilibrium (r2 > 0.8) in populations from Africa. Map source data
from OpenStreetMap23.
Extended Data Fig. 4 | Phylogeny of haplotypes in individuals included in
the 1000 Genomes Project and Neanderthals covering the genomic region
of the core risk haplotype. The shaded area highlights a monophyletic group
that contains all present-day haplotypes carrying the risk allele at rs35044562
and the haplotypes of the three high-coverage Neanderthals. Arabic numbers
show bootstrap support (100 replicates). The tree is rooted with the inferred
ancestral human sequence. Scale bar, number of substitutions per nucleotide
position.
Article
Extended Data Fig. 5 | Frequency differences between south and east Asia for haplotypes introgressed from Neanderthals. The dashed line indicates the
frequency difference for the Neanderthal haplotype that confers risk of severe COVID-19.
Extended Data Table 1 | Genetic variants in LD (r2 > 0.98) with rs35044562 and the corresponding Neanderthal variants
Data from the 1000 Genomes Project22. ‘Ref’ indicates the alleles from hg19. The three Neanderthal genomes are homozygous at these positions. LD, linkage disequilibrium.
Article
Extended Data Table 2 | Previous studies that identified gene flow from Neanderthals at the core haplotype
The hg19 coordinates for the previously identified15–21 introgressed haplotypes are shown.
�
������������
����
�����
���
������������
�������
���������
��������� !�”#$%&��’�()*#�%$� #%� +,+,#$%&��’�()-����%!�”.$//#�,0#%$��-���#�1&2!�&��%�%�!/���3�%&������ $1!+!4!%,�5�5%&�2��6%&#%2�2��$+4!�&78&!�5��/���3! ���%�$1%$��5��1���!�%��1,#� %�#���#���1,!�!������%!�”79��5$�%&��!�5��/#%!������0#%$��-���#�1&��4!1!��:����$�; !%��!#4<�4!1!��#� %&�; !%��!#4<�4!1,�&�164!�%7 .%#%!�%!1�9��#44�%#%!�%!1#4#�#4,���:1��5!�/%&#%%&�5�44�2!�"!%�/�#��������%!�!�%&�5!"$��4�"�� :%#+4�4�"�� :/#!�%�=%:����>�%&� ���1%!��7�?#���5!�/� 8&��=#1%�#/�4��!@�’A(5���#1&�=���!/��%#4″��$�?1�� !%!��:”!3��#�#�# !�1��%��$/+��#� $�!%�5�5/�#�$��/��%B�%#%�/��%����2&�%&��/�#�$��/��%�2���%#6��5��/ !�%!�1%�#/�4������2&�%&��%&��#/��#/�4�2#�/�#�$�� ����#%� 4,8&��%#%!�%!1#4%��%’�($�� B0C2&�%&��%&�,#�����D����%2�D�! � EAFGHIJKKJAHLMNLNHNOJPFQHRMHQMNISTRMQHNJFMFGHRGHAUKMVHQMNISTRMHKJSMHIJKWFMXHLMIOATYPMNHTAHLOMHZMLOJQNHNMILTJA[B ��1�!�%!���5�5#441�3#�!#%��%��%� B ��1�!�%!���5�5#�,#��$/�%!�������1����1%!���:�$1&#�#�%��%��5�5���/#4!%,#� # \$�%/��%5��/$4%!�4�1�/�#�!����B5$44 ��1�!�%!���5�5%&��%#%!�%!1#4�#�#/�%���!�14$ !�”1��%�#4%�� ��1,’�7″7/�#��(�����%&��+#�!1��%!/#%��’�7″7��”����!��1��55!1!��%(B0C3#�!#%!��’�7″7�%#� #� �3!#%!��(����#���1!#%� ��%!/#%���5�5$�1��%#!�%,’�7″71��5! ��1�!�%��3#4�(9���$44&,��%&��!�%��%!�”:%&�%��%�%#%!�%!1’�7″7]:L:S(2!%&1��5! ��1�!�%��3#4�:�55�1%�!@��: �”�����5�55��� �/#� ̂ 3#4$���%� _T̀MĤH̀UFPMNHUNHMXUILH̀UFPMNHaOMAM̀MSHNPTLURFM[9��b#,��!#�#�#4,�!�:!�5��/#%!������%&�1&�!1��5�5��!���#� >#�6�31&#!�>��%��#�4���%%!�”�9��&!��#�1&!1#4#� 1�/�4�= ��!”��:! ��%!5!1#%!���5�5%&�#������!#%�4�3�45��%��%�#� 5$44�����%!�”�5�5�$%1�/��;�%!/#%���5�5�55�1%�!@��’�7″7��&��c�Q:<�#����c�S(:(:!� !1#%!�"&�2%&�,2���1#41$4#%� EPSHaMRHIJFFMILTJAHJAHNLULTNLTINHdJSHRTJFJeTNLNHIJALUTANHUSLTIFMNHJAHKUAGHJdHLOMHWJTALNHURJ̀M[.�5%2#��#� 1� �<�4!1,!�5��/#%!��#+�$%#3#!4#+!4!%,�5�51�/�$%��1� �C#%#1�44�1%!��C#%##�#4,�!� 9��/#�$�1�!�%�$%!4!@!�"1$�%�/#4"��!%&/�������5%2#��%&#%#��1��%�#4%�%�%&�����#�1&+$%��%,�% ��1�!+� !�!��$+4!�&� 4!%��#%$��:��5%2#��/$�%+�+�/# �#3#!4#+4�%�%�� !%���#� ��3!�2���7f�f��%���"4,��1�$�#"�1� � ����!%!��!�!�#1�//$�!%,�����!%��,'�7"7g!%h$+(7.��%&�0#%$��-���#�1&"$! �4!���5���$+/!%%!�"1� �i��5%2#��5��5$�%&��!�5��/#%!��7C#%#<�4!1,!�5��/#%!��#+�$%#3#!4#+!4!%,�5�5 #%#B44/#�$�1�!�%�/$�%!�14$ �# #%##3#!4#+!4!%,�%#%�/��%8&!��%#%�/��%�&�$4 ���3! �%&�5�44�2!�"!�5��/#%!��:2&���#��4!1#+4�)DB11���!��1� ��:$�!j$�! ��%!5!���:����2�+4!�6�5���$+4!14,#3#!4#+4� #%#��%�DB4!�%�5�55!"$���%&#%�#���1!#%� �#2 #%#DB ��1�!�%!���5�5#�,���%�!1%!������� #%##3#!4#+!4!%,
h$”�k�+��”.��lm:nono
p
p
p
pp
p
p
pp p
0�0���2 #%#2#���� $1� !�!�%&�������%�%$ ,7*C4!�6q7l5��4!�6#”� !��j$!4!+�!$/’*C(:<&,>*r7r5��%&�/#=!/$/D4!6�4!&�� �&,4�”��!��:%#+!=’h8.4!+l7lo(5��1#44!�”3#�!#�%�!�!�%&�”���/��7<*s0tl7uo5��*C*C��$�!�"7>#%&�/#%!1#ll7o5��1��#%!�”/#��7B44��5%2#��#���$+4!14,#3#!4#+4�#� �=1��%>#%&�/#%!1#5����5�51&#�”�7
gfB.���$4%�’��$� r:r:B0Bvbnvwn()&%%��)??22271�3! lu&”7��”?���$4%�?0�#� ��%#4″���/��’B4%#!:w!� !\#rr7lu:�&#”,��6#,#()&%%�)??1 �#7�3#7/�”7 �?��#� ��%#4?looo”���/�����\�1%’�&#��r��4�#��()&%%��)??2227!�%���#%!��#4″���/�7��”?;���/+4’��4�#��loo()&%%�)??2227����/+47��”?
x
������������
����
�����
���
������������
�������
���������
9!�4 D���1!5!1�����%!�”<4�#����4�1%%&����+�4�2%&#%!�!�%&�+��%5!%5��,�$�����#�1&7s5s5,�$#����%�$��:��# %&�#������!#%���1%!���+�5���/#6!�",�$���4�1%!��7*!5��1!��1�� b�!�$�#4i��1!#4�1!��1�� ;1�4�"!1#4:�3�4$%!��#�,i��3!���/��%#4�1!��1��9��#��5����1�1��,�5�5%&� �1$/��%2!%,��1%!���:����#%$��71�/? �1$/��%�?��D�����%!�"D�$//#�,D54#%7� 5*!5��1!��1���%$ , ��!"�B44�%$ !��/$�% !�14�������%&�����!�%��3��2&��%&� !�14��$��!�!���"#%!3�7.#/�4��!@� C#%#�=14$�!��� -��4!1#%!�� -#� �/!@#%!�� b4!� !�"
-����%!�”5�����1!5!1/#%��!#4�:�,�%�/�#� /�%&� �f�f���j$!��!�5��/#%!��5��/#$%&���#+�$%��/�%,����5�5/#%��!#4�:�=���!/��%#4�,�%�/�#� /�%&� �$�� !�!�/#�,�%$ !��7h���:!� !1#%�2&�%&���#1&/#%��!#4:�,�%�/����/�%&� 4!�%� !�!���4�3#�%%�%�,�$��%$ ,7s5s5,�$#����%�$��!5!5#4!�%!%�/#��4!��%�%�,�$�����#�1&:��# %&�#������!#%���1%!��+�5�����4�1%!�”#��������7>#%��!#4�i�=���!/��%#4�,�%�/��?#s�3�43� !�!�%&��%$ ,B�%!+� !��;$6#�,�%!11�444!���<#4#���%�4�",#� #�1&#��4�",B�!/#4�#� �%&����"#�!�/�h$/#�����#�1&�#�%!1!�#�%��4!�!1#4 #%#C$#4$������#�1&�5�51��1���
>�%&� ��?#s�3�43� !�!�%&��%$ ,�&s
p
f�f�$�� #44#3#!4#+4�&!”&D1�3��#”�0�#� ��%#4″���/��’�yr(78&��#/�4��!@��5�5%&�gfB.’r:luu1#���#� muz:qmm1��%��4�(2#�4!/!%� +,+,%&� #%#���3! � 5��/%&�1�&��%�7f�f�$�� #44″���/��!�!�%&��&#��r��4�#��’�yn{oq(�5�5%&�looo”���/�����\�1%7.!%��2&!1&#����%�&#�� +�%2���#�,%2�!� !3! $#4�2����=14$ � :�!�1�%&������!%!���!�!�%&�”���/�#����%!�5��/#%!3�5��%&��&,4�”���%!1��4#%!���&!�78&!��=14$�!��1�!%��!$/2#���%���D��%#+4!�&� 78&�5!� !�”�!�!��$��%$ ,#���#�!4,����� $1!+4�$�!�”�$+4!14,#3#!4#+4�”���/��78&��!”�!5!1#�1��5�5%&��&,4�”��!��2#�#������ $�!�”+��%�%�#�)%&�! ��%!5!� &#�4�%,���”��$�� 2!%&%&�w!� !\#0�#� ��%#4loo%!/���$%�5�5loo+��%�%�#����4!1#%��7B44�!”&%1�&��%�1��%�!+$%!�”%�%�%&�/�%#D#�#4,�!��&�2� #���!%!3�1����4#%!��+�%2���%&��!�6#44�4�#� &���!%#4!@#%!��7f�f�$�� #44��4�3#�%�$+4!1 #%##%#%&#� :&��1�2�2� ! ��%���5��/� #�,�#� �/!@#%!���5�5#�$+�#/�4������j$!3#4��%78&�”���%!1#���1!#%!���%$ ,!�!���%%&���� $1%�5�5%&!�/#�$�1�!�%:2�2����4,!�%�����%� %&����$4%�!�!�#�#��3�4$%!��#�,������1%!3�79��%&��&,4�”���%!1%����:&�2�3��:2�2�$��%&�+$!4%D!��#� �/�$/+��”����#%���5�5<&,>*r7r%�%�1#41$4#%�%&��&,4�”��!��7B�B��%#%� #+�3�:#44+��%�%�#����4!1#%�����$4%� !�!�%&��#/�/����&,4�%!1″��$�7f�f�#�#4,�� �$+4!14,#3#!4#+4�/�%#�%#%!�%!1�5��/#”���%!1#���1!#%!���%$ ,�5�5&���!%#4!@� �|wsCDlu�#%!��%�78&��#%$���5�5%&�$� ��4,!�” #%#’&���!%#4!@� �|wsCDlu�#%!��%�(!�!��$1&%&#%+4!� !�”‘�5&���!%#4!@#%!��#� !�5�1%!��2!%&.B-.D��wn(2#���%����!+4�2!%&!��%&!1#4#� ��#1%!1#41���%�#!�%�7
ppp
pppp
ppp
- The major genetic risk factor for severe COVID-19 is inherited from Neanderthals
Online content
Fig. 1 Genetic variants associated with severe COVID-19.
Fig. 2 Phylogeny relating the DNA sequences that cover the core Neanderthal haplotype in individuals from the 1000 Genomes Project and Neanderthals.
Fig. 3 Geographical distribution of the Neanderthal core haplotype that confers risk for severe COVID-19.
Extended Data Fig. 1 Odds ratios for hospitalization owing to COVID-19 for cohorts contributing to the meta-analysis (round 3) of the COVID-19 Host Genetics Initiative (rs35044562).
Extended Data Fig. 2 Pairwise linkage disequilibrium between diagnostic Neanderthal variants.
Extended Data Fig. 3 Linkage disequilibrium between index variant rs11385942 and the index variant of the COVID-19 Host Genetics Initiative (rs35044562).
Extended Data Fig. 4 Phylogeny of haplotypes in individuals included in the 1000 Genomes Project and Neanderthals covering the genomic region of the core risk haplotype.
Extended Data Fig. 5 Frequency differences between south and east Asia for haplotypes introgressed from Neanderthals.
Extended Data Table 1 Genetic variants in LD (r2 > 0.
Extended Data Table 2 Previous studies that identified gene flow from Neanderthals at the core haplotype.