DNA sequence alignments used in Patterson et al. Nature 2006
“Genetic evidence for complex speciation of
humans and chimpanzees”
Patterson N, Richter DJ, Gnerre S,
Lander ES and Reich D; Nature 2006
Sequence obtained for this study: We sequenced 117,862 reads of
DNA: 115,152 from a western lowland gorilla (Gorilla gorilla, individual NG05251 in the Coriell catalog: locus.umdnj.edu/primates/species_summ.html)
and 2,710 from a black-handed spider monkey (Ateles geoffryi, individual NG05352). All sequencing reads are
publicly available at the NCBI trace archive (http://www.ncbi.nlm.nih.gov/Traces);
to access them, carry out the following queries:
(1) Gorilla data (Gorilla gorilla):
CENTER_NAME='WIBR' and CENTER_PROJECT='G611'
CENTER_NAME='WIBR' and CENTER_PROJECT='G612'
CENTER_NAME='WIBR' and CENTER_PROJECT='G618'
CENTER_NAME='WIBR' and CENTER_PROJECT='G619'
CENTER_NAME='WIBR' and CENTER_PROJECT='G744'
(2) New world monkey data (Ateles geoffroyi)
CENTER_NAME='WIBR' and CENTER_PROJECT='G820'
We note that the NCBI trace archive contains slightly more reads that we report in our analyses, because not every read submitted to the Trace Archive passed standard pre-filtering steps.
Alignments: The alignments of humans, chimpanzees, gorillas, and more distantly related primates can be downloaded below or online at Nature. The first two data sets are packaged into “tar” files. When opened with the unix command “tar -xvf name", these expand into many files: one for each alignment. The third and fourth data sets, corresponding to alignments of contiguous sequence, are in Threaded Block Set aligner (tba) format, and are packaged into “gz” files. These can be opened with the unix command “gunzip name".
|
HCGOM shotgun data |
33,016 alignments |
|
|
HCGM shotgun data |
51,966 alignments |
|
|
HCGOM contiguous chr. 7 |
1 contiguous alignment |
|
|
HCGOM contiguous chr. X |
1 contiguous alignment |
Data sets: The filtered data can be
accessed below or online at Nature. Data
are packaged into “gz” files, which can be opened with the unix command “gunzip
name".
|
HCGOM
shotgun data |
498,771
divergent sites |
|
|
HCGM
shotgun data |
858,941
divergent sites |
|
|
HCGOM contiguous
chr. 7 |
69,521
divergent sites |
|
|
HCGOM
contiguous chr. X |
8,769 divergent sites |
Further questions: Please contact David Reich (reich at genetics.med.harvard.edu) for any further clarification about these data