Washington University School of Medicine SNP Research Facility
Google Research
FP-TDI SNP genotyping


Annotating SNPs in the Human Genome

According to dbSNP (build 124) there are over 10 million SNPs in the human genome, more than half of which have been validated. Although this amounts to roughly one SNP every 1,000 bp, research has shown that SNPs tend to be clustered near one another rather than evenly spaced throughout the genome.

We obtained repeat-masked FASTA files for the human genome from the FTP server of UCSC's Golden Path (build hg18; Karolchik et al, 2003; International Human Genome Sequencing Consortium, 2001). Using information from public databases, we marked the position of each non-indel SNP in public databases using IUPAC ambiguity codes. With repetitive and polymorphic elements annotated, these files are ideal for primer design. This work was presented as a poster at the AAAS meeting in February 2006.

SNP-masked Genomic Sequence
region snps  
Chromosome 1 671,783   chr1.ambigs.fasta.gz  
Chromosome 2 651,716   chr2.ambigs.fasta.gz  
Chromosome 3 522,952   chr3.ambigs.fasta.gz  
Chromosome 4 521,207   chr4.ambigs.fasta.gz  
Chromosome 5 485,152   chr5.ambigs.fasta.gz  
Chromosome 6 551,388   chr6.ambigs.fasta.gz  
Chromosome 7 452,712   chr7.ambigs.fasta.gz  
Chromosome 8 404,562   chr8.ambigs.fasta.gz  
Chromosome 9 394,448   chr9.ambigs.fasta.gz  
Chromosome 10 438,671   chr10.ambigs.fasta.gz  
Chromosome 11 432,671   chr11.ambigs.fasta.gz  
Chromosome 12 402,692   chr12.ambigs.fasta.gz  
Chromosome 13 315,940   chr13.ambigs.fasta.gz  
Chromosome 14 238,882   chr14.ambigs.fasta.gz  
Chromosome 15 221,678   chr15.ambigs.fasta.gz  
Chromosome 16 256,474   chr16.ambigs.fasta.gz  
Chromosome 17 216,458   chr17.ambigs.fasta.gz  
Chromosome 18 222,368   chr18.ambigs.fasta.gz  
Chromosome 19 172,776   chr19.ambigs.fasta.gz  
Chromosome 20 244,331   chr20.ambigs.fasta.gz  
Chromosome 21 119,681   chr21.ambigs.fasta.gz  
Chromosome 22 145,223   chr22.ambigs.fasta.gz  
Chromosome X 285,722   chrX.ambigs.fasta.gz  
Chromosome Y 34,748   chrY.ambigs.fasta.gz  
Sequencing Services Genotyping Services HapMap Project Informatics Services

Copyright 2007, Washington University School of Medicine SNP Research Facility. All rights reserved.
Legal   Contact   Site Map