Single nucleotide polymorphisms (SNPs) How many SNPs are there in humans today? • Human Mutation rate is ~2.5 x 10^-8 mutations / site / gen • ~150 mutations/diploid genome/generation • 6.3 milliards people in the world = 945,000,000,000 mutations in the world today • With 3 milliards nucleotides = each nucleotide in the world today is mutated 315 times. Example of SNP marker Use of SNPs markers • species (or genetical group) identification and analysis of hybridization • phylogeography • population genetics (genetic variation, individual identification – parentage, relatedness, population structure, population size, changes in population size) Advantages • abundant and widespread in many genomes (in both coding and non-coding regions) – milions of loci • spaced every 300-1000 bp • biparentaly inherited (vs. mtDNA) • evolution is well described by simple mutation models (vs. microsatellites) • shorter fragments are needed – using in non-invasive methods Disadvantages • ascertainment bias – selection of loci from an unrepresentative sample of individuals • low variability per locus (usually bi-allelic) • higher number of loci is needed in population genetic applications (4-10 times more loci) Methods • Locus discovery (ascertainment) • Genotyping SNPs discovery Sequencing Sekvencování DNA • Maxam-Gilbertova (chemická) metoda: bázově-specifická chem. modifikace a štěpení fragmentů DNA • Sangerova (enzymatická) metoda: terminace replikace pomocí ddNTP Sequencing Sekvencování DNA Sequencing Sangrova dideoxy metoda • Sekvence délky 500 – 1000 bp • 4 kapiláry - destička s 96 vzorky za noc • Jsou i sekvenátory s 96 kapilárami SNPs genotyping = zjištění genotypu daného jedince SNPs genotyping – sekvenování? Je drahé a nejasné u heterozygotů Heterozygotes? SNPs genotyping – klonování a následné sekvenování? - separation of two (or more in duplicated genes) alleles SNP genotyping - old standards SNPs genotyping – old standards Methods of mutation detection (comparison of specimen’s pattern with pattern of known allelles) • Thermal gradient gel electrophoresis (TGGE) • Denaturing gradient gel electrophoresis (DGGE) • Single-strand conformation polymorphism (SSCP) • = special electrophoresis methods based on differences in mobility of different DNA sequences • detekce geneticky podmíněných chorob, např. cystická fibróza Denaturing gradient gel electrophoresis (DGGE) (TGGE – podobné, ale gradient teploty) Single strand conformation polymorphism (SSCP) The use of automated sequencers (denaturing polymer POP7 – ssDNA, e.g. microsatellites) The use of automated sequencers Why not non-denaturing electrophoresis? Advantages of CE-SSCP • high throughput (when using 4, 16, or 96 -capillary sequencer) – time and money saving • no need of gel preparation and autoradiography • distinction of two DNA strains by two colour-labeling (usually FAM and HEX) • potential of multiplexing – not yet used !!! Disadvantages • need for electrophoresis optimisation (running temperature, sieving matrix, dilution of samples) • „complex“ patterns in some sequences Disadvantages • need for electrophoresis optimisation (running temperature, sieving matrix, dilution of samples) • „complex“ patterns in some sequences • alleles with the same pattern may rarely occur • it is necessary to test several run temperatures Data analysis • GeneMapper (Applied Biosystems) • different „Size Standard“ for each temperature • alignement of more samples Applications • Genotyping of codominant markers (e.g. single copy MHC genes) Applications • Genotyping of codominant markers (e.g. single copy MHC genes) • Identification of number of genes (e.g. duplicated MHC genes) SNP genotyping – new methods ASPE: allele-specific primer extension SBE: single base extension Detection or SBE products Microarray detection of SBE products Microarray analysis of SNPs Detekce: Affymetrix, Illumina aj. Nové postupy při sekvenování DNA „Next generation“ sequencing (Hudson 2008) 454 pyrosequencing • emulzní techniky amplifikace pikolitrové objemy • simultánní sekvenování na destičce z optických vláken detekce pyrofosfátů uvolňovaných při inkorporaci bazí • První generace GS20 → 200 000 reakcí najednou (zhruba 20 milionů bp) dnes FLX → 400 000 reakcí najednou • Problémy s homopolymery • Délka jednotlivých sekvencí 100 – 400 Solexa/Illumina 1G SBS technology (SBS = sequencing by synthesis) • 1 Gb (šestinásobek genomu Drosophily) • Výrazně levnější • Sekvence délky 35 bp • Flourescence, reversibilní terminátory • Spíš pro resequencing SOLiD (sequencing by Oligonucleotide Ligation and Detection)