Karel Klepárník (klep@iach.cz) Department of Bioanalytical Instrumentation Institute of Analytical chemistry Czech Academy of Sciences Brno (www.iach.cz) Moderní analytická instrumentace pro genetický výzkum, lékařskou diagnostiku a molekulární identifikaci organismů 1 1990 Ústav analytické chemie AVČR 2 Polymerase chain reaction PCR amplification 3 PCR amplification scheme Wide upward diagonal Wide upward diagonal DNA template Wide upward diagonal Wide upward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide upward diagonal Wide upward diagonal Wide upward diagonal Wide upward diagonal Wide upward diagonal DNA dissociation 90 ºC Primer annealing 62 ºC DNA synthesis 72 ºC Wide upward diagonal Široký šikmo dolů Wide upward diagonal Wide upward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide upward diagonal Wide downward diagonal Wide downward diagonal Wide upward diagonal Wide upward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide upward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide upward diagonal Wide upward diagonal Wide upward diagonal Wide upward diagonal Široký šikmo nahoru Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide downward diagonal Wide upward diagonal Wide downward diagonal Široký šikmo dolů Široký šikmo dolů Correct copies N=2n+1 – 2(n+1) 1st cycle: n=1 22 – 2∙2 = 0 2nd cycle: n=2 23 – 2∙3 = 2 3rd cycle: n=3 24 – 2∙4 = 8 DNA primer DNA primer 4 Kary B Mullis The Nobel Prize in Chemistry 1993 Kary B. Mullis born 1944 La Jolla, CA, USA University of British Columbia For his invention of the polymerase chain reaction (PCR) method Medal 5 DNA sequencing 6 Synthesis of Sanger sequencing fragments Frederick Sanger Frederick Sanger MRC Laboratory of Mol. Biol. Cambridge, UK 1918 – 2013 Nobel Price in Chemistry 1980 7 Fig14sequencingstrat_82 DNA sequencing strategy 8 Separation methods Capillary electrophoresis CE 9 Why capillary electrophoresis? T L R solid – solid air – solid T0 TR ΔT Miniature capillary: low R => fast separation 1) high resistivity Þ low current at high voltage Þ low heat production 2) efficient heat transport Þ low temperature difference inside the capillary 10 LIF detection 11 Ar-ion laser 40 mW separation capillary ID 50 µm objective 40x; 0.65 blocker 520 nm beam splitter band pass 610 nm PMT blocker 520 nm band pass 540 nm band pass 590 nm band pass 570 nm 50% 488 nm 50% 514 nm lens Four channel LIF detection arrangement Spectral filtering 12 SENSOR LASER PINHOLE OPTICS BEAM SPLITTER MICROSCOPE OBJECTIVE FOCUS SCHEME OF CONFOCAL DETECTOR Space filtering 13 excited sample laser beam polymer filled capillaries sheath-flow cuvette open tubings electrode chamber electrode chamber Sheath-flow cuvette Prof. Norman Dovichi University of Notre Dam Indiana, USA 14 IMG_1376 Prof. Hideki Kambara Hitachi Central Research Laboratory Tokyo, Japan 15 DNA sequencing record 16 DNA sequencing up to 1300 bases in 2 hours Separation matrix: LPA 2.0% (w/w) 17 MDa, 0.5% (w/w) 270 kDa E: 125 V/cm, T: 70 °C karger Barry L. Karger The Barnett Institute Northeastern University Boston MA 17 96 active eight reserve capillaries ABI PRISM® 3700 DNA Analyzer 18 Sheath flow cuvette ABI PRISM® 3700 DNA Analyzer 19 venter200 J. Craig Venter The Institute for Genomic Research (TIGR) The first president of Celera Genomics The completed sequence of the human genome was published in February 2001 in Science. Venter, C. J. et al. Science 2001, 291, 1304-1351. 20 J. CRAIG VENTER, Ph.D., PRESIDENT, CELERA GENOMICS REMARKS AT THE HUMAN GENOME ANNOUNCEMENT THE WHITE HOUSE MONDAY, JUNE 26, 2000 Mr. President, Honorable members of the Cabinet, Honorable members of Congress, distinguished guests. Today, June 26, 2000 marks an historic point in the 100,000-year record of humanity. We are announcing today that for the first time our species can read the chemical letters of its genetic code. At 12:30 p.m. today, in a joint press conference with the public genome effort, Celera Genomics will describe the first assembly of the human genetic code from the whole genome shotgun sequencing method. Starting only nine months ago on September 8, 1999, eighteen miles from the White House, a small team of scientists headed by myself, Hamilton O. Smith, Mark Adams, Gene Myers and Granger Sutton began sequencing the DNA of the human genome using a novel method pioneered by essentially the same team five years earlier at The Institute for Genomic Research in Rockville, Maryland. The method used by Celera has determined the genetic code of five individuals.... …There would be no announcement today, if it were not for the more than $1 billion that PE Biosystems invested in Celera and in the development of the automated DNA sequencer that both Celera and the public effort used to sequence the genome… 21 DNA mutation analysis 22 Next generation sequencing Single molecule detection 23 Stretching of dsDNA in Nanochannels • evaluation of size • chromatography or electrophoresis • detection of nucleotides consecutively cleaved by exonuclease 24 Single molecule reaction monitoring 25 heliscope-showcase Helicos The HeliScope™ Sequencer 2 . 109 b/day 109 reads/run 25 – 55 bp read lengths Genome Sequencer FLX System 3 . 108 b/day 100 Mb/7.5 hour run 400 000 reads/7.5 hour 200 – 300 bp read lengths Illumina Solexa Illumina Genome Analyzer 6 . 108 b / day 3 . 109 b / 5 days run 50 . 106 oligo clusters 36 – 50 bp read lengths Parallel single molecule sequencing by synthesis 26 The HeliScope™ Sequencer http://helicosbio.com/ 27 Photocleavable dideoxy nucleotides 28 Single molecule real time sequencing (SMRTTM) Pacific Biosciences Next generation DNA sequencing DNA sequencing – DNA polymerase RNA sequencing – reverse transcriptase Codone-resolved translation elongation by single ribosomes Tens of nucleotide peaks in 1 sec Read length 1 – 15 kb 80 000 detection points 15 min/genome: 50 n/s * 80 000 points * 15 min * 60 s = 3.6 Gb DNA polymerase 529 processivity 20 kB – 400 b/s Some enzymes are not processive $ 100/genome 29 www.pacificbiosciences.com Pacific Biosciences Single Molecule Real Time (SMRT™) DNA sequencing 30 PacBio RS instrument 31 Single molecule real time sequencing 32 Pacific Biosciences Read Length 33 Pacific Biosciences Read Length 34 http://www.iontorrent.com/lib/images/step1.jpg Hydrogen ion is released as a byproduct when a nucleotide is incorporated into a strand of DNA by a polymerase Ion Torrent The Ion Personal Genome Machine (PGM™) sequencer http://www.iontorrent.com/ vThe world's smallest solid-state pH meter vDigital output 35 High-density array of micro-machined wells. Each well holds a different DNA template. Beneath the wells is an ion-sensitive layer and a proprietary ion sensor. • http://www.iontorrent.com/lib/images/step2.jpg 36 If a nucleotide is added to a DNA template and is then incorporated into a strand of DNA, a hydrogen ion is released. The charge from that ion will change the pH of the solution. The world's smallest solid-state pH meter—will call the base. • http://www.iontorrent.com/lib/images/step3.jpg 37 The sequencer sequentially floods the chip with one nucleotide after another. If the next nucleotide that floods the chip is not a match, no voltage change will be recorded. • http://www.iontorrent.com/lib/images/step4.jpg 38 If there are two identical bases on the DNA strand, the voltage is double, and the chip records two identical bases. • http://www.iontorrent.com/lib/images/step5.jpg 39 Single molecule passage through a pore 40 Oxford Nanopore Technologies Schematic of the nanopore device. 41 Oxford Nanopore Technologies Principle and Instrumentation 42 DNA sequencing development 2001: Genome draft of 5 individuals in 9 months – more than billion $ 2015: Complete human genome in an hour – ~100 $ 43 Sample preparation for next gen. DNA/RNA sequencing single cell profilling 44 45 Single-Cell RNA-Seq Tradiční techniky: vAnalýza několika genů v souborech tisíců buněk (např. in situ hybridizace) vProfil exprese tisíců genů v homogenátu tkání. vtranskriptomy tisíců jednotlivých buněk různého typu a stavu Příklady „Single-Cell RNA-Seq“ aplikací: vPochopení heterogenity: nádorů, evoluce klonů, metastatických klonů, rezistence k léčivům atd. v vPochopení komplexních tkání, např. neuronové tkáně. (Úplný transkripční profil jednotlivých neuronů aktivovaných externími stimuly představuje zásadní krok pro odhalení principu zachycení a uložení paměťové stopy.) v vSpolehlivá identifikace typů buněk a markerů, pochopení diferenciačních drah ve vývojové biologii a biologii systémů Experimentální podmínky pro „single-cell sequencing“ tisíců buněk Manipulace s tisíci buňkami tkání - mikro kontejnery (105 kapek/min) Lýza buněk - uvnitř kontejneru Sekvenování oblasti genů - RNA RNA jedné buňky v kontejneru - specificky značená částice pro hybridizaci Kompletní transkriptome - hybridizace RNA uvnitř kontejneru - nadbytek oligo primerů na jedné částici Identifikace buněk - buněčný barcode pro každý RNA fragment Identifikace sekvence - molekulární barcode - tatáž sekvence jednoho fragmentu může být analyzována vícekrát RNA konstrukty vhodné pro - reverzní transkripci - PCR - vysoce výkonný „next gen. sequencing“ 46 An external file that holds a picture, illustration, etc. Object name is nihms687993f1.jpg Drop-RNA seq enables highly parallel analysis of thousands of individual cells by RNA-seq Tagline vAnalysis of RNA or transcriptome variation in identified cells (Macosko et al., Cell, 2015, 161,1202-14) 47 RNA barcoading separation of thousands of cells in suspension cell – RNA assignment – barcoding analyses of cellular transcriptomes 47 48 8 nts (48 = 65536) 12 nts (412 = 1.7*107) 108 reads on a single bead Molecular barcoded cellular transcriptomes high throughput sequencing inside 0.5 nL droplets PCR handle cell barcode mol. identifier PCR amplified cDNA cellular mRNA hybridized reverse transcription - cDNA poly dT30 outside droplets identical for all beads identical for all primers on a bead, i.e. for the cell in the drop different on each primer (reveals PCR duplicates) captures polyA on mRNA and primes reverse transcription bead ~30 µm 1000 beads in µL ~14 pL An external file that holds a picture, illustration, etc. Object name is nihms687993f1.jpg Synthesis of cellular barcodes and molecular identifiers on microparticles “split-and-pool“ strategy - the same sequence of all primers on a single bead „bar codes“ - 412 (16,777,216) possible barcodes after 12 rounds - different microparticles have different sequences degenerative synthesis - 8 synthesis rounds with 4 DNA bases „univ. mol. identifier“ (UMI) - 48 (65,536) possible sequences on each particle - specific sequences for each primer 30 dT sequence - complementary for polyA RNA Millions of primers on a microparticle 49 50 nl droplets 100,000 nl-sized droplets/min barcoded microparticles suspended in a lysis buffer 51 An external file that holds a picture, illustration, etc. Object name is nihms687993f2.jpg Single Cell RNA-Seq vtranscriptomes from 44,808 mouse retinal cells analyzed v v39 transcriptionally distinct cell populations identified Complex neural mouse retina tissue 1) V čem spočívá princip polymerázové řetězové reakce (PCR amplification)? Enzymatická reakce (DNA polymeráza) na templátu genomové DNA za přítomnosti dvou specifických primérů (krátkých oligonukleotidových řetězců vymezujících počátek a konec amplifikační syntézy) a deoxy nukleotidů (dATP, dTTP dCTP, dGTP) jako základních stavebních jednotek vede k cílené syntéze zvolených fragmentů. Cyklováním mezi teplotami 92, 62, 72 ˚C dochází postupně k disociaci dvouřetězcové DNA, asociaci primérů a syntéze fragmentů na obou komplementárních řetězcích. Takto se produkty předchozího cyklu stávají templáty cyklu následujícího a počet zvolených fragmentů tak narůstá exponenciálně (2n+1, kde n je pořadí cyklu). 2) Princip Sangerovy sekvenační reakce? Enzymatická syntéza (DNA polymeráza) komplementárního řetězce DNA k templátu (genomová DNA) za přítomnosti specifických primérů (krátkých oligonukleotidových řetězců vymezujících počáteční místo syntézy) dideoxy terminátorů (ddATP, ddTTP ddCTP, ddGTP) a deoxy nukleotidů (dATP, dTTP dCTP, dGTP) jako základních stavebních jednotek vede ke směsi různě dlouhých fragmentů. Poloha každého koncového nukleotidu je zde zakódována jako délka příslušného Sangerova sekvenačního fragmentu. Separací těchto fragmentů (specificky fluorescenčně značených na primérech, nebo dideoxy terminátorech), tedy dostáváme sekvenci nukleotidů v genomu. 3) Jaký je princip nejmodernějších metod sekvenování DNA? a)Multiparalelní monitorování inkorporace jednotlivých nukleotidů do jedné molekuly dsDNA v reálném čase polymerázové syntézy. b)Multiparalelní monitorování proudu, při průchodu molekul DNA přes póry umělé membrány. 52 53 Single Cell RNA-Seq Traditional Techniques: vanalysis of a few genes in thousands of individual cells (e.g., in situ hybridization) vexpression profile of thousands of genes only on a tissue homogenate. vtranscriptomes of thousands of single cells varying in type and state Examples of Single Cell RNA-Seq applications: vUnderstanding tumor heterogeneity and clonal evolution – lineage analysis, cancer stem cells, and drug resistant and metastatic clones. v vUnderstanding complex tissues (e.g. neural tissues - the first look at the entire transcriptional profile in individual neurons activated by external stimuli - a critical step in ultimately discovering how a memory is captured and stored). v vHigh resolution identification of cells types and markers, and understanding differentiation pathways in developmental and systems biology. Experimental conditions for single-cell sequencing Thousands of cells from a tissue – capturing containers (105 droplets/min) Gene coding regions – RNA Complete transcriptome – excess of capturing oligo primers Cell identification – cell barcode for each RNA fragment Sequence identification - one sequence could be analyzed many times RNA constructs amenable to - reverse transcription - PCR - high throughput next gen. sequencing 54