• From discovery to technology explosion •1868: Discovery of DNA •1953: Watson and Crick propose double helix structure •1977: Sanger sequencing •1985: PCR •2000: Working draft human genome announced (Sanger method) • •2005: 454 sequencer launch (pyrosequencing) •2006: Genome Analyzer launched (Solexa sequencing) •2007: SOLiD launched (ligation sequencing) •2009: Whole human genome no longer merits Nature/Science paper •2010: “third-gen” systems $ human Genome $3 billion $2-3 million $250k $50k $20k <$1k http://www.nature.com/nature/journal/v464/n7289/images/cover_nature.jpg Frederick Sanger 1958 – Nobelova cena za určení struktury inzulínu 1975 - Dideoxy sekvenační metoda 1977 – osekvenoval Φ-X174 (5,368 bp) 1980 – dostal druhou Nobelovu cenu za chemii Později (polovina 80-tých let) osekvenoval bakteriofága λ pomocí shotgun metody (48,502 bp) Sekvenování genomů •1986 Leroy Hood: první automatický • sekvenátor • •1986 Human Genome Initiative – •1990 započat projekt sekvenování • lidského genomu (předpokládaná • doba 15 let) • beyondhgp reflections_hood Leroy Hood Sekvenování genomů •1995 John Craig Venter sekvenoval první bakteriální genom • •1996 první eukaryotický genom (kvasinka) sekvenován 1102_Horz_Venter_top_R John Craig Venter Craig Venter Global Ocean Sampling Expedition Synthetic genomics Human Longevity Inc Craigventer2.jpg (3328×4992) http://www.youtube.com/watch?v=J0rDFbrhjtI Which applications are labs performing? 9 Oxford Nanopore Sensor array chip: many nanopores in parallel DNA Sequencing Proteins Polymers Small Molecules Adaptable protein nanopore: array animatd viual.png Electronic read-out system Nanopore •Pros: Extremely long sequences, single molecule, portable (minION) •Cons: Very high error rates (up to 38% reported) • MzY5MDc1MzQzNA==_o_nanopore-dna-sequencing.jpg DNA degradation Mechanical damage during tissue homogenization. Wrong pH and ionic strength of extraction buffer. Incomplete removal / contamination with nucleases. Phenol: too old, or inappropriately buffered (pH 7.8 – 8.0); incomplete removal. Wrong pH of DNA solvent (acidic water). Recommended: 1:10 TE for short-term storage, or 1xTE for long-term storage. Vigorous pipetting (wide-bore pipet tips). Vortexing of DNA in high concentrations. Too many freeze-thaw cycles (we tested 5, still Ok). Debatable: sequence-dependent What are the main contaminants? Polysaccharides Lypopolysaccharides Growth media residuals Chitin Protein Secondary metabolites Pigments Growth media residuals Chitin Fats Proteins Pigments Polyphenols Polysaccharides Secondary metabolites Pigments BioNano BioNanoWorkflow.jpg Top sequencing companies #1. Illumina Revenues: $2.752 billion in 2017 #2. Thermo Fisher Scientific Revenues: “Just under” $418.36 million in 2017 Ion AmpliSeq technology works for researchers using Illumina’s NGS platforms, under the name AmpliSeq for Illumina. Thermo Fisher includes NGS within its life sciences solutions segment, which accounted for $5.73 billion of the company’s total revenue of $20.918 billion. #7. Pacific Biosciences of California (PacBio) 2017 revenues: $93.5 million aquired by Illumina 2019 #10. Oxford Nanopore Technologies 2016 revenues: £4.5 million #9. 10x Genomics 2017 revenues: $71 million The company announced a new version of its Chromium de novo assembly solution, which includes a new version of the assembly software, Supernova 2.0. The company’s offerings also include Linked-Reads, a sequencing technology designed to provide long-range information from short-read sequencing data. 10x Genomics—which completed a $55 million Series C financing in 2016—organizes genetic information based on what is known as “read clouds” to map the larger picture of the genome. •A rapid progress in next generation sequencing technologies promises to provide complete (reference) DNA sequences •The bottleneck: –NOT the sequencing capacity –BUT the ability to assemble many short reads with prevalence of repeated DNA (and polyploidy) Sequencing without a limit? Two strategies • • Whole genome shotgun (bottom-top) • • Clone-by-clone (top-bottom) • http://olomouc.ueb.cas.cz/ Genome sequencing http://corelabs.cgrb.oregonstate.edu/sites/default/files/HTS_HISeq2000.png Whole genome shotgun Clone-by-clone sequence assembly physical map construction §Chromosomes: 605 - 995 Mbp (3.6 – 5.9% of the genome) Three genomes of hexaploid wheat Genome size Oryza sativa (2n = 2x = 24) 1C ~ 400 Mbp Triticum aestivum (2n = 6x = 42) 1C ~ 17,000 Mbp AA BB DD §Chromosome arms: 225 - 585 Mbp (1.3 – 3.4% of the genome) D B ; A §Aplication of genomics to flow-sorted chromosomes Chromosome genomics Doležel et al., Choromosome Research 15: 51, 2007 Right collector Left collector Flow-sorted chromosomes Flow-sorted chromosomes L R §Sample rate/ sec: ~1000 chromosomes §Yield / day: 2 - 5x105 chromosomes C:\Data\Obrázky\Foto laboratoř\Foto různé III\FACS Aria a Honza Vrána.jpg Chromosome sorting using flow cytometry Sheat fluid Deflection plates Excitation light Waste Laser Scattered light Fluorescence emission Chromosomes in suspension Flow karyotype Flow chamber Relative fluorescence intensity Ligation into a dephosphorylated BAC vector Transformation of Escherichia coli Size selection by PFGE Chromosome sorting I II III 3B l 2 3 S 1 kbp - 2200 - 825 - 225 Colonies Ordering into 384-well plates 5 x 106 flow-sorted chromosomes (~6 weeks of sorting) Partial digestion Major challenges: •Quantity of DNA (1 – 5 μg DNA) •Quality of DNA (HMW) •Cloning efficacy •Insert size P1010438 Creating chromosome-specific BAC libraries Šafář et al., Plant Journal 39: 986, 2004 Šafář et al., Cytogent. Genome Research 129: 211, 2010 Subgenomic BAC libraries •Main advantages: -Chromosome specificity -Small number of clones (in wheat ~5 x 104 vs. >1 x 106) •Subgenomic BAC libraries facilitate: -Targeted development of DNA markers (BAC end sequencing) -Positional gene cloning -Assembly of ready-to-sequence physical maps (BAC fingerprinting, WGP) Laserová mikrodisekce Výhody: vysoká čistota Nevýhody: malý počet chromozomů, pracnost LASER • Sekvenování genomů •GenBank vznikla v roce 1982 z Los Alamos Sequence Database genbankgrowth Walter Goad Proč sekvenovat dál? •Komparativní genomika •Biomedicínský výzkum •Osobní genom 2010 Ideální lidský genom sekvenován 2010 Ideální lidský genom sekvenován Laserová mikrodisekce Výhody: vysoká čistota Nevýhody: malý počet chromozomů, pracnost LASER Nová GMO revoluce - Molekulární nůžky CRISPR Co je to CRISPR/Cas systém? •Prokaryotický imunitní system, který brání buňku proti cizí DNA •Clustered Regularly Interspaced Short Palindromic Repeats Promotor 1 Cas9 Promotor 2 guideRNA Streptococcus pyogenes Cas9 donor Cílová DNA Nukleáza Naváděcí RNA Mutace © Wikipedia Proč CRISPR? •Rychlost, přesnost, cena •Vícenásobné mutace •Modifikace systému na „molekulárního poslíčka“ A B D Gen X Klasická mutageneze CRISPR/Cas9 Subgenomy © Wikipedia, modifikováno pšenice GMO versus evoluce Bílé víno vzniklo před 7 tis. lety inzercí transpozonu do genu pro antocyan u původního červeného vína V ČR klesá plocha osetá GMO plodinami •8 380 ha v roce 2008 6 480 ha v roce 2009 •4500 ha v roce 2013 Kolik se vlastně GMO plodin pěstuje? USA •sója 94% • bavlna 90% • řepka 90% •cukrová řepa 95% • kukuřice 88% Výsledek obrázku pro bt corn fusarium Kukuřice odolná vůči zavíječi Bacillus thuringiensis Bt delta endotoxin Vkládání cizorodých genů "Syntetická" pšenice Symbióza mezi pšenicí a bakterií 1) odstranit geny rezistence vůči bakterii z genomu pšenice 2) do genomu vložit geny zodpovědné za symbiotické interakce Craig Venter Synthetic genomics Craigventer2.jpg (3328×4992) Synthia – umělý život (2016) •Craig Venter: „první druh.... jehož rodičem je počítač... a je to také první druh, který má ve své DNA zapsán odkaz na své webové stránky“ • •473 genů • • – Richard Feynman: "What I cannot build, I cannot understand" Synthetic cell (Science) •http://www.454.com Genome Sequencer 20 System 454 pyrosequencing (2005) DNA library preparation Fragmentace DNA Ligace adaptoru Vychytání DNA molekul denaturace emPCR Vznik emulze (olej) emPCR emPCR Vychytání kuliček Vychytání kuliček denaturace Sekvenační primer Disperze na sklíčko Disperze na sklíčko Parametry mikroreaktorů Parametry mikroreaktorů • sekvenace sekvenace sekvenace sekvenace sekvenace sekvenace sekvenace sekvenace SOLID (Sequencing by Oligonucleotide Ligation and Detection) • 2-base encoding sequencing (2007) Solexa (2007) • HELICOS (2008) True Single Molecule Sequencing (tSMS) Single Molecule Real-Time (SMRT) Pacific Biosciences 20 zeptolitrů Ion Torrent • • Oxford nanopore Další technologie •Mikroelektroforéza •Sekvenování na bázi microarray CHALLENGES IN GENOME SEQUENCING De novo genome assemblies using only short read data of NGS technologies are generally incomplete and highly fragmented due to §Large duplications §High proportion of repetitive DNA - - - - - - - - chromosomal approach, BAC-by-BAC sequencing - challenge! §Large genome size (~17 Gb) §Polyploidy (3 subgenomes) Chromosomal approach BAC-BY-BAC SEQUENCING BAC clones §Physical map is composed of contigs of overlapping BAC clones §BAC contigs are landed on the chromosome through markers comprised in the contigs § § https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcT-qFG2up8agPzlWRkGaXoSU6iqvqE8G_szdwOQc9ltHp- d060X SOLUTIONS FOR THE REPEATS §Long mate-pair reads > 10 kb § §Long read technologies – PacBio, Oxford Nanopore § §Optical mapping § §Single-molecule mapping of genomic DNA hundreds of kilobases to several megabases in size § §Creates sequence-motif maps, which provide long-range template for ordering genomic sequences § §Visualisation of reality “Seeing is Believing” § § labeling.jpg Three enzymatic approaches §restriction enzymes: sequence-specifically cleave DNA immobilized on a surface § § §nicking enzymes: fluorescent labelling of the nicking site in solution (BioNano Genomics - Irys) §methyltransferase enzymes: labelling with ultra-high density OPTICAL MAPPING Nicking Strand displacement Incorporation of fluorescent nucleotides BIONANO GENOME MAPPING ON NANOCHANEL ARRAYS 3 Fluorescence imaging Lam et al., Nat. Biotechnol. 30(8) 2012 4 Map construction DNA linearization 2 5 Building consensus map Nickase (Nt.BspQI) 1 Sequence-specific labeling U U A Fluorescent dye conjugated nucleotides (Alexa 546 dUTP) were incorporated at the Nt.BspQI sites by Vent (exo−) polymerase. Next, we stained the labeled DNA molecules with the DNA-intercalating dye, YOYO-1, which facilitates visualization of the DNA molecule and measurement of its size. Then, we loaded the DNA onto a nanochannel array chip and applied an electric field, which gradually drives the long, coiled DNA molecules in free suspension through a series of micro- and nanofluidic structures. Once the nanochannels were populated by a set of linearized DNA molecules, we imaged them with automated high-resolution fluorescent microscopy. We determined the size of each DNA molecule by directly measuring its contour length. The histogram peaks represent the location of each sequence motif along the molecules.