CG920 Genomics Lesson 8 Structure and organization of genomes Markéta Pernisová Functional Genomics and Proteomics of Plants, Mendel Centre for Plant Genomics and Proteomics, CEITEC - Central European Institute of Technology, Masaryk University, Brno marketa.pernisova@ceitec.muni.cz, www.ceitec.muni.cz 1 Outline 1. Eukaryotic nuclear genomes 2. Genomes of prokaryotes and eukaryotic organelles 3. Virus genomes and mobile elements 4. Online sources 5. Literature 2 GENOME Genome – set of genetic information of an organism - complete biological information to construct, maintain and replicate/reproduce • eukaryotic • prokaryotic • virus nucleoid plasmids chromosomes in nucleus mitochondria + chloroplasts 3 EUKARYOTIC NUCLEAR GENOMES = set of linear DNA molecules, at least two chromosomes, without exceptions 4 STRUCTURE OF CHROMOSOMES DNA + histones = nucleosome • 140-150 bp • linker: 50-70 bp • + linker histones „beads-on-a-string“ form of chromatin – 11 nm H2A H2B H3 H4 2x 5 STRUCTURE OF CHROMOSOMES 30 nm chromatin fiber - interphase chromosomes • several theories, 2 models: • solenoid model – only linker histones (e.g. H1) • helical ribbon – linker histones + core histone „tails“ – chemical modifications of these tails open up the 30 nm fiber 6 STRUCTURE OF CHROMOSOMES Condensed metaphase chromosomes: 1400 nm one chromatid: 700 nm Centromeres, telomeres – repetitive sequences 7 METAPHASE CHROMOSOMES Human karyogram 8 UNUSUAL CHROMOSOMES • Minichromosomes • short, high gene density • e.g. chicken: 33 minichormosomes – 1/3 of genome ~ 75% of genes • B chromosomes • individual, not in all populations • fragments of normal chromosomes, probably the result of unusual events during nuclear division • common in plants, associated with reduced viability • also in fungi, insects, animals • Holocentric chromosomes • no single centromere, but multiple centromeric structures along chromosome • e.g. Caenorhabditis elegans 9 GENOME ORGANIZATION 10 GENES AND GENE-RELATED SEQUENCES 11 Genes • UTR – untranslated region • Introns – splicing • Exons – functional products Multigene families – groups of genes of identical or similar sequence • simple (or classical) • gene duplication • e.g. genes coding ribosomal RNA in human genome: • 2000 genes for 5S rRNA – all in a single cluster on chromosome 1 • 280 copies of a repeat unit for 28S, 5.8S and 18S rRNA – five clusters of 50-70 repeats on five chromosomes • complex • similar sequences (but not the same), distinctive properties • e.g. mammalian globin genes – expressed at different developmental stages GENES 12 GENE DISTRIBUTION ON CHROMOSOME • uneven • genes also in centromere, lower density Human: 1-64 genes per 100kb 38 genes per 100 kb 1 gene per 100 kb Chromosome 1 in Arabidopsis 13 PSEUDOGENES • evolutionary relics • 2 groups • conventional – arise due to mutation • partialy functional or nonfunctional • processed – derived from mRNA copy by RT • no introns • lacks regulatory sequence upstream of gene • nonfunctional 14 GENE FRAGMENTS • Truncated genes • Gene fragments 15 INTERGENIC DNA • „junk“ DNA – not true 16 REPETITIVE DNA • Tandemly repeated DNA • repeat units are placed next to each other in an array • Interspersed repeats • genome-wide • repeat units are distributed randomly around the genome 17 TANDEMLY REPEATED DNA • ~ satellite DNA • arised due to errors during genome replication • long series of tandem repeats, hundreds of kilobases in length • single genome: several different types of satellite DNA, each with a different repeat unit (5-200 bp) • in centromeres or LTR • Minisatellites („variable number of tandem repeats“ - VNTRs) • repeat unit up to 25 bp, clusters up to 20 kb • telomeres • Microsatellites („simple tandem repeats“ - STRs) • repeat unit up to 13 bp, clusters up to 150 bp • function not clear • use: genetic profiling 18 INTERSPERSED REPEATS • random • arise due to transposition • some of them descended from transposable viruses • LINEs (long interspersed nuclear elements) • over 300 bp • SINEs (short interspersed nuclear elements) • up to 300 bp 19 HUMAN GENOME 1.5% 37.5% 62.5% 43.75% 18.75% 36% 20 NUCLEAR GENOME ORGANIZATION Human genome – 50 kb section • 4 genes • 88 repeats • LINEs • SINEs • LTRs • DNA transposones • 7 microsatellites (4 in introns) • 30% noncoding, nonrepetitive, single-copy DNA of no known function 21 GENOME ORGANIZATION COMPARED 22 GENOME ORGANIZATION • C-value paradox (C-value enigma) – size of the genome does NOT correlate with organism complexity 23 GENOME ORGANIZATION • size of the genome does not correlate with the number of genes 24 GENE CATALOG • Organisms with sequenced genome • Human gene catalog: • cannot tell us what makes a human being … 25 GENE CATALOG 26 PROKARYOTIC GENOME 27 PROKARYOTIC GENOME • Prokaryota • bacteria • archaea 28 PROKARYOTIC GENOME 29 OPERON • lactose operon • utilization of lactose • tryptophan operon • same biochemical pathway • Methanococcus jannaschii (archaea) and Aquifex aeolicus (bacteria) • different functions 30 PLASMIDS • additional genetic information • adaptation to environmental conditions • advantage for the host • in some cases inserted into the main genome 31 PROKARYOTIC GENOMES 32 SIZE OF PROKARYOTIC GENOMES • in most cases size of the genome correlates with the number of genes • average: 950 genes per 1Mb 33 PROKARYOTES vs EUKARYOTES E. coli human 34 PROKARYOTES vs. EUKARYOTES • nucleoid • free in cytoplasm • • plasmids • compact • majority of genome: coding sequences • operons • few repetitive sequences • • chromosomes • in nucleus • introns • • C-value paradox • majority of genome: noncoding sequences • • large number of repetitive sequences • mitochodrial and chloroplast genomes 35 EUKARYTIC ORGANELLE GENOMES 36 ORGANELLE GENOMES • endosymbiont theory of organelle origin • relics of free-living bacteria • symbiotic association with the precursor of the eucaryotic cell • endosymbiosis • from 1 up to a 100 copies in one mitochondrion • heritability – as one copy ??? • transfer DNA from organelles into nuclues and between organelles • Arabidopsis • mitochondrial genome contains nuclear and chloroplast DNA • nuclear genome contains sequences of chloroplast and mitochondrial DNA • vertebrates • mitochondrial DNA in nuclear genome 37 MITOCHONDRIAL GENOME • circular or linear • 1 mitochondrion – 10 identical molecules = approximately 8000 in one cell (human) • rRNA, tRNA, respiratory chain components, ribosomal proteins, transcription, translation, transport proteins ... human yeast 38 CHLOROPLAST GENOME rice • similar set of approximately 200 genes • rRNA, tRNA, ribosomal proteins, photosynthetic components ... 39 SIZE OF ORGANELLE GENOMES 40 VIRUS GENOMES AND MOBILE ELEMENTS 41 VIRUS GENOMES • virus – nucleoprotein particle • dependent on the host = obligate parasites – they need ribosomes and translational apparatus to synthesise the protein coat • bacteriophages (~ phages) • eukaryotic • virus genome • DNA or RNA • circular or linear • ss or ds • segmented or nonsegmented MS2 M13 T4, λ 42 BACTERIOPHAGE GENOMES • number of genes: 3-200 • overlapping genes • phages • lytic (virulent), e.g. T4 • lysogenic (temperate), e.g. phage λ 43 LYTIC INFECTION • = virulent, productive • e.g. phage T4 • fast cell lysis and death • latent period – time needed for the phage reproduction in host • 22 minutes 44 LYSOGENIC INFECTION • = temperate, quiescent • e.g. phage λ • immediately after phage DNA entry – virus genome integration into the host genome by site-specific recombination – prophage • induction of the prophage excision – chemical or physical factors – probably connected with DNA damage 45 EUKARYOTIC VIRUSES • variable • DNA or RNA; ds or ss; circular or linear; segmented or nonsegmented • size: 1.5-240 kb 46 EUKARYOTIC VIRUSES • capsid – icosahedral or filamentous • lipid membrane – derived from host • plant viruses – usually RNA • lytic and lysogenic infection • e.g. viral retroelements • retroviruses – RNA genome • pararetroviruses – DNA genome 47 RETROVIRUSES • genome – each of three genes encode polyproteins which are cleaved, after translation, into two or more functional gene produts • gag – viral core structure = group antigens • pol – reverse transcriptase, integrase, protease functions • env – viral capsid proteins = envelope • LTR – imporatant regulatory regions for transcription and replication 48 RETROVIRUSES • integration of retroviral genome into the host genome 49 VIRUSOIDS AND VIROIDS • satellite viruses or virusoids – especially in plants • RNA molecules, 320-400 bases • satellite virus – shares the capsid with the genome of the helper virus • virusoid – encapsulated on its own • viroid • RNA molecule, 240-375 bases, no genes, never become encapsidated = naked RNA • circular single-stranded molecules • replicated by enzymes coded by the host or helper virus • self-catalyzed cleavage • probably related to evolution of RNA splicing 50 MOBILE ELEMENTS 51 MOBILE ELEMENTS • = transposons • transposition – segment of DNA can move from one position to another in a genome • conservative • replicative • involves recombination 52 MOBILE ELEMENTS • RNA transposons • retrotransposons with LTR • retrotransposons without LTR • DNA transposons • in prokaryotic genomes • Insertion sequence (IS) • Composite transposon • Tn3-type transposon • Transposable phage • … • in eukaryotic genomes • Ac/Ds • Spm • … 53 RETROTRANSPOSONS • transposition via RNA intermediate • retrotransposons • with LTR sequence • without LTR sequence retroviruses 54 RETROTRANSPOSONS WITH LTR • Ty element • first discovered in yeast • 6.3 kb, 25-35 copies • „delta“ element • LTR sequences • 330 bp • around 100 copies 55 RETROTRANSPOSONS WITH LTR • Ty1/ copia • most abundant • env gene is missing • cannot form infectious virus particles - cannot escape from the host cell • form virus like particles (VPL) • Ty3/gypsy • env equivalent • some of them form infectious viruses • endogenous retroviruses (ERV) • humans, mammals 56 RETROTRANSPOSONS WITHOUT LTR • retroposons • LINEs (long interspersed nuclear elements) • pol gene • functional reverse transcriptase • SINEs (short interspersed nuclear elements) • 100-400 bp • no gene • „borrow“ reverse transcriptase from LINE • e.g. Alu element 57 DNA TRANSPOSONS IN PROKARYOTES • do not require RNA intermediate • less common than retrotransposons • IS – insertion sequence • conservative and replicative transposition • composite transposons • Tn3-type • lacks IS • replicative transposition • Transposable phages • replicative transposition 58 DNA TRANSPOSONS IN EUKARYOTES • Human genome • 350 000 transposons • inverted tandem repeats (ITR) • gene for transposase • usually nonfunctional • Maize • Ac/Ds elements • Spm element • Drosophila • P element “JUMPING GENES” 59 MOBILE ELEMENTS IN THE HUMAN GENOME 60 SUMMARY • Eukaryotic nuclear genome • chromosomes • genes • intergenic DNA • gene catalog • Prokaryotic genome • nucleoid • plasmids • Mitochondrial and chloroplast genomes • Virus genomes • bacterial viruses – phages • eukaryotic viruses • Mobile elements • RNA transposons • DNA transposons 61 ONLINE AND LITERATURE SOURCES 62 ONLINE SOURCES https://gold.jgi-psf.org/index 63 ONLINE SOURCES 64 ONLINE SOURCES http://www.ebi.ac.uk/genomes/ 65 ONLINE SOURCES http://www.genomenewsnetwork.org/ 66 LITERATURE • T. A. Brown: Genomes • Alberts et al.: Molecular Biology of the Cell • G. Gibson and S. V. Muse: A Primer of Genome Science + internet, scientific papers … 67