Microbial activity in the soil and its seasonal changes: The picture provided by metatranscriptomics and metaproteomics Petr Baldrian Institute of Microbiology, Prague, Czech Republic Ecology of forest topsoil Tree and mycorrhiza summer Tree and mycorrhiza winter Fungal activity in litter and soil in contrasting seasons Litter Org. horizon Snow CO2 (CHO)n photosynthates (CHO)n litter CO2 CO2 mycorrhizal fungi saprotrophic fungi September March mycorrhizal fungi saprotrophic fungi •High share of fungi in the ecosystem is reflected by high fungal contribution to transcription and protein production, especially in litter • •Fungal activity is important for decomposition of complex organic matter • •Seasonal differences in rhizodeposition affect the share of root-associated / saprotrophic fungi • •Activity of root-associated fungi decreases in winter Methods choice and methodology Exploring ecosystem functioning Stable isotope probing – indication of microorganisms involved in certain processes containing incorporation of C (stable isotope 13C) or N (stable isotope 15N), need for labelled substrate. Microorganisms building biomass on incorporated substrate get their biomolecules (PLFA, DNA, RNA) labelled. There is danger of cheating (comensalism) and cross-feeding (predation on microorganisms accummulating labelled isotope). Leads to „label dillution“ over time. Metaproteomics – isolation of total protein and „sequencing“ of peptides. Peptides can be identified, but carry less information than nucleic acids (degenerated code). Difficulty with sequence assignment + different stability of proteins in the environment over time. Can be extremely powerfull in combination with metatranscriptomics or metagenomics but difficult to be used alone. Microarray technologies – hybridization of DNA or RNA with chips can avoid reverse transcription bias or PCR bias. High resolution across levels of expression. However, microarrays only cover the genes used for their desing. Excellent for study of individual taxa in the environment where sequenced genomes are available. Isolation and analysis of microbial strains – challenging but can be very useful. Exploring ecosystem functioning - metagenomics Pros: - can indicate genetic potential of the community - DNA community relatively stable over time (good representation of the ecosystem) - can theoretically reveal co-occurrence of genes (when longer contigs of DNA covering several genes are assebled) - sometimes gives exact identity of the gene source (when the gene sequence and 16S co-occur on an assebled molecule) - powerfull for exploration of bacteria / archaea - Cons: - eukaryota (e.g. fungi) contain much of noncoding DNA (up to over 90%) - eukaryotic genes contain introns - due to the above, short reads of eukaryotic genes can give no information (noncoding DNA) or be difficult to assign (contains both exons and introns) - need for assembly - potential is not the function, presence is not activity (extracellular DNA, pseudogenes, levels of expression…) - for diverse ecosystems, high depth of sequencing required Metatranscriptomics - opportunities - can indicate real activity in the studied ecosystem; fast response to disturbance / experimental treatment - little danger of „ancient“ RNA from dead cells – such RNA decomposes rapidly - avoids the problem with noncoding DNA - for gene-coding sequences, functional and taxonomic assignment is more simple than for DNA, even for shorter reads - powerfull for exploration of both prokaryota and eukaryota - in the case eukaryota, ease of isolation and purification by „fishing“ the poly-A tails of mRNA molecules (but does not always work) - metatranscriptomes much less complex than metagenomes - results in easier assembly (higher coverage with the same number of reads) - with sufficient depth of sequencing, relative importance of individual processes can be analysed to some depth by comparison of transcription level - while metagenomics tell which genes may be involved, metatranscriptomics tell which genes actually are involved (expressed) Metatranscriptomics - limitations -expression is highly regulated and corresponds to „actual“ conditions, not „mean“ conditions of the site; for example, transcription increases by orders of magnitude when dry soil is moistened -mRNA is short-lived so the metatranscriptome reveals what happened within last tens of minutes -the amount of extracted RNA usually makes amplification necessary; PCR amplification of cDNA brings bias -extracted RNA contains much rRNA that can be difficult to remove -genes with low level of expression are difficult to recover -there is little (if any useful) information on mRNA stability in time and translation rate and thus the amount of protein molecules synthesized per mRNA molecule in its lifetime -metagenomics can theoretically deliver long contigs - chromosome fragments with multiple genes that are from the same genome; this is impossible for metatranscriptomics Workflow Sampling sample collection stabilization storage Library preparation RNA isolation RNA purification DNA removal rRNA removal (to recover all mRNA) or capture of eukaryotic mRNA (poly A) RNA fragmentation cDNA synthesis RNA removal adapter ligation (with barcodes when multiple samples will be sequenced) amplification selection of molecules of approppriate size Sequencing Data analysis long sequences – direct annotation (function, binning to microbial taxa) short sequences – sequence assembly and annotation of contigs (scaffolds) Sample collection and storage Each study site 64 m2 Eight 4.5 cm soil cores at defined locations, approx. 2.0 m from each other After core collection (within 30 min, on site): - collection of core material - - vegetation, roots and mineral soil discarded - - litter material and soil organic horizon separated into two composite samples (each containing materials from all cores) - - sieving of soil (0.5 mm sieve), cutting of litter (scissors) - - aliquoting into tubes - - flash-freezing in liquid nitrogen - Storage for transport on dry ice - Storage upon arrival at -80 °C RNA isolation and purification - sample homogenization by mortar and pestle in liquid nitrogen - MoBio RNA PowerSoil Kit used for isolation in combination with Zymo Research OneStep PCR Inhibitor Removal Kit (PVPP removal of humic and fulvic acids) - DNA removal (DNAse) - verification of DNA removal (no PCR amplification of 16S) - check of RNA yield (Qubit) - check of RNA quality (BioAnalyzer) - storage of isolated total RNA (-80 °C) 4L RNA 4L -rRNA - use of Epicentre RiboZero kits - - to remove both bacterial and eukaryotic rRNA, RiboZero Metabacterial and RiboZero Human/Mouse/Rat kits were combined - - efficiency of rRNA removal verified on Bioanalyzer Removal of ribosomal RNA 4L RNA 16S 18S 28S 28S 16S Library preparation: ScriptSeq™ v2 RNA-Seq Kit (Epicentre) 1L ss cDNA clean 9 1L illumina library Library preparation - RNA fragmentation to get desired size distribution of molecules - - cDNA synthesis (reverse transcription); success check by amplification of 16S from cDNA - - RNA removal (RNAse treatment) - - Illumina adaptor ligation to cDNA - - amplification from Illumina adaptors with barcodes for each sample (10-15 cycles) - - check of DNA yield and size distribution - - removal of short fragments (Agencourt AMPure beads) and long fragments (cut from gel) - - equimolar combination of samples into one common library desired size of fragments 200 bp 650 bp Sequencing Samples were sequenced on Illumina HiSeq (2x150 bases pair-end reads) Library for one lane contained 12 samples (6x sample from litter + 6x sample from soil – summer); another lane for winter samples Theoretically, one lane should deliver up to 350 millions pair-end sequences (some 30 millions per sample). Data analysis and interpretation Pair end 2x 150-base reads were used for assembly of contigs (scaffolds) – assembly (Velvet) performed externally, we do not have enough computing power at the moment Contigs were taxonomically binned and functionally annotated (in MG Rast) Individual sequence reads were mapped to contigs Transcript abundance in each sample expressed as coverage per base = sequence count x sequence length/ contig length The seasonality project Sampling 6 sites x 2 horizons (litter, soil) x 2 seasons (September, March) = 24 samples Community analysis Amplicon sequencing of DNA and RNA-deriver ITS2 sequences (MiSeq) Metatranscriptomics: Shotgun sequencing of rRNA-depleted RNA Isolation of total RNA Deletion of bacterial rRNA and eukaryotic rRNA (communities analysed by 16S and ITS sequening of DNA and RNA) Sequencing on Illumina HiSeq – 2 lanes, 2x150 b 673 000 000 sequences Assembly of reads from all samples together 4 500 000 contigs >200 bases Annotation using MG-RAST and GenBank 44% reads mapped to contigs, 21% to identified contigs (taxon, function) Metadata: Microbial biomass, enzyme activity, chemistry Metaproteomics: Identification of fungal / bacterial proteins Analysis of microbial activity in summer and winter Sampling September - soil temperature 15°C March - soil temperature 2°C 16S tRNA 23S 5S tRNA 18S ITS1 5.8S ITS2 28S DNA RNA transcript mature RNA transcription transcription cleavage cleavage ribosome PCR product (derived from RNA) PCR product (derived from RNA) PCR product (derived from RNA) PCR product (derived from RNA) PCR product (derived from DNA) PCR product (derived from DNA) RNA amplicons: microorganisms possessing ribosomes RNA amplicons: microorganisms producing ribosomes Prokaryota (bacteria) Eukarota (fungi) Identification of active microbes by 16S / ITS sequencing from RNA Community composition and activity of bacteria S W S W S W S W S W S W S W S W Litter Soil Litter Soil Litter Soil Litter Soil Genomes Ribosomes mRNA rib. proteins all mRNA Žifčáková et al. Environ. Microbiol. 2016 Composition of total (DNA) and active (RNA) communities of fungi S W S W S W S W DNA RNA DNA RNA Litter Soil Žifčáková et al. Environ. Microbiol. 2016 Exploring microbial activity: assignment of mRNA taxonomy and function Functional annotation of predicted genes works well for bacteria but far less well for fungi and archaea. There, many hits are to „hypothetical proteins“. The situation is even worse for nonmicrobial sequences (protozoa, invertebrates…) Size of charts corresponds to numbers of transcripts. Žifčáková et al. Environ. Microbiol. 2016 Exploring microbial activity: combining taxonomy and function Bacterial (but not fungal) reads can be reliably identified on the level of phyla. NMDS shows that profiles of functions of various microbial taxa differ Seasonal contribution of microbial taxa to mRNA production summer winter summer winter summer winter summer winter Litter Soil Litter Soil The share of fungal transcripts is higher in litter than in soil. Many fungal reads are unidentified „hypothetical“ proteins. In soil, fungal share of fungal transcripts dramatically decreases in winter. All transcripts Transcripts with identified function Involvement of microbial taxa in soil processes Contributions to activity in litter/summer, litter/winter, soil/summer and soil/winter Fungi are dominant producers of cellulolytic enzymes in litter and important (but not dominant) producers in soil. Involvement of taxa in the decomposition of cellulose endocellulase exocellulase β-glucosidase Functional biodiversity: high redundance of functions (starch and sucrose metabolism as an example) Numbers indicate transcript counts for each function. (one species may produce one or more transcripts) Seasonal changes in fungal expression are more intensive in soil increase in summer increase in winter 10-fold Transcripts: green – litter, brown – soil Seasonal changes in expression Archaea Bacteria Fungi 29-51% of dominant transcripts show seasonal changes in relative abundance Žifčáková et al. Environ. Microbiol. 2016 Fungal transcription profiles Transcription profiles differ among litter and soil. Seasonal pattern of transcription is more apparent in soil than in litter. Fungal transcription in soil Increased in summer: Spliceosome components Ribosome components Phenylalanine metabolism Increased in winter: Fatty acid biosynthesis DNA repair RNA degradation Amino acid metabolism Yeast-specific cell processes soil summer soil winter litter winter litter summer Seasonal activity of fungal divisions: transcription of beta-tubulin Soil profile Beta tubulin can be reasonably assigned to fungal phyla. Relative share of transcripts of Basidiomycota (most ectomycorrhizal fungi) decreases in winter. Litter Soil Ascomycota Basidiomycota Summer Winter Fungal genes involved in mycorrhizal symbiosis Exclusively transcribed in summer in winter Transcription increased in summer in winter No seasonal difference in trancription Transcripts with high similarity (tblastx, E < 10-50)to Lacccaria laccata genes involved in mycorrhizal symbiosis are more frequently transcribed in summer. Over 50% of them are exclusively transcribed in summer. Functional biodiversity: high redundance of functions (starch and sucrose metabolism as an example) Numbers indicate transcript counts for each function. (one species may produce one or more transcripts) Metaproteomics http://www.orioninternational.pk/images/products/LTQ%20Orbitrap.png INTRO Analysis of microbial proteins by metaproteomics Extraction of proteins Separation and fragmentation Mass spectrometry Soil profile Identification of peptide fragments Functional and taxonomic annotation Illustrationsby Katharina Riedel and Stephan Fuchs, Greifswald Metatproteomics - opportunities -measures proteins produced, not genes transcribed (better proxy for function) -proteins can be stable, so that the picture of metaproteomics represents a longer period of time (but how long?) -protein extraction from litter or soil is no more a technical problem Metatproteomics - limitations -difficult annotation (complex metatranscriptomes can be only annotated when metatranscriptome or metagenome assembly is available) -sequencing of peptides, not proteins, unclear which peptides originate from the same protein -only a fraction of proteins is typically annotated -annotation is extremely computationally demanding -some proteins can be too much stable -high costs (>1000 EUR for one metatranscriptome sample) force into shallow replication Metaproteome annotation Slide provided by Stephan Fuchs, Greifswald Illustration by Stephan Fuchs, Greifswald Eukaryotic proteins are quantitatively important in soils The share of annotated fungal proteins in litter is comparable to those of bacteria Illustration by Stephan Fuchs, Greifswald Microbial community composition and activity as reflected in the metagenome, metatranscriptome and metaproteome (only annotated reads considered) Litter Soil Illustration by Stephan Fuchs, Greifswald Methodological challenges: annotation of eukaryotic genes in soils is challenging Illustration by Stephan Fuchs, Greifswald Methodological challenges: annotation of eukaryotic genes in soils is challenging Illustration by Stephan Fuchs, Greifswald Methodological challenges: annotation of eukaryotic genes in soils is challenging Illustration by Stephan Fuchs, Greifswald Methodological challenges: annotation of eukaryotic genes in soils is challenging Other methods to detect microbial activity Po přidání substrátu, značeného stabilním isotopem 13C, se značený isotop akumuluje v biomase aktivně metabolizujících mikroorganismů. Složení „aktivního“ společenstva lze pak vyhodnotit při analýze lipidů anebo – po separaci „těžkých“ (značených) a „lehkých“ molekul DNA pomocí molekulárních metod. Separace nukleových kyselin mezi 12C a 13C není obvykle v praxi úplně bezproblémová, protože část mikroorganizmů je značena pouze částečně. Proto mezi frakcemi s 12C a 13C obvykle bývá postupný přechod. Stable Isotope Probing (SIP) The separation between 12C and 13C is not usually this perfect in reality, because organisms are often partially labeled. So then you see more of a smear between the 12C and 13C bands. Different community analysis methods are used to examine the fractions. DGGE is very popular and is probably the least expensive. T-RFLP may be more sensitive but is also complex to run. After using community analysis to confirm the presence of heavy RNA, then you can taxonomically identify it by either -cutting out and sequencing DGGE bands or –cloning and sequencing the entire heavy fraction. identifikace frakcí v kterých je přítomna DNA pomocí qPCR Vzorek CO2 Podíl 13C-CO2 Celková respirace DNA extrakce 13C DNA neoznačená 12C-DNA Rozdělení DNA ultracentrifugací na základě vznášivé hustoty vysráženi DNA ve frakcích • DGGE • T-RFLP • Sekvenace vzorky obsahující 13C, nebo 12C DNA microcosm bottle Mikrobiální společenstvo 13C-značená celulóza P4082149 DNA 13C DNA Stable Isotope Probing (SIP) Tyto analzy-molecular metody dgge, trflp, sekvenovaní nebudu popisovat, jednak byly nekolikarat zmineny v prubehu konf a budu predpokladat ye vsichni o nich maji alespon pribliynou predatavu. Delali jsme analyzy houboveho i bakterialniho spolecenstva, krome CO2 a delta 13 prezentovane vysledky se týkají hub Culture independent; Soil DNA extractions: Fast DNA Spin Kit for soil from MPBio Bacteria-1108F and 1132R (qPCR) T-RFLP: Bacteria-16Seu27f-HEX and 1492R Fungi-ITS1f-HEX and ITS4-FAM Primers: Fungi ITS1f and ITS2* Figure 2 Distribution of fungal orders (as identified by PlutoF pipeline) based on relative abundance* of 291 most abundant fungal OTUs (A) and distribution of bacterial phyla (as identified by) based on relative abundance* of most abundant bacterial OTUs (B) representative of following samples representative of following samples: L= litter horizon, H= organic horizon, 13= 13C-labeled community, 12= non-labeled community *Relative abundance (number of sequences in sample type normalized as parts per thousand for each OTU, total abundance then expressed as sum of normalized abundances in all samples per OTU Příklad: Bakteriální rozkladači celulózy Bakterie označené 13C patří hlavně mezi Betaproteobacteria a Bacteroidetes. Rody Mucilaginibacter a Herminiimonas akumulovaly nejvíce 13C uhlíku z celulózy. Výhody a nevýhody metody - rozlišuje aktivní a neaktivní mikroorganismy (biomasu) - umožňuje selektivně sledovat využití určitého (značeného) substrátu - ve spojení s RT-PCR je extrémně citlivá a umožňuje detekci složení nerostoucího společenstva - - - vyžaduje spojení s dalšími technikami (PCR nebo RT-PCR a DGGE, RFLP nebo FAME) - nutné speciální vybavení (ultracentrufuga a nejlépe také RT-PCR) - substráty značené stabilními isotopy jsou drahé (13C glukóza – 2.000 Kč/g, 13C celulóza – 36.000 Kč/g, 13C lignin – 500.000 Kč/g) a je jich k dispozici omezený počet - Stable Isotope Probing (SIP) Isolation and characterization of bacterial strains: enzyme activity SEE MORE IN THE LECTURE OF SALVADOR LLADÓ ON WEDNESDAY Lladó et al. Biol. Fertil. Soils in press Most versatile taxa belong to Proteobacteria. Acidobacteria and Bacteroidetes can grow on multiple mono- and oligosaccharides. SEE MORE IN THE LECTURE OF SALVADOR LLADÓ ON WEDNESDAY Lladó et al. Biol. Fertil. Soils in press Isolation and characterization of bacterial strains: carbon sources Dominant bacteria: Expression in forest soil and litter across seasons Isolated major bacteria from forest litter and soil were genome sequenced Genome sequences were used to identify transcripts belonging to the isolated taxa Between 10 and 300 genes were identified as transcripts in litter and soil Bacteria expressed similar genes in litter and soil Spectra of expressed genes differed between summer and winted SEE MORE IN THE LECTURE OF SALVADOR LLADÓ ON WEDNESDAY Activity of the dominant fungus, mycorrhizal Russula ochroleuca Russula ochroleuca represents the second most abundant fungal taxon in the studied forest soil As mycorrhizal symbiont, it does not grow in culture Genome was obtained by shotgun sequencing DNA, isolated from a fresh fruitbody The vast majority of predicted genes does not have clear relatives for which producer and function are known Composition of expressed genes differs among seasons Metatranscriptomics – summary -higher percentage of reads receive taxonomy annotation than functional annotation (due to hypothetical proteins known from sequenced genomes) - -functional annotation is more reliable than taxonomic annotation - -in fungi, most transcripts can not be reliably assigned to either Ascomycota or Basidiomycota - -current resources for annotation (GenBank, MG Rast) contain still little fungal sequences (and genomes), in reality, many more genomes are sequenced, but it is difficult to use them for annotation - -most microbial transcripts/functions represent basic metabolism which can be of limited value for the exploration of environmental processes - Soil metatranscriptomics is currently technically feasible and can deliver interesting data.