SYLICA 'Omic Technologies – Bowater Feb 2013 SYLICA 2013 Bowater lectures Using ‘Omic Technologies to Investigate Gene Function Bowater Lectures in Brno, Feb. 2013 4 lectures on linked topics will be delivered during the coming week: •Contemporary DNA Sequencing Technologies – 26/2/2013 @ 10:00 •Using ‘Omic Technologies to Investigate Gene Function – 26/2/2013 @ 14:00 •Biophysical Methods to Study Molecular Interactions – 27/2/2013 @ 10:00 •Synthetic Biology & Nanotechnology: Tomorrow’s Molecular Biology? – 28/2/2013 @ 10:00 SYLICA 'Omic Technologies – Bowater Feb 2013 Genomics, ‘Omics & Technology •Molecular biology: major scientific discipline for past ~50 years •Genomics = “analysis of genomes”: became important science during 1990’s •Analyses of various other biological molecules have developed into their own scientific disciplines; e.g. Metabolomics = “analysis of metabolites”, etc. •Transcriptomics/Proteomics: developed during past 10-15 years •Bioinformatics: has developed as major branch of science - enables efficient analysis of data from “omics” experiments SYLICA 'Omic Technologies – Bowater Feb 2013 Genomics & Technology •Significance of “omics” coincides with dramatic improvements in different technologies: Ømolecular biology: increased range of approaches for purification and manipulation of proteins and nucleic acids Øcomputers: required for gathering and analysis of data Øinternet: allows data to be shared, quickly and easily •All developments have increased speed and cost-effectiveness - available to much wider audience SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomes •Genome: all of hereditary information encoded in the DNA (or RNA) •Transcriptome: set of all mRNAs ("transcripts”) produced from a genome •Term can be applied to: Øcomplete set of transcripts for a given organism Øspecific subset of transcripts present in a particular cell type or under specific growth conditions •Transcriptome varies because it reflects genes that are actively expressed at any given time SYLICA 'Omic Technologies – Bowater Feb 2013 DNA Microarrays Show Differences in Gene Expression •Microarray chips contain fragments from genes in the group to be analyzed –Full genome of bacteria or yeast, or protein families from larger genomes •mRNA or cDNA from different samples are differentially tagged •Analysis on the same chip shows differences SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomics •Transcriptomics uses high-throughput techniques based on DNA microarrays •For further details about microarrays see Lucchini et al., Microbiology, 147, 1403-1414 (2001) • Nelson & Cox, “Lehninger, Principles of Biochemistry”, 4th edn, 2004, p. 328 SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomics Nelson & Cox, “Lehninger, Principles of Biochemistry”, 4th edn, 2004, p. 328 SYLICA 'Omic Technologies – Bowater Feb 2013 Nelson & Cox, “Lehninger, Principles of Biochemistry”, 4th edn, 2004, p. 328 SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomics Nelson & Cox, “Lehninger, Principles of Biochemistry”, 4th edn, 2004, p. 328 SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomics Nelson & Cox, “Lehninger, Principles of Biochemistry”, 4th edn, 2004, p. 328 SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomics Nelson & Cox, “Lehninger, Principles of Biochemistry”, 4th edn, 2004, p. 328 SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomics Nelson & Cox, “Lehninger, Principles of Biochemistry”, 4th edn, 2004, p. 328 •Experiments performed under different conditions •Determines effect of conditions on expression •Produces huge amount of data •Lots of repeats required - expensive SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomics Polymerase Chain Reaction (PCR) •Used to amplify DNA in the test tube –Can amplify regions of interest (genes) within DNA –Can amplify complete circular plasmids •Mix together –Target DNA –Primers (oligonucleotides complementary to target) –Nucleotides: dATP, dCTP, dGTP, dTTP –Thermostable DNA polymerase •Place the mixture into thermocycler –Melt DNA at ~95°C –Cool to ~ 50–60°C, primers anneal to target –Polymerase extends primers in 5’®3’ direction –After a round of elongation is done, repeat steps SYLICA 'Omic Technologies – Bowater Feb 2013 General Steps of PCR SYLICA 'Omic Technologies – Bowater Feb 2013 FIGURE 9–12a (part 1) Amplification of a DNA segment by the polymerase chain reaction (PCR). (a) The PCR procedure has three steps. DNA strands are 1 separated by heating, then 2 annealed to an excess of short synthetic DNA primers (orange) that flank the region to be amplified (dark blue); 3 new DNA is synthesized by polymerization catalyzed by DNA polymerase. The three steps are repeated for 25 or 30 cycles. The thermostable Taq DNA polymerase (from Thermus aquaticus, a bacterial species that grows in hot springs) is not denatured by the heating steps. •Repeat steps 1–3 many times: General Steps of PCR SYLICA 'Omic Technologies – Bowater Feb 2013 FIGURE 9–12a (part 2) Amplification of a DNA segment by the polymerase chain reaction (PCR). (a) The PCR procedure has three steps. DNA strands are 1 separated by heating, then 2 annealed to an excess of short synthetic DNA primers (orange) that flank the region to be amplified (dark blue); 3 new DNA is synthesized by polymerization catalyzed by DNA polymerase. The three steps are repeated for 25 or 30 cycles. The thermostable Taq DNA polymerase (from Thermus aquaticus, a bacterial species that grows in hot springs) is not denatured by the heating steps. Photolitographic Synthesis of DNA SYLICA 'Omic Technologies – Bowater Feb 2013 FIGURE 9–22 Photolithography to create a DNA microarray. 1 A computer is programmed with the desired oligonucleotide sequences. 2 The reactive groups, attached to a solid surface, are initially rendered inactive by photoactive blocking groups, which can be removed by a flash of light. An opaque screen blocks the light from some areas of the surface, preventing their activation. Other areas or “spots” are exposed. 3 A solution containing one activated nucleotide (e.g., A*) is washed over the spots. The 5’ hydroxyl of the nucleotide is blocked to prevent unwanted reactions, and the nucleotide links to the surface groups at the appropriate spots through its 3’ hydroxyl. The surface is washed successively with solutions containing each remaining activated nucleotide (G*, C*, T*). The 5’-blocking groups on each nucleotide limit the reactions to addition of one nucleotide at a time, and these groups can also be removed by light. Once each spot has one nucleotide, a second nucleotide can be added to extend the nascent oligonucleotide at each spot, using screens and light to ensure that the correct nucleotides are added at each spot in the correct sequence. This continues until the required sequences are built up on each of the thousands of spots in a DNA microarray. DNA Microarrays: Applications •DNA microarrays allow simultaneous screening of many thousands of genes: high-throughput screening •Genome-wide genotyping –Which genes are present in this individual? •Tissue-specific gene expression –Which genes are used to make proteins? •Mutational analysis –Which genes have been mutated? • SYLICA 'Omic Technologies – Bowater Feb 2013 Adaptations to PCR •Reverse Transcriptase PCR (RT-PCR) –Used to amplify RNA sequences –First step uses reverse transcriptase to convert RNA to DNA •Quantitative PCR (Q-PCR) –Used to show quantitative differences in gene levels SYLICA 'Omic Technologies – Bowater Feb 2013 qPCR SYLICA 'Omic Technologies – Bowater Feb 2013 FIGURE 9–13 Quantitative PCR. PCR can be used quantitatively, by carefully monitoring the progress of a PCR amplification and determining when a DNA segment has been amplified to a specific threshold level. (a) The amount of PCR product present is determined by measuring the level of a fluorescent probe attached to a reporter oligonucleotide complementary to the DNA segment that is being amplified. Probe fluorescence is initially not detectable due to a fluorescence quencher attached to the same oligonucleotide. When the reporter oligonucleotide pairs with its complement in a copy of the amplified DNA segment, the fluorophore is separated from the quenching molecule and fluorescence results. (b) As the PCR reaction proceeds, the amount of the targeted DNA segment increases exponentially, and the fluorescent signal also increases exponentially as the oligonucleotide probes anneal to the amplified segments. After many PCR cycles, the signal reaches a plateau as one or more reaction components become exhausted. When a segment is present in greater amounts in one sample than another, its amplification reaches a defined threshold level earlier. The “No template” line follows the slow increase in background signal observed in a control that does not include added sample DNA. CT is the cycle number at which the threshold is first surpassed. Proteomes •Proteome: set of all proteins produced under a given set of conditions •Term can be applied to: Øcomplete set of proteins for a given organism Øspecific subset of proteins present in a particular cell type or under specific growth conditions •Proteome varies because it reflects genes that are actively expressed at any given time •Proteomics analyses many samples using 2D-electrophoresis and mass spectrometry •High-throughput, but less than transcriptomics SYLICA 'Omic Technologies – Bowater Feb 2013 Gel Electrophoresis •Electrophoresis separates molecules by size •Resolution is limited Berg, Tymoczko & Stryer, “Biochemistry”, 6th edn, 2006, p. 71 SYLICA 'Omic Technologies – Bowater Feb 2013 Isoelectric Focusing •Electrophoresis across a pH gradient •Proteins migrate to their isoelectric pH Berg, Tymoczko & Stryer, “Biochemistry”, 6th edn, 2006, p. 73 SYLICA 'Omic Technologies – Bowater Feb 2013 Two-dimensional Gel Electrophoresis •Protein sample initially fractionated in one dimension by isoelectric focusing •SDS-PAGE performed perpendicular to original direction •Separates proteins according to pI and mass Berg, Tymoczko & Stryer, “Biochemistry”, 6th edn, 2006, p. 74 SYLICA 'Omic Technologies – Bowater Feb 2013 •Proteins from E. coli separated by 2D-electrophoresis •>1,000 proteins can be resolved Berg, Tymoczko & Stryer, “Biochemistry”, 6th edn, 2006, p. 74 SYLICA 'Omic Technologies – Bowater Feb 2013 Two-dimensional Gel Electrophoresis Mass Spectrometry •MALDI-TOF mass spectrometry •Protein sample is ionized and exposed to electrical field •Ions travel according to size Berg, Tymoczko & Stryer, “Biochemistry”, 6th edn, 2006, p. 94 SYLICA 'Omic Technologies – Bowater Feb 2013 MALDI-TOF Mass Spectrum •MALDI-TOF gives good estimates of molecular weights •Can be used to identify a few proteins within a mixture Berg, Tymoczko & Stryer, “Biochemistry”, 6th edn, 2006, p. 94 SYLICA 'Omic Technologies – Bowater Feb 2013 Proteomic Analysis by Mass Spectrometry •Proteins separated by 2D electrophoresis •Single proteins eluted •Digestion with trypsin will give fragments with unique set of sizes •Sizes identified by mass spectrometry and matched to database •Allows identification of unknown proteins Berg, Tymoczko & Stryer, “Biochemistry”, 6th edn, 2006, p. 95 SYLICA 'Omic Technologies – Bowater Feb 2013 Transcriptomics v Proteomics •Transcriptomics and proteomics are both very powerful •Differences in their practical application: ØTranscriptomics is robust, relatively cost-effective and user-friendly ØProteomics still relatively limited – problems can remain with purification and stability of proteins •Increasing potential to combine and compare data sets - for discussion see Hegde et al., Curr. Opin. Biotech., 14, 647-651 (2003) SYLICA 'Omic Technologies – Bowater Feb 2013 Bioinformatics: Mining the Data SYLICA 'Omic Technologies – Bowater Feb 2013 Bioinformatics & Databases ·Latest biological data is gathered, organised and disseminated through large databases ·Databases include: - EBI, NCBI, Pfam, SMART, SWISS-PROT, TAIR ·Information in bioinformatic databases: - sequences, structures, homology searches ·Fast search engines allow searches by all with internet access – databases are as useful as the results they help generate! ·Improved tools for analysis of sequences SYLICA 'Omic Technologies – Bowater Feb 2013 Databases – Some URLs Resource URL European Bioinformatics Institute GenBank NCBI Protein DataBank Sanger Centre SMART The Arabidopsis Information Resource (TAIR) www.ebi.ac.uk/ www.ncbi.nlm.nih.gov/Genbank/ www.ncbi.nlm.nih.gov/ http://www.rcsb.org/pdb/home/home.do www.sanger.ac.uk/ smart.embl-heidelberg.de www.arabidopsis.org/ SYLICA 'Omic Technologies – Bowater Feb 2013 NCBI: Complete Genomes SYLICA 'Omic Technologies – Bowater Feb 2013 NCBI: Eukaryotic Genomes SYLICA 'Omic Technologies – Bowater Feb 2013 NCBI: Eukaryotic Genomes SYLICA 'Omic Technologies – Bowater Feb 2013 NCBI: Microbial Genomes SYLICA 'Omic Technologies – Bowater Feb 2013 NCBI: Microbial Genomes SYLICA 'Omic Technologies – Bowater Feb 2013 Databases - The Caveats ·Databases contain mistakes (low as a proportion of total data) - primary data errors - data analysis errors - annotation errors ·Errors are difficult to correct ·Make the interpretation of data your own responsibility!! · SYLICA 'Omic Technologies – Bowater Feb 2013 NCBI – Useful Links Links to brief description of all resources at NCBI SYLICA 'Omic Technologies – Bowater Feb 2013 PubMed: retrieval system containing citations, abstracts, and indexing terms for journal articles in the biomedical sciences NCBI – Useful Links BLAST: BLAST® (Basic Local Alignment Search Tool) is a set of similarity search programs All resources: provides integrated access to nucleotide and protein sequence data from >100,000 organisms Genetics & Medicine: continuously updated catalogues of human genes and genetic disorders Domains & Structure: NCBI Structure Group, including tools to search and display structures Taxonomy: general information about taxonomy SYLICA 'Omic Technologies – Bowater Feb 2013 Databases Summary •Many databases are available - some have lot of general information (NCBI, EBI) - some have specific data (Pfam, SWISS-PROT) - some relate to specific research interests (TAIR) •Become well acquainted with specific databases •Wide range of databases, web sites and other resources are available for in silico analysis of biological data •Great advantages, but beware caveats and potential pitfalls – understand capabilities and limitations! •Use information intelligently: - always ask if the conclusions make biological sense - may require further analyses or experimentation SYLICA 'Omic Technologies – Bowater Feb 2013 “Omics” Overview •Analyses of various biological molecules have developed into their own scientific disciplines; e.g. Metabolomics = “analysis of metabolites”, etc. •Transcriptome: set of all mRNAs ("transcripts”) produced from a genome •Proteome: set of all proteins produced under a given set of conditions •Both can vary because they reflect genes that are actively expressed at any given time •Transcriptomics and proteomics are both powerful, but are used differently: transcriptomics is cheaper and more user friendly than proteomics SYLICA 'Omic Technologies – Bowater Feb 2013