CEITEC_logo_pos 1 Secondary structure diagrams of proteins, protein families and ligands Radka Svobodová NCBR, CEITEC MASARYK UNIVERSITY CEITEC_logo_pos Current trends: Number of available structures grows 2 CEITEC_logo_pos Current trends: Size of deposited structures also grows 3 CEITEC_logo_pos Current trends: Protein families are getting bigger 4 Analysis of individual structure Analysis of a whole family CEITEC_logo_pos Protein family structures and their analysis 5 §Comparison of protein family members §Different species §Different substituents §Mutations §Active and inactive forms § §Firm (conserved) and flexible regions § §Binding of ligands CEITEC_logo_pos Protein family structures and their analysis How to do it? 6 Cytochrome P450 (protein family 1.10.630.10) Aldolase class I (protein family 3.20.20.70) CEITEC_logo_pos 7 Multi image for 1.10.630.10 without ligands Cytochrome P450 (protein family 1.10.630.10) Aldolase class I (protein family 3.20.20.70) Insight into protein family: Secondary structure 2D diagrams Protein family structures and their analysis How to do it? CEITEC_logo_pos Protein family structures and their analysis Secondary structure utilization – necessary steps 8 § §Detection § § §Annotation § § §Visualization § CEITEC_logo_pos Visualization of secondary structure in 2D: Solved in past? Not for protein families! 9 1TQN 1OG2 ISSUE 1: Similar proteins have different 2D diagrams RMSD: 2.295 Å Hera, PDBe CEITEC_logo_pos Visualization of secondary structure in 2D: Solved in past? Not for protein families! 10 ISSUE 2: Secondary structure elements close in 2D diagrams are far in reality 1TQN Hera, PDBe CEITEC_logo_pos ISSUE 3: 2D diagrams does not reflect a shape of a protein Visualization of secondary structure in 2D: Solved in past? 11 1ORW HERA CEITEC_logo_pos Protein family based 2D diagrams How to get them? §Input: § § § §Step 1: Detection & annotation §Find secondary structure elements (SSE) §Annotate them § §Step 2: Statistics §Average length of SSE §Average occurence of SSE 12 CATH: Protein Structure Classification Database at UCL RCSB PDB: Homepage CEITEC_logo_pos Protein family based 2D diagrams How to get them? §Step 3: Construct the 2D diagram §Group all b-strands into sheets §Divide the helices and sheets into primary (common for most of the domains) and secondary (the remaining ones). §Place all primary helices and sheets into the 2D diagram. §Adjust the angles of the primary helices and sheets. §Add all secondary helices and sheets into the 2D diagram. §Adjust the angles of the secondary helices and sheets. § §Step 4: Draw the 2D diagrams 13 CEITEC_logo_pos Protein family 2D diagrams 2DProts database 14 https://2dprots.ncbr.muni.cz CEITEC_logo_pos Protein family 2D diagrams 2DProts database 15 CEITEC_logo_pos Protein family 2D diagrams 2DProts database 16 CEITEC_logo_pos 2DProts outputs 2D diagram of a protein domain § 17 CEITEC_logo_pos 2DProts outputs: Multiple 2D diagram of protein domains in a family § 18 With opacity No opacity CEITEC_logo_pos Superfamily: Dipeptidylpeptidase IV (2.140.10.30) PROTEIN FAMILY 2DProts HERA CATH PROTEIN Current solution CEITEC_logo_pos Superfamily: Rhodopsin 7-helix transmembrane proteins (1.20.1070.10) PROTEIN FAMILY 2DProts HERA CATH PROTEIN Current solution CEITEC_logo_pos Superfamily: Aldolase class I (3.20.20.70) PROTEIN FAMILY 2DProts HERA CATH PROTEIN Current solution CEITEC_logo_pos 2DProts integration to CATH 22 CEITEC_logo_pos 2DProts integration into OverProt 23 https://overprot.ncbr.muni.cz CEITEC_logo_pos 24 2DProts integration into OverProt https://overprot.ncbr.muni.cz CEITEC_logo_pos 25 Publications issue cover Sillitoe I, ..., Berka K, Hutařová Vařeková I, Svobodová R., et al. (2021). CATH: increased structural coverage of functional space. Nucleic Acids Research, 49(D1), D266-D273. Hutařová Vařeková, I., Hutař, J., Midlik, A., Horský, V., Hladká, E., Svobodová, R., & Berka, K. (2021). 2DProts: database of family-wide protein secondary structure diagrams. Bioinformatics, 37(23), 4599-4601. CEITEC_logo_pos Porin Family 2.40.160.10 2DProts: Coloring by structure properties Example: Occurence of secondary structures 26 Cytochrome reductase, Family 2.140.10.30 CEITEC_logo_pos 2DProts: Integration of ligands 27 PDB ID 2bgn, domain A00 Cytochrome reductase, family 2.140.10.30 CEITEC_logo_pos 2DProts: Integration of ligands 28 OMPF Porin PDB ID 2zfg, domain A00 Porin, Family 2.40.160.10 CEITEC_logo_pos 2DProts: 2D diagrams for proteins 29 Hemoglobine PDB ID 1v4w Pseudomonas aeruginosa lectin II PDB ID 1gzt 29 CEITEC_logo_pos 2DProts: Integration of AlphaFoldDB 30 CEITEC_logo_pos 2DProts: Integration of AlphaFoldDB 31 E. coli PapC protein, C-terminal domain Family 2.60.40.2070 Structures from PDB Structures from AlphaFoldDB