Hammock: a hidden Markov model-based peptide clustering
algorithm to identify protein-interaction consensus motifs in
large datasets

KREJČÍ, Adam, TR HUPP, Matej LEXA, Bořivoj VOJTĚŠEK and Petr MÜLLER. Hammock: a hidden Markov model-based peptide clustering algorithm to identify protein-interaction consensus motifs in large datasets. Bioinformatics. Oxford: Oxford University Press, 2016, vol. 32, No 1, p. 9-16. ISSN 1367-4803. Available from: https://dx.doi.org/10.1093/bioinformatics/btv522.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	Hammock: a hidden Markov model-based peptide clustering algorithm to identify protein-interaction consensus motifs in large datasets
Authors	KREJČÍ, Adam (203 Czech Republic), TR HUPP (826 United Kingdom of Great Britain and Northern Ireland), Matej LEXA (703 Slovakia, belonging to the institution), Bořivoj VOJTĚŠEK (203 Czech Republic) and Petr MÜLLER (203 Czech Republic).
Edition	Bioinformatics, Oxford, Oxford University Press, 2016, 1367-4803.

Other information
Original language	English
Type of outcome	Article in a journal
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	United Kingdom of Great Britain and Northern Ireland
Confidentiality degree	is not subject to a state or trade secret
Impact factor	Impact factor: 7.307
RIV identification code	RIV/00216224:14330/16:00089377
Organization unit	Faculty of Informatics
Doi	http://dx.doi.org/10.1093/bioinformatics/btv522
UT WoS	000368357800002
Keywords in English	phage display; sequence logo; clustering;
Tags	International impact, Reviewed
Changed by	Changed by: doc. Ing. Matej Lexa, Ph.D., učo 31298. Changed: 13/3/2018 14:02.

Abstract

Motivation: Proteins often recognize their interaction partners on the basis of short linear motifs located in disordered regions on proteins' surface. Experimental techniques that study such motifs use short peptides to mimic the structural properties of interacting proteins. Continued development of these methods allows for large-scale screening, resulting in vast amounts of peptide sequences, potentially containing information on multiple protein-protein interactions. Processing of such datasets is a complex but essential task for large-scale studies investigating protein-protein interactions. Results: The software tool presented in this article is able to rapidly identify multiple clusters of sequences carrying shared specificity motifs in massive datasets from various sources and generate multiple sequence alignments of identified clusters. The method was applied on a previously published smaller dataset containing distinct classes of ligands for SH3 domains, as well as on a new, an order of magnitude larger dataset containing epitopes for several monoclonal antibodies. The software successfully identified clusters of sequences mimicking epitopes of antibody targets, as well as secondary clusters revealing that the antibodies accept some deviations from original epitope sequences. Another test indicates that processing of even much larger datasets is computationally feasible.

PrintDisplayed: 14/5/2024 13:36

Hammock: a hidden Markov model-based peptide clustering algorithm to identify protein-interaction ...

Other applications