AlphaFind: Discover structure similarity across the entire
known proteome

V 2024

AlphaFind: Discover structure similarity across the entire known proteome

PROCHÁZKA, David, Terézia SLANINÁKOVÁ, Jaroslav OĽHA, Adrián ROŠINEC, Katarína GREŠOVÁ et. al.

Základní údaje

Originální název

AlphaFind: Discover structure similarity across the entire known proteome

Autoři

Vydání

2024

Nakladatel

bioRxiv

Další údaje

Typ výsledku

Výzkumná zpráva

Stát vydavatele

Česká republika

Utajení

není předmětem státního či obchodního tajemství

Odkazy

AlphaFind web application Pre-print

Organizační jednotka

Fakulta informatiky

Klíčová slova anglicky

Protein structure similarity;Learned metric index;Learned indexing;Protein structure search;AlphaFold DB

Štítky

DISA, learned indexing, LMI, protein structures

Příznaky

Mezinárodní význam

Změněno: 19. 2. 2024 10:17, RNDr. Terézia Slanináková

Anotace

V originále

AlphaFind is a web-based search engine that provides fast structure-based retrieval in the entire set of AlphaFold DB structures. Unlike other protein processing tools, AlphaFind is focused entirely on tertiary structure, automatically extracting the main 3D features of each protein chain and using a machine learning model to find the most similar structures. This indexing approach and the 3D feature extraction method used by AlphaFind have both demonstrated remarkable scalability to large datasets as well as to large protein structures. The web application itself has been designed with a focus on clarity and ease of use. The searcher accepts any valid Uniprot ID, PDB ID or gene symbol as input, and returns a set of similar protein chains from AlphaFold DB, including various similarity metrics between the query and each of the retrieved results. In addition to the main search functionality, the application provides 3D visualizations of protein structure superpositions in order to allow researchers to instantly analyze the structural similarity of the retrieved results. The AlphaFind web application is available online for free and without any registration at https://alphafind.fi.muni.cz.

Návaznosti

GF23-07040K, projekt VaV

Název: Naučené indexy pro podobností hledání

Investor: Grantová agentura ČR, Naučené indexy pro podobností hledání, Lead agentura

LM2023055, projekt VaV

Název: Česká národní infrastruktura pro biologická data

Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, ELIXIR-CZ: Česká národní infrastruktura pro biologická data

721/2023, interní kód MU

Název: Prohledávání velkých sad proteinů na základě podobnosti jejich struktur postaveno na učeném metrickém indexu

Investor: CESNET, Prohledávání velkých sad proteinů na základě podobnosti jejich struktur postaveno na učeném metrickém indexu

90254, velká výzkumná infrastruktura

Název: e-INFRA CZ II

Citovat

PROCHÁZKA, David, Terézia SLANINÁKOVÁ, Jaroslav OĽHA, Adrián ROŠINEC, Katarína GREŠOVÁ, Miriama JÁNOŠOVÁ, Jakub ČILLÍK, Jana PORUBSKÁ, Radka SVOBODOVÁ, Vlastislav DOHNAL a Matej ANTOL. AlphaFind: Discover structure similarity across the entire known proteome. bioRxiv, 2024.

@misc{2375820,
   author = {Procházka, David and Slanináková, Terézia and Oľha, Jaroslav and Rošinec, Adrián and Grešová, Katarína and Jánošová, Miriama and Čillík, Jakub and Porubská, Jana and Svobodová, Radka and Dohnal, Vlastislav and Antol, Matej},
   keywords = {Protein structure similarity;Learned metric index;Learned indexing;Protein structure search;AlphaFold DB},
   publisher = {bioRxiv},
   title = {AlphaFind: Discover structure similarity across the entire known proteome},
   url = {https://alphafind.fi.muni.cz/},
   year = {2024}
}

TY  - GEN
ID  - 2375820
AU  - Procházka, David - Slanináková, Terézia - Oľha, Jaroslav - Rošinec, Adrián - Grešová, Katarína - Jánošová, Miriama - Čillík, Jakub - Porubská, Jana - Svobodová, Radka - Dohnal, Vlastislav - Antol, Matej
PY  - 2024
TI  - AlphaFind: Discover structure similarity across the entire known proteome
PB  - bioRxiv
KW  - Protein structure similarity;Learned metric index;Learned indexing;Protein structure search;AlphaFold DB
UR  - https://alphafind.fi.muni.cz/
N2  - AlphaFind is a web-based search engine that provides fast structure-based retrieval in the entire set of AlphaFold DB structures. Unlike other protein processing tools, AlphaFind is focused entirely on tertiary structure, automatically extracting the main 3D features of each protein chain and using a machine learning model to find the most similar structures. This indexing approach and the 3D feature extraction method used by AlphaFind have both demonstrated remarkable scalability to large datasets as well as to large protein structures. The web application itself has been designed with a focus on clarity and ease of use. The searcher accepts any valid Uniprot ID, PDB ID or gene symbol as input, and returns a set of similar protein chains from AlphaFold DB, including various similarity metrics between the query and each of the retrieved results. In addition to the main search functionality, the application provides 3D visualizations of protein structure superpositions in order to allow researchers to instantly analyze the structural similarity of the retrieved results. The AlphaFind web application is available online for free and without any registration at https://alphafind.fi.muni.cz.
ER  -

PROCHÁZKA, David, Terézia SLANINÁKOVÁ, Jaroslav OĽHA, Adrián ROŠINEC, Katarína GREŠOVÁ, Miriama JÁNOŠOVÁ, Jakub ČILLÍK, Jana PORUBSKÁ, Radka SVOBODOVÁ, Vlastislav DOHNAL a Matej ANTOL. \textit{AlphaFind: Discover structure similarity across the entire known proteome}. bioRxiv, 2024.

Podrobný výpis o publikaci