D 2015

Retrieval, visualization and validation of affinities between documents

TRIGO, Luis, Martin VÍTA, Rui SARMENTO and Pavel BRÁZDIL

Basic information

Original name

Retrieval, visualization and validation of affinities between documents

Authors

TRIGO, Luis (620 Portugal), Martin VÍTA (203 Czech Republic, guarantor, belonging to the institution), Rui SARMENTO (620 Portugal) and Pavel BRÁZDIL (203 Czech Republic)

Edition

Lisbon; Portugal, Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - Volume 3: KMIS, p. 452-459, 8 pp. 2015

Publisher

SciTePress

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Portugal

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

RIV identification code

RIV/00216224:14330/15:00087400

Organization unit

Faculty of Informatics

ISBN

978-989-758-158-8

Keywords in English

Affinity network; Centrality measures; Comparison of rankings; Graph-based representation of documents; Information retrieval; Knowledge artifacts
Změněno: 3/5/2016 14:55, RNDr. Pavel Šmerk, Ph.D.

Abstract

V originále

We present an Information Retrieval tool that facilitates the task of the user when searching for a particular information that is of interest to him. Our system processes a given set of documents to produce a graph, where nodes represent documents and links the similarities. The aim is to offer the user a tool to navigate in this space in an easy way. It is possible to collapse/expand nodes. Our case study shows affinity groups based on the similarities of text production of researchers. This goes beyond the already established communities revealed by co-authorship. The system characterizes the activity of each author by a set of automatically generated keywords and by membership to a particular affinity group. The importance of each author is highlighted visually by the size of the node corresponding to the number of publications and different measures of centrality. Regarding the validation of the method, we analyse the impact of using different combinations of titles, abstracts and keywords on capturing the similarity between researchers.