Design of a system for the analysis of social media content Introduction Methodology Conclusion References • Boyd, D. M., Ellison, N. B. (2007). Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication, 13(1), p. 210-230. • Pang, B.; Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends® in Information Retrieval. 2(2), p. 1-135. • Ministr, J., Ráček, J. and Toth, D. (2012) ‘Visualization of the discussion content from the Internet’, IDIMT- 2012 ICT Support for Complex Systems – 20th Interdisciplinary Information Management Talks, Jindřichův Hradec, Linz: Trauner Verlag, p. 297-304. Jaroslav Ráček Jaroslav.Racek@ibacz.eu Faculty of Informatics, Masaryk University & IBA CZ Brno, Czech Republic Active users of social media produce large volumes of data on a daily basis. This data could contain patterns and expression of sentiments which have value to commerce and academics alike. Many algorithms and methods for analysing social media data exist, but there is not any one platform uniting these tools and services. The solution is a design of a single system with four layers, to integrate all required elements. These layers are the: collection of data from different sources, data management and storage, analytical algorithm usage focused on relevant aspects and visualisation of analytical results and source data. This paper introduces a design of new component based system for social media analysis, which can be easily configured for use in academic and also commercial environment in traditional areas of social media analysis. The system was designed on the basis of applied research in cooperation with industrial partners of FI MU. • Chronologically arranged posts vs. threads of posts • Expressions of opinions, attitudes and reactions that can be viewed by other users. • Indirect conversation with benefit of hindsight • Integration of all components in a single platform • Provides human and software interface for visualisation of results and access to retrieved data • Specific configuration for different usage of the system • Direct and indirect conversations, sharing of profile content • Mainly short messages, no general topic • Long main article and related short comments • General or focused on narrow area of interest • Adapters for different data sources • Different purposes: personal diary, significant events of life, info about interests, company sites • Interactive (commented) or static sites • Specialized solutions for management of massive amounts of data and running indexing on it • Possibility of initial filtering of data • Essential task for text analysis • Identification of key terms contained in a source text • “Social” network of users, groups, similar behaviour • General part of any specific analytical tool • Visualisation of analysis results in a form of different types of graphs and record details • Identification of multiple identities as the same person • Using behaviour patterns, timing of posts, typos, phrases • Evaluation of text combinations by comparison to records in knowledge base Dalibor Toth xtoth2@fi.muni.cz &