Future Trends in Similarity Searching Pavel Zezula Masaryk University Brno, Czech Republic Outline of the talk • Introduction • Many faces of similarity • Real life and digital similarity • On the importance of searching • Current technology and its limitations • Similarity search computing services • SISAP after 5 years August 9-10 2SISAP 2012, Toronto, Canada Introduction • Honor and responsibility of Keynote talks • Technical talk versus a conceptual one • SISAP is a narrow community – I am a part of it • Most of my technical knowledge is summarize in the Similarity Search book • My objectives: current experience, where to go, how to proceed • Presented observations reflect my personal experience with: similarity, search, applications August 9-10 3SISAP 2012, Toronto, Canada SISAP 2012, Toronto, Canada 4 Metric Searching technology Hanan Samet Foundation of Multidimensional and Metric Data Structures Morgan Kaufmann, 2006 P. Zezula, G. Amato, V. Dohnal, and M. Batko Similarity Search: The Metric Space Approach Springer, 2006 August 9-10 Real-life Similarity • Are they similar? August 9-10 5SISAP 2012, Toronto, Canada Real-life Similarity • Are they similar? August 9-10 6SISAP 2012, Toronto, Canada Real-life Similarity • Are they similar? August 9-10 7SISAP 2012, Toronto, Canada Real-life Similarity • Are they similar? August 9-10 8SISAP 2012, Toronto, Canada Real-Life Motivation The social psychology view • Any event in the history of organism is, in a sense, unique. • Recognition, learning, and judgment presuppose an ability to categorize stimuli and classify situations by similarity. • Similarity (proximity, resemblance, communality, representativeness, psychological distance, etc.) is fundamental to theories of perception, learning, judgment, etc. August 9-10 9SISAP 2012, Toronto, Canada Contemporary Networked Media The digital data view • Almost everything that we see, read, hear, write, measure, or observe can be digital. • Users autonomously contribute to production of global media and the growth is exponential. • Sites like Flickr, YouTube, Facebook host user contributed content for a variety of events. • The elements of networked media are related by numerous multi-facet links of similarity. August 9-10 10SISAP 2012, Toronto, Canada Challenge • Networked media is getting close to the human “fact- bases – the gap between physical and digital has blurred • Similarity data management is needed to connect, search, filter, merge, relate, rank, cluster, classify, identify, or categorize objects across various collections. WHY? It is the similarity which is in the world revealing. August 9-10 11SISAP 2012, Toronto, Canada Similarity & Geometry • Figures that have the same shape but not necessarily the same size are similar figures: • Any two line segments are similar: • Any two circles are similar: A B C D August 9-10 12SISAP 2012, Toronto, Canada Similarity & Geometry • Any two squares are similar: • Any two equilateral triangles are similar: August 9-10 13SISAP 2012, Toronto, Canada Similarity & Geometry • Definition: • Two polygons are similar to each other, if: 1. Their corresponding angles are equal 2. The lengths of their corresponding sides are proportional August 9-10 14SISAP 2012, Toronto, Canada Similarity & Geometry • Example: •