Advanced SearchTechniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University http://disa.fi.muni.cz  For the following graph  Compute the PageRank of each page, assuming no taxation Pavel Zezula, Jan Sedmidubsky. Advanced Search Techniques for Large Scale Data Analytics (PA212) 2 BA C  For the following graph 1) Set up the PageRank equations, assuming β = 0.8 2) Order nodes by PageRank, from lowest to highest Pavel Zezula, Jan Sedmidubsky. Advanced Search Techniques for Large Scale Data Analytics (PA212) 3 4 6 321 5  For the following graph  Assuming β = 0.8, compute the topic-sensitive PageRank for the following teleport sets: 1) {A} 2) {A, C} Pavel Zezula, Jan Sedmidubsky. Advanced Search Techniques for Large Scale Data Analytics (PA212) 4 BA DC  Suppose the BALANCE algorithm with bids of 0 or 1 only, to a situation where advertiser  A bids on query words x and y  B bids on query words x and z  Both have a budget of $2. Decide whether the following sequences of queries are certainly handled optimally by the algorithm: 1) yzyy 2) xyyz 3) xyzx Pavel Zezula, Jan Sedmidubsky. Advanced Search Techniques for Large Scale Data Analytics (PA212) 5  Bookstore has enough ratings to use a more advanced recommendation system  Suppose the mean rating of books is 3.4 stars  Alice has rated 350 books and her average rating is 0.4 stars higher than average users' ratings  Animals Farm, is a book title in the bookstore with 250,000 ratings whose average rating is 0.7 higher than global average  What is a baseline estimate of Alice's rating for Animals Farms? Pavel Zezula, Jan Sedmidubsky. Advanced Search Techniques for Large Scale Data Analytics (PA212) 6  Computers A, B and C have the following features:  Assuming features as a vector for each computer, e.g., A’s vector is [3.06, 500, 6], we can compute the cosine distance between any two vectors  Scaling dimensions can prefer some components  Assume 1 as the scale factor for processor speed, α for the disk size, and β for the main memory size and compute: 1) The cosines of angles between pairs of vectors (in terms of α and β) 2) The angles between the vectors if α = β = 1 Pavel Zezula, Jan Sedmidubsky. Advanced Search Techniques for Large Scale Data Analytics (PA212) 7 Feature A B C Processor speed 3.06 2.68 2.92 Disk size 500 320 640 Main-memory size 6 4 6  A user has rated the three computers as follows:  A: 4 stars, B: 2 stars, C: 5 stars  Tasks: 1) Normalize the ratings for this user 2) Compute a user profile for the user, with the following features Pavel Zezula, Jan Sedmidubsky. Advanced Search Techniques for Large Scale Data Analytics (PA212) 8 Feature A B C Processor speed 3.06 2.68 2.92 Disk size 500 320 640 Main-memory size 6 4 6