Prologue
CZ.1.07/2.2.00/28.0041 Centrum interaktivních a multimediálních studijních opor pro inovaci výuky a efektivní učení
evropský
sociální
soaaini     ^h^h ministerstvo školství,      opvzdsiavanr v*^^<
fondVCR    EVROPSKÁ UNIE      MLÁDEŽE A TĚLOVÝCHOVY     pro konkurenceschopnost ,*4JÍA'»*'
INVESTICE DO ROZVOJE VZDĚLÁVÁNÍ
You should spent most of your time thinking about
what you should think about most of your time.
2/128
RANDOMIZED ALGORITHMS AND PROTOCOLS - 2020
WEB PAGE of the LECTURE
RANDOMIZED ALGORITHMS AND PROTOCOLS - 2020
Prof. Jozef Gruska, DrSc Wednesday, 10.00-11.40, B410
http://www.f i.muni.cz/usr/gruska/random20
FINAL EXAM: You need to answer four questions out of five given to you. CREDIT (ZAPOČET): You need to answer three questions out of five given to you.
EXERCISES/TUTORIALS
EXERCISES/TUTORIALS: Thursdays 14.00-15.40, C525
TEACHER: RNDr. Matej Pivoluška PhD
Language English
NOTE: Exercises/tutorials are not obligatory
5/128
CONTENTS - preliminary
0
EE!
Basic concepts and examples of randomized algorithms Types and basic design methods for randomized algorithms Basics of probability theory
Simple methods for design of randomized algorithms Games theory and analysis of randomized algorithms Basic techniques I: moments and deviations Basic techniques II: tail probabilities inequalities Probabilistic method I: Markov chains - random walks Algebraic techniques - fingerprinting Fooling the adversary - examples Randomized cryptographic protocols Randomized proofs Probabilistic method II: Quantum algorithms
6/128
LITERATURE
Part I
R. Motwami, P. Raghavan: Randomized algorithms, Cambridge University Press, UK, 1995
J. Gruska: Foundations of computing, International Thompson Computer Press, USA. 715 pages, 1997
J. Hromkovic: Design and analysis of randomized algorithms, Springer, 275 pages, 2005
N. Alon, J. H. Spencer: The probabilistic method, Willey-lnterscience, 2008
Chapter 9. RANDOM WALKS - MARKOV CHAINS/MODELS
Random walks on graphs are a very simple, interesting and fundamental tool with surprisingly many applications in informatics and also in mathematics and natural sciences. Design and analysis of randomized algorithms is one of them.
The concept of a random walk is closely related with that of Markov chain (model) - one of the key concepts of discrete stochastic processes.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
9/128
BASIC CONCEPTS I
Notation Let G — (V, E) be a connected and undirected graph with n nodes (in V) and m edges (in E). For a node v £ V, let ^g(v) denote the set of neighbors of v in G.
A random walk on G is the following sequence of moves of the process that starts in
some (initial) node vo and then:
a neighbor v\ of vo, from \~g(vo), is chosen, randomly and independently,and then the process moves (walks) from vo to v\. Afterwards a neighbor V2 of v\ is chosen, again randomly and independently, and then the process walks/moves from vi to V2- The process then continues to walk/move, in the same way, from a current node to a randomly chosen neighbouring one, until, for some reasons, processor ends moving, or it moves again and again .. .even for ever.
Typical problems to explore concerning walking for a given graph G are:
■ What is the expected number of steps to get from a given node u to a given node vl
■ What is the expected number of steps needed to visit all nodes of G at least once when starting in a given node ul
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
10/128
Simple hypercubes
6-d hypercube
X
Y Z
w
k?
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
11/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
12/128
DRUNKARD'S WALK	EXAMPLE
□□□□□□□□□□□□□□DO Let a drunken seaman walk on a linear, both sides infinite, graph/pathway, each time making a step right or left, with the same probability. What are probabilities for such a drunken man to be in a particular position after some steps in case he starts in some fixed initial position?	Let i G = Kn be the complete graph of n nodes ■ u ^ v be any two vertices of G. It holds: ■ The expected number of steps of a random walk that begins in u and ends when for the first time reaches v, is ???????
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 13/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 14/128
EXAMPLE - to finish	
Let G = Kn be the complete graph of n nodes It holds: The expected number of steps of a random walk in Kn that begins in a fixed node u and ends when first reaching a fixed node v is Hence: p = n — 1	The expected number of steps to visit all nodes in G starting from any node u is (n-l)Hn, where Hn is so called Harmonic number Hn = J2) = °0g") i=l
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 15/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 16/128
A RELATED PROBLEM - COUPON SELECTION
Coupon selector problem: There are n types of coupons and at each time a coupon is chosen randomly and returned.
It has been shown that the average number of trials needed to have a coupon of each type is
nHn.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
17/128
EXA
MPLE
Let us consider graph K3 with the initial probabilities of its three nodes, say AO, Al, A2, being
P0,Pl,P2
and let 1/2 be the probability of the transmission through any edge.
If pi is the initial probability of one of the above nodes, then the probability of being in the same node after one step is
after two steps is
Pi = (1 - Pi) 2 = 2 " ~2'
P} = (1-(1- P(.)!)I = ! _ I + £
and after j steps the probability is
7=1
Therefore
lim P>,: = \
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
18/128
EXAMPLE - 2-SATISFIABILITY
RELATION TO A RANDOM WALK ON THE LINE
A simple polynomial-time (Monte Carlo) algorithm will be given to
□ Start with an arbitrary assignment.
B while there is an unsatisfied clause C,
choose randomly one of two literals of C and complement its value in the current assignment.
Theorem The expected number of steps of the above algorithm at finding a satisfying assignment is ö(n2) (where n is the number of variables).
Let A be a particular satisfying assignment.
The progress of the above algorithm can be represented by a particle moving between integers {0,1,..., n} on the real line. The position of the particle will always indicate how many variables in the current assignment have the correct value (as in A).
Crucial observation. In an unsatisfied clause at least one of two literals has an incorrect value. Therefore at each step of the algorithm with probability | we increase by one the number of variables having their correct value; and with probability \ the number of variables having correct value is decreased by one. The motion of the particle therefore resembles a random walk on the line (that is on the linear graph).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
19/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
20/128
STOCHASTIC PROCESSES in RANDOMIZED ALGORITHMS
A stochastic process P is a sequence of random variables P — {Xt}t>i, where we think of values of Xt as values (states) of the process P at time t.
Two types of stochastic processes come up over and over again in the analysis of randomized algorithms:
□ Martingale, where values of each next variable may depend, even in a complicated way, on the past history, but its expectation is 0.
B Markov chain where next state depends always only on the current state and not on ways to get there - not on the past history.
The most useful algorithmic property of Markov chains, to be explored in the next, is their convergence to a fixed (probability) distributions on states.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
21/128
MARKOV CHAINS - 1st DEFINITION
A Markov chain is a discrete-time stochastic process defined over a set of states S in terms of a matrix P of transition probabilities
P(i,j)   —   Pij —   the probability that the next state will be j if the current state is /.
Probability conditions: For any i,j it has to hold
0 < py < 1 and   ^2 Pij — 1-
j
Denote Xt the state of the Markov chain at time t.
The stochastic process {Xt}^0, specifying the history of the evolution of the Markov chain at time t, has the following memoryless property:
The future behaviors of a Markov chain depends on its current state, and not how the chain arrived at the current state. That is, it holds, for any t > 1:
Pr[Xt+1 = j\Xo = /'o, Xi = /i, .. ., X = it = /] = Pr[X+1 = j\Xt = /] = pu.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
22/128
NOTE
APPLICATIONS of MARKOV CHAINS
Note: Markov chains do not need to have prespecified initial states.
In general, initial states are chosen according to some (initial) probability distribution Xq over S.
Such a distribution is called the initial (probability) distribution.
In physical sciences, Markov chain provide a fundamental model for the emergence of global properties from local interactions.
In informatics, random walks provide a general paradigm for random exploration of an exponentially large combinatorial structures (for example graphs), by a sequence of simple and local transitions.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
23/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
24/128
HIDDEN MARKOV MODELS
Hidden Markov Model (HMM) is a Markov model, with random transitions among states, extended by random outputs of the states. HMM works in such a way that an observer can see only a sequence of states' outputs and not the internal structure (states, transition and emission probabilities) of the underlying Markov model.
Hidden Markov Model (HMM), has a lot of applications, especially in artificial intelligence.
For example, almost all fast speech and patterns recognition systems use HMM.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
25/128
DEVELOPMENT of MARKOV MODEL
The concept of Markov chain/model introduced Andreei Andreevich Markov in 1906, in order to consider the case of having a sequence of random experiments in which result of any experiment depends also on the result of the previous experiment.
Before that only such sequences of random experiments were considered where the result of each random experiment was fully independent from all previous experiments.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
26/128
UNIVERSALITY of QUANTUM RANDOM WALKS
MARKOV CHAINS -2nd DEFINITION
It can be also shown that any
quantum evolution
can be seen as so-called
continuous quantum walk.
A Markov chain/model is a discrete time stochastic process {Xk}k>o of random variables with values in a countable set / such that for every k > 1 and every /'o, h,..., ik from / we have
Pi\Xk — ik | Xk-i = /*_!,..., X0 = i0] — Pr[Xk — ik | Xk-i — ik-i] — Pik_Vk ■
The matrix P(i,j) — p,j is called the transition matrix of one-step transition probabilities.
/c-steps transition probabilities       are defined by and they do not depend on m.
If we define a matrix P^ by P^k\i,j) — p\j \ then (Chapman-Kolmogorov equations)
p(k+m) _ p(k) p(m)
The matrix P^ is said to be the /c-steps transition matrix.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
27/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
28/128
DETAILS of MARKOV'S MODEL STEPS
Rows of all /c-steps transition matrices have non-negative entries and sum up to 1. Such matrices are called stochastic matrices.
The probability distribution over states of a Markov chain C with n nodes Ni,Nn and with a transition n x n matrix P at any given time t is given by a row vector Qt — (P(l, t),..., P(n, t), where P(i, t) is the probability that chain is in the state A/, after t steps. C?t is therefore distribution vector for time step t and therefore probability distribution of states after t steps
Distribution vector of such a Markov chain at time t is then given by the vector P'Qo
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
29/128
BASIC REACHABILITY CONCEPTS
A state j is called reachable/accessible from the state / if
there is a k > 0 such that p\p > 0.
We say that states / and j are called mutually reachable if / is
reachable from j and vice verse.
A Markov chain is called irreducible, if any two of its states are mutually reachable.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
30/128
GENERAL OBSERVATION
EXAMPLE - EATING HABITS
Systems represented by Markov chains change randomly and therefore it is generally impossible to determine the exact state of the system in the future.
However, we often can determine various useful statistical properties of Markov chains.
Example: A creature in ZOO eats once a day, either grapes, or cheese, or lettuce, according to the following rules:
■ If it ate cheese yesterday it will not eat it today and will eat lettuce and grapes with the same probability 0.5.
■ If it ate grapes yesterday, it will eat today grapes with probability 0.1 cheese with probability 0.4 and lettuce with probability 0.5
■ If it ate lettuce yesterday, it will eat grapes with probability 0.4 and cheese with probability 0.6.
These eating habits can be modelled by Markov model in the next figure.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
31/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
32/128
MARKOV CHAIN for EATING HABIT!
A Markov chain is often described by a directed graph with edges labelled by probabilities for going from one state/node to another one.
0.5
Eating habits of a creature.
One statistical property that can be computed is the percentage of days the creature eats grapes (or cheese).
1. Ch apter 9. Random Walks - Markov Chains/Models
33/128
EHRENFEST MODEL
There are two urns that, in total, always contain four balls. At each step, one of the balls is chosen at random and moved to other urn.
If we choose as states number of balls in the first urn, then the transition matrix, where rows and columns are labelled (from the top to bottom and from the left to right) 0, 1, 2, 3, 4 looks as follows
P =
/	0	1	0	0	0	\
	1/4	0	3/4	0	0	
	0	1/2	0	1/2	0	
	0	0	3/4	0	1/4	
V	0	0	0	1	0	/
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
34/128
P(i,j) is probability that if first urn has / balls than in next step will have j balls.
ABSORBING DRUNKARD'S WALK
A drunk man walks on a 5 nodes (0,1,2,3,4) linear graph, where leftmost node (0) is his home and rightmost one (4) is a bar. If he is at home or in bar he keeps staying there. Otherwise he moves with probability 1/2 to left node and probability 1/2 to right node.
The transition matrix has therefore the form
P =
where rows and columns are labeled by 0,1,2,3,4 and P(i,j) is probability that the drunken man goes from the node / to the node j.
í	1	0	0	0	0	
	1/2	0	1/2	0	0	
	0	1/2	0	1/2	0	
	0	0	1/2	0	1/2	
	0	0	0	0	1	)
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
35/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
35/128
ABSORBING MARKOV CHAINS	BASIC CONCEPTS - 1.
Definition A state s, of a Markov chain is called absorbing if it is impossible to leave it (i.e. p„ = 1). A Markov chain is called absorbing if it has at least one absorbing state, and if from any state one can go to an absorbing state, in some number of steps. The Drunkard's walk is an example of an absorbing Markov chain. If the transition matrix of a Markov walk has a absorbing states and t not absorbing states, one can renumber states so that all first t rows and columns represent not absorbing states and remaining ones absorbing states. The matrix has then the following canonical form: where C? is a t x t matrix, R is a t x a matrix, 0 is a a x t zero matrix and / is a x a identity matrix, with first t rows and columns labeled by not absorbing states. This means that for any integer n, "<o;) where * is a matrix the precise form of which will not be important in the following.	Given an initial state Xo = /, the probability that the first transition into state j occurs at time t is given by rf> = Pr[Xt = j and Xs / j for 1 < s < t \ X0 = /]. Given an initial state Xo = /, then the probability that there is a visit to state j at some time t > 0 is given by t>0 fa is the probability that Markov chain will return to the state / at least once when started in the state /. If fn — 1 the state / is called persistent/recurrent (vracajúci sa); otherwise it is called transient (prechodný). A Markov chain is recurrent if every its state is recurrent. All states of an irreducible Markov chain are recurrent. If fn < 1, then each time the chain is in the state /, with probability 1 — f„ will never return again to /. It therefore holds: Pr[The number of visits to / from / equals k] — f^(l — fn).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 36/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 37/128
BASIC CONCEPTS II.	
Denote as the hitting time (čas dosiahnutia/riešenia) h,j the expected number of steps needed to visit the state/node j for the first time when starting from the state/node /. Clearly, it holds t>0 If ha < oo, then the state / is called positive (non-null) recurrent/persistent; otherwise it is called null-recurrent/persistent.	If a state / is reachable from itself, then the greatest common divisor of the set of positive k's such that > 0, is called the period of / and is denoted by d,. If d i = 1, then the state / is said to be aperiodic. A finite Markov chain all states of which are aperiodic and recurrent is called ergodic.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 38/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 39/128
EQUIVALENT DEFINITIONS	SUMMARY
A state / has a period k if any return to the state / must occur in time steps that are multiple of k.	A state / is
A state / is called aperiodic - if it is not periodic for any k > 1.	transient if /// < 1, ha = oo;
A state / is said to be transient, if given that we start in the state /, there is non-zero probability that we will never return to /. If a state / is not transient, then it is called recurrent.	null recurrent if fa = 1, /?„ = oo; non-null recurrent if fa = 1, /?„ < oo.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 40/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 41/128
EXAMPLE of a Markov chain with null-recurrent states	ERGODIC MARKOV MODEL REVISITED
Consider a Markov chain whose states are all positive integers.	
From each state / the next states are the state / + 1 (with probability       and the state 1 (with probability -J-j-)	
Starting at state 1, the probability of not having returned to state 1 within the first t steps is ri J = 1 f_\ j+1 t+1 Hence the probability of never returning to state 1 from 1 is 0, and state 1 is recurrent. It holds also 1,1   t t + i   t(t + iy However, the expected number of steps until the first return to state 1 from state 1 is	An equivalent definition of ergodic Markov chains. Definition A Markov chain with a transition matrix P is called ergodic if it is possible to go from every state to every state and there is an integer n such that all entries of the matrix Pn are positive.
oo                         oo , t=i t=i	
which is unbounded.	
However, it holds: In a finite Markov chain at least one state is recurrent and all recurrent states are positive recurrent.	
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 42/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 43/128
ERGODIC THEOREM
Let us have an ergodic Markov chain C with the set of states S — {1,..., n}. Then, it holds:
■ There exists a vector tt — (tti, ...,irn) such that for every i,j £ S it holds
i- CO
ttj = lim p)j
and the limit (limiting probability) does not depend on /.
■ The vector tt is the only non-negative solution of the system of linear equalities
n n
;=i j=i
■ The vector tt (stationary prob. distrib) satisfies the identity it — irP.
■ For every 1 < / < n, it holds that fa — 1 and ha —
■ If N(i, t) denotes the number of visits to the state / within the t first steps, then
..    N(i, t)
lim  —--- = tt;.
t—>oo t
Implications: Ergodic Markov chains always "forget", after a number of steps, their initial probability distribution.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 44/128
WEATHER FORECAST
The following Markov chain wf depicts probabilities for going from sunny days to rainy and vice verse.
0.9
0.5
Transition matrix has the form
0.9 0.1 0.5 0.5
and the stationary probability distribution of wf is tt — (0.833,0.167) and it holds
ttp — tt.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
45/128
EXAMPLE - QUEUE -<?„-!.
EXAMPLE - QUEUE - Qn - II.
Let us consider, for any integer n, the bounded queue Qn in which each time moment exactly one of the following steps happen.
■ If a queue has fewer than n customers, then with probability A a new customer joins the queue.
■ If the queue is not empty, then with probability /i the head of the line is served and leaves the queue.
■ With remaining probability, the queue is unchanged.
Let us consider stochastic process with Xt being the number of customers at time t. This is a Markov chain and its transition matrix has the following non-zero entries:
=   A if / < n Pi,i-i   =   M if / > 0
r i-A   if/ = o
P,v   =    l   1 - A - fj, if 1 < / < n - 1 I   1 — ji     \f i — n
Pi,i+1 = A if / < n; P/,/_i   =   /2 if ; > 0;
1 - A     if /' = 0; 1-X- fi\fl< i < n-1; 1 — fi     if / = n;
This Markov chain is ergodic and therefore it has a unique stationary distribution tt. It holds
7T0 tt i tt
n
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
46/128
7T0(1 - A) + 7Tl/i
7T/_lA + 7T/(1 — X/l) + 7T/+l/i, 1 < / < ti — 1 7Tn_lA + 7T,,(1 - /i)
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
47/128
EXAMPLE - QUEUE - Qn - III.
From that one can show that
7T; = 7TQ
AY
is the solution of the above system of equations. Since
it holds
;=o /=o 1
£ľ=o(W
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
48/128
MARKOV CHAINS for GRAPHS
Let G — (V, E) be a connected, non-bipartite, and undirected graph with \ V\ — n and \E\ — m. G induces a Markov chain, denoted by Mg, states of which are nodes of G and for any two nodes u, v £ V
Puv = P(u, v) where d(u) is the degree of u (in G).
0, otherwise;
Properties of Mg m Mg is irreducible.
■ Periodicity of Mg is the greatest common divisor of the length of all closed walks in G.
m Since G is undirected, there are closed walks of length 2;
■ Since G is non-bipartite, it has odd cycles and therefore the greatest common divisor of all closed walks is 1. Hence, Mg is aperiodic.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
49/128
STATIONARY DISTRIBUTION of MARKOV CHAINS on GRAPHS
Ergodic theorem therefore implies that Mg has a unique stationary distribution it. This 7r is easy to determine. Indeed, it holds
Lemma For all v e V, ttv = Proof Let [ttP]v be the v-th component of ttP. Then
Corollary: For all v G V, hvv = —
2m d(v)-
Comment: Google's page ranking algorithm is essentially a Markov chain over the graph of the web.
= [nP]v = J2nuP(u,v)=
{u,v)€E
d{u) 1 2m X d(u)
d(v)
^   2m 2m
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
50/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
51/128
APPLICATIONS of MARKOV MODELS
Physics - especially thermodynamics and statistical mechanics. Chemistry
The PageRank of a webpage, as used by Google, is defined by a Markov chain. It is the probability to be at page / in the stationary distribution on the following Markov chain on all known webpages.
If N is the number of known webpages, and a page / has links to k, webpages, then the probability to go to any of these pages is
ol 1 — ol
k, N
and the probability to go to any other page is
1 — a
where the parameter a (experimentally chosen) is about 0.85.
■ Economics and finance
■ Social sciences
■ Mathematical biology
■ Algorithmic music composition
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
52/128
(TIME) REVERSIBLE MARKOV CHAINS
An ergodic Markov chain with a stationary distribution it and transition probabilities P,j is called (time) reversible, if it hols, for any states / and j:
Informally, in the time reversible Markov chains, for each pair of states /,_/, the long-run rate at which the chain makes a transition from state / to state j equals the long-run rate at which the chain makes a transition from state j to state /.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
53/128
(TIME) REVERSIBLE MARKOV CHAINS - II.
DESIGN of TIME-REVERSIBLE CHAINS
For time revrsible Markov chains the stationary distribution is easy to compute due to the following theorem.
Theorem Consider an ergodic Markov chain with states {1,..., n} and the transition matrix P. If there is a vector it of non-negative numbers it — (iti,it2, ... ,irn) such that ^tt,■ — 1 and for any pair of states i,j it holds
TTjPiJ — -.P..,
then it is the stationary distribution corresponding to P.
Proof Consider the j'-th entry of it. Conditions of theorem imply for any j
n n 5Z7r'P'J = ^KjPj'i = 7T/ ■ 1 = TTj
and therefore it holds irP — it. Hence it is the stationary distribution and can be computed from the above system of linear equations.
The reason why a reversible chain is called reversible is that if we start in the stationary distribution at time 0, then the sequence of random variables (Xo,... , Xt) has exactly the same distributions as the reversed sequence (Xt,..., Xo).
Given any finite Markov chain with a transition matrix P and stationary distribution it, then the matrix P*, where 7r,p,j = 717 p,* is the transition matrix of a time-reversible Markov chain. Indeed, it holds
□ The matrix P* is stochastic because
^2pu = ^PjiKj/m = 7r'77r' = !■ j j
B The reversed chain has the same stationary distribution, because
P*'s paths starting from the stationary distribution are reverse of P's paths starting from the same distribution.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
54/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
55/128
REVERSIBLE and TIME REVERSIBLE MARKOV CHAINS	EXAMPLE 1
Let us have a Markov chain process:	
C\ :           ..., X„_2, X„_i, Xn with transition probabilities Py and stationary probabilities 7r,.	Let us have a connected, undirected and non-bipartite graph G with a set of nodes {1,2,..., n} and with weights vvy = vy,, assigned to any edge (/,_/).
The reverse process: Cl :                 . . . , Xn , Xn-1, Xn-2, ■ ■ ■ is also a Markov chain with transition probabilities	To G we can associate a Markov chain Cg with transition probabilities p _ wü
Q:, = *>P* Markov chain Ci is called time reversible if C?y = P,,, for all /,_/ what is equivalent with	and then Cg is time reversible and E, Efcw*
TTjPji — TTjPjj	are its stationary probabilities.
Theorem: Suppose an ergodic irreducible Markov chain C has transition probabilities Py. If there are non-negative numbers x, summing up to 1 and satisfying all equalities XiPjj — XjPji for all /,_/, then C is time reversible and x, = 7r, for all /.	
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 56/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 57/128
Example 2	APPLICATIONS - SAMPLING
Let us have the graph G with nodes {1,2,3,4,5} and with non-zero-weight edges:	
Wl2 = 3, Win — 1, 1V15 = 2	
w35 — 6, W45 = 4	
Then P12 = 1/2; PM = 1/6; P15 = 1/3	APPLICATIONS - SAMPLING
P21 = 1; P35 = 1; P« = 1/5; P45 = 4/5	
and Cg is time reversible.	
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 58/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 59/128
SAMPLING	MONTE CARLO ESTIMATION OF vr
Sampling in a set 5 according a given probability distribution tt, on elements of 5, is picking up an element x E 5 with probability 7r(x).	■ Let Z = (X, V) be a point chosen randomly in a 2 x 2 square centered in (0, 0). ■ This is equivalent to choosing X and y randomly from interval [—1,1]. ■ Let Z be considered as random variable that has value 1 (0) if the point (X, y) lies in the circle of radius 1 centered in the point (0, 0). ■ Clearly Pr(Z=l) = £ ■ If we perform such an experiment m times and Z, be the value of Z at the /th run, and 1/1/ = E™iz-. then |~ m       ~| m E[W] = E ]TZ,   = £e[Z,] = ^ L'=i  J '=i and therefore 1/1/' = (4/m)W is a natural estimation for tt.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 60/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 61/128
	HOW to do SAMPLING?
A natural question now is how good is the estimation of tt that we get from E[W]=™     W' = (4/m)W An application of the Chernoff bound gives: =   Pr([W - E[W] > eE[W]) < 2e-m7re2/12 Therefore, taking m large enough we get an arbitrarily good approximation of tt.	In the previous example the Monte Carlo method used a uniform sampling in a square to achieve a counting - to determine the value of tt. In various other cases we can do efficient computation provided we can perform sampling according to a given probability distribution. A Markov-models-induced Monte Carlo method provides a very general approach to sample according to a desired probability distribution. The basic idea is to create an Ergodic Markov Chain whose states form the sample space and whose stationary distribution is the required sampling distribution. Let Xo, Xi,... be a run of such a Markov chain. The chain converges to the stationary distribution from any state Xo and so after a sufficiently large number of steps r, the distribution of the state Xr will be close to the stationary distribution and so it can be used for required sampling. We can repeat the same argument starting with Xr and getting to Xir and so on.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 62/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 63/128
DESIGN of MARKOV CHAINS with UNIFORM STATIONARY DISTRIBUTION - I.
We show first how to construct a Markov Chain with a stationary distribution that is uniform over the given state space ft.
The first step is to define on ft a neighbourhood relation - that is to make out of ft a graph - and in such a way that the graph obtained will be irreducible.
The second step is to define transition probabilities in such a way that the resulting stationary distribution is uniform.
The next lemma show how this can be done in case we can introduce also self-loops.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
64/128
DESIGN of MARKOV CHAINS with UNIFORM STATIONARY DISTRIBUTION - II.
Notation: For any x £ Q let A/(x) be the set of neighbours in the created graph and let
M > N — maxx£o \ N(x)\.
Lemma: For a finite state space Q and a given neighbourhood structure and any M > N let us design a Markov chain such that for any x,y £ Q
( 1/M if x/y,y £ A/(x);
Px,y = <   0 if x/y,y £ A/(x);
{ 1-N(x)/M   ifx = y;
Then if this chain is irreducible and aperiodic, then its stationary distribution is the uniform distribution.
It is easy to see that the chain is time reversible and so we can apply corresponding theorem from slide 55, Then for any x, irx —
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
65/128
EXAMPLE - INDEPENDENT SETS of a GRAPH - I.
EXAMPLE - INDEPENDENT SETS of a GRAPH - II.
A set 5 of nodes of a graph G is called independent if no two nodes in 5 are connected by an edge in G.
Consider a Markov chain, whose states are independent sets of a graph G = (V,E).
a Let Xo be an arbitrary independent set in G. B To determine Xl+\ do the following
■ choose a node v uniformly and randomly from V; m if v £ X;, then X,+i = X,■ — {v};
m if v 0 X, and if adding v to X still gives an independent set, then X,+i = X, U {v}; m otherwise set X,+i = X,
Using the construction from previous lemma one can show that if x, y are neighbouring independent sets then Pxy = 1/\V\ and the stationary distribution is the uniform distribution.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
66/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
67/128
EXERCISE
s
From a bag of white and black balls you should pick up a white (black) ball with probability | (|). How can you do that?
You have 11 boxes of balls. You should pick
a bal a bal a bal a bal a bal a bal a bal a bal a bal a bal a bal
from the first box with probability from the second box with probability from the third box with probability from the fourth box with probability from the fifth bOx with probability ^; from the sixth bOx with probability from the seventh box with probability J^; from the eighth box with probability from the ninth box with probability from the tenth box with probability from the eleventh box with probability —
36 '
36 '
How can you do that?
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
68/128
The METROPOLIS ALGORITHM
This is a general method to transform any irreducible Markov chain on a state space Q to a Markov chain with a required stationary distribution.
Let us assume that we have already designed an irreducible state space (graph) and we want to construct a Markov chain on this state space with a stationary distribution irx = b(x)/B, where b(x) > 0 for all x £ Q and B — X^en Kx) 's finite.
Theorem For a finite state space Q, its neighbourhood structure {A/(x) | x £ Q} and N — maxxGc> |A/(x)| let M > N be any such number. For any x £ Q, let irx be the desired probability of state x in the stationary distribution. Consider a Markov chain with
f i? ' min{l,7ry/7rx} ifx/y£/V(x); Px,y = I   0 if x/y £* A/(x);
I 1 - Ey^x p*.y     if x=y;
Then, if this Markov chain is irreducible and aperiodic, then its stationary distribution is given by probabilities irx.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
69/128
DOING SAMPLING - once more
CONVERGENCE of RANDOM WALKS
In many important applications we need to do sampling/use of elements of a set S according to a given probability distribution it on S.
One way to do that is to design such a Markov chain M on the set S that has it as the stationary distribution and then to start a random walk on M and to stop when one can expect that stationary distribution is (almost) reached.
To find time for halting we need an estimation of the convergence rate of any initial distribution to the stationary one.
CONVERGENCE
Of RANDOM WALKS
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
70/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
71/128
CONVERGENCE of RANDOM WALKS - BASICS
In the next slides we deal with the following problem: how many steps are needed for a random walk (that starts at some node), to converge to the stationary distribution -this means that the walk will be in each node with the probability specified by the stationary distribution.
We present two techniques to deal with the above problem.
Stopping rule method calculates the rate of convergence directly by defining a proper stopping rule.
Coupling method reduces the problem of determining the rate of convergence to that of calculating the number of steps needed to meet another, imaginary, random walk, that starts at the stationary distribution.
PRELIMINARIES
Distance: ||P — Q\\, between two probability distributions, P and Q, on a set / of states is defined by
||P-Q|| = max|P(/')-<?(/')!■ Lemma: Let P and C? be probability distributions over a set / of states. Then
<?ll = ^£|P(0-<?('')!■
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
72/128
STOPPING RULE (TIME)- REPETITION
Given is a sequence of random variables Zi,Z2, ■ .., and another random variable T, whose range are natural numbers. T is a stopping rule for random variables Zi, Z2,..., if, for every /, the event T — i is independent of variables Z), for all j > i.
The idea is that the variables Z, are observed in the order one at each time step: at first Zi, then Z2 and so on. The value of the variable T then shows the number of variables observed when such a process has to stop.
Observe that if the sequence {Z,} is not independent, then the event T — i may depend on some variable Zj, j > i.
It is sometimes required that Pr(T < 00) = 1, or that T is almost surely finite.
The intuition behind such a definition of the stopping rule is that at any particular time it is enough to look at the sequence of variables/states considered so far in order to be able to tell if it is time to stop.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
73/128
STOPPING RULE - OBSERVATIONS
STRONG UNIFORM STOPPING RULE/TIME
Stopping rule can be seen as a mechanism for deciding whether to continue or to stop a process on the basis of the present position and past events, and which will always lead to a decision to stop at some time.
Examples Consider a gambler playing roulette, starting with $ 100.
■ Playing one and only one game corresponds to the stopping time T — 1, and this is a stopping rule.
■ Playing until she either runs out of money or has played 500 games is a stopping rule.
■ Playing until she doubles her money is not a stopping rule (here it is assured that betting systems has some limitations on doubling or tripling the money).
■ Playing until she is the maximum amount ahead she will ever be is not a stopping rule (as it requires information about future, present and past).
Given are random variables Zi, Z2,..., and a random variable T, whose range are natural numbers. T is a stopping rule for variables Zi, Z2,..., if, for every /, the event T — i is independent of variables Z), for all j > i.
For a finite ergodic Markov chain, a strong uniform stopping time T is a stopping rule which satisfies the condition
Pr[Zk = Z| 7= k] = 7T„
where Zk is the state at the kth step in the Markov chain, and 7r, is the probability, under the stationary distribution, of being at the state /.
Next theorem relates strong uniform stopping rule (time) and the rate of convergence of the random walk.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
74/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
75/128
	
Theorem Let 7r be the stationary distribution of a random walk, and       be the probability distribution of that walk after t steps. In addition, let T be a strong uniform stopping time. Then ||C?(t)) -tt|| < Pr[T> t]. Proof. Let Xt be the random variable producing states from / visited by a random walk at step t. V/' c /, (?(')(/')  =   pr[xt g /'] =   (£ Pr[(Xt g /') n (T — j)]) + Pr[(X g /') d (T > t)] j<t =   5^ Pr[Xt g /' T = _/]Pr[T — J] j<t +   Pr[Xt g l'\T > t]Pr[T > t] =   ^7r(/')Pr[T = j] + Pr[Xt g l'\T > t]Pr[T > t] j<t =   tt(/')(1 - Pr[T > t]) + Pr[X g l'\T > t]Pr[T > t] <   tt(/') + Pr[T > t]Pr[X g l'\T > t] < ir(l') + Pr[T > t]	The third equality from the end follows from the fact that T is a strong uniform stopping time and that if a random walk is in a stationary distribution, then it stays in it forever. The last but one equality follows from the fact that Pr[T = j] = Pr[T < t] = 1 - Pr[T > t]. j<t Finally the last equality implies, since both Pr[Xt e l'\T > t] and tt(I') are at most 1, V/'c /,|Q(t)(//)-7r(//)| < Pr[T> t].
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 76/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 77/128
EXAMPLE - HYPERCUBE	EXAMPLE - CARD SHUFFLING 1.
We show how fast converges to the stationary distribution a special random walk on a hypercube. Consider the following random walk □ Choose uniformly a neighbor (of the currently visited node). Q With probability | move to the chosen node; otherwise do not move at all. (Last trick is needed in order to have an ergodic (aperiodic) Markov chain.) Let us define as the stopping rule T the number of coordinates chosen so far (even if not all choices of coordinates yielded a move). Note that T is a strong uniform stopping rule (informally said, this happens because we have an equal probability of being at any node after all coordinates were chosen). In order to be able to use the last theorem we need to determine Pr[T > t]. However, this is exactly the coupon selector problem, since we can see coordinates as being coupons that need to be collected. For coupon selector problem, the expected number of trials needed to see all n coupons is O(nlgn). Therefore, the expected number of steps until we choose all coordinates is 0(n\gn). Using Markov's inequality we get Pr[T > t] < ^ = C(iip).	Suppose we want to shuffle a pack of n cards, numbered 1, 2,..., n, according to the following policy: Move the top card to a random location in the pack. How long it will take to have all cards in a random distribution (to reach the stationary distribution)? This problem can be viewed as a random walk in a graph with n! vertices corresponding to all possible permutations. Edges are determined by the shuffling policy. Denote by BOTTOM the card that was originally at the bottom of the pack. Let T be the number (name) of the card moved from the top at the last step. T is stopping rule and T = BOTTOM is the stopping time. We claim that T is strong uniform time.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 78/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 79/128
	
This is due to the fact that once we remove BOTTOM from the top of the pack, we are already in the stationary distribution. (It can be shown by induction on the number of cards bellow BOTTOM that the cards below BOTTOM are always in a random order.) We show now that the above stopping rule behaves as a coupon collector. Indeed, define T, to be the number of steps until there are / cards bellow BOTTOM (including BOTTOM). Since T — Tn we have T = Ti + (T2 - Ti) + (T3 - T2) + ... + (Tn - Tn_i). Moreover, T,+i — T, has a geometric distribution with parameter n^L. This is similar to the situation in coupon selection. Indeed, let Vj denote the number of steps until we see / coupons and let V — Vn. Similarity between V and T follows from the fact that V=V1 + (V2-V1) + (V3-V2) + ... + (Vn- V„_i) and that       — Vj has also geometric distribution with parameter !2^L. Hence Pr[T>t]<E[T]=o(nigny	The goal is again to investigate how fast a random walk X, starting at a fixed point, approaches the stationary distribution. To do that we consider another random walk Y that starts at the stationary distribution. Y always remains at the stationary distribution. These two walks are correlated in the sense that once they meet, their future moves are the same. Therefore, to determine how fast X approaches stationary distribution, it is sufficient to determine when such two random walks meet.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 80/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 81/128
EXAMPLE 1 - HYPERCUBE II.	EXAMPLE II - CARD SHUFFLING II.
Let coordinator be a super-player that coordinates moves of two walks X and Y, at each step, as follows. □ The coordinator chooses randomly an / £ {1,..., n}. Let X, (Y,-) be the /th bit of the node currently visited by X(Y). Q If X, — Yj, then with probability | both X and Y move, and with probability | both of them stay still. The move, if any, is to the neighbor that differs in the /th bit from the node currently visited. B If X, 7^ Y, then with probability | player X moves and Y stay still, and with the same probability it is vice verse. (Move is again to the node that differs in the /th coordinate.) From the point of view of both, X and Y, they perform a random walk. It is now easy to see that the following claim holds If, for some /, X, — Y,, then it always stays as such. If a coordinate / is chosen, (and X, / Y), then X, — Yj at the end of the step. Hence, the random variable that counts the number of steps needed for X and Y to meet behaves as a coupon collector. This way we get the same result as using the previous method.	We will consider again an n cards shuffling problem, but this time with a different shuffling policy. Uniformly and randomly choose a card and put it, alternatively, either to the top or to the bottom of the pack. To demonstrate the coupling argument we will consider two initial packs of cards. One that is fully ordered (first is the card number 1, last the card number n), second that is randomly ordered. The coordinator chooses, randomly, a card / and in both packs the card is moved alternatively to top or to bottom.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 82/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 83/128
EXAMPLE II - CARD SHUFFLING III.
Claim The stationary distribution is reached after all cards are picked.
By induction, one can show that after / steps each pack can be partition into three parts: top, middle and bottom.
The cards in the top and bottom parts have all already been selected and appear in the same order in both packs. The cards in the middle part have not been selected yet and appear in some arbitrary order in both packs.
Therefore, after all cards have been picked at least once, the two packs are in the same order.
Observe now that the problem of picking all cards is again similar to the coupon selector problem. The expected number of steps needed to shuffle the pack is therefore 0(n\gn).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
84/128
FIXING A SPANNING TREE 1/3
For a graph G — (V, E) we want to generate randomly a spanning tree of G. (To make the exposition simple, we assume that edges of the spanning tree will always be oriented towards a specific node r, called the root of the tree.)
Consider a Markov chain the states of which are all possible pairs (T, r), where T is a spanning tree of G, and r is the root of T. (The number of states in this Markov chain is s\V\, where s is the number of spanning trees of G.) A random walk on this Markov chain will now be defined as follows:
□ Randomly choose an edge e £ E that connects the root r to a node y;
B If the edge e is in the tree (and therefore directed from y to r), then change its direction - making by that y to be the root of the (new) tree;
El If e is not in the tree, add e to the tree and direct it from r to y and delete from the tree the (unique) outgoing edge leaving y. Observe:
■ The indegree of each state of the Markov chain is equal to the degree of the root of the corresponding tree (in the graph) - because a state (T, r) can be reached from all neighbours of r in G;
■ The outdegree of each state is equal to the degree of the root of the corresponding tree (in the graph) - argument is the same as in the previous case;
■ As a consequence, the probability of choosing a state (T, r) at the stationary distribution is proportional to the degree of r in G.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
85/128
FIXING A SPANNING TREE 2/3
FIXING A SPANNING TREE 3/3
We use now coupling argument to find out how fast the random walk approaches the the stationary distribution.
Consider two random walks: first walk starts at the stationary distribution; second starts at the arbitrary state.
Stage 1 Two walks will progress independently until they have the same node as the root (observe that trees might still be different).
Stage 2 Once both walks agree on a root, from now on, they will make the same moves.
Observe that if we imagine the root of the tree as a particle, then it (implicitly) performs a random walk in the graph G - this observation is crucial for the following analysis.
Let us now calculate the expected number of steps until two random walks meet - that is that they generate the same spanning tree.
Calculating the expected number of steps in Stage 1 is equivalent to the following problem: Two particles are moving randomly in a graph. What is the expected number of steps they meet?
Lemma The expected number of steps until two particles meet is at most twice the cover time of the graph.
Concerning the expected number of steps in Stage 2, observe first that at the beginning of this stage both spanning trees (in the corresponding states of the two random walks), have the same root.
Observe that when a new root is chosen, there is an edge connecting the old root to the new root, and it exists in both trees, implying that both spanning trees remain in the same root.
Suppose the root node is switched from r to r'. Notice that, from now on, the (unique) outgoing edge of node r, will remain the same in both trees. This happens, since the outgoing edge of a node r changes when either r becomes the root, or when it ceases to be the root.
It follows, that a sufficient condition for the two trees to be identical, is that each node in the graph is a root of the tree at some point in time.
This implies that an upper bound on the time for the two trees to converge at Stage II is the cover time of G.
Conclusion: the expected number of steps, in both stages, until the spanning trees become identical, depends linearly on the cover time of the graph.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
86/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
87/128
RANDOM WALKS on GRAPHS - APPLICATIONS
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
88/128
COMMUTE and COVER TIM
E
COMMUTE and COVER TIME
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
89/128
COMMUTE and COVER TIME - BASIC CONCEPTS
EXAMPLES
For the lollipop graph (cukrátko) Ln
Commute time Cuv between nodes u and v is defined:
Guv — huv ~t~ hvu — Gvui
and it is the expected time for a random walk starting at node u to return to u after exactly one visit to the node v.
CU(G) denotes the expected length of a walk that starts at u and ends upon visiting every node in G at least once.
The cover time of G, notation C(G), is defined by
C(G) = max CU(G).
it holds
huv = 6(n3),    hvu = 6(n2)
(Ln is an example showing that adding more edges can reduce the cover time - contrary to our usual intuition.)
■ /_„ has cover time 0(n3).
* The complete graph Kn has the cover time 6(n\gn).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
90/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
91/128
COMMUTE TIME - BASIC RESULT
Lemma If G — (V, E), m — \E\, (u, v) £ E, then huv + hvu < 2m.
Proof: Let us assign to G a new Markov chain, Mg, states of which are oriented versions of the edges of E (2m of them), and let the transition matrix C? of Mg have only the following non-zero values 1
Q(u,v),(v,w) — ~JJyj'
Matrix C? is doubly stochastic (all rows and also columns sum-up to 1). Indeed, for each
v, w £ V:
xe v,yev(x)
Q(x,y),(v,w) —   2J   Q(x,v),(v,w) —   2J £/()/)
= d(v)
ver(x)
d(v)
= 1.
It can be shown, that for any Markov chain with a doubly stochastic matrix the uniform distribution is stationary.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
92/128
Therefore, stationary probability of each state (an edge of Mg is j^. Consequently, (by Ergodic theorem), the expected time between successive traversals of the directed edge (v, u) is 2m = l/(l/2m).
Let us now go back to the original problem hu
-???
Conditioned on the event that the initial entry into u was through edge (v, u), we can conclude, from the above analysis of Mg, that the expected time to go from there to v and then back to u, along (v, u) is at most 2m.
The memoryless property of Markov chains allows now to remove conditioning and to get the Lemma.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
93/128
ELECTRICAL NETWORKS - BASICS 1/2
ELEKTRICKY POTENCIÁL a NAPÄTIE
Some problems associated with random walks on undirected graphs can be studied conveniently using concepts and language of the electrical network theory. A resistive electrical network is an undirected graph where to each edge an edge resistance (as a positive integer) is associated.
1
1
1 ampere
1 ampere
Figure: Potentials of nodes are (f>(u) = 2; (f>(a) = |; (f>(v) = 1; voltage difference between u and v is 1 and between a and v is |.
Kirchhoff law: The sum of all currents entering a node u of a network equals the sum of all currents leaving u.
Ohm law: I — ^, where / is current; V is voltage; R is resistance.
For each edge e, let Re be the resistance of e. For each node u, let iu be the
current that enters (and exits) the node u.
Elektrický potenciál je práca potrebná k preneseniu jednotkového náboja z nekonečna na dané miesto silového pola po lubovolnej dráhe.
Elektrický potenciál je veličina charakterizujúca energetický stav v danom bode elektrického silového pola.
Elektrické napätie je práca, ktorá sa vykoná v elektrickom poli pri přeneseni jednotkového elektrického množstva po určitej dráhe.
V potenciálovom elektrickom poli nezávisí napätie na dráhe a rovná sa rozdielu potenciálov.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
94/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
95/128
ELECTRICAL NETWORKS - BASICS 2/2
1 ampere
1 ampere
Figure: Potentials of nodes are 4>{u) = 2; 4>{a) = § ■ 4>{y) = voltage difference between u and v is 1 and between a and v is |; resistance between u and v is 2; efFective resistance is only 1.
We would like to compute the potential 4>(v) for each node v and the current iuv for each edge e = (u, v). By Ohm's Law (V=IR),
_ ftu) - 0(v)
Re
Effective resistance Ruv between two nodes u and v is the voltage difference between u and v when one ampere is injected into u and removed from v. Hence Ruv — 4>(u) - 4>(v).
Observe that effective resistance between nodes of an edge can be smaller than the resistance of the edge (see the above figure).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
96/128
COMMUTE TIME and EFFECTIVE RESISTANCE
To each graph G we associate an electrical network Af(G) at which the resistance of each edge is 1.
Theorem Commute time for any two vertices u and v in G is
GUv = 2mRuv = huv + hvu.
Proof: Notation:
■ d(x) is the degree of any node x.
■ <$>zv is the potential of any node z, relative to v, when d(x) units of current enter each node x e V, and 2m units of the current exit v.
For every edge (u, w) e E it holds, by Ohm's law:
^ uv     ^ wv — 'uw Ruw — 'uw •
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
97/128
COMMUTE TIME and EFFECTIVE RESISTANCE!
Kirchhoff's law implies that for every u e V — {v}
DERIVATION
Kirchhoff's law implies that for every u £ V — {v}
d(u) = ^2 >™ = 0ro ~ ^
wef(u)
wef(u)
On the other hand, definition of huv, (u, v) e £(<S), implies that for each
u G V-{v}\
hUv =   }y  -JT^i1 + hwv)-wer(u)   v 1
It can be shown, that both above systems of equations are in fact the same system of linear equations and therefore they have the same solution for each u,v e V: huv = Q>Uv
Hence,
and therefore
wGr(íi) wGr(ii) d(u) = <í>uvd(u) - ]T <t>M
<t>uv - -t^t y^ <t>
d(u) ^
and
Hence
d(u) ^
Quv^-rr^ y^ i+tttt y^ <t>wv = y^ -j^{i+<t>wv).
d(u)  Z-,        d(u)      '. ^. d(u)
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
98/128
vver(u) v ' n/er(u) wer(u)   v '
and so we got the same equation as huv — J2wer(u) + /?1W)-
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
99/128
PROOF - CONTINUATION 1.1	PROOF - CONTINUATION 1.2
Consider now the following network: a current of magnitude 2m enters u and a current of magnitude d(x) exits every node x £ V. If the potential of u in this network is assumed to be equal to 0, then the potential of v is equal to — (f>Uv — —huv-Let us now perform a superposition of such a network with the network considered on previous slide (at which d(x) units of current enter each node x £ V, and 2m units of the current exit v). In the resulting network, all external currents cancel, except for those in vertices u (where the current of magnitude 2m enters) and v (where the current of magnitude 2m exits).	The difference of potentials between u and v is: k   _ f_/j ) — h   + h   — <t>   -u<t>   —r "uv       v    "vuj        "uv   i   "vu        YUl/ T ^VU ^Ul/* Therefore, Cuv, is the voltage between u and v in the last network. Hence, by Ohm law (/ = VR), Claim: For every pair of vertices u and v, the effective resistance Ruv is not more than the distance between u and v in G.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 100/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 101/128
	COVER TIME
Corollary: Let G = (V, E) be a graph, n = \ V\, m = \E\ and u, v £ V. ■ If (u, v) £ Af(G), then Cuv < 2m; i If i/, v £ V, then Cuv < 2m(n - 1). u \f UjV £ V, then Cuv < n3.	Theorem: For a graph G with n nodes and m edges C(G) < 2m(n — 1). Proof: Let T be any spanning tree of G — (V, E). Then there is a traversal of T visiting nodes I/O, H, ... , 1/2,7-2 = I/O that traverses each edge of T exactly once in each direction. Consider a random walk that starts at i/o, visits all nodes in the order prescribed by the traversal, and terminates after returning to i/o. An upper bound on the expected length of such a walk is an upper bound on CVo(G). 2n-3 CvQ{G) < ^2 hvjVj+1 —    *^2 Cuv j=0 {u,v)GT Since (vj, i/j+i) £ E, CVjVj+1 < 2m by previous corollary. Therefore CVQ(G) < 2m(n - 1). The above bound is independent of the choice of i/o. Hence C(G) < 2m(n — 1).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 102/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 103/128
SPECIAL TYPES of GRAPHS
Let us derive cover time for several special graphs.
■ A complete graph Kn. The problem to determine cover time is essentially the Coupon Collector problem. Therefore
C(Kn) — n\gn+ on.
A star graph S„ The problem of calculating the cover time is again essentially the Coupon Collector problem. Therefore
C(S„) = 2n Ig n + cn.
A line L„: Let u\, 112 be the end points of L„. Since RU1,U2
CUliU2 = 2|E(Z.„)|/?U1,U2 = 2{n - l)(n - 1) = 2{n - 1) By symmetry, hul>U2 — (n — l)2, and therefor
C(Ln) = (n-lf.
Lollipop graph Ln: C(Ln) — 9(n3).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
n — 1, we have
2
104/128
EFFECTIVE RESISTANCE of GRAPHS - II.
The effective resistance /?(G) of a graph G is defined by
R(G) =     max Ruv.
{u,v}(ZV(G)
Theorem mR(G) < C(G) < 2e3mR(G) In n + n.
Proof. Lower bound: Let R(G) = Ruv for some vertices iv, v e V. Then
C(G) > max(huv, hvu) >
C,
2m/?,,
m/?(G).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
105/128
EFFECTIVE RESISTANCE of GRAPHS - III.
Upper bound. Create a random walk of the length 2e3m/?(G) In n and divide it into In n phases of the same length [= 2e3m/?(G)].
For any vertices u and v, the hitting time huv is at most 2m/?(G). (This is the average time to get through any of In n phases.)
By Markov inequality (Pr[Y > t] < ^p-), the probability that v is not visited during a single phase is at most 2^%g)(= 2/,^)) = £ " where t = 2e3m/?(G), E[Y] = 2mR(G).
Therefore, the probability that v is not visited during any of the In n phases is at most
\e3 J        —   „3 ■
Summing over n choices of v, we get that the probability that there is a node not visited within 2e3mR(G) In n steps is at most
When this happens (that is if there is a node not visited during 2e3mR(G) In n steps), we "continue to walk until all nodes are visited" (and n3 steps are enough for that - what happens with the probability 1/n2).
The expected total time is therefore
2e3mR{G) In n + {~^)n3 = 2e3mR{G) In n + n.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
106/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
107/128
APPLICATION of RAYLEIGHT's MONOTONICITY LAV
Rayleight's monotonicity law states that the effective resistance of a graph is non-increased (non-decreased), whenever the resistance of any edge of the graph is decreased (increased).
Corollary: effective resistance of graphs can not increase by adding edges.
Lemma Effective resistance of graphs is not more than its diameter diam(G).
Proof The whole graph can be generated by adding edges to the subgraph that corresponds to the diameter.
Fact: If G is a /c-regular graph with n edges, then diam(G) < 3f. Theorem If G is a /c-regular graph with n edges, then C(G) — 0(n2 In n). Proof. Since
n >
k ■ diam(G)
and, by the last theorem, C(G) — 0(mR(G) In n), we have R(G) < diam(G) < ^ and
C(G)<0(f .^-Inn).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
108/128
USTCON PROBLEM
It is the problem to decide, given an undirected graph G and two vertices s and t, whether there is a path from s to t.
Let RLP be the family of languages L for which there exists a probabilistic off-line log-space TM  M such that for any input x
Pr[A4 accepts
> i if x G L = 0    if x £ L
Theorem USTCON G RLP.
Proof Let a log-space bounded probabilistic TM M. simulate a random walk of length 2n3 through the given graph starting from s.
If M. encounters t during such a walk, it outputs YES, otherwise it outputs NO. The probability of the output YES instead of NO is 0.
What is the probability that M outputs NO instead of YES?
We know that hst < n3. By Markov inequality, if t is reachable from s, then the probability that t is not visited during 2n3 steps is at most |.
M. needs a space to count till 2n3 and to keep track of its position in the graph during the walk. Therefore it needs space
0(\gn).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
109/128
Nonuniform, deterministic, log-space algorithms for USTCON
We will consider regular d-degree graphs with n nodes such that all edges of each node
are labeled by labels from {1,2,..., d}
UNIVERSAL TRAVERSAL SEQUENCE
Any <t G {1,2,..., d}* or a G {1, 2,..., d}°° and any starting node v specify a walk through the given graph.
A sequence a is said to traverse a graph G if the walk it prescribes visits every node of G regardless of the starting node.
A sequence a is said to be a universal traverse sequence for a class of labeled graphs if it traverses every labeled graph in the class (for any starting node).
A universal traversal sequence whose length is polynomial in n can be used by a deterministic log-space off-line TM to decide instances of USTCON.
(However, in order to be a uniform log-space algorithm, the universal traversal sequence should be constructable by a log-space TM, rather than be encoded in the machine's finite state control.)
Q - a family of connected regular graphs on n-nodes and m edges.
U(Q) — length of the shortest universal traversal sequence for Q.
R(G) — maximum resistance between any two nodes of any graph in Q,
Theorem U(G) < 5mR(G) \g(n\G\).
Proof Given G G G, v G G, let us consider a random walk of the length
5mR(G)\g(n\G\) divided into lg(n|C?|) sections of length 5mR(G).
The probability that the walk fails to visit v in any section is at most |. (Due to the Markov inequality and the fact that Cuv — 2mRuv.)
Probability that v is not visited during any of the lg(n|C?|) sections is thus at most
(2)lg(n|e|) = (2  (,g2/5(n|S|)/lg(2/5) = („101)1/(18(2/5) = (n|e?|)-c   for a   c> ^
5 5
Summing up over n choices of v and \Q\ choices of the labeled graph G, the probability that the random walk (sequence) fails to be universal is less than 1.
As a consequence, there is a sequence of such a length that is universal for the class G-(We have just used the probabilistic method.)
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
110/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
111/128
HIDDEN MARKOV MODELS
HMM
HIDDEN MARKOV MODELS
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
112/128
HIDDEN MARKOV MODELS - DEFINITIONS
Hidden Markov Models (HMM) have, similarly as Markov chains considered so far, a set of states and their transition probabilities (that are given). However, in addition, it has a set of outputs each state can produce, according to its given emission probability, each time the system comes to that state. However, in a HMM states are hidden, as well as their transition and emission probabilities, before any observer.
An observer can see only the sequences of outputs the states produce. The task is determine, from the large amount of such outputs, all parameters, as well its transition and production probabilities.
Hidden Markov Model have been very successfully used in pattern recognition, speech recognition, handwriting and gestures recognition, machine translations, gene predictions, bio-informatics, human activities recognition, as well as in many other applications.
In general, HMM can be applied when the goal is to recover a data sequence that is not immediately observable (but other data that depend on the sequence are).
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
113/128
HMM - Figure
EXAMPLE: URN PROBLEM
n 12
a23
V J \ \\        / / \\
bll
yi I    I y2
y3
1
y4
Figure 1. Probabilistic parameters of a hidden Markov model (example) X — states
y — possible observations
a — state transition probabilities
fa — output probabilities
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
In a room not visible to an observer there is a robot and urns, X\, X2,..., Xn each containing a known mixture of balls labeled as {yi,y2,...}.
Robot works as follows. Chooses randomly, according a given probability distribution, one urn, randomly draw a ball from it, emails its label to the observer, puts the ball back and, according to the probability distribution associated with that urn chooses the next one and the process continues.
This process can continue many times. Observes see each time only a sequence of labels y,i,y,2, ....y*.
The task for observers is to determine parameters: transition probabilities for states ( of an ordinary Markov chain behind) and the number of different balls in different urns (and emission probabilities - actually number of different balls in urns).
114/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
115/128
EXAMPLE - WEATHER
Alice and Bob live far apart from each other and talk daily about what Bob did previous day.His actions (waking, shopping, cleaning) depended on the weather in the following way.
Shop
From their phone calls Alice tries to deduce how was and is weather in the place Bob lives.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
116/128
INFERENCE PROBLEMS
In the following picture x(t) is the state at time t and y(t) is the output at time t.
.r(/ - 1)
.r)/) )-----^ j'\f ■-]- ]} |_
du +1)
Probability of observed sequence: The probability of observing an output sequence
Y = y(0),y(l),...,y(/-1)
of length / is given by
Pr(Y) = Pr(Y\X)Pr(X)
where the sum runs over all possible hidden-node sequences
X — x(0),x(l),... ,x(l — l).This problem can be handled effectively using so called Forward algorithm.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
117/128
EXAMPLE
In the following HMM and its output sequence
Filtering: The task is to compute, given the chain's parameters and a sequence of observations, the last states at the end of observations:, i.e. to compute
Pr(x(t)\y(l),...,y(t))
the following state sequences are possible:
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
118/128
IV054   1. Chapter 9. Random Walks - Markov Chains/Models
119/128
EXAMPLE 2. Markov model
For the Markov model
show that:
Provided that today is sunny, show that 0.04 is probability that tomorrow is sunny and the day after is rainy.
Show that 0.34 is probability that it will be rainy two days from now provided it is foggy today.
IV054    1. Chapter 9. Random Walks - Markov Chains/Models
120/128
EXAMPLE 2. Hidden Markov Model
Let us add to the previous model two outputs "umbrella" and "no umbrella" and let probability of having umbrella be 0.1 (0.8) [0.3] for the sunny (rainy) [foggy] day. Supposed you were locked in a room for several days and you were asked about weather outside. The only piece of evidence you have is whether a man bringing you food carries umbrella or not.
Suppose the day you were locked in was sunny. The next day man carrying food came with the umbrella. Assume that the prior probability of the man carrying an umbrella on any day is 0.5. Show that 0.08 is the probability that the second day was rainy.
Suppose the day you were locked in the room was sunny and that man brought an umbrella on day 2 but not on day 3. Show that 0.19 is the probability that it was foggy on day 3.
IV054    1. Chapter 9. Random Walks - Markov Chains/Models
HMM - speech recognition - example
HIERARCHICAL HIDDEN MARKOV MODEL
In Hierarchical Hidden Markov Model (HHMM) each state can itself be a HHMM.
"one" - W-AX-N
(3) "O* (3 -O-* (^)
HMM for
"AX"
HMM for (")
"N" A*
Entry node
"two" - T-00
OoO
o
Exit node
a.
S .
s,,
■■>.
/
s
	
	Ml
r?	
	
"S.	
	4
HMM for
([ T J!
HMM for
"OO"
A simple hidden Markov model topology to recognize two spoken words.
IVÚ54    I   Chapter 9. Random Walks - Markov Chains/Models
122/128
Illustration of the structure of a HHMM. Gray lines show vertical transitions. The horizontal transitions are shown as black lines. The light gray circles are the internal states and the dark gray circles are the terminal states that returns control to the activating state. The production states are not shown in this figure.
IV0r>4    1. Chapter 9. Random Walks - Markov Chains/Models
123/128
HHMM and SPEECH RECOGNITION	Appendix
A huge amount of samples of speech, from many different individuals, are applied to a HHMM to infer the hierarchy of states and all transition and transmission probabilities (essentially a simulation of neocortex for producing speech), and then the resulting HHMM is used to recognize new utterances.	APPENDIX
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 124/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 125/128
SECRETARY PROBLEM	SOLUTION
The problem: ■ There is a single secretariat position to fill n an institute. ■ There are n applicants for the position, and the value of n is known. ■ Each applicant has a unique "quality value" - the interview making committee has no knowledge of quality values of those applicants that have not been interviewed yet and no knowledge how large is the best quality value of applicants. ■ The applicants are interviewed in a random order. ■ After each interview, the applicant is immediately accepted or rejected. ■ The decision to accept or reject an applicant can be based only on the relative "quality value" of the applicants interviewed so far. ■ Rejected applicants cannot be recalled. ■ The goal is to select an applicant with the best 'quality value". ■ How should selection committee proceeds at the best?	Terminology: A candidate is an applicant who, when interviewed, is better than all the applicants interviewed previously. Since the goal in the secretary problem is to select the single best applicant, only candidates will be considered for acceptance. Optimal policy for this problem (the stopping rule): For large n the optimal policy is to interview and reject the first -e applicants and then accept the next one who is better than candidates interviewed till then. As n gets larger, the probability of selecting the best applicant goes to -, which is around 37%.
IV054   1. Chapter 9. Random Walks - Markov Chains/Models 126/128	IV054   1. Chapter 9. Random Walks - Markov Chains/Models 127/128
ANDREY ANDREEVITCH MARKOV
■ Russian mathematician (1856-1922)
■ He introduced the Markov Models in 1906
■ The original motivation was to extend the law of large numbers to dependent events.
■ In 1913 he applied his findings to the first 20 000 letters of Pushkin's Eugene Onegin.