• 0 Vote(s) - 0 Average
• 1
• 2
• 3
• 4
• 5
 Seminar On RANDOMIZED ALGORITHMS
Post: #1

Seminar
On
RANDOMIZED ALGORITHMS
Prepared by
Shamseena K
S1 M.Tech
Software Engineering
CUSATPage 2

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
2
ABSTRACT
A randomized algorithm is defined as an algorithm that typically uses the random
bits as an auxiliary input to guide its behavior. It achievs good performance in the
"average case". Formally, the algorithm's performance will be a random variable
determined by the random bits, with (hopefully) good expected value. The "worst case" is
typically so unlikely to occur that it can be ignored.
First and foremost reason for using randomized algorithms is simplicity. An easy
randomized algorithm can match the performance of a deterministic algorithm. For some
problems, the best-known randomized algorithms are faster than the best-known
deterministic algorithms. This is achieved by requiring that the correct answer be found
only with high probability or that the algorithm should run in expected polynomial time.
This means that the randomized algorithm may not find a correct answer in some cases or
may take very long time.
Two types of Randomized Algorithms:
1. Las Vegas Algorithm
2. Monte Carlo Algorithm
Las Vegas Algorithm: The randomized algorithm always outputs the correct
answer, it is just that there is a small probability of taking long to execute.
Monte Carlo Algorithm: The algorithm always completes quickly, but allow a
small probability of error.
Quicksort is probably the most familiar "real-world" algorithm in which
randomness is very useful. The deterministic version of this algorithm requires O(n
2
)
time to sort n numbers for some degenerate inputs -- such as the array elements being
already in sorted order! However, if the algorithm randomly permutes elements before
starting, it finishes in O(n log n) time with a high probability, for any input. The
difference between the two is crucial when sorting large datasets.Page 3

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
3
CONTENTS
1. INTRODUCTION
1.1 DETERMINISTIC ALGORITHM
1.2 RANDOMIZED ALGORITHM
1.3
WHY RANDOMIZED ALGORITHMS?
2. LAS VEGAS AND MONTE CARLO
2.1 LAS VEGAS ALGORITHM
2.2
MONTE CARLO ALGORITHM
3. RANDOMIZED QUICK SORT ALGORITHM
3.2
RANDOMIZED QUICKSORT
4. RANDOMIZED MIN-CUT ALGORITHM
5. COMPLEXITY CLASSES
5.1 CLASS RP
5.2 CLASS ZPP
5.3 CLASS PP
5.4 CLASS BPP
8. SCOPE
9. CONCLUTION
10.REFERENCESPage 4

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
4
1. INTRODUCTION
An algorithm is any well-defined computational procedure that takes some values,
or set of values as input and produces some values or set of values as output. An
algorithm is thus a sequence of computational steps that transform the input into the
output. A problem can be represented as a tuple, problem (instances, solutions). Where
instances are the inputs to the problem and solutions are the outputs to the problem. To
solve a problem we can use different algorithms. These algorithms differ dramatically in
their efficiency.
A randomized algorithm is defined as an algorithm that typically uses the random
bits as an auxiliary input to guide its behavior. It achievs good performance in the
"average case".
1.1 DETERMINISTIC ALGORITHM
Goal is to solve the problem correctly (always) and quickly. Typically the number
of steps should be polynomial in the size of the input.
Input
Output
1.2 RANDOMIZED ALGORITHM
A randomized algorithm is defined as an algorithm that is allowed to access a
source of independent, unbiased random bits, and it is then allowed to use these random
bits to influence its computation.
Input
Output
Random bits
Algorithm
AlgorithmPage 5

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
5
Formally, the algorithm's performance will be a random variable determined by
the random bits, with (hopefully) good expected value. The "worst case" is typically so
unlikely to occur that it can be ignored. Bounds on the performances of random
algorithms depend on their random choices and not on any assumptions about the inputs.
It is important to distinguish this form of probabilistic analysis of an algorithm, in which
one assumes a distribution on the inputs and analyses an algorithm that may itself be
deterministic.
1.3 WHY RANDOMIZED ALGORITHMS ?
Simplicity : - This is the first and foremost reason for using randomized
algorithms. There are numerous examples where an easy randomized algorithm can
match (or even surpass) the performance of a deterministic algorithm.
Example: Sorting algorithms.
The Merge-Sort algorithm is asymptotically best deterministic algorithm. It is not
too hard to describe. However, the same asymptotic running time can be achieved by the
simple randomized Quick-sort algorithm. The algorithm picks a random element as a
pivot and partitions the rest of the elements: those smaller than the pivot and those bigger
than the pivot. Recurse on these two partitions. We will give the analysis of the running
time later.
Speed :- The best known randomized algorithms are faster than the best known
deterministic algorithms. This is achieved by requiring that the correct answer be found
only with high probability or that the algorithm should run in expected polynomial time.
This means that the randomized algorithm may not find a correct answer in some cases or
may take very long time.
Example: Checking if a multi-variate polynomial is the zero polynomial.
There is no known deterministic polynomial-time algorithm that determines if the
given multi-variate polynomial is the zero polynomial. On the other hand, there is a very
simple and efficient randomized algorithm for this problem: just evaluate the givenPage 6

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
6
polynomial at random points. Note that such a polynomial could be represented
implicitly. e.g., as a determinant of a matrix whose entries contain different variables.Page 7

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
7
2. LAS VEGAS AND MONTE CARLO
There are two types of Randomized Algorithms:
1) Las Vegas Algorithm
2) Monte Carlo Algorithm
2.1 LAS VEGAS ALGORITHM: -
Las Vegas Algorithm: The randomized algorithm always produces the correct
answer,but its runtime for each input is a random variable whose expectation is bounded.
Example: Quick-sort algorithm
2.2 MONTE CARLO ALGORITHM: -
The randomized algorithm runs for a fixed number of steps for each input and
produces an answer that is correct with a bounded probability. That is it may produce
incorrect solution .It has a fixed deterministic running time. If the algorithm is run
repeatedly with independent random choices each time, the failure probability can be
made arbitrarily small, at the expense of running time.
For decision problems (problems for which the answer to an instance is YES or
NO), there are two kinds of Monte Carlo algorithms: those with one-sided error, and
those with two-sided error. A Monte Carlo algorithm is said to have two-sided error if
there is a non-zero probability that it errs when it outputs either YES or NO. It is said to
have one-sided error if the probability that it errs is zero for at least one of the possible
outputs (YES / NO) that it produces.
Example: Min-Cut Algorithm.
Which is better, Monte Carlo or Las Vegas? The answer depends on the
application. In some applications an incorrect solution may be catastrophic. A Las Vegas
algorithm is by definition a Monte Carlo algorithm with error probability zero. It is
possible to derive a Las Vegas algorithm from a Monte Carlo algorithm. The efficiencyPage 8

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
8
of the derivation procedure depends on the time taken to verify the correctness of a
solution to the problem.
We say that a Las Vegas algorithm is an efficient Las Vegas algorithm if on any
input its expected running time is bounded by a polynomial function of the input size.
Similarly a Monte Carlo algorithm is an efficient Monte Carlo algorithm if on any input
its worst-case running time is bounded by a polynomial function of the input size.Page 9

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
9
3. RANDOMIZED QUICKSORT
Consider sorting a set S of n numbers into ascending order. Pick the first element
as the pivot element, say y. Partition the set S-{y} into two sets S
1
and S
2 ,
where S
1
consists of those elements of S that are smaller than y, and S
2
has the remaining elements.
Recursively sort S
1
and S
2 ,
then output the elements of S
1
in ascending order , followed
by y ,and then the elements of S
2
in ascending order.
Analysis: -
If we could find y in cn steps for some constant c, we could partition S- {y} in n-1
additional steps by comparing each element of S with y; thus the total number of steps in
our sorting procedure would be given by the recurrence
T (n) <= 2T(n/2) + (c+1) n,
Where T(k) represents the time taken by this method to sort k numbers on the worst-case
input. This recurrence has the solution T(n)<= c n log n for a constant c.
The difficulty with the above scheme in practice is finding the element y that
splits S-{y} into two sets of the same size. One simple solution is to choose an element of
S at random. This does not always ensure a splitter giving a roughly even split. However,
it is reasonable to hope that in the recursive algorithm we will be lucky fairly often. The
result is an algorithm we call RandQS, for Randomized Quicksort.Page 10

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
10
3.2 RANDOMIZED QUICKSORT
Randomized Quicksort algorithm makes random choices during execution.
Algorithm RandQS:
Input: A set of numbers S.
Output: The elements of S sorted in increasing order.
1.Choose an element y uniformly at random from S: every element in S has
equal probability of being chosen.
2.By comparing each element of S with y, determine the set S
1
of elements
smaller than y and the set S
2
of elements larger than y.
3.Recursively sort S
1
and S
2
.Output the sorted version of S
1
, followed by y,
and then the sorted version of S
2
.
Analysis: -
As for any sorting algorithms, we measure the running time of RandQS in terms
of the number of comparisons it performs .Our goal is to analyze the expected number of
comparisons in an execution of RandQS. All the comparisons are performed in Step 2.
Let S(i) be the i
th
smallest element in the input list S.
Define the random variable X
ij
to assume the value 1 if S(i) and S(j) are compared
in an execution, and the value 0 otherwise. Thus X
ij
is a count of comparisons between
S(i) and S(j), and so the total number of comparisons is
Expected runtime t of RandQS is
??
=
=
n
i
i
j
ij
X
1
??
??
=
=
=
=
=
=
n
i
i
j
ij
n
i
i
j
ij
X
E
X
E
t
1
1
]
[
]
[Page 11

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
11
Let P
ij
denote the probability that S(i) and S(j) are compared in an execution.
Since X
ij
only assumes the values 0 and 1,
E[X
ij
] = P
ij
*1 + (1- P
ij
) * 0 = P
ij
To compute P
ij
, we make two observations.
1. There is a comparison between S(i) and S(j) if and only if S(i) or S(j) occurs
earlier in the permutation
?
than any element S(l) such that i<l<j. To see this,
let S(k) be the earliest in
?
from among all elements of rank between i and j. If
k = i or j then S(i) will belong to the left sub-tree of S(k) while S(j) will
belong to the right sub-tree of S(k), implying that there is no comparison
between S(i) and S(j). Conversely, when k=I or j there is an ancestor-
descendent relationship between S(i) and S(j), implying that the two elements
are compared by RandQS.
2. Any of the elements S(i),S(I+1),Â¦,S(j) is equally likely to be the first of these
elements to be chosen as a partitioning element, and hence to appear first in
?
. Thus, the probability that this first element is either S(i) or S(j) is exactly
2/(j-i+1).
Therefore, the expected runtime t:
Note that 1+ 1/2 + 1/3 + Â¦ + 1/n ? ln n
Randomized Quicksort is a Las Vegas Algorithm.
n
n
j
n
i
j
p
X
E
t
n
j
n
i
i
j
n
i
i
j
ij
n
i
i
j
ij
ln
2
2
1
2
]
[
1
1
1
1
?
=
+
-
=
=
=
?
??
??
??
=
=
=
=
=
=
=Page 12

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
12
The expected running time holds for every input. It is an expectation that depends
only on the random choices made by the algorithm, and not on any assumptions about the
distribution of the input. The behavior of a randomized algorithm can vary even on a
single input, from one execution to another. The running time becomes a random
variable, and the running time analysis involves understanding the distribution of this
random variable.Page 13

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
13
4. RANDOMIZED MIN-CUT ALGORITHM
Let G = (V,E) be a connected , undirected multigraph with n vertices.A
multigragh may contain multiple edges between any pair of vertices.A cut in G is a set
of edges whose removal results in G being broken into two or more components. A min-
cut is a cut of minimum cardinality. For instance, say this graph represents a network, and
we want to know how robust it is in the sense of the the minimum number of links
whose failure causes the network to become disconnected.The minimum cut problem is
to find a cut of minimum size.
Min-cut algorithm
Algorithm Min-Cut
Input: Connected, undirected multigragh G(V,E).
Output: Size of the min-cut.
1. while |V| >= 2 do
2.
Pick an edge e randomly and contract it.
3.
Remove Self Loops.
4. end while
5. Return |E| ;
Once we have collapsed the first random edge, the rest of the algorithm proceeds
recursively on the remaining (n - 1)-node graph. Here is a lucky example.Page 14

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
14Page 15

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
15
Analysis:-
Look at what happens to the edges after they are contracted. They become self
loops and are removed. After the i-th iteration of the while loop, or the i-th contraction,
how many nodes remain: n - i. Let us assume that no minimum-cut edge got contracted.
(We want to know what is the probability of this â„¢lucky eventâ„¢ happening) Each node in
the contracted graph has degree>= k (Why? Because contractions do not decrease the
min-cut). Thus after the i-th iteration, the total number of edges in the graph is >=
k(n-i)/2 .
The probability q
1
of picking one of those k edges in the first merging step = k/(kn/2)
= n/2
The probability p
1
of not picking any of those k edges in the first merging step = (1-2/n)
Repeat the same argument for the first n-2 merging steps.
Probability p of not picking any of those k edges in all the merging steps
= (1-2/n)(1-2/(n-1))(1-2/(n-2))Â¦(1-2/3)
Therefore, the probability of finding the min-cut:
If we repeat the whole procedure n
2
/2 times, the probability of not finding the
min-cut is at most
)1
(
2
3
)...
1
(
)1
)...(
3
)(
2
(
)
3
2
1
)...(
1
2
1)(
2
1(
-
=
-
-
-
=
-
-
-
-
=
n
n
n
n
n
n
n
n
p
e
n
n
/1
)
/2
1(
2/
2
2
?
-Page 16

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
16
By this process of repetition, we have managed to reduce the probability of failure
from 1-2/(n**2) to a more respectable 1/e. Further execution of the algorithm will make
the failure probability arbitrarily small-the only consideration being that repetitions
increase the running time.
Randomized Min-cut is a Monte Carlo Algorithm.Page 17

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
17
5.COMPLEXITY CLASSES
A complexity class is a collection of languages all of whose recognition problems
can be solved under prescribed bounds on the computational resources. A language
recognition problem is to decide whether a given string x in ?* belongs to L. An
algorithm solves a language recognition problem for a specific language L by accepting
(output YES) any input string contained in L , and rejecting (output NO) any input string
not contained in L.We are primarily interested in various forms of efficient algorithms,
where efficient is defined as being polynomial time. Recall that an algorithm has
polynomial running time if it halts within n
O(1)
time on any input of length n.
Some basic complexity classes focusing on those involving randomized
algorithms are
1) Randomized Polynomial time (RP)
2) Zero-error Probabilistic Polynomial time(ZPP)
3) Probabilistic Polynomial time(PP)
4) Bounded-error Probabilistic Polynomial time (BPP)
5.1 CLASS RP
The class RP (for randomized polynomial time) consists of all languages L that
have a randomized algorithm A running in worst case polynomial time such that for any
input x in ?*,
Â¢ x ?L ? Pr[A(x) accepts] >= Ã‚Â½
Â¢ x ? L ? Pr[A(x) accepts] = 0
The choice of the bound on the error probability Ã‚Â½ is arbitrary. In fact , as was
observed in the case of the min-cut algorithm, independent repetitions of the algorithm
can be used to go from the case where the probability of success is polynomially small to
the case where the probability of error is exponentially small while changing only thePage 18

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
18
degree of the polynomial that bounds the running time . Thus , the success probability
can be changed to an inverse polynomial function of the input size without significantly
affecting the definition of RP.
Observe that an RP algorithm is a Monte Carlo algorithm that can err only when
x ?L. this is referred to as one-sided error. The class co-RP consists of languages that
have polynomial time randomized algorithms erring only in the case when x ? L.A
problem belonging to both RP and co-RP can be solved by a randomized algorithm with
zero-sided error, ie a Las Vegas algorithm.
5.2 CLASS ZPP
The class ZPP is the class of languages which have Las Vegas algorithms
running in expected polynomial time.
ZPP = RP n co-RP
5.3 CLASS PP
The class PP (for probabilistic polynomial time ) consists of all languages L that
have a Randomized algorithm A running in worst case polynomial time such that for any
input x in ?*,
Â¢ x ?L ? Pr[A(x) accepts] > Ã‚Â½
Â¢ x ? L ? Pr[A(x) accepts] < Ã‚Â½
To reduce the error probability of a two-sided error algorithm, we can perform
several independent iterations on the same input and produce the output that occurs in the
majority of these iterations. Unfortunately, the definition of the class PP is rather weak:
because we have no bound on how far from Ã‚Â½ the probabilities are, it may not be possible
to use a small number of repetitions of an algorithm A with such two-sided error
probability to obtain an algorithm with significantly small error probability.Page 19

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
19
5.3 CLASS BPP
The class BPP (for bounded error probabilistic polynomial time) consists of all
languages L that have a randomized algorithm A running in worst case polynomial time
such that for any input x in ?*,
Â¢ x ?L ? Pr[A(x) accepts] >= Ã‚Â¾.
Â¢ x ? L ? Pr[A(x) accepts] <= Ã‚Â¼.
For this class of algorithms the error probability can be reduced to 1/2
n
with only a
polynomial number of iterations. In fact the probability bounds Ã‚Â¾ and Ã‚Â¼ can be changed
to Ã‚Â½ + 1/p(n) and Ã‚Â½ -1/p(n), respectively, for any polynomially bounded function p(n)
without affecting this error reduction property or the definition of the class BPP to a
significant extent.Page 20

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
20
There are two principal advantages to randomized algorithms. The first is
performance. For many problems, randomized algorithms run faster than best-known
deterministic algorithms.
Second, many randomized algorithms are simpler to describe and implement than
deterministic
algorithms of comparable performance. Randomized Quicksort is an
example.Page 21

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
21
Difficulty to sample the distributions using only random bits:-For example,
consider an algorithm that picks a random real number from some interval. This requires
infinitely many random bits.
There is sometimes a non-trivial computational overhead associated with
sampling from a seemingly well-behaved distribution. For example, consider the program
of using a source of unbiased random bits to sample uniformly from a set S whose
cardinality is not a power of 2.Page 22

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
22
8.SCOPE
Â¢ Number theoretic algorithms:
Primality testing (Monte Carlo)
Â¢ Data structures:
Sorting, order statistics, searching, and computational geometry.
Â¢ Algebraic identities:
Polynomial and matrix identity verification and Interactive proof systems.
Â¢ Mathematical programming:
Faster algorithms for linear Programming, Rounding linear program
solutions to integer program solutions,
Â¢ Graph algorithms:
Minimum spanning trees, shortest paths and minimum cuts.
Â¢ Counting and enumeration:
Matrix permanent and Counting combinatorial structures.
Â¢ Parallel and distributed computing:
Â¢ Probabilistic existence proofs:
Show that a combinatorial object arises with nonzero probability among
objects drawn from a suitable probability space.Page 23

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
23
9.CONCLUSION
The ideas underlying randomized algorithms can be traced back to Monte Carlo
methods used in numerical analysis, statistical physics, and simulation .
A randomized
algorithm is defined as an algorithm that
typically uses the random bits as an auxiliary input
to guide its behavior. It achievs good performance in the "average case". Formally, the
algorithm's performance will be a random variable determined by the random bits, with
(hopefully) good expected value. The "worst case" is typically so unlikely to occur that it
can be ignored. Randomized algorithms take a random bit to randomize the algorithm.
For randomized quick sort expected running time holds for every input.
There are some interesting complexity classes involving randomized algorithms:
1. Randomized Polynomial time (RP)
2. Zero-error Probabilistic Polynomial time (ZPP)
3. Probabilistic Polynomial time (PP)
4. Bounded-error Probabilistic Polynomial time (BPP).Page 24

Seminar Report-2005
Randomized Algorithms
Department of Computer Science
CUSAT
24
10.REFERENCES
Books:-
Rajeev Motwani and Prabhakar Raghavan:
Randomized Algorithms, Cambridge University Press 1995.
Thomas H Cormen, Charles E Leiserson,Ronald L Rivest and Clifford Stein:
Introduction to Algorithms
Websites:-
http://en.wikipedia.org/wiki/Probabilistic_algorithm
http://en.wikipedia.org/wiki/Las_Vegas_algorithm
 « Next Oldest | Next Newest »