您的位置:首页 > 编程语言 > C语言/C++

《Data Structure And Algorithm Analysis In C++》读书笔记二

2018-01-11 18:37 627 查看

Chapter 2 Algorithm Analysis

Topics:

*How to estimate the time required for a program.

*How to reduce the running time of a program from days or years to fractions of a second.

*The results of careless use of recursion.

*Very efficient algorithms to raise a number to a power an to compute the greatest common divisor of two numbers.

2.1 Mathematical Background



The definition is to compare the relative rates of growth.

Def1. the growth rate of T(N) <= f(N)  O

Def2. the growth rate of T(N) >-=g(N)  Ω

Def3. the growth rate of T(N) == h(N)   Θ

Def3. the growth rate of T(N) < p(N) o

When we say that T(N) = O(f(N)), we are guaranteeing that the function T(N) grows at a rate no faster than f(N);

thus f(N) is an upper bound on T(N).

when f(N) = Ω(T(N)). T(N) is the lower bound on f(N).

example, N^3 grows faster than N^2, so N^2 = O(N^3) or  N^3 = Ω(N^2)

example, f(N) = N^2  g(N) = 2N^2 grow at the same rate, so both f(N) = O(g(N)) and f(N) = Ω(g(N))
are true.

when two function grow at the same rate, then the decision of whether or not to signify this
with Θ() is depend on the particular context.

another example, if g(N) = 2N^2, then
we can say thate g(N) = O(N^4), g(N) = O(N^3), and g(N) = O(N^2).

but the last option is the best answer.
Writing g(N) = Θ(N^2) says not only that g(N) = O(N^2) but also that the result is as good as possible.









2.2 Model

Assume that has fixed-size(say, 32bit)integers and no fancy operations, such as matrix inversion or sorting, which clearly cannot be done in one time unit(some gpu to do the computation in a unit time periods) We also assume the infinite memory.

2.3 What to Analyze

The most important resource to analyze is generally the running time. In the theoretical model, the main factors are the algorithm used and the input to the algorithm.



If a program is running much more slowly than the algorithm analysis suggests, there may be an implementation inefficiency. This can occur in C++ when arrays are inadvertently copied in their entirety, instead of passed with reference.

Generally, the quantity required is the worst-case time unless otherwise specified.

Sometimes the average-case bounds are usually much more difficult to compute.





when Input size is small. compare the growth rate of the different algorithm.



when Input size is large. compare the growth rate of the different algorithm.

2.4 Running-Time Calculations

Normally we are essentially doing the analysis by  computing a Big-Oh running time.(Ignore the low-order terms, throw away leading constants. etc.)This guarantee that the program will terminate withiin
a certain time period. The program may stop earlier thatn this, but never later.

2.4.1 A Simple Example



2.4.2 General Rules

Rule 1-FOR loops

The running time of a for loop is at most the running time for the statements inside the for loop (including tests)times the number of iterations.

Rule 2-Nested loops

Analyze these inside out. The total running time of a statement inside a group of nested loops is the running time of the statement multiplied by the product of the sizes of all the loops.



Rule3-Consecutive Statements

These just add(which means that the maximum is the one that counts;)



Rule 4-If/Else



Other rules are obvious, but a basic strategy of analyzing from the inside(or deepest part) out works. If there are function calls, there must be analyzed first. If there are recursive functions, there are several options. If the recursion is really just
a thinly veiled for loop, the analysis is usually trival.

The fullowing function is really just a simple loop and is O(N):



The next example is really a terrible use of recursion. 



if N is up to 40, it is terribly inefficient. The growth rate of this algorithm is exponentially.

Let T(N) be the running time for the function call fib(n).

If N = 0 or N = 1, the running time is some const avlue. we say that T(0) = T(1) = 1

When N > 2, the time to execute the function is the constant work at line1 plus the work at line 3.

line3 consist an addition of two function calls. The function calls are not simple operations, they must be analyzed by themselves.

The first function call is fib(n - 1) and hence, by the definition of T, requires T(N-1) units of time.

The second function cal is T(N-1) units of time.

So the total time required is then T(N) = T(N - 1) + T(N - 2) + 2

since fib(n) = fib(n - 1) + fib(n - 2) so T(N) >= fib(n). and because fib(N) < (5/3)^N by section 1.2.5 

thus for N > 4 fib(N) >= (3/2)^N.

2.4.3 Solutions for Maximum Subsequence Sum Problem

The algorithms to solve the maximum contiguous subsequence sum problem posed earlier. O(N^3)

/*
* Cubic maximum contiguous subsequence sum algorithm.
*/
int maxSubSum1(const std::vector<int> & a)
{
int maxSum = 0;

for (int i = 0; i < a.size(); ++i)
for (int j = i; j < a.size(); ++j)
{
int thisSum = 0;

for (int k = i; k <= j; ++k)
thisSum += a[k];
if(thisSum > maxSum)
maxSum = thisSum;
}
return maxSum;
}Precise Analysis:





Algorithm2 is clearly O(N^2)

/*
* Quadratic maximum contiguous subsequence sum algorithm.
*/
int maxSubSum2(const std::vector<int> & a)
{
int maxSum = 0;

for (int i = 0; i < a.size(); ++i)
{
int thisSum = 0;
for (int j = i; j < a.size(); ++j)
{
thisSum += a[j];

if(thisSum > maxSum)
maxSum = thisSum;
}
}
return maxSum;
}Algorithm3 is a recursive and relative complicated O(NlogN) solution.
It use the "divide-and-conquer" strategy. The idea is to split the problem into two roughly equal subproblems,

which are then solved recursively. This is the "divide" part. The "conquer" stage consists of patching together the two solutions of the subproblems, and possibly doing a small amount of additional work, to arrive at a solution for the whole problem.

the core implementation is to divide the sequence to two part. get the max sum of sub sequence of the first part.

get the max sum of sub sequence of the second part. Calculate the max sum of the sub sequence in the first half that includes the last element in the first half, and the largest sum in the second half that includes the first element in the second half. The
two sums can be added together.

Then choose the max one from the steps above.

#include <iostream>
#include <vector>

// choose the max number from three input.
inline int max3(int a, int b, int c)
{
return (a) > (b) ?
((a) > (c) ? (a) : (c)):
((b) > (c) ? (b) : (c));
}
/*
* Recursive maximum contiguous subsequence sum algorithm.
* Finds maximum sum in subarray spanning a[left..right].
* Does not attempt to maintain actual best sequence.
*/
int maxSumRec(const std::vector<int> & a, int left, int right)
{
if(left == right) // base case
{
if(a[left] > 0)
return a[left];
else
return 0;
}

int center = (left + right) / 2;
int maxLeftSum = maxSumRec(a, left, center);
int maxRightSum = maxSumRec(a, center + 1, right);

int maxLeftBorderSum = 0, leftBorderSum = 0;
for (int i = center; i >= left; --i)
{
leftBorderSum += a[i];
if(leftBorderSum > maxLeftBorderSum)
maxLeftBorderSum = leftBorderSum;
}

int maxRightBorderSum = 0, rightBorderSum = 0;
for (int j = center + 1; j <= right; ++j)
{
rightBorderSum += a[j];
if(rightBorderSum > maxRightBorderSum)
maxRightBorderSum = rightBorderSum;
}

return max3(maxLeftSum, maxRightSum,
maxLeftBorderSum + maxRightBorderSum);
}

/*
* Driver for divide-and-conquer maximum contiguous
* subsequence sum algorithm.
*/
int maxSubSum3(const std::vector<int> &a)
{
return maxSumRec(a, 0, a.size() - 1);
}

int main(int argc, char ** argv) {

std::vector<int> v = {4, -3, 5, -2, -1, 2, 6, -2};
std::cout << maxSubSum3(v) << std::endl;
return 0;
}Analysis:
Let T(N) be the time it takes to solve the problem.

If N = 1, then T(1) = 1.

and T(N) = 2T(N/2) + O(N)   

2T(N/2) for recursive and O(N) for the two loops.

To simplify the calculation, we can replace the O(N) term in teh equation above with N; since T(N) will expressed in big-Oh notation anyway, this will not effect the answer.

If T(N) = 2T(N/2) + N, and T(1) = 1, then T(2) = 4 = 2 * 2, T(4) = 12 = 4 * 3, T(8) = 32 = 8*4; and T(16) = 80 = 16*5;

if N = 2^k then T(N) = N*(k+1)= NlogN + N = O(NlogN).

The analysis is only for N is a power of 2, if it is not, a more complicated analysis is required, but the Big-Oh result remains unchanged.

Algorithm4, it has the time complexity O(N)

/*
* Linear-time maximum contiguous subsequence sum algorithm.
*/
int maxSubSum4(const std::vector<int> & a)
{
int maxSum = 0, thisSum = 0;
for (int j = 0; j < a.size(); ++j)
{
thisSum += a[j];
if(thisSum > maxSum)
maxSum = thisSum;
else if(thisSum < 0)
thisSum = 0;
}
}
Sketch proof about the correctness of this algorithm above:



An Online algorithms that requires only constant space and runs in linear time is just about as good as possible.

2.4.4 Logarithms in the Running Time

An algorithm is O(logN) if it takes constant (O(1)) time to cut the problem size by a fraction(which is usually 1/2). On the other hand, if constant time is required to merely reduce the problem by a const amount(such as to make the problem smaller by1),
then tha lgorithm is O(N).

When we talk about O(logN) algorithms for these kinds of problems, we usually presume that the input is preread.

Binary Search



#define NOT_FOUND (-1)
/**
* Performs the standard binary search using two comparisons per level.
* Retusn index where item is found or -1 if not found.
*/
template <typename Comparable>
int binarySearch(const std::vector<Comparable> & a, const Comparable & x)
{
int low = 0, high = a.size() - 1;
while(low <= high)
{
int mid = (low + high) / 2;
if(a[mid] < x)
low = mid + 1;
else if(a[mid] > x)
high = mid - 1;
else
return mid;
}
return NOT_FOUND; // NOT_FOUND is defined as -1
}



Euclid's Algorithm
Calculate the greatest comon divisor about two integers.

long long gcd(long long m, long long n)
{
while(n != 0)
{
long long rem = m % n;
m = n;
n = rem;
}
return m;
}



Exponentiation

calculate the power of an integer.



long long pow( long long x, int n)
{
if(n == 0)
return 1;
if(n == 1)
return x;
if( n % 2 == 0)
return pow(x * x, n / 2);
else
return pow(x * x, n / 2) * x;
}

2.4.5 Limitations of Worst-Case Analysis

Sometimes the analysis is shown empirically to be an overestimate. If this is the case, then either the analysis needs to be tightened(usually by a clever observation), or it may be that the average running time is significantly less than the worst-case
running time and no improvement in the bound is possible. For many complicated algorithms the worst-case bound is achievable by some bad input but is usually an overestimate in practice. Unfortunately, for the most these problems, an average-case analysis
is extremely complex(int many cases still unsolved), and a word-case bound, even though overly perssimistic, is the best analytical result known.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  算法 算法分析