您的位置:首页 > 理论基础 > 数据结构算法

动态规划(dynamic program)&& 最长公共子序列(LCS)

2014-09-20 20:43 435 查看
动态规划特征:

1.最优子结构 the property of optimal substructure 

An opt solution to a problem contain opt solution subproblem

2.重叠子过程尽量少

recursive solution contains a small number distinct subproblems repeat many times

Longest common subsequence



1.memoization alg 备忘法

伪代码

LSC(x,y,i,j)  //ignoring base case
if c[i][j] = NULL
then if x[i] = y[j]
c[i][j] = LSC(x,y,i-1,j-1) + 1
else c[i][j] = max{LCS(x,y,i-1,j),LCS(x,y,i,j-1)}
return c[i][j]
else return c[i][j]
2.动态规划法

Traceback example
 ØAGCAT
Ø000000
G0
0

1

1

1

1
A0
1

1

1

2

2
C0
1

1

2

2

2

伪代码

function LCSLength(X[1..m], Y[1..n])
C = array(0..m, 0..n)
for i := 0..m
C[i,0] = 0
for j := 0..n
C[0,j] = 0
for i := 1..m
for j := 1..n
if X[i] = Y[j]
C[i,j] := C[i-1,j-1] + 1
else
C[i,j] := max(C[i,j-1], C[i-1,j])
return C[m,n]

#include<stdio.h>
int c[50][50];
void LCSlength(char x[],char y[],int m,int n){
int i,j;
for(i = 0;i<m;i++)
c[0][i] = 0;
for(j = 0;j<n;j++)
c[j][0] = 0;
for(i = 0;i<m;i++)
for(j = 0;j<n;j++){
if(x[i] == y[j]) c[i+1][j+1] = c[i][j] + 1;
else if(c[i+1][j]>c[i][j+1]) c[i+1][j+1] = c[i+1][j];
else c[i+1][j+1] = c[i][j+1];
}
}

void LCS(char *lcs,char *x,char *y,int m,int n){
int i,j,k;
LCSlength(x,y,m,n);
i = m-1;
j = n-1;
k = c[m]
-1;
while(i>=0&&j>=0){
if(x[i] == y[j]) {lcs[k--] = x[i];
i--;
j--;
}
else if(c[i][j+1]>c[i+1][j]) i--;
else j--;
}
}

int main(){
char x[7] = {'A','B','C','B','D','A','B'};
char y[6] = {'B','D','C','A','B','A'};
char lcs[6];
LCS(lcs,x,y,7,6);
lcs[c[7][6]] = '\0';
printf("%d %s\n",c[7][6],lcs);
}

递归方法回溯LCS(一个)

伪代码

function backtrack(C[0..m,0..n], X[1..m], Y[1..n], i, j)
if i = 0 or j = 0
return ""
else if X[i] = Y[j]
return backtrack(C, X, Y, i-1, j-1) + X[i]
else
if C[i,j-1] > C[i-1,j]
return backtrack(C, X, Y, i, j-1)
else
return backtrack(C, X, Y, i-1, j)回溯所有LCS
伪代码

function backtrackAll(C[0..m,0..n], X[1..m], Y[1..n], i, j)
if i = 0 or j = 0
return {""}
else if X[i] = Y[j]
return {Z + X[i] for all Z in backtrackAll(C, X, Y, i-1, j-1)}
else
R := {}
if C[i,j-1] ≥ C[i-1,j]
R := backtrackAll(C, X, Y, i, j-1)
if C[i-1,j] ≥ C[i,j-1]
R := R ∪ backtrackAll(C, X, Y, i-1, j)
return R

相关:

1.Shortest common supersequence

u 是 x和y的common supersequence当且仅当,x和y均为u的子序列

Given two sequences X =
< x1,...,xm >
and Y =
< y1,...,yn >,
a sequence U =
< u1,...,uk >
is a common supersequence of X and Y ifU is
a supersequence of both X and Y.
In other words, a shortest common supersequence of strings x and y is a shortest string z such that both x and y are subsequences of
z.

For example, if X

 and Y

,
the lcs is Z

.
By inserting the non-lcs symbols while preserving the symbol order, we get the scs: U

.

与LCS的关系



2.编辑距离/Levenshtein距离

编辑距离,又称Levenshtein距离,是指两个字串之间,由一个转成另一个所需的最少编辑操作次数。许可的编辑操作包括将一个字符替换成另一个字符,插入一个字符,删除一个字符。

The edit distance when only insertion and deletion is allowed (no substitution),
or when the cost of the substitution is the double of the cost of an insertion or deletion, is:

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息