HDU 3518 Boring counting(后缀数组啊 求字符串中不重叠的重复出现至少两次的子串的个数)
2015-03-03 20:47
549 查看
题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=3518
Problem Description
035 now faced a tough problem,his english teacher gives him a string,which consists with n lower case letter,he must figure out how many substrings appear at least twice,moreover,such apearances can not overlap each other.
Take aaaa as an example.”a” apears four times,”aa” apears two times without overlaping.however,aaa can’t apear more than one time without overlaping.since we can get “aaa” from [0-2](The position of string begins with 0) and [1-3]. But the interval [0-2] and
[1-3] overlaps each other.So “aaa” can not take into account.Therefore,the answer is 2(“a”,and “aa”).
Input
The input data consist with several test cases.The input ends with a line “#”.each test case contain a string consists with lower letter,the length n won’t exceed 1000(n <= 1000).
Output
For each test case output an integer ans,which represent the answer for the test case.you’d better use int64 to avoid unnecessary trouble.
Sample Input
Sample Output
Source
2010
ACM-ICPC Multi-University Training Contest(9)——Host by HNU
题意:
如题,求字符串中不重叠的重复出现至少两次的子串的个数!
PS:
运用后缀数组的性质,
先将后缀数组的sa、rank、height数组求出来!
然后再枚举重复子串的长度i,然后再在连续(至于为什么是连续的将在下面解释)且值大于所枚举的长度i的height数组中找出最大值和最小值,
如果最大值和最小值的差大于等于所枚举的长度i(不重叠),那么答案增加一;
为什么需要连续的height数组呢?
因为这样可以不用再考虑重复计数的问题,如果有相同的前缀那么他们的rank排名必定是相邻的,那么height就是连续的,前缀不同的则再做另一次判断!
代码如下:
最后再贴两个讲解后缀数组很详细的链接:
1、http://wenku.baidu.com/link?url=g6BzfQTbVcypv0wSYhjw0dtlf6eKSp20Jm5KZDT_EzMKZec_tNGNCUU7lzUWcyC6BvhC73raG2E0Uda2lmykTkmGEuhaWKxJV6CA8K3ZzG_
2、http://www.cnblogs.com/staginner/archive/2012/02/02/2335600.html
Problem Description
035 now faced a tough problem,his english teacher gives him a string,which consists with n lower case letter,he must figure out how many substrings appear at least twice,moreover,such apearances can not overlap each other.
Take aaaa as an example.”a” apears four times,”aa” apears two times without overlaping.however,aaa can’t apear more than one time without overlaping.since we can get “aaa” from [0-2](The position of string begins with 0) and [1-3]. But the interval [0-2] and
[1-3] overlaps each other.So “aaa” can not take into account.Therefore,the answer is 2(“a”,and “aa”).
Input
The input data consist with several test cases.The input ends with a line “#”.each test case contain a string consists with lower letter,the length n won’t exceed 1000(n <= 1000).
Output
For each test case output an integer ans,which represent the answer for the test case.you’d better use int64 to avoid unnecessary trouble.
Sample Input
aaaa ababcabb aaaaaa #
Sample Output
2 3 3
Source
2010
ACM-ICPC Multi-University Training Contest(9)——Host by HNU
题意:
如题,求字符串中不重叠的重复出现至少两次的子串的个数!
PS:
运用后缀数组的性质,
先将后缀数组的sa、rank、height数组求出来!
然后再枚举重复子串的长度i,然后再在连续(至于为什么是连续的将在下面解释)且值大于所枚举的长度i的height数组中找出最大值和最小值,
如果最大值和最小值的差大于等于所枚举的长度i(不重叠),那么答案增加一;
为什么需要连续的height数组呢?
因为这样可以不用再考虑重复计数的问题,如果有相同的前缀那么他们的rank排名必定是相邻的,那么height就是连续的,前缀不同的则再做另一次判断!
代码如下:
#include <cstdio> #include <cstring> const int N = 1017; int wa , wb , wv , ws ; int rank ; //名次数组 int height ; //排名相邻的两个后缀的最长公共前缀 char str ; int s , sa ; //sa为后缀数组,n个后缀从小到大进行排序之后把排好序的后缀的开头位置 int Max(int a, int b) { return a > b ? a:b; } int Min(int a, int b) { return a < b ? a:b; } int cmp(int *r, int a, int b, int l) { return r[a]==r[b] && r[a+l]==r[b+l]; } //get_sa函数的参数n代表字符串中字符的个数,这里的n里面是包括人为在字符串末尾添加的那个0的 //get_sa函数的参数m代表字符串中字符的取值范围,是基数排序的一个参数, //如果原序列都是字母可以直接取128, //如果原序列本身都是整数的话,则m可以取比最大的整数大1的值。 void get_sa(int *r, int *sa, int n, int m) //倍增算法 { int i,j,p,*x=wa,*y=wb,*t; for(i=0; i<m; i++) ws[i]=0; for(i=0; i<n; i++) ws[x[i]=r[i]]++; for(i=1; i<m; i++) ws[i]+=ws[i-1]; for(i=n-1; i>=0; i--) sa[--ws[x[i]]]=i; //对长度为1的字符串排序 for(p=1,j=1; p<n; j*=2,m=p) { for(p=0,i=n-j; i<n; i++) y[p++]=i; for(i=0; i<n; i++) if(sa[i]>=j) y[p++]=sa[i]-j; //第二关键字排序结果 for(i=0; i<n; i++) wv[i]=x[y[i]]; for(i=0; i<m; i++) ws[i]=0; for(i=0; i<n; i++) ws[wv[i]]++; for(i=1; i<m; i++) ws[i]+=ws[i-1]; for(i=n-1; i>=0; i--) sa[--ws[wv[i]]]=y[i]; //第一关键字排序 for(t=x,x=y,y=t,p=1,x[sa[0]]=0,i=1; i<n; i++) x[sa[i]]=cmp(y,sa[i-1],sa[i],j)?p-1:p++; //更新rank数组 } return; } void get_height(int *r, int *sa, int n) //求height数组 { int i, j, k=0; for(i=1; i<=n; i++) rank[sa[i]]=i; for(i=0; i<n; height[rank[i++]]=k) for(k?k--:0,j=sa[rank[i]-1]; r[i+k]==r[j+k]; k++); return; } int main() { while(scanf("%s",str) != EOF) { int len = strlen(str); if(str[0] == '#') break; for(int i = 0; i < len; i++) s[i] = str[i]-'a'+1; s[len] = 0;//这个赋值为0是关键所在 get_sa(s,sa,len+1,27); get_height(s,sa,len); //for(int i=0;i<len+1;i++) printf("%d %d\n",i,sa[i]); int ans = 0; for(int i = 1; i <= (len+1)/2; i++) { //查一半就好了,长度大于(len+1)/2的子串不可能重复俩次啦 //长度为i的重复子串 int minn = N; int maxn = -1; for(int j = 1; j <= len; j++) { if(height[j] >= i)//连续 { minn = Min(minn,Min(sa[j-1],sa[j])); maxn = Max(maxn,Max(sa[j-1],sa[j])); } else//若height的值不是连续大于i的 { if(minn+i <= maxn) ans++; minn = N, maxn = -1; } } if(minn+i <= maxn) ans++; } printf("%d\n",ans); } return 0; }
最后再贴两个讲解后缀数组很详细的链接:
1、http://wenku.baidu.com/link?url=g6BzfQTbVcypv0wSYhjw0dtlf6eKSp20Jm5KZDT_EzMKZec_tNGNCUU7lzUWcyC6BvhC73raG2E0Uda2lmykTkmGEuhaWKxJV6CA8K3ZzG_
2、http://www.cnblogs.com/staginner/archive/2012/02/02/2335600.html
相关文章推荐
- HDU 3518 Boring counting 重复出现不重叠子串个数(后缀数组)
- 后缀数组(每个字符串至少出现两次且不重叠的最长子串)spoj220
- spoj 220 每个字符串至少出现两次且不重叠的最长子串 (后缀数组)
- spoj220 每个字符串至少出现两次且不重叠的最长子串(后缀数组)
- Hdu 3518 Boring counting (出现2次及以上且不重叠的子串的个数)
- SPOJ 220后缀数组:求每个字符串至少出现两次且不重叠的最长子串
- HDU 3518 Boring counting(后缀数组 所有不重叠字串出现两次以上的次数)
- SPOJ 220 Relevant Phrases of Annihilation(每个字符串至少出现两次且不重叠的最长子串)
- SPOJ PHRASES 每个字符串至少出现两次且不重叠的最长子串
- spoj 220 Relevant Phrases of Annihilation(每个字符串至少出现两次且不重叠的最长子串)
- SPOJ - PHRASES Relevant Phrases of Annihilation (n个字符串中求至少出现两次且不重叠的最长子串)
- SPOJ PHRASES 每个字符串至少出现两次且不重叠的最长子串
- 重复出现的不重叠子串数目 hdu 3518
- hdu 3518 重复两次以上的不相交子串
- spoj220求至少出现两次且不重叠的最长公共子串
- 后缀数组(至少重复k次的可重叠的最长重复子串)—— POJ 3882
- HDOJ 题目3518 Boring counting(后缀数组,求不重叠重复次数最少为2的子串种类数)
- 后缀数组(出现两次的子串且不重叠)hdu3518
- poj 3261 Milk Patterns (后缀数组 至少出现k次的可重叠最长重复子串)
- spoj 220 Relevant Phrases of Annihilation (后缀数组 每个串中都至少出现两次的不重叠最长子串)