您的位置:首页 > 其它

一道有关hash的POJ题目:POJ1200 Crazy Search

2013-12-27 21:39 483 查看
一、题目描述

这里把题目粘过来吧,网页是在http://poj.org/problem?id=1200,题目描述如下:

Description:

Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you
soon will discover, you really need the help of a computer and a good algorithm to solve such a puzzle.

Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text.

As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5.

Input:

The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not
exceed 16 Millions.

Output:

The program should output just an integer corresponding to the number of different substrings of size N found in the given text.

Sample Input:

3 4

daababac

Sample Output:

5

Hint:

Huge input,scanf is recommended.

题目大意就是将一个字符串分成长度为N的子串。且不同的字符不会超过NC个。问总共有多少个不同的子串。最后采用的办法就是以NC作为进制,把一个字符子串转换为这个进制下的数,再用哈希判断。由于题目说长度不会超过16000000,所以哈希长度就设为16000000就行。另外为每一个字符对应一个整数,来方便转化。

比如题目中的daababac与整数对应之后就是12232324,然后子串分别可以转换为下列各数:

daa->122->011(因为是化为4进制,所以需要减1)->5(将转换后的4进制数计算为10进制作为此子串的hash索引值);

aab->223->112->22;

aba->232->121->25;

时间复杂度为O(n)。代码实现如下所示:

#include<stdio.h>
#include<string.h>
#define mem(a) memset(a,0,sizeof(a))

unsigned int hash[16000000+5];
unsigned int c[128];
char str[1000000];

int main()
{
int len,base;
while(~scanf("%d%d",&len,&base))
{
mem(str);
mem(c);
mem(hash);
scanf("%s",str);
int num =0;
int i,j=0,length=strlen(str),tp=1;
for(i=0;i<length;i++)
{
if(c[str[i]]==0)c[str[i]]=++j;
if(j==base)break;
}
for(i=0;i<len;i++)
{
num=num*base+c[str[i]]-1;
tp*=base;
}
tp/=base;
hash[num]=1;
int count=1;
for(i=1;i<=length-len;i++)
{
num = ( num-(c[str[i-1]]-1)*tp )* base+ c[str[i+len-1]] - 1;
if(!hash[num])
{
hash[num]=1;
count++;
}
}
printf("%d\n",count);
}
return 0;
}
  这个题目给我的提示是,在进行字符串相关处理时可通过利用字符数目有限、并可和整数进行转换的特点,将字符处理转换为一种整数域的处理,从而方便了问题的解决。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: