您的位置:首页 > 大数据 > 人工智能

杭电2473-Junk-Mail Filter

2015-10-21 20:51 344 查看

Junk-Mail Filter

Time Limit: 15000/8000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)

Total Submission(s): 7717 Accepted Submission(s): 2435

[align=left]Problem Description[/align]
Recognizing junk mails is a tough task. The method used here consists of two steps:

1) Extract the common characteristics from the incoming email.

2) Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.

We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:

a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so

relationships (other than the one between X and Y) need to be created if they are not present at the moment.

b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.

Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.

Please help us keep track of any necessary information to solve our problem.

[align=left]Input[/align]
There are multiple test cases in the input file.

Each test case starts with two integers, N and M (1 ≤ N ≤ 105 , 1 ≤ M ≤ 106), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.

Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.

[align=left]Output[/align]
For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.

[align=left]Sample Input[/align]

5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3

3 1
M 1 2

0 0


[align=left]Sample Output[/align]

Case #1: 3
Case #2: 2


这题要用到虚拟父节点,因为假如1,2,3的父节点都是1,现在要删除1这个节点的话,一定要保证2,3还在一个父节点上,此时让1的父节点等于1是不对的!

如果让1,2,3的父节点都是4,此时删除1的时候,直接让1的父节点等于7。2,3的父节点还是4就对了!

设立虚父结点,一般并查集初始化的时候每个点的父结点都是自己,本题要删结点,一般并查集的初始化方法行不通,举个例子,比如1,2,3的父结点都是1,现在删除1,1的父结点还是1,2,3的父结点也为1,集合数还是1,正确答案为2 我们为每一个结点设立一个虚父结点,即初始化的时候不再将每个结点的父结点设置为自己本身。比如1,2,3初始化时,可以将他们的父结点设置为4,5,6假如1,2,3还是一个集合,他们的父结点都为4,删除结点1,那么1的父结点为7,2,3的父结点还是4

#include<cstdio>
#include<cstring>
#include<algorithm>
using namespace std;
#define N1 100000+10
#define N2 1000000+10
int m,n,total;
int father[2*N1+N2];
int flag[2*N1+N2];
void init()
{
int i;
for(i=0;i<m;i++)
{
father[i]=m+i;
}
for(i=m;i<n+m+m;i++)
{
father[i]=i;
}
}
int find(int x)
{
int t=x;
while(t!=father[t])
t=father[t];
int i=x,j;
while(i!=t)
{
j=father[i];
father[i]=t;
i=j;
}
return t;
}
void merge(int x,int y)
{
int fx=find(x);
int fy=find(y);
if(fx!=fy)
father[fx]=fy;
}
void del(int x)
{
father[x]=total++;
}

int main()
{
int i,a,b,cot=0,num;
char c;
while(scanf("%d%d",&m,&n),m||n)
{
total=2*m;
getchar();
init();
for(i=0;i<n;i++)
{
scanf("%c",&c);
if(c=='M')
{
scanf("%d%d",&a,&b);
getchar();
merge(a,b);
}
else{
scanf("%d",&a);
getchar();
del(a);
}
}
memset(flag,0,sizeof(flag));
num=0;
for(i=0;i<m;i++)
{
if(!flag[find(i)])
{
num++;
flag[find(i)]=1;
}
}
printf("Case #%d: %d\n",++cot,num);
}
return 0;
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: