hdu2473 Junk-Mail Filter 并查集+删除节点+路径压缩
2015-08-17 13:51
357 查看
Description
Recognizing junk mails is a tough task. The method used here consists of two steps:
1) Extract the common characteristics from the incoming email.
2) Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.
We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:
a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.
b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.
Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.
Input
There are multiple test cases in the input file.
Each test case starts with two integers, N and M (1 ≤ N ≤ 10 5 , 1 ≤ M ≤ 10 6), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.
Output
For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.
Sample Input
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3
3 1
M 1 2
0 0
Sample Output
Case #1: 3
Case #2: 2
题意:两种操作:
M x y ,表示X,Y 是一类的。
S A 表示A不再属于当前属于那一类了,即A独立出去属于一类。。
问有多少类。。。
路径压缩,实质就是记忆化。方便下一次查找。。。
int findfather(int x)
{
if(x!=father[x])
father[x]=findfather(father[x]);
return father[x];
}
如果写成常规的
if(x==father[x]) return x;
return findfather(father[x]);
没查一个就得递归到最后。。。很费时。。。
节点删除:
带有压缩的并查集不能直接删除节点,,删除操作可以用新建一个节点来代替被删除的节点来实现.
用一个数组real来表示真实的节点编号,删除一个节点就将这个节点的real值换成一个新的编号.
例如数据
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3
我们来看看他的实现过程,各个量的变化。。。(ff(real)==finffather(real[i]))
5 6
real 0 1 2 3 4
findfather(i) 5 0 1 2 3 4
ff(real[i]) 0 1 2 3 4
M 0 1
real 0 1 2 3 4
findfather(i) 5 1 1 2 3 4
ff(real[i]) 1 1 2 3 4
M 1 2
real 0 1 2 3 4
findfather(i) 5 2 2 2 3 4
ff(real[i]) 2 2 2 3 4
M 1 3
real 0 1 2 3 4
findfather(i) 5 3 3 3 3 4
ff(real[i]) 3 3 3 3 4
S 1
real 0 5 2 3 4
findfather(i) 6 3 3 3 3 4 5
ff(real[i]) 3 5 3 3 4
M 1 2
real 0 5 2 3 4
findfather(i) 6 3 3 3 3 4 3
ff(real[i]) 3 3 3 3 4
S 3
real 0 5 2 6 4
findfather(i) 7 3 3 3 3 4 3 6
ff(real[i]) 3 3 3 6 4
Case #1: 3
这里real[i]是i的真是父节点,实际上删去的节点并没有真是删去,
,只是用另一个代替了而已,
如S 1 删去1后,1 所指的父节点成了5;
而父节点1依旧纯在,只是不再表示1 的父节点。。(注意此时会多出一个数据)
就是新大代替点。。。
M1 2 是合并两个点的真是父节点。。
#include<cstdio>
#include<cmath>
#include<cstring>
#include<iostream>
#include<algorithm>
#include<queue>
#include<vector>
#include<map>
#include<stack>
#pragma comment(linker,"/STACK:102400000,102400000")
#define pi acos(-1.0)
#define EPS 1e-6
#define INF (1<<24)
using namespace std;
int father[1100005];
int real[100005];
bool flag[1100005];
int n,m,ind;
void init()
{
int i;
for(i=0;i<n;i++)
{
father[i]=i;
real[i]=i;
}
ind=n;
}
int findfather(int x)
{
if(x!=father[x])
father[x]=findfather(father[x]);
return father[x];
}
void Uion(int x,int y)
{
int a=findfather(x);
int b=findfather(y);
father[a]=b;
}
void delect(int x)
{
real[x]=ind;
father[ind]=ind;
ind++;
}
int main()
{
char ss;
int a,b;
int case1=1;
while(scanf("%d %d",&n,&m),n!=0||m!=0)
{
init();
int i,j;
for(j=0;j<m;j++)
{
getchar();
scanf("%c",&ss);
if(ss=='M')
{
scanf("%d %d",&a,&b);
Uion(real[a],real[b]);
}
else if(ss=='S')
{
scanf("%d",&a);
delect(a);
}
}
memset(flag,false,sizeof(flag));
int cnt=0;
for(i=0;i<n;i++)
{
int ance=findfather(real[i]);
if(!flag[ance])
{
cnt++;
flag[ance]=true;
}
}
printf("Case #%d: %d\n",case1++,cnt);
}
return 0;
}
Recognizing junk mails is a tough task. The method used here consists of two steps:
1) Extract the common characteristics from the incoming email.
2) Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.
We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:
a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.
b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.
Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.
Input
There are multiple test cases in the input file.
Each test case starts with two integers, N and M (1 ≤ N ≤ 10 5 , 1 ≤ M ≤ 10 6), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.
Output
For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.
Sample Input
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3
3 1
M 1 2
0 0
Sample Output
Case #1: 3
Case #2: 2
题意:两种操作:
M x y ,表示X,Y 是一类的。
S A 表示A不再属于当前属于那一类了,即A独立出去属于一类。。
问有多少类。。。
路径压缩,实质就是记忆化。方便下一次查找。。。
int findfather(int x)
{
if(x!=father[x])
father[x]=findfather(father[x]);
return father[x];
}
如果写成常规的
if(x==father[x]) return x;
return findfather(father[x]);
没查一个就得递归到最后。。。很费时。。。
节点删除:
带有压缩的并查集不能直接删除节点,,删除操作可以用新建一个节点来代替被删除的节点来实现.
用一个数组real来表示真实的节点编号,删除一个节点就将这个节点的real值换成一个新的编号.
例如数据
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3
我们来看看他的实现过程,各个量的变化。。。(ff(real)==finffather(real[i]))
5 6
real 0 1 2 3 4
findfather(i) 5 0 1 2 3 4
ff(real[i]) 0 1 2 3 4
M 0 1
real 0 1 2 3 4
findfather(i) 5 1 1 2 3 4
ff(real[i]) 1 1 2 3 4
M 1 2
real 0 1 2 3 4
findfather(i) 5 2 2 2 3 4
ff(real[i]) 2 2 2 3 4
M 1 3
real 0 1 2 3 4
findfather(i) 5 3 3 3 3 4
ff(real[i]) 3 3 3 3 4
S 1
real 0 5 2 3 4
findfather(i) 6 3 3 3 3 4 5
ff(real[i]) 3 5 3 3 4
M 1 2
real 0 5 2 3 4
findfather(i) 6 3 3 3 3 4 3
ff(real[i]) 3 3 3 3 4
S 3
real 0 5 2 6 4
findfather(i) 7 3 3 3 3 4 3 6
ff(real[i]) 3 3 3 6 4
Case #1: 3
这里real[i]是i的真是父节点,实际上删去的节点并没有真是删去,
,只是用另一个代替了而已,
如S 1 删去1后,1 所指的父节点成了5;
而父节点1依旧纯在,只是不再表示1 的父节点。。(注意此时会多出一个数据)
就是新大代替点。。。
M1 2 是合并两个点的真是父节点。。
#include<cstdio>
#include<cmath>
#include<cstring>
#include<iostream>
#include<algorithm>
#include<queue>
#include<vector>
#include<map>
#include<stack>
#pragma comment(linker,"/STACK:102400000,102400000")
#define pi acos(-1.0)
#define EPS 1e-6
#define INF (1<<24)
using namespace std;
int father[1100005];
int real[100005];
bool flag[1100005];
int n,m,ind;
void init()
{
int i;
for(i=0;i<n;i++)
{
father[i]=i;
real[i]=i;
}
ind=n;
}
int findfather(int x)
{
if(x!=father[x])
father[x]=findfather(father[x]);
return father[x];
}
void Uion(int x,int y)
{
int a=findfather(x);
int b=findfather(y);
father[a]=b;
}
void delect(int x)
{
real[x]=ind;
father[ind]=ind;
ind++;
}
int main()
{
char ss;
int a,b;
int case1=1;
while(scanf("%d %d",&n,&m),n!=0||m!=0)
{
init();
int i,j;
for(j=0;j<m;j++)
{
getchar();
scanf("%c",&ss);
if(ss=='M')
{
scanf("%d %d",&a,&b);
Uion(real[a],real[b]);
}
else if(ss=='S')
{
scanf("%d",&a);
delect(a);
}
}
memset(flag,false,sizeof(flag));
int cnt=0;
for(i=0;i<n;i++)
{
int ance=findfather(real[i]);
if(!flag[ance])
{
cnt++;
flag[ance]=true;
}
}
printf("Case #%d: %d\n",case1++,cnt);
}
return 0;
}
相关文章推荐
- [Linux 运维 -- 存储] RAID入门
- HDU 2457 DNA repair (AC自动机 + DP)
- 游戏人工智能 状态驱动智能体设计——有限状态机(FSM)
- Fiddler [Fiddler] Connection to localhost. failed.
- HDU 4720 Naive and Silly Muggles
- Fiddler [Fiddler] Connection to localhost. failed.
- 使用AIDL实现进程间的通信
- 1079. Total Sales of Supply Chain (25)
- sleep()和wait()线程控制方法的区别
- NSBundle的使用,注意mainBundle和Custom Bundle的区别
- HDU1023 Train Problem II【Catalan数】
- Error Domain=NSCocoaErrorDomain Code=3000 "未找到应用程序的“aps-environment”的权利字符串"...
- Drainage Ditches---hdu1532(最大流)
- Bahosain and Digits
- 我做的第一个人工智能棋-井字棋
- HDU-2473 Junk-Mail Filter
- UVA 10651 Pebble Solitaire
- 聊一聊关于AIDL 那点事
- 【OC笔记】MRC中多对象内存管理以及循环retain
- WARN [org.hibernate.engine.jdbc.internal.JdbcServicesImpl] - <HHH000342: Could not obtain connectio