Etaoin Shrdlu 2010.3.2
2016-02-05 14:54
465 查看
Etaoin Shrdlu 2010.3.2
Etaoin Shrdlu
Time Limit:1000MS Memory Limit:65536K
Total Submit:27 Accepted:10
Description
The relative frequency of characters innatural language texts is very important for cryptography. However, thestatistics vary for different languages. Here are the top 9 characters sortedby their relative frequencies for several common languages:
English: ETAOINSHR
German: ENIRSATUD
French: EAISTNRUL
Spanish: EAOSNRILD
Italian: EAIONLRTS
Finnish: AITNESLOK
Just as important as the relativefrequencies of single characters are those of pairs of characters, so calleddigrams. Given several text samples, calculate the digrams with the toprelative frequencies.
Input
The input contains several test cases. Eachstarts with a number n on a separate line, denoting the number of lines of thetest case. The input is terminated by n=0. Otherwise, 1<=n<=64, and therefollow n lines, each with a maximal length of 80 characters.
The concatenationof these n lines, where the end-of-line characters are omitted, gives the textsample you have to examine. The text sample will contain printable ASCIIcharacters only.
Output
For each test case generate 5 linescontaining the top 5 digrams together with their absolute and relativefrequencies. Output the latter rounded to a precision of 6 decimal places. Iftwo digrams should have the same frequency, sort them in (ASCII)lexicographical
order. Output a blank line after each test case.
Sample Input
2
Take a look at this!!
!!siht ta kool a ekaT
5
P=NP
Authors: A. Cookie, N. D. Fortune, L. Shalom
Abstract: We give a PTAS algorithm for MaxSATand apply the PCP-Theorem [3]
LetF be a set of clauses. The following PTAS algorithm gives an optimal
assignment for F:
0
Sample Output
a 30.073171
!! 3 0.073171
a 30.073171
t 20.048780
oo 2 0.048780
a 80.037209
or 7 0.032558
. 50.023256
e 50.023256
al 4 0.018605
Source
UML 2001
Etaoin Shrdlu
Time Limit:1000MS Memory Limit:65536K
Total Submit:27 Accepted:10
Description
The relative frequency of characters innatural language texts is very important for cryptography. However, thestatistics vary for different languages. Here are the top 9 characters sortedby their relative frequencies for several common languages:
English: ETAOINSHR
German: ENIRSATUD
French: EAISTNRUL
Spanish: EAOSNRILD
Italian: EAIONLRTS
Finnish: AITNESLOK
Just as important as the relativefrequencies of single characters are those of pairs of characters, so calleddigrams. Given several text samples, calculate the digrams with the toprelative frequencies.
Input
The input contains several test cases. Eachstarts with a number n on a separate line, denoting the number of lines of thetest case. The input is terminated by n=0. Otherwise, 1<=n<=64, and therefollow n lines, each with a maximal length of 80 characters.
The concatenationof these n lines, where the end-of-line characters are omitted, gives the textsample you have to examine. The text sample will contain printable ASCIIcharacters only.
Output
For each test case generate 5 linescontaining the top 5 digrams together with their absolute and relativefrequencies. Output the latter rounded to a precision of 6 decimal places. Iftwo digrams should have the same frequency, sort them in (ASCII)lexicographical
order. Output a blank line after each test case.
Sample Input
2
Take a look at this!!
!!siht ta kool a ekaT
5
P=NP
Authors: A. Cookie, N. D. Fortune, L. Shalom
Abstract: We give a PTAS algorithm for MaxSATand apply the PCP-Theorem [3]
LetF be a set of clauses. The following PTAS algorithm gives an optimal
assignment for F:
0
Sample Output
a 30.073171
!! 3 0.073171
a 30.073171
t 20.048780
oo 2 0.048780
a 80.037209
or 7 0.032558
. 50.023256
e 50.023256
al 4 0.018605
Source
UML 2001
#include <cstdio> #include <cstring> #include <algorithm> using namespace std; #define MAXM 70000 #define MAXN 80+10 struct node { int num; char ch[2]; }hash[MAXM]; int ls,n,tot; char s[MAXN]; bool cmp(struct node a,struct node b) { return (a.num>b.num); } void clear() { int i; for(i=0;i<MAXM;i++) { hash[i].num=0; } } int main() { char mid[20],last; int i,j,p,t,key; struct node temp; while (scanf("%d",&n),n) { gets(mid); tot=0; clear(); for(i=1;i<=n;i++) { gets(s); ls=strlen(s); tot+=ls; for(j=0;j<ls-1;j++) { p=256*s[j]+s[j+1]; hash[p].num++; hash[p].ch[0]=s[j]; hash[p].ch[1]=s[j+1]; } if (i!=1) { p=256*last+s[0]; hash[p].num++; hash[p].ch[0]=last; hash[p].ch[1]=s[0]; } last=s[ls-1]; } sort(hash,hash+MAXM-1,cmp); key=hash[4].num; t=key; while(hash[t+1].num==key) t++; for(i=0;i<=t;i++) for(j=0;j<i;j++) if ((hash[i].num==hash[j].num)&&(strcmp(hash[j].ch,hash[i].ch)>0)) { temp=hash[i]; hash[i]=hash[j]; hash[j]=temp; } for(i=0;i<5;i++) printf("%s %d %.6lf\n",hash[i].ch,hash[i].num,(double)hash[i].num/(double)(tot-1)); printf("\n"); } return 0; }
相关文章推荐
- 老僧长谈设计模式-6-状态模式
- HYSBZ/BZOJ 1037 [ZJOI2008] 生日聚会Party - dp
- hdu3220 2010.3.1
- L2 正则化
- Java设计模式(十八)----命令模式
- Unix环境高级编程读书笔记(1):c程序进程空间布局
- poj2926 2010.2.26
- 基于脚本的动画的计时控制(“requestAnimationFrame”)(转)
- Cheat (tldr, bropages) - Unix命令用法备忘单
- PHP面向对象深入研究之【组合模式与装饰模式】
- [Unity3D]Window Phone代码通信
- 学习笔记 AppCompatEditText
- Verify Preorder Serialization of a Binary Tree
- 使用SecureCRT在windows与linux间传输文件
- hdu2817 2010.2.25
- IOS二维码扫描IOS7系统实现
- 数据库密码重制
- 数据通信系统的模型
- poj2503 2010.2.24
- pyspark 编译器 pycharm 配置