您的位置:首页 > 其它

【DataStructure】Another usage of Map: Concordance

2014-08-14 01:06 375 查看


Statements: This blog was written by me, but most of content is quoted from book【Data Structure with Java Hubbard】


【Description】

Aconcordanceis a list of words that appear in a textdocument along with the numbers of the lines on which the words appear. It is just like an index of a book except that it lists line numbers instead of page.numbers. Concordances
are useful for analyzing documents to find word frequencies and associations that are not evident from reading the document directly. This program builds a concordance for a text file. The run here uses this particular text taken from Shakespeare’s play Julius
Caesar. The first part of the resulting concordance is shown on the right.


【Implement】

package com.albertshao.ds.map;

//  Data Structures with Java, Second Edition
//  by John R. Hubbard
//  Copyright 2007 by McGraw-Hill

import java.io.*;
import java.util.*;

public class Concordance {
  private Map<String,String> map = new HashMap<String,String>();
  
  public Concordance(String file) {
    int lineNumber = 0;
    try {
      Scanner input = new Scanner(new File(file));
      while (input.hasNextLine()) {
        String line = input.nextLine();
        ++lineNumber;
        StringTokenizer parser = new StringTokenizer(line,",.;:()-!?' ");
        while (parser.hasMoreTokens()) {
          String word = parser.nextToken().toUpperCase();
          String listing = map.get(word);
          if (listing == null) {
            listing = "" + lineNumber;
          } else {
            listing += ", " + lineNumber;
          }
          map.put(word,listing);
        }
      }
      input.close();
    } catch(IOException e) {
      System.out.println(e);
    }
  }
  
  public void write(String file) {
    try {
      PrintWriter output = new PrintWriter(file);
      for (Map.Entry<String,String> entry : map.entrySet()) {
        output.println(entry);
      }
      output.close();
    } catch(IOException e) {
      System.out.println(e);
    }
  }
}


package com.albertshao.ds.map;

//  Data Structures with Java, Second Edition
//  by John R. Hubbard
//  Copyright 2007 by McGraw-Hill

public class TestConcordance {
  public static final String PATH = "D:\\machao\\DataStructure\\src\\com\\albertshao\\ds\\map\\";
  public static final String IN_FILE = "Shakespeare.txt";
  public static final String OUT_FILE = "Shakespeare.out";

  public static void main(String[] args) {
    Concordance c = new Concordance(PATH+IN_FILE);
    c.write(PATH+OUT_FILE);
  }
}


【Result】

The content in the Shakespeare.txt:
<span style="font-family:Arial;">Friends, Romans, countrymen, lend me your ears!
I come to bury Caesar, not to praise him.
The evil that men do lives after them,
The good is oft interred with their bones;
So let it be with Caesar. The noble Brutus
Hath told you Caesar was ambitious;
If it were so, it was a grievous fault;
And grievously hath Caesar answer'd it.
Here, under leave of Brutus and the rest, --
For Brutus is an honourable man;
So are they all, all honourable men.
Come I to speak in Caesar's funeral.
He was my friend, faithful and just to me.
But Brutus says he was ambitious;
And Brutus is an honourable man.
He hath brought many captives home to Rome.
Whose ransoms did the general coffers fill:
Did this in Caesar seem ambitious?
When that the poor have cried, Caesar hath wept;
Ambition should be made of sterner stuff.
Yet Brutus says he was ambitious;
And Brutus is an honourable man.
You all did see that on the Lupercal
I thrice presented him with a kingly crown,
Which he did thrice refuse: was this ambition?
Yet Brutus says he was ambitious;
And, sure, is an honourable man.
I speak not to disprove what Brutus spoke,
But here I am to speak what I do know.
You all did love him once, not without cause.
What cause withholds you, then, to mourn for him?
O judgement! thou art fled to brutish beasts,
And men have lost their reason!
</span>


The result also means the content in the Shakespeare.out:
<span style="font-family:Arial;font-size:14px;">GRIEVOUS=7
WHAT=28, 29, 31
KINGLY=24
REST=9
JUDGEMENT=32
SURE=27
CAUSE=30, 31
REFUSE=25
ME=1, 13
DO=3, 29
THEIR=4, 33
FUNERAL=12
NOT=2, 28, 30
YET=21, 26
CAESAR=2, 5, 6, 8, 12, 18, 19
LE***E=9
THAT=3, 19, 23
COFFERS=17
HIM=2, 24, 30, 31
ARE=11
MADE=20
MY=13
CROWN=24
MOURN=31
FRIEND=13
THIS=18, 25
CAPTIVES=16
OFT=4
PRAISE=2
ROMANS=1
YOU=6, 23, 30, 31
HERE=9, 29
BURY=2
GRIEVOUSLY=8
WITHHOLDS=31
D=8
BEASTS=32
A=7, 24
O=32
LEND=1
WITHOUT=30
I=2, 12, 24, 28, 29, 29
SAYS=14, 21, 26
ANSWER=8
ON=23
CRIED=19
BUT=14, 29
STUFF=20
WEPT=19
ART=32
YOUR=1
S=12
OF=9, 20
AMBITIOUS=6, 14, 18, 21, 26
MANY=16
FLED=32
GENERAL=17
HE=13, 14, 16, 21, 25, 26
INTERRED=4
MEN=3, 11, 33
EVIL=3
FRIENDS=1
POOR=19
NOBLE=5
KNOW=29
WHOSE=17
LUPERCAL=23
BRUTISH=32
FAULT=7
THE=3, 4, 5, 9, 17, 19, 23
WERE=7
FOR=10, 31
THEY=11
THRICE=24, 25
AND=8, 9, 13, 15, 22, 27, 33
IF=7
UNDER=9
THEM=3
THEN=31
SEE=23
IN=12, 18
FILL=17
IS=4, 10, 15, 22, 27
ROME=16
IT=5, 7, 7, 8
WAS=6, 7, 13, 14, 21, 25, 26
ALL=11, 11, 23, 30
H***E=19, 33
TOLD=6
LOST=33
ONCE=30
FAITHFUL=13
BRUTUS=5, 9, 10, 14, 15, 21, 22, 26, 28
AM=29
WITH=4, 5, 24
AN=10, 15, 22, 27
WHICH=25
HONOURABLE=10, 11, 15, 22, 27
TO=2, 2, 12, 13, 16, 28, 29, 31, 32
SPOKE=28
SHOULD=20
LIVES=3
BONES=4
BE=5, 20
AFTER=3
COUNTRYMEN=1
SPEAK=12, 28, 29
DID=17, 18, 23, 25, 30
AMBITION=20, 25
COME=2, 12
SEEM=18
REASON=33
BROUGHT=16
LOVE=30
STERNER=20
MAN=10, 15, 22, 27
WHEN=19
DISPROVE=28
PRESENTED=24
SO=5, 7, 11
HATH=6, 8, 16, 19
JUST=13
HOME=16
THOU=32
EARS=1
GOOD=4
LET=5
RANSOMS=17
</span>
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: