您的位置:首页 > 运维架构

XVII Open Cup named after E.V. Pankratiev. Eastern Grand Prix. Problem G. Gmoogle 模拟、字符串处理、文本搜索

2017-12-08 14:25 567 查看
XVII Open Cup named after E.V. Pankratiev. Eastern Grand Prix.

Problem G. Gmoogle

Input le: standard input

Output le: standard output

Time limit: 1 second

Memory limit: 256 megabytes

You are hired to create alpha version of the new searching engine named GMoogle. Alpha version should

work with the content, represented as a database of sentences:

 • Content is merged into line S, consisting of characters `a'-`z', `A'-`Z', spaces, notation marks (\.!?")

(quotes are not counted) and decimal digits.

 • If one of characters .!?" presents in the S, then it denotes the end of the sentence, except for

one special case: if rst non-space character after `.' is lowercase English letter, then it is an

abbreviation sign but not the end of the sentence; for example, string I like tea in a 500

ml. cup" contains one sentence, but strings Cup is 500 ml. I want it" and Cup is 500

ml. 500 ml is great for me" contains two sentences).

 • First non-space character after the end of sentence is considered as the rst character of the new

sentence.

 • word is contiguous sequence of characters `a'-'z', `A'-`Z', delimited by spaces, notation signs or

beginning/end of the sentence/string. It is guaranteed that digits can not be neighbors of the

letters, i.e. sequences like 10ml" or R2D2" are illegal.

 • S may contain the sentences containing no words. It is guaranteed that S does not contains two or

more characters .!?" in a row.

After the content is indexed, users make requests. Each request can be represented as a string q, consisting

of one or more words (de nition of the word is given above). Words are separated by arbitrary number

of spaces (1 or more), heading and trailing spaces are possible.

Your program has to print all sentences from S, where all words from q are presented (in any order).

Words are considered equal, if all the letters at the corresponding positions are the same (case insensitive,

i.e. `B' and `b' are considered the same.

Input

First line of the input contains non-empty line S, consisting of no more than 1000 characters. Next line

contains one integer n (1 n 100) | number of the requests. Then n requests q1; : : : ; qn follow, each

on separate line in the format, described above. Note that in S and qi trailing and heading spaces are

allowed.

Output

For each request q1; q2; :::; qn print the request at the separate line. Then print the list of found sentences

in same order they present in S, one sentence per line. Requests and answers are printed in the quotes;

answers are preceeded by single `-' and single space; heading and trailing spaces must be eliminated.

Look the sample for clarify.

Example

standard input 

Hello everyone. I want 2 coffee if

you have it. I like coffee very much.

4

HELLO

Coffee

much coffee

VoDka

standard output

Search results for "HELLO":

- "Hello everyone."

Search results for "Coffee":

- "I want 2 coffee if you have it."

- "I like coffee very much."

Search results for "much coffee":

- "I like coffee very much."

Search results for "VoDka":

Source

XVII Open Cup named after E.V. Pankratiev. Eastern
Grand Prix.

My Solution

题意:要求模拟一个搜索系统,给出文本,然后每次查询几个单词要求输出所以出现查询单词的句子。

模拟、字符串处理、文本搜索

先把文本预处理成一个一个单独的句子,并标号0、1、2......,并且用map<string, vector<int>>建立单词到句子的映射。

然后对于每个单独查询的每个单词都会有一个集合,然后对这些集合取一个交集就是答案了。

这里用到的求交集的方法是 是用一个map<int, int> check表示这些集合里每个句子出现的次数,最后遍历一遍check,

出现次数为查询的单词的个数的句子构成的集合就是所求的交集。

注意点:1、一个句子里可能出现几个相同的单词,建立映射的时候,一个单词只映射一次到该句子。

                2、当'.'后面的第一个非空字符是小写字母时,这里不是句子的结束。

                3、这里文本的最后一句可能没有标点符号且可能有很多空格,处理一下即可。

                4、故意把文本处理成单个句子的方法是先拿出单独的句子,然后确定该句在此处结尾时,在建立这句的单词带这句话的映射。

                5、无论是单词的映射还是查询,都全部用cctype里的isuppper和tolower来转化成小写字母进行比较。

时间复杂度 O(nlogn + k*qlogn)

空间复杂度 O(n)

#include <bits/stdc++.h>
using namespace std;
string s, word, line;
vector<string> senc;
map<string, vector<int>> mp;
map<int, int> check;
int main () {
#ifdef LOCAL
freopen("g.txt", "r", stdin);
#endif // LOCAL

getline(cin, s);
int n, sz = s.size(), i, j, len, cnt = 0, k;
while(s[sz-1] == ' '){
sz--;
}
bool flag;
for(i = 0; i < sz; i++){
if(s[i] == '.' || s[i] == '!' || s[i] == '?'){
flag = true;
if(s[i] == '.'){

for(j = i + 1; j < sz; j++){
if(islower(s[j])){
flag = false;//cout <<"?"<<endl;
break;
}
else if(s[j] != ' ' && s[j] != '\0'){
//cout << s[j] << " ? \n";
break;
}

}

}
if(!flag){
line += s[i];
continue;
}

len = line.size();
if(len != 0){
//cout << line << endl;
for(j = 0; j < len; j++){
if(islower(line[j])){
word += line[j];
}
else if(isupper(line[j])){
word += tolower(line[j]);
}
else if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
}
if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
line += s[i];
senc.push_back(line);
line.clear();
cnt++;
}
}
else{
if(line.size() == 0 && (s[i] == ' ' || s[i] == '\0')){ //!
;
}
else{
line += s[i];
}
}
}
len = line.size();
if(len != 0){
//cout << line << endl;
for(j = 0; j < len; j++){
if(islower(line[j])){
word += line[j];
}
else if(isupper(line[j])){
word += tolower(line[j]);
}
else if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
}
if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
//line += s[i];
senc.push_back(line);
line.clear();
cnt++;
}
/*
for(auto x = mp.begin(); x != mp.end(); x++){
cout << (x->first) << endl;
sz = (x->second).size();
for(i = 0; i < sz; i++){
cout << " " << (x->second)[i] ;
}
cout << endl;
}
cout << endl;
*/

cin >> n;
getchar();
for(i = 0; i < n; i++){
getline(cin, s);
cout << "Search results for \"" << s << "\":\n";
len = s.size();
cnt = 0;
for(j = 0; j < len; j++){
if(islower(s[j])){
word += s[j];
}
else if(isupper(s[j])){
word += tolower(s[j]);
}
else if(!word.empty()){
if(mp.find(word) != mp.end()){
sz = mp[word].size();
for(k = 0; k < sz; k++){
check[mp[word][k]]++;
}
}
cnt++;
word.clear();
}
}
if(!word.empty()){
if(mp.find(word) != mp.end()){
sz = mp[word].size();
for(k = 0; k < sz; k++){
check[mp[word][k]]++;
}
}
cnt++;
word.clear();
}

for(auto x = check.begin(); x != check.end(); x++){
if((x->second) == cnt){
cout << "- \"" << senc[x->first] << "\"\n";
}
}
check.clear();
}
}


  Thank you!

                                                                                                           
                                 ------from ProLights
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐