您的位置:首页 > 其它

LeetCode 30. Substring with Concatenation of All Words

2015-12-29 16:33 330 查看
题目描述:
You are given a string, s,
and a list of words, words,
that are all of the same length. Find all starting indices of substring(s) in s that
is a concatenation of each word in wordsexactly
once and without any intervening characters.

For example, given:

s:
"barfoothefoobarman"


words:
["foo", "bar"]

You should return the indices:
[0,9]
.
(order does not matter).
题目解析:给定一个字符串s和一个字符串向量words,用words向量中的字符串进行任意组合.比如给定例子中的words向量有2个元素,组合形式有"foobar"和"barfoo". 然后查看字符串s中是否有"foobar"或者"barfoo".有的话返回字符串s中的首地址.

解法分析
1.网上大多数的写法,Leetcode运行时间700ms
2.自创的,Leetcode 300ms
3.最快的方法,Leetcode 50ms

解法一:
1.首先对words向量中的各个字符串就行建表map.map中的值为此字符串出现的次数
2.依次取s的子串,对子串中的每个单词在第一步建好的表中进行查找,若找不到,即匹配失败,进行下一个字符.若找到了该子串,则统计该子串出现的次数,若次数大于了第一步表中的次数,则认为匹配失败.

代码如下:
vector<int>
findSubstring3(string
s,
vector<string>&
words) {

map<
string,
int>
dict,
cur;

int
wordsLen
=
words.
size();

int
wordLen
=
words[0].
size();

int
sLen
=
s.
size();

vector<
int>
ret;

for(
int
k=0;
k<
wordsLen;
k++)

dict[
words[
k]]++;
//初始化容器

for(
int
i=0;
i<=
sLen-
wordLen*
wordsLen;
i++)
[align=left] {[/align]

cur.
clear();
//每次使用之前要清空,这个容器是不断变化的

int
j;

for(
j=0;
j<
wordsLen;
j++)
[align=left] {[/align]

string
word
=
s.
substr(
i+
j*
wordLen,
wordLen);
//获取这个单词

if(
dict.
find(
word) ==
dict.
end())
//这个单词不是单词列表中的

break;

cur[
word]++;

if(
dict[
word] <
cur[
word])
//出现的次数多了

break;
[align=left] }[/align]

if(
j
==
wordsLen)
//这时候是匹配的

ret.
push_back(
i);
[align=left] }[/align]

return
ret;
[align=left] }[/align]
[align=left]Leecode测试时间如下:[/align]





[align=left]解法二:[/align]
[align=left]针对第一种方法总建表的内存浪费时间进行了优化.[/align]
[align=left]在对words建表完毕后,用一个数组保存各个键值的value,[/align]
[align=left]子串遍历过程中,若查找成功并且其value不为0,则将map相应键值的value减1,否则匹配失败[/align]

[align=left]代码如下:[/align]

vector<int>
findSubstring2(string
s,
vector<string>&
words) {

vector<
int>
ret;

//判空

if
(
words.
empty() ||
s.
empty() ||
words[0].
size() == 0)

return
ret;

int
i,
j;

int
sLen
=
s.
size();

int
wordsLen
=
words.
size();

int
wordLen
=
words[0].
size();

int
wordsCharLen
=
wordsLen
*
wordLen;

map<
string,
int>
dict;

map<
string,
int>::
iterator
dictIt;

int
dictLen;

int
*
dictItemNum;

if
(
sLen
<
wordsCharLen)

return
ret;

//建立words的表

for
(
i=0;
i<
wordsLen;
i++)
[align=left] {[/align]

dictIt
=
dict.
find(
words[
i]);

if
(
dictIt
!=
dict.
end())

dictIt->
second
++;

else

dict.
insert(
pair<
string,
int>(
words[
i],1));
[align=left] }[/align]

dictLen
=
dict.
size();

//存储dict的value

dictItemNum
=
new
int[
dictLen];

for
(
i=0,
dictIt=
dict.
begin();
dictIt!=
dict.
end();
dictIt++,
i++)

dictItemNum[
i] =
dictIt->
second;

//依次比较

for
(
i=0;
i<
sLen-
wordsCharLen+1;
i++)
[align=left] {[/align]

bool
isMatch
=
true;

for
(
j=0,
dictIt=
dict.
begin();
dictIt!=
dict.
end();
dictIt++,
j++)

dictIt->
second
=
dictItemNum[
j];

for
(
j=0;
j<
wordsLen;
j++)
[align=left] {[/align]

dictIt
=
dict.
find(
s.
substr(
i+
j*
wordLen,
wordLen));

if
(
dictIt
==
dict.
end() ||
dictIt->
second
== 0)
[align=left] {[/align]

isMatch
=
false;

break;
[align=left] }[/align]

else
[align=left] {[/align]

dictIt->
second
--;
[align=left] }[/align]
[align=left] }[/align]

if
(
isMatch)

ret.
push_back(
i);
[align=left] }[/align]

return
ret;

[align=left] }[/align]
[align=left]LeetCode测试时间如下:[/align]





[align=left]解法三:[/align]

[align=left]仔细思考前两种解法的中间过程,存在不少重复计算,比如对于[/align]
[align=left]s:
"barfoothefoobarman"
[/align]
words:
["foo", "bar"]

[align=left]而言[/align]
[align=left]当判别s的第一个子串"barfoo"和第四个子串"foothe"时候,其中的"foo"被重复判断了两次,想办法消除此种重复计算,可以使得算法更加快速[/align]

[align=left]另建表可用unordered_map,时间比map少20ms[/align]

[align=left]代码抄自Leecode Discuss[/align]

vector<int>
findSubstring4(string
s,
vector<string>
&words) {

vector<
int>
ret;

int
i,
j;

int
sLen
=
s.
size();

int
wordsLen
=
words.
size();

if
(
sLen
<= 0 ||
wordsLen
<= 0)

return
ret;

//初始化word的dict

//unordered_map<string, int> dict;

map<
string,
int>
dict;

for
(
i=0;
i<
wordsLen;
i++)

dict[
words[
i]]++;

//遍历所有子串

int
wordLen
=
words[0].
size();

for
(
i=0;
i<
wordLen;
i++)
[align=left] {[/align]

int
left
=
i;

int
count
= 0;
//记录连续的元素个数

map<
string,
int>
tdict;
//缓存map

for
(
j=
i;
j<=
sLen-
wordLen;
j
+=
wordLen)
[align=left] {[/align]

string
str
=
s.
substr(
j,
wordLen);

//若有效,进行计数

if
(
dict.
count(
str))
//此次元素相同
[align=left] {[/align]

tdict[
str]++;

if
(
tdict[
str] <=
dict[
str])

count++;

else
[align=left] {[/align]

//匹配成功,但是超过了次数,认为匹配失败,失败的话

//从头开始遍历,从重复的位置开始计算,重复位置之前的计数-1

while
(
tdict[
str] >
dict[
str])
[align=left] {[/align]

string
str1
=
s.
substr(
left,
wordLen);

tdict[
str1]--;

if
(
tdict[
str1] <
dict[
str1])

count--;

left
+=
wordLen;
[align=left] }[/align]
[align=left] }[/align]

// come to a result

if
(
count
==
wordsLen) {

ret.
push_back(
left);

// advance one word

tdict[
s.
substr(
left,
wordLen)]--;

count--;

left
+=
wordLen;
[align=left] }[/align]
[align=left] }[/align]

// not a valid word, reset all vars

else
[align=left] {[/align]

tdict.
clear();

count
= 0;
//这个小子串匹配不成功,继续后续判断

left
=
j +
wordLen;
[align=left] }[/align]
[align=left] }[/align]
[align=left] }[/align]

return
ret;

[align=left] }[/align]
[align=left]测试时间如下:[/align]



内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: