LeetCode 30. Substring with Concatenation of All Words
2015-12-29 16:33
330 查看
题目描述:
You are given a string, s,
and a list of words, words,
that are all of the same length. Find all starting indices of substring(s) in s that
is a concatenation of each word in wordsexactly
once and without any intervening characters.
For example, given:
s:
words:
You should return the indices:
(order does not matter).
题目解析:给定一个字符串s和一个字符串向量words,用words向量中的字符串进行任意组合.比如给定例子中的words向量有2个元素,组合形式有"foobar"和"barfoo". 然后查看字符串s中是否有"foobar"或者"barfoo".有的话返回字符串s中的首地址.
解法分析
1.网上大多数的写法,Leetcode运行时间700ms
2.自创的,Leetcode 300ms
3.最快的方法,Leetcode 50ms
解法一:
1.首先对words向量中的各个字符串就行建表map.map中的值为此字符串出现的次数
2.依次取s的子串,对子串中的每个单词在第一步建好的表中进行查找,若找不到,即匹配失败,进行下一个字符.若找到了该子串,则统计该子串出现的次数,若次数大于了第一步表中的次数,则认为匹配失败.
代码如下:
vector<int>
findSubstring3(string
s,
vector<string>&
words) {
map<
string,
int>
dict,
cur;
int
wordsLen
=
words.
size();
int
wordLen
=
words[0].
size();
int
sLen
=
s.
size();
vector<
int>
ret;
for(
int
k=0;
k<
wordsLen;
k++)
dict[
words[
k]]++;
//初始化容器
for(
int
i=0;
i<=
sLen-
wordLen*
wordsLen;
i++)
[align=left] {[/align]
cur.
clear();
//每次使用之前要清空,这个容器是不断变化的
int
j;
for(
j=0;
j<
wordsLen;
j++)
[align=left] {[/align]
string
word
=
s.
substr(
i+
j*
wordLen,
wordLen);
//获取这个单词
if(
dict.
find(
word) ==
dict.
end())
//这个单词不是单词列表中的
break;
cur[
word]++;
if(
dict[
word] <
cur[
word])
//出现的次数多了
break;
[align=left] }[/align]
if(
j
==
wordsLen)
//这时候是匹配的
ret.
push_back(
i);
[align=left] }[/align]
return
ret;
[align=left] }[/align]
[align=left]Leecode测试时间如下:[/align]
[align=left]解法二:[/align]
[align=left]针对第一种方法总建表的内存浪费时间进行了优化.[/align]
[align=left]在对words建表完毕后,用一个数组保存各个键值的value,[/align]
[align=left]子串遍历过程中,若查找成功并且其value不为0,则将map相应键值的value减1,否则匹配失败[/align]
[align=left]代码如下:[/align]
vector<int>
findSubstring2(string
s,
vector<string>&
words) {
vector<
int>
ret;
//判空
if
(
words.
empty() ||
s.
empty() ||
words[0].
size() == 0)
return
ret;
int
i,
j;
int
sLen
=
s.
size();
int
wordsLen
=
words.
size();
int
wordLen
=
words[0].
size();
int
wordsCharLen
=
wordsLen
*
wordLen;
map<
string,
int>
dict;
map<
string,
int>::
iterator
dictIt;
int
dictLen;
int
*
dictItemNum;
if
(
sLen
<
wordsCharLen)
return
ret;
//建立words的表
for
(
i=0;
i<
wordsLen;
i++)
[align=left] {[/align]
dictIt
=
dict.
find(
words[
i]);
if
(
dictIt
!=
dict.
end())
dictIt->
second
++;
else
dict.
insert(
pair<
string,
int>(
words[
i],1));
[align=left] }[/align]
dictLen
=
dict.
size();
//存储dict的value
dictItemNum
=
new
int[
dictLen];
for
(
i=0,
dictIt=
dict.
begin();
dictIt!=
dict.
end();
dictIt++,
i++)
dictItemNum[
i] =
dictIt->
second;
//依次比较
for
(
i=0;
i<
sLen-
wordsCharLen+1;
i++)
[align=left] {[/align]
bool
isMatch
=
true;
for
(
j=0,
dictIt=
dict.
begin();
dictIt!=
dict.
end();
dictIt++,
j++)
dictIt->
second
=
dictItemNum[
j];
for
(
j=0;
j<
wordsLen;
j++)
[align=left] {[/align]
dictIt
=
dict.
find(
s.
substr(
i+
j*
wordLen,
wordLen));
if
(
dictIt
==
dict.
end() ||
dictIt->
second
== 0)
[align=left] {[/align]
isMatch
=
false;
break;
[align=left] }[/align]
else
[align=left] {[/align]
dictIt->
second
--;
[align=left] }[/align]
[align=left] }[/align]
if
(
isMatch)
ret.
push_back(
i);
[align=left] }[/align]
return
ret;
[align=left] }[/align]
[align=left]LeetCode测试时间如下:[/align]
[align=left]解法三:[/align]
[align=left]仔细思考前两种解法的中间过程,存在不少重复计算,比如对于[/align]
[align=left]s:
words:
[align=left]而言[/align]
[align=left]当判别s的第一个子串"barfoo"和第四个子串"foothe"时候,其中的"foo"被重复判断了两次,想办法消除此种重复计算,可以使得算法更加快速[/align]
[align=left]另建表可用unordered_map,时间比map少20ms[/align]
[align=left]代码抄自Leecode Discuss[/align]
vector<int>
findSubstring4(string
s,
vector<string>
&words) {
vector<
int>
ret;
int
i,
j;
int
sLen
=
s.
size();
int
wordsLen
=
words.
size();
if
(
sLen
<= 0 ||
wordsLen
<= 0)
return
ret;
//初始化word的dict
//unordered_map<string, int> dict;
map<
string,
int>
dict;
for
(
i=0;
i<
wordsLen;
i++)
dict[
words[
i]]++;
//遍历所有子串
int
wordLen
=
words[0].
size();
for
(
i=0;
i<
wordLen;
i++)
[align=left] {[/align]
int
left
=
i;
int
count
= 0;
//记录连续的元素个数
map<
string,
int>
tdict;
//缓存map
for
(
j=
i;
j<=
sLen-
wordLen;
j
+=
wordLen)
[align=left] {[/align]
string
str
=
s.
substr(
j,
wordLen);
//若有效,进行计数
if
(
dict.
count(
str))
//此次元素相同
[align=left] {[/align]
tdict[
str]++;
if
(
tdict[
str] <=
dict[
str])
count++;
else
[align=left] {[/align]
//匹配成功,但是超过了次数,认为匹配失败,失败的话
//从头开始遍历,从重复的位置开始计算,重复位置之前的计数-1
while
(
tdict[
str] >
dict[
str])
[align=left] {[/align]
string
str1
=
s.
substr(
left,
wordLen);
tdict[
str1]--;
if
(
tdict[
str1] <
dict[
str1])
count--;
left
+=
wordLen;
[align=left] }[/align]
[align=left] }[/align]
// come to a result
if
(
count
==
wordsLen) {
ret.
push_back(
left);
// advance one word
tdict[
s.
substr(
left,
wordLen)]--;
count--;
left
+=
wordLen;
[align=left] }[/align]
[align=left] }[/align]
// not a valid word, reset all vars
else
[align=left] {[/align]
tdict.
clear();
count
= 0;
//这个小子串匹配不成功,继续后续判断
left
=
j +
wordLen;
[align=left] }[/align]
[align=left] }[/align]
[align=left] }[/align]
return
ret;
[align=left] }[/align]
[align=left]测试时间如下:[/align]
You are given a string, s,
and a list of words, words,
that are all of the same length. Find all starting indices of substring(s) in s that
is a concatenation of each word in wordsexactly
once and without any intervening characters.
For example, given:
s:
"barfoothefoobarman"
words:
["foo", "bar"]
You should return the indices:
[0,9].
(order does not matter).
题目解析:给定一个字符串s和一个字符串向量words,用words向量中的字符串进行任意组合.比如给定例子中的words向量有2个元素,组合形式有"foobar"和"barfoo". 然后查看字符串s中是否有"foobar"或者"barfoo".有的话返回字符串s中的首地址.
解法分析
1.网上大多数的写法,Leetcode运行时间700ms
2.自创的,Leetcode 300ms
3.最快的方法,Leetcode 50ms
解法一:
1.首先对words向量中的各个字符串就行建表map.map中的值为此字符串出现的次数
2.依次取s的子串,对子串中的每个单词在第一步建好的表中进行查找,若找不到,即匹配失败,进行下一个字符.若找到了该子串,则统计该子串出现的次数,若次数大于了第一步表中的次数,则认为匹配失败.
代码如下:
vector<int>
findSubstring3(string
s,
vector<string>&
words) {
map<
string,
int>
dict,
cur;
int
wordsLen
=
words.
size();
int
wordLen
=
words[0].
size();
int
sLen
=
s.
size();
vector<
int>
ret;
for(
int
k=0;
k<
wordsLen;
k++)
dict[
words[
k]]++;
//初始化容器
for(
int
i=0;
i<=
sLen-
wordLen*
wordsLen;
i++)
[align=left] {[/align]
cur.
clear();
//每次使用之前要清空,这个容器是不断变化的
int
j;
for(
j=0;
j<
wordsLen;
j++)
[align=left] {[/align]
string
word
=
s.
substr(
i+
j*
wordLen,
wordLen);
//获取这个单词
if(
dict.
find(
word) ==
dict.
end())
//这个单词不是单词列表中的
break;
cur[
word]++;
if(
dict[
word] <
cur[
word])
//出现的次数多了
break;
[align=left] }[/align]
if(
j
==
wordsLen)
//这时候是匹配的
ret.
push_back(
i);
[align=left] }[/align]
return
ret;
[align=left] }[/align]
[align=left]Leecode测试时间如下:[/align]
[align=left]解法二:[/align]
[align=left]针对第一种方法总建表的内存浪费时间进行了优化.[/align]
[align=left]在对words建表完毕后,用一个数组保存各个键值的value,[/align]
[align=left]子串遍历过程中,若查找成功并且其value不为0,则将map相应键值的value减1,否则匹配失败[/align]
[align=left]代码如下:[/align]
vector<int>
findSubstring2(string
s,
vector<string>&
words) {
vector<
int>
ret;
//判空
if
(
words.
empty() ||
s.
empty() ||
words[0].
size() == 0)
return
ret;
int
i,
j;
int
sLen
=
s.
size();
int
wordsLen
=
words.
size();
int
wordLen
=
words[0].
size();
int
wordsCharLen
=
wordsLen
*
wordLen;
map<
string,
int>
dict;
map<
string,
int>::
iterator
dictIt;
int
dictLen;
int
*
dictItemNum;
if
(
sLen
<
wordsCharLen)
return
ret;
//建立words的表
for
(
i=0;
i<
wordsLen;
i++)
[align=left] {[/align]
dictIt
=
dict.
find(
words[
i]);
if
(
dictIt
!=
dict.
end())
dictIt->
second
++;
else
dict.
insert(
pair<
string,
int>(
words[
i],1));
[align=left] }[/align]
dictLen
=
dict.
size();
//存储dict的value
dictItemNum
=
new
int[
dictLen];
for
(
i=0,
dictIt=
dict.
begin();
dictIt!=
dict.
end();
dictIt++,
i++)
dictItemNum[
i] =
dictIt->
second;
//依次比较
for
(
i=0;
i<
sLen-
wordsCharLen+1;
i++)
[align=left] {[/align]
bool
isMatch
=
true;
for
(
j=0,
dictIt=
dict.
begin();
dictIt!=
dict.
end();
dictIt++,
j++)
dictIt->
second
=
dictItemNum[
j];
for
(
j=0;
j<
wordsLen;
j++)
[align=left] {[/align]
dictIt
=
dict.
find(
s.
substr(
i+
j*
wordLen,
wordLen));
if
(
dictIt
==
dict.
end() ||
dictIt->
second
== 0)
[align=left] {[/align]
isMatch
=
false;
break;
[align=left] }[/align]
else
[align=left] {[/align]
dictIt->
second
--;
[align=left] }[/align]
[align=left] }[/align]
if
(
isMatch)
ret.
push_back(
i);
[align=left] }[/align]
return
ret;
[align=left] }[/align]
[align=left]LeetCode测试时间如下:[/align]
[align=left]解法三:[/align]
[align=left]仔细思考前两种解法的中间过程,存在不少重复计算,比如对于[/align]
[align=left]s:
"barfoothefoobarman"[/align]
words:
["foo", "bar"]
[align=left]而言[/align]
[align=left]当判别s的第一个子串"barfoo"和第四个子串"foothe"时候,其中的"foo"被重复判断了两次,想办法消除此种重复计算,可以使得算法更加快速[/align]
[align=left]另建表可用unordered_map,时间比map少20ms[/align]
[align=left]代码抄自Leecode Discuss[/align]
vector<int>
findSubstring4(string
s,
vector<string>
&words) {
vector<
int>
ret;
int
i,
j;
int
sLen
=
s.
size();
int
wordsLen
=
words.
size();
if
(
sLen
<= 0 ||
wordsLen
<= 0)
return
ret;
//初始化word的dict
//unordered_map<string, int> dict;
map<
string,
int>
dict;
for
(
i=0;
i<
wordsLen;
i++)
dict[
words[
i]]++;
//遍历所有子串
int
wordLen
=
words[0].
size();
for
(
i=0;
i<
wordLen;
i++)
[align=left] {[/align]
int
left
=
i;
int
count
= 0;
//记录连续的元素个数
map<
string,
int>
tdict;
//缓存map
for
(
j=
i;
j<=
sLen-
wordLen;
j
+=
wordLen)
[align=left] {[/align]
string
str
=
s.
substr(
j,
wordLen);
//若有效,进行计数
if
(
dict.
count(
str))
//此次元素相同
[align=left] {[/align]
tdict[
str]++;
if
(
tdict[
str] <=
dict[
str])
count++;
else
[align=left] {[/align]
//匹配成功,但是超过了次数,认为匹配失败,失败的话
//从头开始遍历,从重复的位置开始计算,重复位置之前的计数-1
while
(
tdict[
str] >
dict[
str])
[align=left] {[/align]
string
str1
=
s.
substr(
left,
wordLen);
tdict[
str1]--;
if
(
tdict[
str1] <
dict[
str1])
count--;
left
+=
wordLen;
[align=left] }[/align]
[align=left] }[/align]
// come to a result
if
(
count
==
wordsLen) {
ret.
push_back(
left);
// advance one word
tdict[
s.
substr(
left,
wordLen)]--;
count--;
left
+=
wordLen;
[align=left] }[/align]
[align=left] }[/align]
// not a valid word, reset all vars
else
[align=left] {[/align]
tdict.
clear();
count
= 0;
//这个小子串匹配不成功,继续后续判断
left
=
j +
wordLen;
[align=left] }[/align]
[align=left] }[/align]
[align=left] }[/align]
return
ret;
[align=left] }[/align]
[align=left]测试时间如下:[/align]
相关文章推荐
- 内存那些事
- elasticsearch 文档
- POJ 1979 Red and Black(dfs)
- PetSc学习记录
- android adb 显示 device offline
- Deep Learning阅读资料
- MongoDB常用命令
- fuel 6.1自动推送3控高可用centos 6.5 juno环境排错(二)
- 桌面共享
- [CSS3] Transition Function
- Hibernate学习(二)
- 启动spark-shell出现问题,解决办法
- VirtualBox虚拟机安装与上网配置
- LDAP《二》
- LeakCanary: 让内存泄露无所遁形
- MySQL函数之日期和时间函数
- Union Mount
- git的安装、注册和使用
- 热力图实现-heatmap.js 代码示例
- 老李推荐:第2章1节《MonkeyRunner源码剖析》了解你的测试对象: NotePad应用简介