您的位置:首页 > 产品设计 > UI/UE

187. Repeated DNA Sequences

2016-05-09 17:02 387 查看

Problem

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = “AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT”,

Return:

[“AAAAACCCCC”, “CCCCCAAAAA”].

Solution

Discuss中的回答

  “A”,“C”,“G”,“T”的ASCII码分别是65,67,71和84,转换成二进制分别是‭01000001‬,‭‭01000011,‭01000111,‭01010100,可以看到他们的最后三位是不一样的。所以只需要用最后三位就可以区别这四个字母。

class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
unordered_map<int,int> map;
vector<string> ret;
int key = 0;
for(int i = 0;i<s.length();++i)
{
key = ((key<<3)|(s[i] & 0x7)) & 0x3FFFFFFF;
if(i<9)
continue;
if(map.find(key) == map.end())
map[key] = 1;
else if(map[key] == 1)
{
ret.push_back(s.substr(i-9,10));
map[key]++;
}
}

return ret;

}
};
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: