您的位置:首页 > 产品设计 > UI/UE

[LeetCode187]Repeated DNA Sequences

2015-12-02 08:43 471 查看
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
Hide Company Tags LinkedIn
Hide Tags Hash Table Bit Manipulation


如果不考虑用bit manipulation的方法, 这道题就一行code:

class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
unordered_map<string,int> mp;
int n = s.size();
vector<string> res;
for(int i = 0; i<n-9; ++i){
if(mp[s.substr(i,10)]++ == 1) res.push_back(s.substr(i,10));
}
return res;
}
};


但是很慢。。看到discuss里的8ms code, bit manipulation 这种题总是terrify me! 讨厌! 好好理解一下:

vector<string> findRepeatedDnaSequences(string s) {
char  hashMap[1048576] = {0};
vector<string> ans;
int len = s.size(),hashNum = 0;
if (len < 11) return ans;
for (int i = 0;i < 9;++i)
hashNum = hashNum << 2 | (s[i] - 'A' + 1) % 5;
for (int i = 9;i < len;++i)
if (hashMap[hashNum = (hashNum << 2 | (s[i] - 'A' + 1) % 5) & 0xfffff]++ == 1)
ans.push_back(s.substr(i-9,10));
return ans;
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  leetcode