您的位置:首页 > 其它

levelDB跳表实现

2016-03-02 23:22 316 查看

跳表的原理就是利用随机性建立索引,加速搜索,并且简化代码实现难度。具体的跳表原理不再赘述,主要是看了levelDB有一些实现细节的东西,凸显自己写的实现不足之处。



去除冗余的key

template<typename Key, class Comparator>
struct SkipList<Key,Comparator>::Node {
explicit Node(const Key& k) : key(k) { }

Key const key;

// Accessors/mutators for links.  Wrapped in methods so we can
// add the appropriate barriers as necessary.
Node* Next(int n) {
assert(n >= 0);
// Use an 'acquire load' so that we observe a fully initialized
// version of the returned Node.
return reinterpret_cast<Node*>(next_
.Acquire_Load());
}
void SetNext(int n, Node* x) {
assert(n >= 0);
// Use a 'release store' so that anybody who reads through this
// pointer observes a fully initialized version of the inserted node.
next_
.Release_Store(x);
}

// No-barrier variants that can be safely used in a few locations.
Node* NoBarrier_Next(int n) {
assert(n >= 0);
return reinterpret_cast<Node*>(next_
.NoBarrier_Load());
}
void NoBarrier_SetNext(int n, Node* x) {
assert(n >= 0);
next_
.NoBarrier_Store(x);
}

private:
// Array of length equal to the node height.  next_[0] is lowest level link.
port::AtomicPointer next_[1];
};

这里使用一个Node节点表示所有相同key,不同高度的节点集合,仅保留了key和不同高度的向右指针,并且使用NewNode来动态分配随即高度的向右指针集合,而next_就指向这指针集合。这也是c/c++ tricky的地方。

#include <stdio.h>
struct Node {
char str[1];
};
int main() {
char* mem = new char[4];
for (int i = 0; i < 4; i++) {
mem[i] = i + '0';
}
Node* node = (Node*)mem;
char* const pstr = node->str;
for (int i = 0; i < 4; i++) {
printf("%c", pstr[i]);
}
return 0;
}

就像上面这个简单的sample,成员str可以作为指针指向从数组下标0开始的元素,并且不受申明时的限制,不局限于大小1,索引至分配的最大的内存地址。

简易随机数生成

uint32_t Next() {
static const uint32_t M = 2147483647L;   // 2^31-1
static const uint64_t A = 16807;  // bits 14, 8, 7, 5, 2, 1, 0
// We are computing
//       seed_ = (seed_ * A) % M,    where M = 2^31-1
//
// seed_ must not be zero or M, or else all subsequent computed values
// will be zero or M respectively.  For all other values, seed_ will end
// up cycling through every number in [1,M-1]
uint64_t product = seed_ * A;

// Compute (product % M) using the fact that ((x << 31) % M) == x.
seed_ = static_cast<uint32_t>((product >> 31) + (product & M));
// The first reduction may overflow by 1 bit, so we may need to
// repeat.  mod == M is not possible; using > allows the faster
// sign-bit-based test.
if (seed_ > M) {
seed_ -= M;
}
return seed_;
}

可以看到,他使用A和M对种子进行运算,达到一定数据范围内不会重复的数集,而里面对于(product % M),使用(product >> 31) + (product & M)进行运算优化,考虑右移和与操作的代价远小于取余操作。

简洁清晰的私有帮助方法,帮助寻找小于指定key的节点

template<typename Key, class Comparator>
typename SkipList<Key,Comparator>::Node*
SkipList<Key,Comparator>::FindLessThan(const Key& key) const {
Node* x = head_;
int level = GetMaxHeight() - 1;
while (true) {
assert(x == head_ || compare_(x->key, key) < 0);
Node* next = x->Next(level);
if (next == NULL || compare_(next->key, key) >= 0) {
if (level == 0) {
return x;
} else {
// Switch to next list
level--;
}
} else {
x = next;
}
}
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: