您的位置:首页 > 数据库 > Redis

REDIS源码中一些值得学习的技术细节02

2016-02-05 22:05 691 查看
Redis中散列函数的实现:

Redis针对整数key和字符串key,采用了不同的散列函数

对于整数key,redis使用了 Thomas Wang的 32 bit Mix Function,实现了dict.c/dictIntHashFunction函数:

/* Thomas Wang's 32 bit Mix Function */
unsigned int dictIntHashFunction(unsigned int key)
{
key += ~(key << 15);
key ^=  (key >> 10);
key +=  (key << 3);
key ^=  (key >> 6);
key += ~(key << 11);
key ^=  (key >> 16);
return key;
}


这段代码的妙处我还没来得及仔细研究,等研究好了会在这里补上,不过找到了两个初看还不错的链接:

首先是Thomas Wang大神本人的链接:
http://web.archive.org/web/20071223173210/http://www.concentric.net/~Ttwang/tech/inthash.htm
再者是他人根据上面链接和其他资料写的总结
http://blog.csdn.net/jasper_xulei/article/details/18364313
对于字符串形式的key,redis使用了MurmurHash2算法和djb算法:

MurmurHash2算法对于key是大小写敏感的,而且在大端机器和小端机器上生成结果不一致

redis的dict.c/dictGenHashFunction是MurmurHash2算法的C语言实现:

unsigned int dictGenHashFunction(const void *key, int len) {
/* 'm' and 'r' are mixing constants generated offline.
They're not really 'magic', they just happen to work well.  */
uint32_t seed = dict_hash_function_seed;
const uint32_t m = 0x5bd1e995;
const int r = 24;

/* Initialize the hash to a 'random' value */
uint32_t h = seed ^ len;

/* Mix 4 bytes at a time into the hash */
const unsigned char *data = (const unsigned char *)key;

while(len >= 4) {
uint32_t k = *(uint32_t*)data;

k *= m;
k ^= k >> r;
k *= m;

h *= m;
h ^= k;

data += 4;
len -= 4;
}

/* Handle the last few bytes of the input array  */
switch(len) {
case 3: h ^= data[2] << 16;
case 2: h ^= data[1] << 8;
case 1: h ^= data[0]; h *= m;
};

/* Do a few final mixes of the hash to ensure the last few
* bytes are well-incorporated. */
h ^= h >> 13;
h *= m;
h ^= h >> 15;

return (unsigned int)h;
}


而redis则借助djb函数实现了不区分大小写的散列函数dict.c/dictGenCaseHashFunction:

unsigned int dictGenCaseHashFunction(const unsigned char *buf, int len) {
unsigned int hash = (unsigned int)dict_hash_function_seed;

while (len--)
hash = ((hash << 5) + hash) + (tolower(*buf++)); /* hash * 33 + c */
return hash;
}


以上三个散列函数(dictIntHashFunction, dictIntHashFunction, dictGenCaseHashFunction)分别用在了redis的不同地方,用以实现了不同场合下的散列需求,接下来的文章将会详细介绍。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: