您的位置：首页 > 数据库 > Redis

Redis近似LRU算法优化

2016-11-25 18:43 423 查看

本文和大家分享的主要是redis中，从算法性能及模拟效果提升出发优化近似LRU算法的相关操作，希望通过本文的分享，对大家学习 redis
有所帮助。

Redis 3.0 LRU算法优化实现

Redis 3.0中主要做了如下优化：

· LRU 时钟的粒度从秒级提升为毫秒级

· 使用新的 API 来获取LRU替换时的采样样本

· 默认的LRU采样样本数从3提升为5

· 使用 eviction pool 来选取需要淘汰的key

　　提升 LRU时钟的粒度，主要是为了在测试LRU算法性能时，能够在更短的时间内获取结果，更新LRU时钟的方法也有所变化，如果LRU时钟的时间粒度高于
serverCron刷新的时间粒度，那么就主动获取最新的时间，否则使用server缓存的时间，

/* Macro used to obtain the current LRU clock.

* If the current resolution is lower than the frequency we refresh the

* LRU clock (as it should be in production servers) we return the

* precomputed value, otherwise we need to resort to a system call. */#define LRU_CLOCK() ((1000/server.hz <= LRU_CLOCK_RESOLUTION) ? server.lruclock : getLRUClock())

unsigned int getLRUClock(void) {

return (mstime()/LRU_CLOCK_RESOLUTION) & LRU_CLOCK_MAX;

}

　　在源码的 utils/lru 目录下有测试脚本，测试前需要把 src/redis.h 中的 REDIS_LRU_CLOCK_RESOLUTION 宏设置为1，即LRU时钟的分辨率为
1ms ，然后重新编译源码，执行方式如下，

ruby test-lru.rb > /tmp/lru.html

　　测试完成后会生成一个 html 页面，包含测试结果，以及一个图形化的插入淘汰流程

在 Redis 2.8 中每次选取淘汰样本时，都是调用 dictGetRandomKey 来随机获取一个key，会根据
maxmemory-samples 配置的大小，多次调用。这个流程在 Redis 3.0中被优化为一次调用获取指定数量的 key ，且不需要每次都调用随机函数，如下，

unsigned int dictGetSomeKeys(dict *d, dictEntry **des, unsigned int count) {

unsigned long j; /* internal hash table id, 0 or 1. */

unsigned long tables; /* 1 or 2 tables? */

unsigned long stored = 0, maxsizemask;

unsigned long maxsteps;

if (dictSize(d) < count) count = dictSize(d);

maxsteps = count*10;

/* Try to do a rehashing work proportional to 'count'. */

for (j = 0; j < count; j++) {

if (dictIsRehashing(d))

_dictRehashStep(d);

else

break;

}

tables = dictIsRehashing(d) ? 2 : 1;

maxsizemask = d->ht[0].sizemask;

if (tables > 1 && maxsizemask < d->ht[1].sizemask)

maxsizemask = d->ht[1].sizemask;

/* Pick a random point inside the larger table. */

unsigned long i = random() & maxsizemask;

unsigned long emptylen = 0; /* Continuous empty entries so far. */

while(stored < count && maxsteps--) {

for (j = 0; j < tables; j++) {

if (tables == 2 && j == 0 && i < (unsigned long) d->rehashidx) {

if (i >= d->ht[1].size) i = d->rehashidx;

continue;

}

if (i >= d->ht[j].size) continue; /* Out of range for this table. */

dictEntry *he = d->ht[j].table[i];

/* Count contiguous empty buckets, and jump to other

* locations if they reach 'count' (with a minimum of 5). */

if (he == NULL) {

emptylen++;

if (emptylen >= 5 && emptylen > count) {

i = random() & maxsizemask;

emptylen = 0;

}

} else {

emptylen = 0;

while (he) {

/* Collect all the elements of the buckets found non

* empty while iterating. */

*des = he;

des++;

he = he->next;

stored++;

if (stored == count) return stored;

}

}

}

i = (i+1) & maxsizemask;

}

return stored;

}

dictGetSomeKeys 会随机从db的某个起始位置开始，连续获取指定数量的
key，需要注意的是，如果 db 对应的字典正在做 rehash ，可能需要从两个 hashtable来获取key。如果需要根据某种分布来随机获取字典里面的key，这种采样方式可能是不合适的，但是如果只是为了随机获取一系列key来作为LRU算法的淘汰样本，这种方式是可行的。

　　采样性能的提升带来的好处就是，我们可以在不牺牲淘汰算法性能的情况下，提高采样的样本数，让Redis的近似LRU算法更接近于严格LRU算法，所以目前Redis把超过
maxmemory 后默认的采样样本数从 3 个提升到 5 个。

　　最后一个也是最重要的改进是，选取要淘汰 key 的流程。之前是每次随机选取 maxmemory-samples 个key，然后比较它们的
idle 时间，idle时间最久的key会被淘汰掉。在Redis 3.0中增加了一个
eviction pool 的结构，eviction pool是一个数组，保存了之前随机选取的key及它们的idle时间，数组里面的key按idle时间升序排序，当内存满了需要淘汰数据时，会调用
dictGetSomeKeys 选取指定的数目的key，然后更新到eviction pool里面，如果新选取的key的idle时间比eviction
pool里面idle时间最小的key还要小，那么就不会把它插入到eviction
pool里面，这个思路和 LIRS替换算法利用的每个块的历史信息思想有些类似，

eviction pool更新逻辑代码如下，

#define EVICTION_SAMPLES_ARRAY_SIZE 16void evictionPoolPopulate(dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {

int j, k, count;

dictEntry *_samples[EVICTION_SAMPLES_ARRAY_SIZE];

dictEntry **samples;

/* Try to use a static buffer: this function is a big hit...

* Note: it was actually measured that this helps. */

if (server.maxmemory_samples <= EVICTION_SAMPLES_ARRAY_SIZE) {

samples = _samples;

} else {

samples = zmalloc(sizeof(samples[0])*server.maxmemory_samples);

}

count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);

for (j = 0; j < count; j++) {

unsigned long long idle;

sds key;

robj *o;

dictEntry *de;

de = samples[j];

key = dictGetKey(de);

/* If the dictionary we are sampling from is not the main

* dictionary (but the expires one) we need to lookup the key

* again in the key dictionary to obtain the value object. */

if (sampledict != keydict) de = dictFind(keydict, key);

o = dictGetVal(de);

idle = estimateObjectIdleTime(o);

/* Insert the element inside the pool.

* First, find the first empty bucket or the first populated

* bucket that has an idle time smaller than our idle time. */

k = 0;

while (k < MAXMEMORY_EVICTION_POOL_SIZE &&

pool[k].key &&

pool[k].idle < idle) k++;

if (k == 0 && pool[MAXMEMORY_EVICTION_POOL_SIZE-1].key != NULL) {

/* Can't insert if the element is < the worst element we have

* and there are no empty buckets. */

continue;

} else if (k < MAXMEMORY_EVICTION_POOL_SIZE && pool[k].key == NULL) {

/* Inserting into empty position. No setup needed before insert. */

} else {

/* Inserting in the middle. Now k points to the first element

* greater than the element to insert. */

if (pool[MAXMEMORY_EVICTION_POOL_SIZE-1].key == NULL) {

/* Free space on the right? Insert at k shifting

* all the elements from k to end to the right. */

memmove(pool+k+1,pool+k,

sizeof(pool[0])*(MAXMEMORY_EVICTION_POOL_SIZE-k-1));

} else {

/* No free space on right? Insert at k-1 */

k--;

/* Shift all elements on the left of k (included) to the

* left, so we discard the element with smaller idle time. */

sdsfree(pool[0].key);

memmove(pool,pool+1,sizeof(pool[0])*k);

}

}

pool[k].key = sdsdup(key);

pool[k].idle = idle;

}

if (samples != _samples) zfree(samples);

}

　　当选取的淘汰策略和LRU相关时（allkeys-lru或volatile-lru），
freeMemoryIfNeeded会调用 evictionPoolPopulate 来更新eviction pool，然后淘汰掉eviction pool里面的最后一个元素所对应的key，这样的选取淘汰key的方式的好处是：假设说新随机选取的key的访问时间可能比历史随机选取的key的访问时间还要新，但是在Redis
2.8中，新选取的key会被淘汰掉，这和LRU算法利用的访问局部性原理是相违背的，在Redis
3.0中，这种情况被避免了。

　　此外，如果某个历史选取的key的idle时间相对来说比较久，但是本次淘汰并没有被选中，因为出现了idle时间更久的key，那么在使用eviction
pool的情况下，这种idle时间比较久的key淘汰概率增大了，因为它在eviction
pool里面被保存下来，参与下轮淘汰，这个思路和访问局部性原理是契合的。

　　和 Redis 2.8 相比，改进的效果我们可以引用下图：

可以看到在前面1/2需要淘汰的key里面（
浅灰色的点）， Redis 3.0 残留下来的key明显比Redis 2.8少了很多，而且后面新插入的1/2的key里面（
绿色的点），Redis 3.0没有一个淘汰的key。

总结

Redis 3.0中对于 LRU 替换算法的优化，在只维护一个eviction pool带来的少量开销情况下，对算法效率的提升是比较明显的，效率的提升带来的是访问命中率的提升。同时，在目前3.4的unstable版本中我们也可以看见Redis计划实现
LFU 算法以支持更丰富的业务场景，阿里云Redis服务团队也会持续跟进。此外，对于
LIRS 这种基于LRU的改进算法，在不影响性能的前提下，我们也会研究在内核上做支持。

来源：云栖博客

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航