Redis精通系列——LFU演算法詳述(Least Frequently Used

本文已收錄于專欄

??《Redis精通系列》??

上千人點贊收藏，全套Redis學習資料，大廠必備技能！

1、簡介

2、實作方式

2.1 LRU實作方式

2.2 LFU實作方式

3、LFU使用

3.1 組態檔開啟LFU淘汰演算法

1、簡介

LRU有一個明顯的缺點，它無法正確的表示一個Key的熱度，如果一個key從未被訪問過，僅僅發生記憶體淘汰的前一會兒被用戶訪問了一下，在LRU演算法中這會被認為是一個熱key，
例如如下圖，keyA與keyB同時被set到Redis中，在記憶體淘汰發生之前，keyA被頻繁的訪問，而keyB只被訪問了一次，但是這次訪問的時間比keyA的任意一次訪問時間都更接近記憶體淘汰觸發的時間，如果keyA與keyB均被Redis選中進行淘汰，keyA將被優先淘汰，我想大家都不太希望keyA被淘汰吧，那么有沒有更好的的記憶體淘汰機制呢？當然有，那就是LFU，

LRU存在的問題.png

LFU(Least Frequently Used)是Redis 4.0 引入的淘汰演算法，它通過key的訪問頻率比較來淘汰key，重點突出的是Frequently Used，
?

LRU與LFU的區別：

LRU -> Recently Used，根據最近一次訪問的時間比較
LFU -> Frequently Used，根據key的訪問頻率比較

Redis4.0之后為maxmemory_policy淘汰策略添加了兩個LFU模式（LRU請看我上一篇文章）：

volatile-lfu：對有過期時間的key采用LFU淘汰演算法
allkeys-lfu：對全部key采用LFU淘汰演算法

2、實作方式

Redis分配一個字串的最小空間占用是19位元組，16位元組（物件頭）+3位元組（SDS基本欄位），Redis的記憶體淘汰演算法LRU/LFU均依靠其中的物件頭中的lru來實作，
Redis物件頭的記憶體結構：

typedef struct redisObject {
    unsigned type:4;        // 4 bits 物件的型別（zset、set、hash等）
    unsigned encoding:4;    // 4 bits 物件的存盤方式（ziplist、intset等）
    unsigned lru:24;        // 24bits 記錄物件的訪問資訊
    int refcount;            // 4 bytes 參考計數
    void *ptr;                // 8 bytes （64位作業系統），指向物件具體的存盤地址/物件body
}

Redis物件頭中的lru欄位，在LRU模式下和LFU模式下使用方式并不相同，
?

2.1 LRU實作方式

在LRU模式，lru欄位存盤的是key被訪問時Redis的時鐘server.lrulock（Redis為了保證核心單執行緒服務性能，快取了Unix作業系統時鐘，默認每毫秒更新一次，快取的值是Unix時間戳取模2^24），當key被訪問的時候，Redis會更新這個key的物件頭中lru欄位的值，
因此在LRU模式下，Redis可以根據物件頭中的lru欄位記錄的值，來比較最后一次key的訪問時間，
?

用Java代碼演示一個簡單的Redis-LRU演算法：

Redis物件頭

package com.lizba.redis.lru;

/**
 * <p>
 *      Redis物件頭
 * </p>
 *
 * @Author: Liziba
 * @Date: 2021/9/22 22:40
 */
public class RedisHead {

    /** 時間 */
    private Long lru;
    /** 具體資料 */
    private Object body;

    public RedisHead setLru(Long lru) {
        this.lru = lru;
        return this;
    }

    public RedisHead setBody(Object body) {
        this.body = body;
        return this;
    }


    public Long getLru() {
        return lru;
    }

    public Object getBody() {
        return body;
    }

}

Redis LRU實作代碼

package com.lizba.redis.lru;

import java.util.Comparator;
import java.util.List;
import java.util.concurrent.ConcurrentHashMap;
import java.util.stream.Collectors;

/**
 * <p>
 * Redis中LRU演算法的實作demo
 * </p>
 *
 * @Author: Liziba
 * @Date: 2021/9/22 22:36
 */
public class RedisLruDemo {

    /**
     * 快取容器
     */
    private ConcurrentHashMap<String, RedisHead> cache;
    /**
     * 初始化大小
     */
    private int initialCapacity;

    public RedisLruDemo(int initialCapacity) {
        this.initialCapacity = initialCapacity;
        this.cache = new ConcurrentHashMap<>(initialCapacity);
        ;
    }

    /**
     * 設定key/value 設定的時候更新LRU
     *
     * @param key
     * @param body
     */
    public void set(String key, Object body) {
        // 觸發LRU淘汰
        synchronized (RedisLruDemo.class) {
            if (!cache.containsKey(key) && cache.size() >= initialCapacity) {
                this.flushLruKey();
            }
        }
        RedisHead obj = this.getRedisHead().setBody(body).setLru(System.currentTimeMillis());
        cache.put(key, obj);
    }


    /**
     * 獲取key，存在則更新LRU
     *
     * @param key
     * @return
     */
    public Object get(String key) {

        RedisHead result = null;
        if (cache.containsKey(key)) {
            result = cache.get(key);
            result.setLru(System.currentTimeMillis());
        }

        return result;
    }


    /**
     * 清除LRU key
     */
    private void flushLruKey() {

        List<String> sortData = cache.keySet()
                .stream()
                .sorted(Comparator.comparing(key -> cache.get(key).getLru()))
                .collect(Collectors.toList());
        String removeKey = sortData.get(0);
        System.out.println( "淘汰 -> " + "lru : " + cache.get(removeKey).getLru() + " body : " + cache.get(removeKey).getBody());
        cache.remove(removeKey);
        if (cache.size() >= initialCapacity) {
            this.flushLruKey();
        }
        return;
    }


    /**
     *  獲取所有資料測驗用
     *
     * @return
     */
    public List<RedisHead> getAll() {
         return cache.keySet().stream().map(key -> cache.get(key)).collect(Collectors.toList());
    }


    private RedisHead getRedisHead() {
        return new RedisHead();
    }

}

測驗代碼

package com.lizba.redis.lru;

import java.util.Random;
import java.util.concurrent.TimeUnit;

/**
 * <p>
 *      測驗LRU
 * </p>
 *
 * @Author: Liziba
 * @Date: 2021/9/22 22:51
 */
public class TestRedisLruDemo {

    public static void main(String[] args) throws InterruptedException {

        RedisLruDemo demo = new RedisLruDemo(10);
        // 先加入10個key，此時cache達到容量，下次加入會淘汰key
        for (int i = 0; i < 10; i++) {
            demo.set(i + "", i);
        }
        // 隨機訪問前十個key，這樣可以保證下次加入時隨機淘汰
        for (int i = 0; i < 20; i++) {
            int nextInt = new Random().nextInt(10);
            TimeUnit.SECONDS.sleep(1);
            demo.get(nextInt + "");
        }
        // 再次添加5個key，此時每次添加都會觸發淘汰
        for (int i = 10; i < 15; i++) {
            demo.set(i + "", i);
        }

        System.out.println("-------------------------------------------");
        demo.getAll().forEach( redisHead -> System.out.println("剩余 -> " + "lru : " + redisHead.getLru() + " body : " + redisHead.getBody()));
    }

}

測驗結果

2.2 LFU實作方式

在LFU模式下，Redis物件頭的24bit lru欄位被分成兩段來存盤，高16bit存盤ldt(Last Decrement Time)，低8bit存盤logc(Logistic Counter)，

lru_24 bit.png

2.2.1 ldt(Last Decrement Time)

高16bit用來記錄最近一次計數器降低的時間，由于只有8bit，存盤的是Unix分鐘時間戳取模2^16，16bit能表示的最大值為65535（65535/24/60≈45.5），大概45.5天會折返（折返指的是取模后的值重新從0開始），
?

Last Decrement Time計算的演算法原始碼：

/* Return the current time in minutes, just taking the least significant
 * 16 bits. The returned time is suitable to be stored as LDT (last decrement
 * time) for the LFU implementation. */
// server.unixtime是Redis快取的Unix時間戳
// 可以看出使用的Unix的分鐘時間戳，取模2^16
unsigned long LFUGetTimeInMinutes(void) {
  return (server.unixtime/60) & 65535;
}

/* Given an object last access time, compute the minimum number of minutes
 * that elapsed since the last access. Handle overflow (ldt greater than
 * the current 16 bits minutes time) considering the time as wrapping
 * exactly once. */
unsigned long LFUTimeElapsed(unsigned long ldt) {
  // 獲取系統當前的LFU time
  unsigned long now = LFUGetTimeInMinutes();
  // 如果now >= ldt 直接取差值  
  if (now >= ldt) return now-ldt;
  // 如果now < ldt 增加上65535
  // 注意Redis 認為折返就只有一次折返，多次折返也是一次，我思考了很久感覺這個應該是可以接受的，本身Redis的淘汰演算法就帶有隨機性  
  return 65535-ldt+now;
}

2.2.2 logc(Logistic Counter)

低8位用來記錄訪問頻次，8bit能表示的最大值為255，logc肯定無法記錄真實的Rediskey的訪問次數，其實從名字可以看出存盤的是訪問次數的對數值，每個新加入的key的logc初始值為5（LFU_INITI_VAL），這樣可以保證新加入的值不會被首先選中淘汰；logc每次key被訪問時都會更新；此外，logc會隨著時間衰減，
?

2.2.3 logc 演算法調整

redis.conf 提供了兩個配置項，用于調整LFU的演算法從而控制Logistic Counter的增長和衰減，

lfu-log-factor 用于調整Logistic Counter的增長速度，lfu-log-factor值越大，Logistic Counter增長越慢，

Redis Logistic Counter增長的源代碼：

/* Logarithmically increment a counter. The greater is the current counter value
 * the less likely is that it gets really implemented. Saturate it at 255. */
uint8_t LFULogIncr(uint8_t counter) {
  // Logistic Counter最大值為255  
  if (counter == 255) return 255;
  // 取一個0~1的亂數r  
  double r = (double)rand()/RAND_MAX;
  // counter減去LFU_INIT_VAL （LFU_INIT_VAL為每個key的Logistic Counter初始值，默認為5）
  double baseval = counter - LFU_INIT_VAL;
  // 如果衰減之后已經小于5了，那么baseval < 0取0
  if (baseval < 0) baseval = 0;
  // lfu-log-factor在這里被使用
  // 可以看出如果lfu_log_factor的值越大，p越小
  // r < p的概率就越小，Logistic Counter增加的概率就越小（因此lfu_log_factor越大增長越緩慢）
  double p = 1.0/(baseval*server.lfu_log_factor+1);
  if (r < p) counter++;
  return counter;
}

如下是官網提供lfu-log-factor在不同值下，key隨著訪問次數的增加的Logistic Counter變化情況的資料：

lfu-decay-time 用于調整Logistic Counter的衰減速度，它是一個以分鐘為單位的數值，默認值為1,；lfu-decay-time值越大，衰減越慢，

Redis Logistic Counter衰減的源代碼：

/* If the object decrement time is reached decrement the LFU counter but
 * do not update LFU fields of the object, we update the access time
 * and counter in an explicit way when the object is really accessed.
 * And we will times halve the counter according to the times of
 * elapsed time than server.lfu_decay_time.
 * Return the object frequency counter.
 *
 * This function is used in order to scan the dataset for the best object
 * to fit: as we check for the candidate, we incrementally decrement the
 * counter of the scanned objects if needed. */
unsigned long LFUDecrAndReturn(robj *o) {
  // 獲取lru的高16位，也就是ldt
  unsigned long ldt = o->lru >> 8;  
  // 獲取lru的低8位，也就是logc  
  unsigned long counter = o->lru & 255;
  // 根據配置的lfu-decay-time計算Logistic Counter需要衰減的值
  unsigned long num_periods = server.lfu_decay_time ? LFUTimeElapsed(ldt) / server.lfu_decay_time : 0;
  if (num_periods)
    counter = (num_periods > counter) ? 0 : counter - num_periods;
  return counter;
}

2.2.4 LFU 優化

LFU 與 LRU 有一個共同點，當記憶體達到max_memory時，選擇key是隨機抓取的，因此Redis為了使這種隨機性更加準確，設計了一個淘汰池，這個淘汰池對于LFU和LRU算的都適應，只是淘汰池的排序演算法有區別而已，
Redis 3.0就對這一塊進行了優化（來自redis.io）：