Redis Cluster 自動化安裝,擴容和縮容
之前寫過一篇基于python的redis集群自動化安裝的實作,基于純命令的集群實作還是相當繁瑣的,因此官方提供了redis-trib.rb這個工具
雖然官方的的redis-trib.rb提供了集群創建、 檢查、 修復、均衡等命令列工具,之所個人接受不了redis-trib.rb,原因在于redis-trib.rb無法自定義實作集群中節點的主從關系,
比如ABCDEF6個節點,在創建集群的程序中必然要明確指定哪些是主,哪些是從,主從對應關系,可惜通過redis-trib.rb無法自定義控制,參考如下截圖,
更多的時候,是需要明確指明哪些機器作為主節點,哪些作為從節點,redis-trib.rb做不到自動控制集群中的哪些機器(實體)作為主,哪些機器(實體)作為從,
如果使用redis-trib.rb,還需要解決ruby的環境依賴,因此個人不太接受使用redis-trib.rb搭建集群,
參考《Redis開發與運維》里面的原話:
如果部署節點使用不同的IP地址, redis-trib.rb會盡可能保證主從節點不分配在同一機器下, 因此會重新排序節點串列順序,
節點串列順序用于確定主從角色, 先主節點之后是從節點,
這說明:使用redis-trib.rb是無法人為地完全控制主從節點的分配的,
后面redis 5.0版本的Redis-cli --cluster已經實作了集群的創建,無需依賴redis-trib.rb,包括ruby環境,redis 5.0版本Redis-cli --cluster本身已經實作了集群等相關功能
但是基于純命令本身還是比較復雜的,尤其是在較為復雜的生產環境,通過手動方式來創建集群,擴容或者縮容,會存在一系列的手工操作,以及一些不安全因素,
所以,自動化的集群創建 ,擴容以及縮容是有必要的,
測驗環境
這里基于python3,以redis-cli --cluster命令為基礎,實作redis自動化集群,自動化擴容,自動化縮容
測驗環境以單機多實體為示例,一共8個節點,
1,自動化集群的創建,6各節點(10001~10006)創建為3主(10001~10002)3從(10004~10006)的集群
2,集群的自動化擴容,增加新節點10007為主節點,同時添加10008為10007節點的slave節點
3,集群的自動化縮容,與2相反,移除集群中的10007以及其slave的10008節點
Redis集群創建
集群的本質是執行兩組命令,一個是將主節點加入到集群中,一個是依次對主節點添加slave節點,
但是期間會涉及到找到各個節點id的邏輯,因此手動實作的話,比較繁瑣,
主要命令如下:
################# create cluster #################
redis-cli --cluster create 127.0.0.1:10001 127.0.0.1:10002 127.0.0.1:10003 -a ****** --cluster-yes
################# add slave nodes #################
redis-cli --cluster add-node 127.0.0.1:10004 127.0.0.1:10001 --cluster-slave --cluster-master-id 6164025849a8ff9297664fc835bc851af5004f61 -a ******
redis-cli --cluster add-node 127.0.0.1:10005 127.0.0.1:10002 --cluster-slave --cluster-master-id 64e634307bdc339b503574f5a77f1b156c021358 -a ******
redis-cli --cluster add-node 127.0.0.1:10006 127.0.0.1:10003 --cluster-slave --cluster-master-id 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a -a ******
這里使用python創建的程序中列印出來redis-cli --cluster 命令的日志資訊
[root@JD redis_install]# python3 create_redis_cluster.py################# flush master/slave slots ################################## create cluster #################redis-cli --cluster create 127.0.0.1:10001 127.0.0.1:10002 127.0.0.1:10003 -a ****** --cluster-yesWarning: Using a password with '-a' or '-u' option on the command line interface may not be safe.>>> Performing hash slots allocation on 3 nodes...Master[0] -> Slots 0 - 5460Master[1] -> Slots 5461 - 10922Master[2] -> Slots 10923 - 16383M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001 slots:[0-5460] (5461 slots) masterM: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002 slots:[5461-10922] (5462 slots) masterM: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003 slots:[10923-16383] (5461 slots) master>>> Nodes configuration updated>>> Assign a different config epoch to each node>>> Sending CLUSTER MEET messages to join the clusterWaiting for the cluster to join.>>> Performing Cluster Check (using node 127.0.0.1:10001)M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001 slots:[0-5460] (5461 slots) masterM: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003 slots:[10923-16383] (5461 slots) masterM: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002 slots:[5461-10922] (5462 slots) master[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.0################# add slave nodes #################redis-cli --cluster add-node 127.0.0.1:10004 127.0.0.1:10001 --cluster-slave --cluster-master-id 6164025849a8ff9297664fc835bc851af5004f61 -a ******Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.>>> Adding node 127.0.0.1:10004 to cluster 127.0.0.1:10001>>> Performing Cluster Check (using node 127.0.0.1:10001)M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001 slots:[0-5460] (5461 slots) masterM: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003 slots:[10923-16383] (5461 slots) masterM: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002 slots:[5461-10922] (5462 slots) master[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.>>> Send CLUSTER MEET to node 127.0.0.1:10004 to make it join the cluster.Waiting for the cluster to join>>> Configure node as replica of 127.0.0.1:10001.[OK] New node added correctly.0redis-cli --cluster add-node 127.0.0.1:10005 127.0.0.1:10002 --cluster-slave --cluster-master-id 64e634307bdc339b503574f5a77f1b156c021358 -a ******Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.>>> Adding node 127.0.0.1:10005 to cluster 127.0.0.1:10002>>> Performing Cluster Check (using node 127.0.0.1:10002)M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002 slots:[5461-10922] (5462 slots) masterS: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004 slots: (0 slots) slave replicates 6164025849a8ff9297664fc835bc851af5004f61M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003 slots:[10923-16383] (5461 slots) masterM: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001 slots:[0-5460] (5461 slots) master 1 additional replica(s)[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.>>> Send CLUSTER MEET to node 127.0.0.1:10005 to make it join the cluster.Waiting for the cluster to join>>> Configure node as replica of 127.0.0.1:10002.[OK] New node added correctly.0redis-cli --cluster add-node 127.0.0.1:10006 127.0.0.1:10003 --cluster-slave --cluster-master-id 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a -a ******Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.>>> Adding node 127.0.0.1:10006 to cluster 127.0.0.1:10003>>> Performing Cluster Check (using node 127.0.0.1:10003)M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003 slots:[10923-16383] (5461 slots) masterM: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002 slots:[5461-10922] (5462 slots) master 1 additional replica(s)S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005 slots: (0 slots) slave replicates 64e634307bdc339b503574f5a77f1b156c021358M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001 slots:[0-5460] (5461 slots) master 1 additional replica(s)S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004 slots: (0 slots) slave replicates 6164025849a8ff9297664fc835bc851af5004f61[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.>>> Send CLUSTER MEET to node 127.0.0.1:10006 to make it join the cluster.Waiting for the cluster to join>>> Configure node as replica of 127.0.0.1:10003.[OK] New node added correctly.0################# cluster nodes info: #################8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 myself,master - 0 1575947748000 53 connected 10923-1638364e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575947748000 52 connected 5461-1092223e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575947746000 52 connected6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 master - 0 1575947748103 51 connected 0-5460026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575947749000 51 connected9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575947749105 53 connected[root@JD redis_install]#
Redis集群擴容
redis擴容主要分為兩步:
1,增加主節點,同時為主節點增加從節點,
2,重新分配slot到新增加的master節點上,
主要命令如下:
增加主節點到集群中
redis-cli --cluster add-node 127.0.0.1:10007 127.0.0.1:10001 -a ******
為增加的主節點添加從節點
redis-cli --cluster add-node 127.0.0.1:10008 127.0.0.1:10007 --cluster-slave --cluster-master-id 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 -a ******
重新分片slot
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10001 --cluster-from 6164025849a8ff9297664fc835bc851af5004f61 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10002 --cluster-from 64e634307bdc339b503574f5a77f1b156c021358 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10003 --cluster-from 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
################# cluster nodes info: #################
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575960493000 64 connected
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575960493849 66 connected
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575960494852 65 connected 6826-10922
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575960492000 65 connected
4854375c501c3dbfb4e2d94d50e62a47520c4f12 127.0.0.1:10008@20008 slave 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 0 1575960493000 67 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575960493000 66 connected 12288-16383
3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:10007@20007 myself,master - 0 1575960493000 67 connected 0-1364 5461-6825 10923-12287
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 master - 0 1575960492848 64 connected 1365-5460
可見新加的節點成功重新分配了slot,集群擴容成功,
這里有幾個需要注意的兩個問題,如果是自動化安裝的話:
1,add-node之后(不管是柱節點還是從節點),要sleep足夠長的時間(這里是20秒),讓集群中所有的節點都meet到新節點,否則會擴容失敗
2,新節點的reshard之后要sleep足夠長的時間(這里是20秒),否則繼續reshard其他節點的slot會導致上一個reshared失敗
整個程序如下
[root@JD redis_install]# python3 create_redis_cluster.py#########################cleanup instance##########################################################add node into cluster################################# redis-cli --cluster add-node 127.0.0.1:10007 127.0.0.1:10001 -a redis@passwordWarning: Using a password with '-a' or '-u' option on the command line interface may not be safe.>>> Adding node 127.0.0.1:10007 to cluster 127.0.0.1:10001>>> Performing Cluster Check (using node 127.0.0.1:10001)M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001 slots:[0-5460] (5461 slots) master 1 additional replica(s)S: 9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006 slots: (0 slots) slave replicates 8b75325c59a7242344d0ebe5ee1e0068c66ffa2aM: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003 slots:[10923-16383] (5461 slots) master 1 additional replica(s)S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004 slots: (0 slots) slave replicates 6164025849a8ff9297664fc835bc851af5004f61S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005 slots: (0 slots) slave replicates 64e634307bdc339b503574f5a77f1b156c021358M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002 slots:[5461-10922] (5462 slots) master 1 additional replica(s)[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.>>> Send CLUSTER MEET to node 127.0.0.1:10007 to make it join the cluster.[OK] New node added correctly.0 redis-cli --cluster add-node 127.0.0.1:10008 127.0.0.1:10007 --cluster-slave --cluster-master-id 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 -a ******Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.>>> Adding node 127.0.0.1:10008 to cluster 127.0.0.1:10007>>> Performing Cluster Check (using node 127.0.0.1:10007)M: 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:10007 slots: (0 slots) masterS: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004 slots: (0 slots) slave replicates 6164025849a8ff9297664fc835bc851af5004f61S: 9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006 slots: (0 slots) slave replicates 8b75325c59a7242344d0ebe5ee1e0068c66ffa2aM: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002 slots:[5461-10922] (5462 slots) master 1 additional replica(s)S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005 slots: (0 slots) slave replicates 64e634307bdc339b503574f5a77f1b156c021358M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003 slots:[10923-16383] (5461 slots) master 1 additional replica(s)M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001 slots:[0-5460] (5461 slots) master 1 additional replica(s)[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.>>> Send CLUSTER MEET to node 127.0.0.1:10008 to make it join the cluster.Waiting for the cluster to join>>> Configure node as replica of 127.0.0.1:10007.[OK] New node added correctly.0#########################reshard slots############################################################# execute reshard #########################################redis-cli -a redis@password --cluster reshard 127.0.0.1:10001 --cluster-from 6164025849a8ff9297664fc835bc851af5004f61 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1############################ execute reshard #########################################redis-cli -a redis@password --cluster reshard 127.0.0.1:10002 --cluster-from 64e634307bdc339b503574f5a77f1b156c021358 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1############################ execute reshard #########################################redis-cli -a redis@password --cluster reshard 127.0.0.1:10003 --cluster-from 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1################# cluster nodes info: #################026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575960493000 64 connected9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575960493849 66 connected64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575960494852 65 connected 6826-1092223e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575960492000 65 connected4854375c501c3dbfb4e2d94d50e62a47520c4f12 127.0.0.1:10008@20008 slave 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 0 1575960493000 67 connected8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575960493000 66 connected 12288-163833645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:10007@20007 myself,master - 0 1575960493000 67 connected 0-1364 5461-6825 10923-122876164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 master - 0 1575960492848 64 connected 1365-5460[root@JD redis_install]#
Redis集群縮容
縮容按道理是擴容的反向操作.
從這個命令就可以看出來:del-node host:port node_id #洗掉給定的一個節點,成功后關閉該節點服務,
縮容就縮容了,從集群中移除掉(cluster forget nodeid)某個主節點就行了,為什么還要關閉?因此本文不會采用redis-cli --cluster del-node的方式縮容,而是通過普通命令列來縮容,
這里的自定義縮容實質上分兩步
1,將移除的主節點的slot分配回集群中其他節點,這里測驗四個主節點縮容為三個主節點,實際上執行命令如下,
2,集群中的節點依次執行cluster forget master_node_id(slave_node_id)
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10001 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 6164025849a8ff9297664fc835bc851af5004f61 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10002 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 64e634307bdc339b503574f5a77f1b156c021358 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10003 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
完整代碼如下
[root@JD redis_install]# python3 create_redis_cluster.py
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10001 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 6164025849a8ff9297664fc835bc851af5004f61 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10002 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 64e634307bdc339b503574f5a77f1b156c021358 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10003 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
################# cluster nodes info: #################
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575968426000 76 connected
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575968422619 75 connected
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 myself,master - 0 1575968426000 75 connected 0-5460
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575968425000 77 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575968427626 77 connected 10923-16383
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575968426000 76 connected 5461-10922
[root@JD redis_install]#
其實到這里并沒有結束,這里要求縮容之后集群中的所有節點都要成功地執行cluster forget master_node_id(和slave_node_id)
否則其他節點仍然有10007節點的心跳資訊,超過1分鐘之后,仍舊會將已經踢出集群的10007節點(以及從節點10008)會被添加回來
這就一開始就遇到一個奇葩問題,因為沒有在縮容后的集群的slave節點上執行cluster forget,被移除的節點,會不斷地被添加回來……,
參考這里:http://www.redis.cn/commands/cluster-forget.html
完整的代碼實作如下
import osimport timeimport redisfrom time import ctime,sleepdef create_redis_cluster(list_master_node,list_slave_node): print('################# flush master/slave slots #################') for node in list_master_node: currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True) currenrt_conn.execute_command('flushall') currenrt_conn.execute_command('cluster reset') for node in list_slave_node: currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True) #currenrt_conn.execute_command('flushall') currenrt_conn.execute_command('cluster reset') print('################# create cluster #################') master_nodes = '' for node in list_master_node: master_nodes = master_nodes + node["host"] + ':' + str(node["port"]) + ' ' command = "redis-cli --cluster create {0} -a ****** --cluster-yes".format(master_nodes) print(command) msg = os.system(command) print(msg) time.sleep(5) print('################# add slave nodes #################') counter = 0 for node in list_master_node: currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True) current_master_node = node["host"] + ':' + str(node["port"]) current_slave_node = list_slave_node[counter]["host"] + ':' + str(list_slave_node[counter]["port"]) myid = currenrt_conn.cluster('myid') #slave 節點在前,master節點在后 command = "redis-cli --cluster add-node {0} {1} --cluster-slave --cluster-master-id {2} -a ****** ". format(current_slave_node,current_master_node,myid) print(command) msg = os.system(command) counter = counter + 1 print(msg) # show cluster nodes info time.sleep(10) print("################# cluster nodes info: #################") cluster_nodes = currenrt_conn.execute_command('cluster nodes') print(cluster_nodes)# 回傳擴容后,原始節點中,每個主節點需要遷出的slot數量def get_migrated_slot(list_master_node,n): migrated_slot_count = int(16384/len(list_master_node)) - int(16384/(len(list_master_node)+n)) return migrated_slot_countdef redis_cluster_expansion(list_master_node,dict_master_node,dict_slave_node): new_master_node = dict_master_node["host"] + ':' + str(dict_master_node["port"]) new_slave_node = dict_slave_node["host"] + ':' + str(dict_slave_node["port"]) print("#########################cleanup instance#################################") new_master_conn = redis.StrictRedis(host=dict_master_node["host"], port=dict_master_node["port"], password=dict_master_node["password"], decode_responses=True) new_master_conn.execute_command('flushall') new_master_conn.execute_command('cluster reset') new_master_id = new_master_conn.cluster('myid') new_slave_conn = redis.StrictRedis(host=dict_slave_node["host"], port=dict_slave_node["port"], password=dict_slave_node["password"], decode_responses=True) new_slave_conn.execute_command('cluster reset') new_slave_id = new_slave_conn.cluster('myid') #new_slave_conn.execute_command('slaveof no one') # 判斷新增的節點是否歸屬于當前集群, # 如果已經歸屬于當前集群且不占用slot,則先踢出當前集群 cluster forget nodeid,或者終止,給出告警,總之,怎么開心怎么來 # 登錄集群中的任何一個節點 cluster_node_conn = redis.StrictRedis(host=list_master_node[0]["host"], port=list_master_node[0]["port"], password=list_master_node[0]["password"],decode_responses=True) dict_node_info = cluster_node_conn.cluster('nodes') '''dict_node_info format example : { '127.0.0.1:10008@20008': {'node_id': '1d10c3ce3b9b7f956a26122980827fe6ce623d22', 'flags': 'master', 'master_id': '-','last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '8', 'slots': [], 'connected': True}, '127.0.0.1:10002@20002': {'node_id': '64e634307bdc339b503574f5a77f1b156c021358', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '7', 'slots': [['5461', '10922']], 'connected': True}, '127.0.0.1:10001@20001': {'node_id': '6164025849a8ff9297664fc835bc851af5004f61', 'flags': 'myself,master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599438000', 'epoch': '6', 'slots': [['0', '5460']], 'connected': True}, '127.0.0.1:10007@20007': {'node_id': '307f589ec7b1eb7bd65c680527afef1e30ce2303', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599443599', 'epoch': '5', 'slots': [], 'connected': True}, '127.0.0.1:10005@20005': {'node_id': '23e1871c4e1dc1047ce567326e74a6194589146c', 'flags': 'slave', 'master_id': '64e634307bdc339b503574f5a77f1b156c021358', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599441000', 'epoch': '7', 'slots': [], 'connected': True}, '127.0.0.1:10004@20004': {'node_id': '026f0179631f50ca858d46c2b2829b3af71af2c8', 'flags': 'slave', 'master_id': '6164025849a8ff9297664fc835bc851af5004f61', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599440000', 'epoch': '6', 'slots': [], 'connected': True}, '127.0.0.1:10006@20006': {'node_id': '9f265545ebb799d2773cfc20c71705cff9d733ae', 'flags': 'slave', 'master_id': '8b75325c59a7242344d0ebe5ee1e0068c66ffa2a', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '8', 'slots': [], 'connected': True}, '127.0.0.1:10003@20003': {'node_id': '8b75325c59a7242344d0ebe5ee1e0068c66ffa2a', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442599', 'epoch': '8', 'slots': [['10923', '16383']], 'connected': True} } ''' dict_master_node_in_cluster = 0 dict_slave_node_in_cluster = 0 for key_node in dict_node_info: if new_master_node in key_node: dict_master_node_in_cluster = 1 if len(dict_node_info[key_node]['slots']) > 0: print('error: ' +new_master_node + ' already existing in cluster and alloted slots,execute break......') return if new_slave_node in key_node: dict_slave_node_in_cluster = 1 if len(dict_node_info[key_node]['slots']) > 0: print('error: ' +new_slave_node + ' already existing in cluster and alloted slots,execute break......') return if dict_master_node_in_cluster == 1: for master_node in list_master_node: key_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True) print('waring: ' + new_master_node + ' already existing in cluster,cluster forget it......') forget_command = 'cluster forget {0}'.format(new_master_id) key_node_conn.execute_command(forget_command) if dict_slave_node_in_cluster == 1: for master_node in list_master_node: key_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True) print('waring: ' + new_slave_node + ' already existing in cluster,forget it......') forget_command = 'cluster forget {0}'.format(new_slave_id) key_node_conn.execute_command(forget_command) print("#########################add node into cluster#################################") try: cluster_node = list_master_node[0]["host"] + ':' + str(list_master_node[0]["port"]) # 1,待加入節點在前,第二個節點為集群中的任意一個節點 add_node_command = " redis-cli --cluster add-node {0} {1} -a ****** ".format(new_master_node,cluster_node) print(add_node_command) print(os.system(add_node_command)) time.sleep(20) # slave 節點在前,master節點在后 add_node_command = " redis-cli --cluster add-node {0} {1} --cluster-slave --cluster-master-id {2} -a ****** ". format(new_slave_node,new_master_node,new_master_id) print(add_node_command) print(os.system(add_node_command)) time.sleep(20) except Exception as e: print('add new node error,the reason is:') print(e) print("#########################reshard slots#################################") migrated_slot_count = get_migrated_slot(list_master_node,1) for node in list_master_node: current_master_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True) current_master_node = node["host"] + ':' + str(node["port"]) current_master_node_id = current_master_conn.cluster('myid') ''' example:3節點-->擴容4節點,每個遷移1365 ''' try: command = r'''redis-cli -a ****** --cluster reshard {0} --cluster-from {1} --cluster-to {2} --cluster-slots {3} --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1 '''. format(current_master_node,current_master_node_id,new_master_id,migrated_slot_count) print('############################ execute reshard #########################################') print(command) msg = os.system(command) time.sleep(20) except Exception as e: print('reshard slots error,the reason is:') print(e) print("################# cluster nodes info: #################") cluster_nodes = new_master_conn.execute_command('cluster nodes') print(cluster_nodes)def redis_cluster_shrinkage(list_master_node,list_slave_node,dict_master_node,dict_slave_node): # 判斷新增的節點是否歸屬于當前集群, # 如果不歸屬當前集群,則退出 cluster_node_conn = redis.StrictRedis(host=list_master_node[0]["host"], port=list_master_node[0]["port"], password=list_master_node[0]["password"],decode_responses=True) dict_node_info = cluster_node_conn.cluster('nodes') removed_master_node = dict_master_node["host"] + ':' + str(dict_master_node["port"])+'@'+str(dict_master_node["port"]+10000) removed_slave_node = dict_slave_node["host"] + ':' + str(dict_slave_node["port"])+'@'+str(dict_slave_node["port"]+10000) if not removed_master_node in dict_node_info.keys(): print('Error:'+ str(removed_master_node) +' not in cluster,exiting') return if not removed_slave_node in dict_node_info.keys(): print('Error:' + str(removed_slave_node) + ' not in cluster,exiting') return removed_master_conn = redis.StrictRedis(host=dict_master_node["host"], port=dict_master_node["port"], password=dict_master_node["password"], decode_responses=True) removed_master_id = removed_master_conn.cluster('myid') removed_slave_conn = redis.StrictRedis(host=dict_slave_node["host"], port=dict_slave_node["port"], password=dict_slave_node["password"], decode_responses=True) removed_slave_id = removed_slave_conn.cluster('myid') for node in list_master_node: current_master_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True) current_master_node = node["host"] + ':' + str(node["port"]) current_master_node_id = current_master_conn.cluster('myid') ''' 4節點-->縮容3節點,平均將slot歸還到三個master節點 ''' try: command = r'''redis-cli -a ****** --cluster reshard {0} --cluster-from {1} --cluster-to {2} --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1 '''.\ format(current_master_node, removed_master_id, current_master_node_id) print('############################ execute reshard #########################################') print(command) msg = os.system(command) time.sleep(10) except Exception as e: print('reshard slots error,the reason is:') print(e) removed_master_conn.execute_command('cluster reset') removed_slave_conn.execute_command('cluster reset') for master_node in list_master_node: master_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True) foget_master_command = 'cluster forget {0}'.format(removed_master_id) foget_slave_command = 'cluster forget {0}'.format(removed_slave_id) print(str(master_node)+ '--->' + foget_master_command) print(str(master_node)+ '--->' + foget_slave_command) master_node_conn.execute_command(foget_master_command) master_node_conn.execute_command(foget_slave_command) for slave_node in list_slave_node: slave_node_conn = redis.StrictRedis(host=slave_node["host"], port=slave_node["port"], password=slave_node["password"], decode_responses=True) foget_master_command = 'cluster forget {0}'.format(removed_master_id) foget_slave_command = 'cluster forget {0}'.format(removed_slave_id) print(str(slave_node)+ '--->' +foget_master_command) print(str(slave_node)+ '--->' +foget_slave_command) slave_node_conn.execute_command(foget_master_command) slave_node_conn.execute_command(foget_slave_command) print("################# cluster nodes info: #################") cluster_nodes = cluster_node_conn.execute_command('cluster nodes') print(cluster_nodes)if __name__ == '__main__': # master node_1 = {'host': '127.0.0.1', 'port': 10001, 'password': '******'} node_2 = {'host': '127.0.0.1', 'port': 10002, 'password': '******'} node_3 = {'host': '127.0.0.1', 'port': 10003, 'password': '******'} # slave node_4 = {'host': '127.0.0.1', 'port': 10004, 'password': '******'} node_5 = {'host': '127.0.0.1', 'port': 10005, 'password': '******'} node_6 = {'host': '127.0.0.1', 'port': 10006, 'password': '******'} # 主從節點個數必須相同 list_master_node = [node_1, node_2, node_3] list_slave_node = [node_4, node_5, node_6] # 自動化集群創建 #create_redis_cluster(list_master_node,list_slave_node) # 自動化擴容 node_1 = {'host': '127.0.0.1', 'port': 10007, 'password': '******'} node_2 = {'host': '127.0.0.1', 'port': 10008, 'password': '******'} redis_cluster_expansion(list_master_node,node_1,node_2) # 自動化縮容, #redis_cluster_shrinkage(list_master_node,list_slave_node,node_1,node_2)
參考:https://www.cnblogs.com/zhoujinyi/p/11606935.html
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/29498.html
標籤:NoSQL
