前文我們了解了cephfs使用相關話題,回顧請參考https://www.cnblogs.com/qiuhom-1874/p/16758866.html;今天我們來聊一聊MDS組件擴展相關話題;
我們知道MDS是為了實作cephfs而運行的行程,主要負責管理檔案系統元資料資訊;這意味著客戶端使用cephfs存取資料,都會先聯系mds找元資料;然后mds再去元資料存盤池讀取資料,然后回傳給客戶端;即元素存盤池只能由mds操作;換句話說,mds是訪問cephfs的唯一入口;那么問題來了,如果ceph集群上只有一個mds行程,很多個客戶端來訪問cephfs,那么mds肯定會成為瓶頸,所以為了提高cephfs的性能,我們必須提供多個mds供客戶端使用;那mds該怎么擴展呢?前邊我們說過,mds是管理檔案系統元素資訊,將元素資訊存盤池至rados集群的指定存盤池中,使得mds從有狀態變為無狀態;那么對于mds來說,擴展mds就是多運行幾個行程而已;但是由于檔案系統元資料的作業特性,我們不能像擴展其他無狀態應用那樣擴展;比如,在ceph集群上有兩個mds,他們同時操作一個存盤池中的一個檔案,那么最后合并時發現,一個洗掉檔案,一個修改了檔案,合并檔案系統崩潰了;即兩個mds同時操作存盤池的同一個檔案那么對應mds需要同步和資料一致,這和副本有什么區別呢?對于客戶端讀請求可以由多個mds分散負載,對于客戶端的寫請求呢,向a寫入,b該怎么辦呢?b只能從a這邊同步,或者a向b寫入,這樣一來對于客戶端的寫請求并不能分散負載,即當客戶端增多,瓶頸依然存在;
為了解決分散負載檔案系統的讀寫請求,分布式檔案系統業界提供了將名稱空間分割治理的解決方案,通過將檔案系統根樹及其熱點子樹分別部署于不同的元資料服務器進行負載均衡,從而賦予了元資料存盤線性擴展的可能;簡單講就是一個mds之復制一個子目錄的元資料資訊;
元資料磁區

提示:如上所示,我們將一個檔案系統可以分成多顆子樹,一個mds只復制其中一顆子樹,從而實作元資料資訊的讀寫分散負載;
常用的元資料磁區方式
1、靜態子樹磁區:所謂靜態子樹磁區,就是管理員手動指定某顆指數,由某個元資料服務器負責;如,我們將nfs掛載之一個目錄下,這種方式就是靜態子樹磁區,通過將一個子目錄關聯到另外一個磁區上去,從而實作減輕當前檔案系統的負載;
2、靜態hash磁區:所謂靜態hash磁區是指,有多個目錄,對應檔案存盤到那個目錄下,不是管理員指定而是通過對檔案名做一致性hash或者hash再取模等等,最終落到那個目錄就存盤到那個目錄;從而減輕對應子目錄在當前檔案系統的負載;
3、惰性混編磁區:所謂惰性混編磁區是指將靜態hash方式和傳統檔案系統的方式結合使用;
4、動態子樹磁區:所謂動態子樹磁區就是根據檔案系統的負載能力動態調整對應子樹;cephfs就是使用這種方式實作多活mds;在ceph上多主MDS模式是指CephFS將整個檔案系統的名稱空間切分為多個子樹并配置到多個MDS之上,不過,讀寫操作的負載均衡策略分別是子樹切分和目錄副本;將寫操作負載較重的目錄切分成多個子目錄以分散負載;為讀操作負載較重的目錄創建多個副本以均衡負載;子樹磁區和遷移的決策是一個同步程序,各MDS每10秒鐘做一次獨立的遷移決策,每個MDS并不存在一個一致的名稱空間視圖,且MDS集群也不存在一個全域調度器負責統一的調度決策;各MDS彼此間通過交換心跳資訊(HeartBeat,簡稱HB)及負載狀態來確定是否要進行遷移、如何磁區名稱空間,以及是否需要目錄切分為子樹等;管理員也可以配置CephFS負載的計算方式從而影響MDS的負載決策,目前,CephFS支持基于CPU負載、檔案系統負載及混合此兩種的決策機制;
動態子樹磁區依賴于共享存盤完成熱點負載在MDS間的遷移,于是Ceph把MDS的元資料存盤于后面的RADOS集群上的專用存盤池中,此存盤池可由多個MDS共享;MDS對元資料的訪問并不直接基于RADOS進行,而是為其提供了一個基于記憶體的快取區以快取熱點元資料,并且在元資料相關日志條目過期之前將一直存盤于記憶體中;
CephFS使用元資料日志來解決容錯問題
元資料日志資訊流式存盤于CephFS元資料存盤池中的元資料日志檔案上,類似于LFS(Log-Structured File System)和WAFL( Write Anywhere File Layout)的作業機制, CephFS元資料日志檔案的體積可以無限增長以確保日志資訊能順序寫入RADOS,并額外賦予守護行程修剪冗余或不相關日志條目的能力;
Multi MDS
每個CephFS都會有一個易讀的檔案系統名稱和一個稱為FSCID識別符號ID,并且每個CephFS默認情況下都只配置一個Active MDS守護行程;一個MDS集群中可處于Active狀態的MDS數量的上限由max_mds引數配置,它控制著可用的rank數量,默認值為1; rank是指CephFS上可同時處于Active狀態的MDS守護行程的可用編號,其范圍從0到max_mds-1;一個rank編號意味著一個可承載CephFS層級檔案系統目錄子樹 目錄子樹元資料管理功能的Active狀態的ceph-mds守護行程編制,max_mds的值為1時意味著僅有一個0號rank可用; 剛啟動的ceph-mds守護行程沒有接管任何rank,它隨后由MON按需進行分配;一個ceph-mds一次僅可占據一個rank,并且在守護行程終止時將其釋放;即rank分配出去以后具有排它性;一個rank可以處于下列三種狀態中的某一種,Up:rank已經由某個ceph-mds守護行程接管; Failed:rank未被任何ceph-mds守護行程接管; Damaged:rank處于損壞狀態,其元資料處于崩潰或丟失狀態;在管理員修復問題并對其運行“ceph mds repaired”命令之前,處于Damaged狀態的rank不能分配給其它任何MDS守護行程;
查看ceph集群mds狀態
[root@ceph-admin ~]# ceph mds stat
cephfs-1/1/1 up {0=ceph-mon02=up:active}
[root@ceph-admin ~]#
提示:可以看到當前集群有一個mds運行在ceph-mon02節點并處于up活動狀態;
部署多個mds
[root@ceph-admin ~]# ceph-deploy mds create ceph-mon01 ceph-mon03
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mds create ceph-mon01 ceph-mon03
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f9478f34830>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mds at 0x7f947918d050>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] mds : [('ceph-mon01', 'ceph-mon01'), ('ceph-mon03', 'ceph-mon03')]
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy][ERROR ] ConfigError: Cannot load config: [Errno 2] No such file or directory: 'ceph.conf'; has `ceph-deploy new` been run in this directory?
[root@ceph-admin ~]# su - cephadm
Last login: Thu Sep 29 23:09:04 CST 2022 on pts/0
[cephadm@ceph-admin ~]$ ls
cephadm@ceph-mgr01 cephadm@ceph-mgr02 cephadm@ceph-mon01 cephadm@ceph-mon02 cephadm@ceph-mon03 ceph-cluster
[cephadm@ceph-admin ~]$ cd ceph-cluster/
[cephadm@ceph-admin ceph-cluster]$ ls
ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph-deploy-ceph.log
ceph.bootstrap-mgr.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph.mon.keyring
[cephadm@ceph-admin ceph-cluster]$ ceph-deploy mds create ceph-mon01 ceph-mon03
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadm/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mds create ceph-mon01 ceph-mon03
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f2c575ba7e8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mds at 0x7f2c57813050>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] mds : [('ceph-mon01', 'ceph-mon01'), ('ceph-mon03', 'ceph-mon03')]
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-mon01:ceph-mon01 ceph-mon03:ceph-mon03
[ceph-mon01][DEBUG ] connection detected need for sudo
[ceph-mon01][DEBUG ] connected to host: ceph-mon01
[ceph-mon01][DEBUG ] detect platform information from remote host
[ceph-mon01][DEBUG ] detect machine type
[ceph_deploy.mds][INFO ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-mon01
[ceph-mon01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mds][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite
[ceph-mon03][DEBUG ] connection detected need for sudo
[ceph-mon03][DEBUG ] connected to host: ceph-mon03
[ceph-mon03][DEBUG ] detect platform information from remote host
[ceph-mon03][DEBUG ] detect machine type
[ceph_deploy.mds][INFO ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-mon03
[ceph-mon03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-mon03][WARNIN] mds keyring does not exist yet, creating one
[ceph-mon03][DEBUG ] create a keyring file
[ceph-mon03][DEBUG ] create path if it doesn't exist
[ceph-mon03][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-mon03 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-mon03/keyring
[ceph-mon03][INFO ] Running command: sudo systemctl enable ceph-mds@ceph-mon03
[ceph-mon03][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/[email protected] to /usr/lib/systemd/system/[email protected].
[ceph-mon03][INFO ] Running command: sudo systemctl start ceph-mds@ceph-mon03
[ceph-mon03][INFO ] Running command: sudo systemctl enable ceph.target
[ceph_deploy][ERROR ] GenericError: Failed to create 1 MDSs
[cephadm@ceph-admin ceph-cluster]$
提示:這里出了兩個錯誤,第一個錯誤是沒有找到ceph.conf檔案,解決辦法就是切換至cephadm用戶執行ceph-deploy mds create命令;第二個錯誤是告訴我們說遠程主機上的組態檔和我們本地組態檔不一樣;解決辦法,可以先推送組態檔到集群各主機之上或者從集群主機拉取組態檔到本地然后在分發組態檔,然后在部署mds;
查看本地組態檔和遠程集群主機組態檔
[cephadm@ceph-admin ceph-cluster]$ cat /etc/ceph/ceph.conf [global] fsid = 7fd4a619-9767-4b46-9cee-78b9dfe88f34 mon_initial_members = ceph-mon01 mon_host = 192.168.0.71 public_network = 192.168.0.0/24 cluster_network = 172.16.30.0/24 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx [cephadm@ceph-admin ceph-cluster]$ ssh ceph-mon01 'cat /etc/ceph/ceph.conf' [global] fsid = 7fd4a619-9767-4b46-9cee-78b9dfe88f34 mon_initial_members = ceph-mon01 mon_host = 192.168.0.71 public_network = 192.168.0.0/24 cluster_network = 172.16.30.0/24 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx [client] rgw_frontends = "civetweb port=8080" [cephadm@ceph-admin ceph-cluster]$
提示:可以看到ceph-mon01節點上的組態檔中多了一個client的配置段;
從ceph-mon01拉去組態檔到本地
[cephadm@ceph-admin ceph-cluster]$ ceph-deploy config pull ceph-mon01 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadm/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy config pull ceph-mon01 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : pull [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f966fb478c0> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] client : ['ceph-mon01'] [ceph_deploy.cli][INFO ] func : <function config at 0x7f966fd76cf8> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.config][DEBUG ] Checking ceph-mon01 for /etc/ceph/ceph.conf [ceph-mon01][DEBUG ] connection detected need for sudo [ceph-mon01][DEBUG ] connected to host: ceph-mon01 [ceph-mon01][DEBUG ] detect platform information from remote host [ceph-mon01][DEBUG ] detect machine type [ceph-mon01][DEBUG ] fetch remote file [ceph_deploy.config][DEBUG ] Got /etc/ceph/ceph.conf from ceph-mon01 [ceph_deploy.config][ERROR ] local config file ceph.conf exists with different content; use --overwrite-conf to overwrite [ceph_deploy.config][ERROR ] Unable to pull /etc/ceph/ceph.conf from ceph-mon01 [ceph_deploy][ERROR ] GenericError: Failed to fetch config from 1 hosts [cephadm@ceph-admin ceph-cluster]$ ceph-deploy --overwrite-conf config pull ceph-mon01 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadm/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy --overwrite-conf config pull ceph-mon01 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : True [ceph_deploy.cli][INFO ] subcommand : pull [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fa2f65438c0> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] client : ['ceph-mon01'] [ceph_deploy.cli][INFO ] func : <function config at 0x7fa2f6772cf8> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.config][DEBUG ] Checking ceph-mon01 for /etc/ceph/ceph.conf [ceph-mon01][DEBUG ] connection detected need for sudo [ceph-mon01][DEBUG ] connected to host: ceph-mon01 [ceph-mon01][DEBUG ] detect platform information from remote host [ceph-mon01][DEBUG ] detect machine type [ceph-mon01][DEBUG ] fetch remote file [ceph_deploy.config][DEBUG ] Got /etc/ceph/ceph.conf from ceph-mon01 [cephadm@ceph-admin ceph-cluster]$ ls ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph-deploy-ceph.log ceph.bootstrap-mgr.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph.mon.keyring [cephadm@ceph-admin ceph-cluster]$ cat ceph.conf [global] fsid = 7fd4a619-9767-4b46-9cee-78b9dfe88f34 mon_initial_members = ceph-mon01 mon_host = 192.168.0.71 public_network = 192.168.0.0/24 cluster_network = 172.16.30.0/24 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx [client] rgw_frontends = "civetweb port=8080" [cephadm@ceph-admin ceph-cluster]$
提示:如果本地組態檔存在需要加上--overwrite-conf選項強制將覆寫原有組態檔
再次將本地組態檔分發至集群各主機
[cephadm@ceph-admin ceph-cluster]$ ceph-deploy --overwrite-conf config push ceph-mon01 ceph-mon02 ceph-mon03 ceph-mgr01 ceph-mgr02
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadm/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy --overwrite-conf config push ceph-mon01 ceph-mon02 ceph-mon03 ceph-mgr01 ceph-mgr02
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : True
[ceph_deploy.cli][INFO ] subcommand : push
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fcf983488c0>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] client : ['ceph-mon01', 'ceph-mon02', 'ceph-mon03', 'ceph-mgr01', 'ceph-mgr02']
[ceph_deploy.cli][INFO ] func : <function config at 0x7fcf98577cf8>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mon01
[ceph-mon01][DEBUG ] connection detected need for sudo
[ceph-mon01][DEBUG ] connected to host: ceph-mon01
[ceph-mon01][DEBUG ] detect platform information from remote host
[ceph-mon01][DEBUG ] detect machine type
[ceph-mon01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mon02
[ceph-mon02][DEBUG ] connection detected need for sudo
[ceph-mon02][DEBUG ] connected to host: ceph-mon02
[ceph-mon02][DEBUG ] detect platform information from remote host
[ceph-mon02][DEBUG ] detect machine type
[ceph-mon02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mon03
[ceph-mon03][DEBUG ] connection detected need for sudo
[ceph-mon03][DEBUG ] connected to host: ceph-mon03
[ceph-mon03][DEBUG ] detect platform information from remote host
[ceph-mon03][DEBUG ] detect machine type
[ceph-mon03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mgr01
[ceph-mgr01][DEBUG ] connection detected need for sudo
[ceph-mgr01][DEBUG ] connected to host: ceph-mgr01
[ceph-mgr01][DEBUG ] detect platform information from remote host
[ceph-mgr01][DEBUG ] detect machine type
[ceph-mgr01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mgr02
[ceph-mgr02][DEBUG ] connection detected need for sudo
[ceph-mgr02][DEBUG ] connected to host: ceph-mgr02
[ceph-mgr02][DEBUG ] detect platform information from remote host
[ceph-mgr02][DEBUG ] detect machine type
[ceph-mgr02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[cephadm@ceph-admin ceph-cluster]$
再次部署MDS
[cephadm@ceph-admin ceph-cluster]$ ceph-deploy mds create ceph-mon01 ceph-mon03
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadm/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mds create ceph-mon01 ceph-mon03
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fc39019c7e8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mds at 0x7fc3903f5050>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] mds : [('ceph-mon01', 'ceph-mon01'), ('ceph-mon03', 'ceph-mon03')]
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-mon01:ceph-mon01 ceph-mon03:ceph-mon03
[ceph-mon01][DEBUG ] connection detected need for sudo
[ceph-mon01][DEBUG ] connected to host: ceph-mon01
[ceph-mon01][DEBUG ] detect platform information from remote host
[ceph-mon01][DEBUG ] detect machine type
[ceph_deploy.mds][INFO ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-mon01
[ceph-mon01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-mon01][WARNIN] mds keyring does not exist yet, creating one
[ceph-mon01][DEBUG ] create a keyring file
[ceph-mon01][DEBUG ] create path if it doesn't exist
[ceph-mon01][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-mon01 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-mon01/keyring
[ceph-mon01][INFO ] Running command: sudo systemctl enable ceph-mds@ceph-mon01
[ceph-mon01][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/[email protected] to /usr/lib/systemd/system/[email protected].
[ceph-mon01][INFO ] Running command: sudo systemctl start ceph-mds@ceph-mon01
[ceph-mon01][INFO ] Running command: sudo systemctl enable ceph.target
[ceph-mon03][DEBUG ] connection detected need for sudo
[ceph-mon03][DEBUG ] connected to host: ceph-mon03
[ceph-mon03][DEBUG ] detect platform information from remote host
[ceph-mon03][DEBUG ] detect machine type
[ceph_deploy.mds][INFO ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-mon03
[ceph-mon03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-mon03][DEBUG ] create path if it doesn't exist
[ceph-mon03][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-mon03 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-mon03/keyring
[ceph-mon03][INFO ] Running command: sudo systemctl enable ceph-mds@ceph-mon03
[ceph-mon03][INFO ] Running command: sudo systemctl start ceph-mds@ceph-mon03
[ceph-mon03][INFO ] Running command: sudo systemctl enable ceph.target
[cephadm@ceph-admin ceph-cluster]$
查看msd狀態
[cephadm@ceph-admin ceph-cluster]$ ceph mds stat
cephfs-1/1/1 up {0=ceph-mon02=up:active}, 2 up:standby
[cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs
cephfs - 0 clients
======
+------+--------+------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+------------+---------------+-------+-------+
| 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 |
+------+--------+------------+---------------+-------+-------+
+---------------------+----------+-------+-------+
| Pool | type | used | avail |
+---------------------+----------+-------+-------+
| cephfs-metadatapool | metadata | 59.8k | 280G |
| cephfs-datapool | data | 3391k | 280G |
+---------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
| ceph-mon03 |
| ceph-mon01 |
+-------------+
MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)
[cephadm@ceph-admin ceph-cluster]$
提示:可以看到現在有兩個mds處于standby狀態,一個active狀態mds;
管理rank
增加Active MDS的數量命令格式:ceph fs set <fsname> max_mds <number>
[cephadm@ceph-admin ceph-cluster]$ ceph fs set cephfs max_mds 2 [cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs cephfs - 0 clients ====== +------+--------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | | 1 | active | ceph-mon01 | Reqs: 0 /s | 10 | 13 | +------+--------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 61.1k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ | ceph-mon03 | +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$
提示:僅當存在某個備用守護行程可供新rank使用時,檔案系統中的實際rank數才會增加;多活MDS的場景依然要求存在備用的冗余主機以實作服務HA,因此max_mds的值總是應該比實際可用的MDS數量至少小1;
降低Acitve MDS的數量
減小max_mds的值僅會限制新的rank的創建,對于已經存在的Active MDS及持有的rank不造成真正的影響,因此降低max_mds的值后,管理員需要手動關閉不再不再被需要的rank;命令格式:ceph mds deactivate {System:rank|FSID:rank|rank}
[cephadm@ceph-admin ceph-cluster]$ ceph fs set cephfs max_mds 1 [cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs - 0 clients ====== +------+----------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+----------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | | 1 | stopping | ceph-mon01 | | 10 | 13 | +------+----------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 61.6k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ | ceph-mon03 | +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$ ceph mds deactivate cephfs:1 Error ENOTSUP: command is obsolete; please check usage and/or man page [cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs - 0 clients ====== +------+--------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | +------+--------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 62.1k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ | ceph-mon03 | | ceph-mon01 | +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$
提示:雖然我們執行ceph deactivate 命令對應提示我們命令過時,但對應mds還是被還原了;
手動分配目錄子樹至rank
多Active MDS的CephFS集群上會運行一個均衡器用于調度元資料負載,這種模式通常足以滿足大多數用戶的需求;個別場景中,用戶需要使用元資料到特定級別的顯式映射來覆寫動態平衡器,以在整個集群上自定義分配應用負載;針對此目的提供的機制稱為“匯出關聯”,它是目錄的擴展屬性ceph.dir.pin;目錄屬性設定命令:setfattr -n ceph.dir.pin -v RANK /PATH/TO/DIR;擴展屬性的值 ( -v ) 是要將目錄子樹指定到的rank 默認為-1,表示不關聯該目錄;目錄匯出關聯繼承自設定了匯出關聯的最近的父級,因此,對某個目錄設定匯出關聯會影響該目錄的所有子級目錄;
[cephadm@ceph-admin ceph-cluster]$ sefattr -bash: sefattr: command not found [cephadm@ceph-admin ceph-cluster]$ yum provides setfattr Loaded plugins: fastestmirror Repository epel is listed more than once in the configuration Repository epel-debuginfo is listed more than once in the configuration Repository epel-source is listed more than once in the configuration Loading mirror speeds from cached hostfile * base: mirrors.aliyun.com * extras: mirrors.aliyun.com * updates: mirrors.aliyun.com attr-2.4.46-13.el7.x86_64 : Utilities for managing filesystem extended attributes Repo : base Matched from: Filename : /usr/bin/setfattr [cephadm@ceph-admin ceph-cluster]$
提示:前提是我們系統上要有setfattr命令,如果沒有可以安裝attr這個包即可;
MDS故障轉移機制
出于冗余的目的,每個CephFS上都應該配置一定數量Standby狀態的ceph-mds守護行程等著接替失效的rank,CephFS提供了四個選項用于控制Standby狀態的MDS守護行程如何作業;
1、 mds_standby_replay:布爾型值,true表示當前MDS守護行程將持續讀取某個特定的Up狀態的rank的元資料日志,從而持有相關rank的元資料快取,并在此rank失效時加速故障切換; 一個Up狀態的rank僅能擁有一個replay守護行程,多出的會被自動降級為正常的非replay型MDS;
2、 mds_standby_for_name:設定當前MDS行程僅備用于指定名稱的rank;
3、 mds_standby_for_rank:設定當前MDS行程僅備用于指定的rank,它不會接替任何其它失效的rank;不過,在有著多個CephFS的場景中,可聯合使用下面的引數來指定為哪個檔案系統的rank進行冗余;
4、 mds_standby_for_fscid:聯合mds_standby_for_rank引數的值協同生效;同時設定了mds_standby_for_rank:備用于指定fscid的指定rank;未設定mds_standby_for_rank時:備用于指定fscid的任意rank;
配置冗余mds

提示:上述配置表示ceph-mon03這個冗余的mds開啟對ceph-mon01做實時備份,但ceph-mon01故障,對應ceph-mon03自動接管ceph-mon01負責的rank;
推送配置到集群各主機
[cephadm@ceph-admin ceph-cluster]$ ceph-deploy --overwrite-conf config push ceph-mon01 ceph-mon02 ceph-mon03 ceph-mgr01 ceph-mgr02
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadm/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy --overwrite-conf config push ceph-mon01 ceph-mon02 ceph-mon03 ceph-mgr01 ceph-mgr02
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : True
[ceph_deploy.cli][INFO ] subcommand : push
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f03332968c0>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] client : ['ceph-mon01', 'ceph-mon02', 'ceph-mon03', 'ceph-mgr01', 'ceph-mgr02']
[ceph_deploy.cli][INFO ] func : <function config at 0x7f03334c5cf8>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mon01
[ceph-mon01][DEBUG ] connection detected need for sudo
[ceph-mon01][DEBUG ] connected to host: ceph-mon01
[ceph-mon01][DEBUG ] detect platform information from remote host
[ceph-mon01][DEBUG ] detect machine type
[ceph-mon01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mon02
[ceph-mon02][DEBUG ] connection detected need for sudo
[ceph-mon02][DEBUG ] connected to host: ceph-mon02
[ceph-mon02][DEBUG ] detect platform information from remote host
[ceph-mon02][DEBUG ] detect machine type
[ceph-mon02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mon03
[ceph-mon03][DEBUG ] connection detected need for sudo
[ceph-mon03][DEBUG ] connected to host: ceph-mon03
[ceph-mon03][DEBUG ] detect platform information from remote host
[ceph-mon03][DEBUG ] detect machine type
[ceph-mon03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mgr01
[ceph-mgr01][DEBUG ] connection detected need for sudo
[ceph-mgr01][DEBUG ] connected to host: ceph-mgr01
[ceph-mgr01][DEBUG ] detect platform information from remote host
[ceph-mgr01][DEBUG ] detect machine type
[ceph-mgr01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to ceph-mgr02
[ceph-mgr02][DEBUG ] connection detected need for sudo
[ceph-mgr02][DEBUG ] connected to host: ceph-mgr02
[ceph-mgr02][DEBUG ] detect platform information from remote host
[ceph-mgr02][DEBUG ] detect machine type
[ceph-mgr02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[cephadm@ceph-admin ceph-cluster]$
停止ceph-mon01上的mds行程,看看對應ceph-mon03是否接管?
[cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs cephfs - 0 clients ====== +------+--------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | | 1 | active | ceph-mon01 | Reqs: 0 /s | 10 | 13 | +------+--------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 65.3k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ | ceph-mon03 | +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$ ssh ceph-mon01 'systemctl stop [email protected]' Failed to stop [email protected]: Interactive authentication required. See system logs and 'systemctl status [email protected]' for details. [cephadm@ceph-admin ceph-cluster]$ ssh ceph-mon01 'sudo systemctl stop [email protected]' [cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs cephfs - 0 clients ====== +------+--------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | | 1 | rejoin | ceph-mon03 | | 0 | 3 | +------+--------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 65.3k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs cephfs - 0 clients ====== +------+--------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | | 1 | active | ceph-mon03 | Reqs: 0 /s | 10 | 13 | +------+--------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 65.3k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$
提示:可以看到當ceph-mon01故障以后,對應ceph-mon03自動接管了ceph-mon01負責的rank;
恢復ceph-mon01
[cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs - 0 clients ====== +------+--------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | | 1 | active | ceph-mon03 | Reqs: 0 /s | 10 | 13 | +------+--------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 65.3k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$ ssh ceph-mon01 'sudo systemctl start [email protected]' [cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs - 0 clients ====== +------+--------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | | 1 | active | ceph-mon03 | Reqs: 0 /s | 10 | 13 | +------+--------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 65.3k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ | ceph-mon01 | +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$ ssh ceph-mon03 'sudo systemctl restart [email protected]' [cephadm@ceph-admin ceph-cluster]$ ceph fs status cephfs - 0 clients ====== +------+----------------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+----------------+------------+---------------+-------+-------+ | 0 | active | ceph-mon02 | Reqs: 0 /s | 18 | 17 | | 1 | active | ceph-mon01 | Reqs: 0 /s | 10 | 13 | | 1-s | standby-replay | ceph-mon03 | Evts: 0 /s | 0 | 3 | +------+----------------+------------+---------------+-------+-------+ +---------------------+----------+-------+-------+ | Pool | type | used | avail | +---------------------+----------+-------+-------+ | cephfs-metadatapool | metadata | 65.3k | 280G | | cephfs-datapool | data | 3391k | 280G | +---------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ +-------------+ MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable) [cephadm@ceph-admin ceph-cluster]$
提示:重新恢復ceph-mon01以后,對應不會進行搶占,它會自動淪為standby狀態;并且當ceph-mon03重啟或故障后對應ceph-mon01也會自動接管對應rank;
作者:Linux-1874 出處:https://www.cnblogs.com/qiuhom-1874/ 本文著作權歸作者和博客園共有,歡迎轉載,但未經作者同意必須保留此段宣告,且在文章頁面明顯位置給出原文連接,否則保留追究法律責任的權利.轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/513049.html
標籤:其他
下一篇:Linux策略路由詳解
