Openshift平臺組件監控
Docker
Docker 是Openshift最基本的組件. 需要master與node實體全域的docker健康情況 ,以下是每個節點應該監控的:
Check Name | Description | Storage Driver | Sample Alerting Logic |
Docker Daemon | Check that docker is running on a system | devicemapper | systemctl is-active docker |
overlay2 | systemctl is-active docker | ||
Docker Storage | Check that docker’s storage has adequate space. overlay2 check assumes LV_Name is dockerlv and VG is dockervg. | devicemapper | echo $(echo \"$(docker info 2>/dev/null | awk '/Data Space Available/ {print $4}') / $(docker info 2>/dev/null | awk '/Data Space Total/ {print $4}')\" | bc -l) '>' 0.3 | bc -l |
overlay2 | echo "$(df -h | awk '/dockervg-dockerlv/ {print $5}' | awk -F% '{print $1}') > 70" | bc | ||
Docker Metadata Storage | Check that docker’s metadata storage volume is not full | devicemapper | echo $(echo \"$(docker info 2>/dev/null | awk '/Metadata Space Available/ {print $4}') / $(docker info 2>/dev/null | awk '/Metadata Space Total/ {print $4}')\" | bc -l) '>' 0.3 | bc -l |
overlay2 | N/A with overlay2 |
Nodes & Masters
Check Name | Description | Relevant Hosts | OCP Version | Sample Alerting Logic |
Etcd Service | Check that etcd is active | Masters | <= 3.9 | systemctl is-active etcd |
>= 3.10 | oc get pods -n kube-system --no-headers -o=custom-columns=POD:.metadata.name,STATUS:.status.phase | grep -i "master-etcd" | grep -i "running" | if [ $( wc -l) -eq $(oc get pods -n kube-system --no-headers -o=custom-columns=POD:.metadata.name | grep etcd | wc -l) ]; then exit 0; else exit 1; fi | |||
Etcd Storage | Check that the etcd volume is not too full.This checks assumes the node storage (/var/lib/etcd) is provisioned with a separate logical volume. | Masters | <= 3.9 | echo "$(lvs | awk '/etcd/ {print $4}') > 70" | bcor echo "$(df -h | awk '/etcd/ {print $5}' | awk -F% '{print $1}') > 70" | bc |
>= 3.10 | echo "$(lvs | awk '/etcd/ {print $4}') > 70" | bcor echo "$(df -h | awk '/etcd/ {print $5}' | awk -F% '{print $1}') > 70" | bc | |||
Master API Service (single master) | Check that the Master API Service or pods are active | Masters | <= 3.9 | systemctl is-active atomic-openshift-master |
>= 3.10 | Same as multi-master check. | |||
Master API Service (multi-master) | Check that the Master API Service or pods are active | Masters | <= 3.9 | systemctl is-active atomic-openshift-master-api |
>= 3.10 | oc get pods -n kube-system --no-headers -o=custom-columns=POD:.metadata.name,STATUS:.status.phase | grep -i "master-api" | grep -i "running" | if [ $( wc -l) -eq $(oc get pods -n kube-system --no-headers -o=custom-columns=POD:.metadata.name | grep etcd | wc -l) ]; then exit 0; else exit 1; fi | |||
Master Controllers Service (multi-master) | Check that the Master Controllers Service or pods are active | Masters | <= 3.9 | systemctl is-active atomic-openshift-master-controllers |
>= 3.10 | oc get pods -n kube-system --no-headers -o=custom-columns=POD:.metadata.name,STATUS:.status.phase | grep -i "master-controller" | grep -i "running" | if [ $( wc -l) -eq $(oc get pods -n kube-system --no-headers -o=custom-columns=POD:.metadata.name | grep etcd | wc -l) ]; then exit 0; else exit 1; fi | |||
Node Service | Check that the node service is active | All Nodes | <= 3.9 | systemctl is-active atomic-openshift-node |
>= 3.10 | systemctl is-active atomic-openshift-node | |||
Node Storage | Check that the node’s local data storage volume is not too full. This checks assumes the node storage (/var/lib/origin) is provisioned with a separate logical volume. | All Nodes | <= 3.9 | echo "$(lvs | awk '/origin/ {print $4}') > 70" | bcor echo "$(df -h | awk '/origin/ {print $5}' | awk -F% '{print $1}') > 70" | bc |
>= 3.10 | echo "$(lvs | awk '/origin/ {print $4}') > 70" | bcor echo "$(df -h | awk '/origin/ {print $5}' | awk -F% '{print $1}') > 70" | bc | |||
OpenVSwitch Service | Check that the openvswitch service or pods are active | All Nodes | <= 3.9 | systemctl is-active openvswitch |
>= 3.10 | oc get pods -n openshift-sdn --no-headers -o=custom-columns=POD:.metadata.name,STATUS:.status.phase | grep -i "ovs-" | grep -i "running" | if [ $( wc -l) -eq $(oc get nodes --no-headers | wc -l) ]; then exit 0; else exit 1; fi | |||
SDN Service | Check that all the SDN pods are active | All Nodes | <= 3.9 | NA |
>= 3.10 | oc get pods -n openshift-sdn --no-headers -o=custom-columns=POD:.metadata.name,STATUS:.status.phase | grep -i "sdn-" | grep -i "running" | if [ $( wc -l) -eq $(oc get nodes --no-headers | wc -l) ]; then exit 0; else exit 1; fi |
API Endpoints
許多Openshift組件暴露HTTP端點,用于健康與相關操作,這些需要監控:
Check Name | Description | Sample Alerting Logic |
OpenShift Master API Server | Check the health of a master API Endpoint | curl -s https://console.c1-ocp.myorg.com:8443/healthz | grep ok |
Router | Check the health of the Router | curl http://router.default.svc.cluster.local:1936/healthz | grep 200 |
Registry | Check the health of the Registry | curl -I https://docker-registry.default.svc.cluster.local:5000/healthz | grep 200 |
Logging | Check the health of the EFK Logging Stack | Because of the various components and complexities involved, we recommend the OpenShift Logging health check script. |
Metrics | Check the health of the Metrics Stack | Because of the various components and complexities involved, we recommend the OpenShift Metrics health check script. |
今天先到這兒,希望對云原生,技術領導力, 企業管理,系統架構設計與評估,團隊管理, 專案管理, 產品管管,團隊建設 有參考作用 , 您可能感興趣的文章:
領匯入怎樣帶領好團隊
構建創業公司突擊小團隊
國際化環境下系統架構演化
微服務架構設計
視頻直播平臺的系統架構演化
微服務與Docker介紹
Docker與CI持續集成/CD
互聯網電商購物車架構演變案例
互聯網業務場景下訊息佇列架構
互聯網高效研發團隊管理演進之一
訊息系統架構設計演進
互聯網電商搜索架構演化之一
企業資訊化與軟體工程的迷思
企業專案化管理介紹
軟體專案成功之要素
人際溝通風格介紹一
精益IT組織與分享式領導
學習型組織與企業
企業創新文化與等級觀念
組織目標與個人目標
初創公司人才招聘與管理
人才公司環境與企業文化
企業文化、團隊文化與知識共享
高效能的團隊建設
專案管理溝通計劃
構建高效的研發與自動化運維
某大型電商云平臺實踐
互聯網資料庫架構設計思路
IT基礎架構規劃方案一(網路系統規劃)
餐飲行業解決方案之客戶分析流程
餐飲行業解決方案之采購戰略制定與實施流程
餐飲行業解決方案之業務設計流程
供應鏈需求調研CheckList
企業應用之性能實時度量系統演變
如有想了解更多軟體設計與架構, 系統IT,企業資訊化, 團隊管理 資訊,請關注我的微信訂閱號:
![MegadotnetMicroMsg_thumb1_thumb1_thu[2] MegadotnetMicroMsg_thumb1_thumb1_thu[2]](https://img.uj5u.com/2020/09/14/67044141606291.jpg)
作者:Petter Liu
出處:http://www.cnblogs.com/wintersun/
本文著作權歸作者和博客園共有,歡迎轉載,但未經作者同意必須保留此段宣告,且在文章頁面明顯位置給出原文連接,否則保留追究法律責任的權利,
該文章也同時發布在我的獨立博客中-Petter Liu Blog,
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/36636.html
標籤:其他
上一篇:Prometheus(七) 監控spring boot docker 容器
下一篇:關于BP神經網路引數個數的計算
