我有一個運行 RabbitMQ 的 pod。以下是部署清單:
apiVersion: v1
kind: Service
metadata:
name: service-rabbitmq
spec:
selector:
app: service-rabbitmq
ports:
- port: 5672
targetPort: 5672
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-rabbitmq
spec:
selector:
matchLabels:
app: deployment-rabbitmq
template:
metadata:
labels:
app: deployment-rabbitmq
spec:
containers:
- name: rabbitmq
image: rabbitmq:latest
volumeMounts:
- name: rabbitmq-data-volume
mountPath: /var/lib/rabbitmq
resources:
requests:
cpu: 250m
memory: 128Mi
limits:
cpu: 750m
memory: 256Mi
volumes:
- name: rabbitmq-data-volume
persistentVolumeClaim:
claimName: rabbitmq-pvc
當我將它部署到本地集群中時,我看到 pod 運行了一段時間然后崩潰了。所以基本上它處于崩潰回圈狀態。以下是我從 pod 中得到的日志:
$ kubectl logs deployment-rabbitmq-649b8479dc-kt9s4
2021-10-14 06:46:36.182390 00:00 [info] <0.222.0> Feature flags: list of feature flags found:
2021-10-14 06:46:36.221717 00:00 [info] <0.222.0> Feature flags: [ ] implicit_default_bindings
2021-10-14 06:46:36.221768 00:00 [info] <0.222.0> Feature flags: [ ] maintenance_mode_status
2021-10-14 06:46:36.221792 00:00 [info] <0.222.0> Feature flags: [ ] quorum_queue
2021-10-14 06:46:36.221813 00:00 [info] <0.222.0> Feature flags: [ ] stream_queue
2021-10-14 06:46:36.221916 00:00 [info] <0.222.0> Feature flags: [ ] user_limits
2021-10-14 06:46:36.221933 00:00 [info] <0.222.0> Feature flags: [ ] virtual_host_metadata
2021-10-14 06:46:36.221953 00:00 [info] <0.222.0> Feature flags: feature flag states written to disk: yes
2021-10-14 06:46:37.018537 00:00 [noti] <0.44.0> Application syslog exited with reason: stopped
2021-10-14 06:46:37.018646 00:00 [noti] <0.222.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
2021-10-14 06:46:37.045601 00:00 [noti] <0.222.0> Logging: configured log handlers are now ACTIVE
2021-10-14 06:46:37.635024 00:00 [info] <0.222.0> ra: starting system quorum_queues
2021-10-14 06:46:37.635139 00:00 [info] <0.222.0> starting Ra system: quorum_queues in directory: /var/lib/rabbitmq/mnesia/rabbit@deployment-rabbitmq-649b8479dc-kt9s4/quorum/rabbit@deployment-rabbitmq-649b8479dc-kt9s4
2021-10-14 06:46:37.849041 00:00 [info] <0.259.0> ra: meta data store initialised for system quorum_queues. 0 record(s) recovered
2021-10-14 06:46:37.877504 00:00 [noti] <0.264.0> WAL: ra_log_wal init, open tbls: ra_log_open_mem_tables, closed tbls: ra_log_closed_mem_tables
此日志沒有太大幫助,我從這里找不到任何錯誤訊息。這里唯一有用的行可能是Application syslog exited with reason: stopped, only 但據我所知,這不是。事件日志也沒有幫助:
$ kubectl describe pods deployment-rabbitmq-649b8479dc-kt9s4
Name: deployment-rabbitmq-649b8479dc-kt9s4
Namespace: default
Priority: 0
Node: docker-desktop/192.168.65.4
Start Time: Thu, 14 Oct 2021 12:45:03 0600
Labels: app=deployment-rabbitmq
pod-template-hash=649b8479dc
skaffold.dev/run-id=7af5e1bb-e0c8-4021-a8a0-0c8bf43630b6
Annotations: <none>
Status: Running
IP: 10.1.5.138
IPs:
IP: 10.1.5.138
Controlled By: ReplicaSet/deployment-rabbitmq-649b8479dc
Containers:
rabbitmq:
Container ID: docker://de309f94163c071afb38fb8743d106923b6bda27325287e82bc274e362f1f3be
Image: rabbitmq:latest
Image ID: docker-pullable://rabbitmq@sha256:d8efe7b818e66a13fdc6fdb84cf527984fb7d73f52466833a20e9ec298ed4df4
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
Exit Code: 0
Started: Thu, 14 Oct 2021 13:56:29 0600
Finished: Thu, 14 Oct 2021 13:56:39 0600
Ready: False
Restart Count: 18
Limits:
cpu: 750m
memory: 256Mi
Requests:
cpu: 250m
memory: 128Mi
Environment: <none>
Mounts:
/var/lib/rabbitmq from rabbitmq-data-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9shdv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
rabbitmq-data-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: rabbitmq-pvc
ReadOnly: false
kube-api-access-9shdv:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 23m (x6 over 50m) kubelet (combined from similar events): Successfully pulled image "rabbitmq:latest" in 4.267310231s
Normal Pulling 18m (x16 over 73m) kubelet Pulling image "rabbitmq:latest"
Warning BackOff 3m45s (x307 over 73m) kubelet Back-off restarting failed container
這種崩潰回圈的原因可能是什么?
注意:
rabbitmq-pvc系結成功。沒有問題。
更新:
這個答案表明 RabbitMQ 應該部署為StatefulSet。所以我像這樣調整了清單:
apiVersion: v1
kind: Service
metadata:
name: service-rabbitmq
spec:
selector:
app: service-rabbitmq
ports:
- name: rabbitmq-amqp
port: 5672
- name: rabbitmq-http
port: 15672
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: statefulset-rabbitmq
spec:
selector:
matchLabels:
app: statefulset-rabbitmq
serviceName: service-rabbitmq
template:
metadata:
labels:
app: statefulset-rabbitmq
spec:
containers:
- name: rabbitmq
image: rabbitmq:latest
volumeMounts:
- name: rabbitmq-data-volume
mountPath: /var/lib/rabbitmq/mnesia
resources:
requests:
cpu: 250m
memory: 128Mi
limits:
cpu: 750m
memory: 256Mi
volumes:
- name: rabbitmq-data-volume
persistentVolumeClaim:
claimName: rabbitmq-pvc
pod 仍然會發生崩潰回圈,但日志略有不同。
$ kubectl logs statefulset-rabbitmq-0
2021-10-14 09:38:26.138224 00:00 [info] <0.222.0> Feature flags: list of feature flags found:
2021-10-14 09:38:26.158953 00:00 [info] <0.222.0> Feature flags: [x] implicit_default_bindings
2021-10-14 09:38:26.159015 00:00 [info] <0.222.0> Feature flags: [x] maintenance_mode_status
2021-10-14 09:38:26.159037 00:00 [info] <0.222.0> Feature flags: [x] quorum_queue
2021-10-14 09:38:26.159078 00:00 [info] <0.222.0> Feature flags: [x] stream_queue
2021-10-14 09:38:26.159183 00:00 [info] <0.222.0> Feature flags: [x] user_limits
2021-10-14 09:38:26.159236 00:00 [info] <0.222.0> Feature flags: [x] virtual_host_metadata
2021-10-14 09:38:26.159270 00:00 [info] <0.222.0> Feature flags: feature flag states written to disk: yes
2021-10-14 09:38:26.830814 00:00 [noti] <0.44.0> Application syslog exited with reason: stopped
2021-10-14 09:38:26.830925 00:00 [noti] <0.222.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
2021-10-14 09:38:26.852048 00:00 [noti] <0.222.0> Logging: configured log handlers are now ACTIVE
2021-10-14 09:38:33.754355 00:00 [info] <0.222.0> ra: starting system quorum_queues
2021-10-14 09:38:33.754526 00:00 [info] <0.222.0> starting Ra system: quorum_queues in directory: /var/lib/rabbitmq/mnesia/rabbit@statefulset-rabbitmq-0/quorum/rabbit@statefulset-rabbitmq-0
2021-10-14 09:38:33.760365 00:00 [info] <0.290.0> ra: meta data store initialised for system quorum_queues. 0 record(s) recovered
2021-10-14 09:38:33.761023 00:00 [noti] <0.302.0> WAL: ra_log_wal init, open tbls: ra_log_open_mem_tables, closed tbls: ra_log_closed_mem_tables
The feature flags are now marked as it's seen. No other notable changes. So I still need help.
! New Issue !
Head over here.
uj5u.com熱心網友回復:
pod 被 oomkilled(最后狀態,原因),您需要為 pod 分配更多資源(記憶體)。
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/317629.html
