实用干货丨如何使用Prometheus配置自定义告警规则
2013 年 3 月 30 日
Name: prometheus-demo-prometheus-operator-prometheus-0 Namespace: monitoring Priority: 0 Node: gke-c-7dkls-default-0-c6ca178a-gmcq/10.132.0.15 Start Time: Wed, 11 Mar 2020 18:06:47 +0000 Labels: app=prometheus controller-revision-hash=prometheus-demo-prometheus-operator-prometheus-5ccbbd8578 prometheus=demo-prometheus-operator-prometheus statefulset.kubernetes.io/pod-name=prometheus-demo-prometheus-operator-prometheus-0 Annotations: <none> Status: Running IP: 10.40.0.7 IPs: <none> Controlled By: StatefulSet/prometheus-demo-prometheus-operator-prometheus Containers: prometheus: Container ID: docker://360db8a9f1cce8d72edd81fcdf8c03fe75992e6c2c59198b89807aa0ce03454c Image: quay.io/prometheus/prometheus:v2.15.2 Image ID: docker-pullable://quay.io/prometheus/prometheus@sha256:914525123cf76a15a6aaeac069fcb445ce8fb125113d1bc5b15854bc1e8b6353 Port: 9090/TCP Host Port: 0/TCP Args: --web.console.templates=/etc/prometheus/consoles --web.console.libraries=/etc/prometheus/console_libraries --config.file=/etc/prometheus/config_out/prometheus.env.yaml --storage.tsdb.path=/prometheus --storage.tsdb.retention.time=10d --web.enable-lifecycle --storage.tsdb.no-lockfile --web.external-url=http://demo-prometheus-operator-prometheus.monitoring:9090 --web.route-prefix=/ State: Running Started: Wed, 11 Mar 2020 18:07:07 +0000 Last State: Terminated Reason: Error Message: caller=main.go:648 msg="Starting TSDB ..." level=info ts=2020-03-11T18:07:02.185Z caller=web.go:506 component=web msg="Start listening for connections" address=0.0.0.0:9090 level=info ts=2020-03-11T18:07:02.192Z caller=head.go:584 component=tsdb msg="replaying WAL, this may take awhile" level=info ts=2020-03-11T18:07:02.192Z caller=head.go:632 component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0 level=info ts=2020-03-11T18:07:02.194Z caller=main.go:663 fs_type=EXT4_SUPER_MAGIC level=info ts=2020-03-11T18:07:02.194Z caller=main.go:664 msg="TSDB started" level=info ts=2020-03-11T18:07:02.194Z caller=main.go:734 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml level=info ts=2020-03-11T18:07:02.194Z caller=main.go:517 msg="Stopping scrape discovery manager..." level=info ts=2020-03-11T18:07:02.194Z caller=main.go:531 msg="Stopping notify discovery manager..." level=info ts=2020-03-11T18:07:02.194Z caller=main.go:553 msg="Stopping scrape manager..." level=info ts=2020-03-11T18:07:02.194Z caller=manager.go:814 component="rule manager" msg="Stopping rule manager..." level=info ts=2020-03-11T18:07:02.194Z caller=manager.go:820 component="rule manager" msg="Rule manager stopped" level=info ts=2020-03-11T18:07:02.194Z caller=main.go:513 msg="Scrape discovery manager stopped" level=info ts=2020-03-11T18:07:02.194Z caller=main.go:527 msg="Notify discovery manager stopped" level=info ts=2020-03-11T18:07:02.194Z caller=main.go:547 msg="Scrape manager stopped" level=info ts=2020-03-11T18:07:02.197Z caller=notifier.go:598 component=notifier msg="Stopping notification manager..." level=info ts=2020-03-11T18:07:02.197Z caller=main.go:718 msg="Notifier manager stopped" level=error ts=2020-03-11T18:07:02.197Z caller=main.go:727 err="error loading config from \"/etc/prometheus/config_out/prometheus.env.yaml\": couldn't load configuration (--config.file=\"/etc/prometheus/config_out/prometheus.env.yaml\"): open /etc/prometheus/config_out/prometheus.env.yaml: no such file or directory"
Exit Code: 1 Started: Wed, 11 Mar 2020 18:07:02 +0000 Finished: Wed, 11 Mar 2020 18:07:02 +0000 Ready: True Restart Count: 1 Liveness: http-get http://:web/-/healthy delay=0s timeout=3s period=5s #success=1 #failure=6 Readiness: http-get http://:web/-/ready delay=0s timeout=3s period=5s #success=1 #failure=120 Environment: <none> Mounts: /etc/prometheus/certs from tls-assets (ro) /etc/prometheus/config_out from config-out (ro) /etc/prometheus/rules/prometheus-demo-prometheus-operator-prometheus-rulefiles-0 from prometheus-demo-prometheus-operator-prometheus-rulefiles-0 (rw) /prometheus from prometheus-demo-prometheus-operator-prometheus-db (rw) /var/run/secrets/kubernetes.io/serviceaccount from demo-prometheus-operator-prometheus-token-jvbrr (ro) prometheus-config-reloader: Container ID: docker://de27cdad7067ebd5154c61b918401b2544299c161850daf3e317311d2d17af3d Image: quay.io/coreos/prometheus-config-reloader:v0.37.0 Image ID: docker-pullable://quay.io/coreos/prometheus-config-reloader@sha256:5e870e7a99d55a5ccf086063efd3263445a63732bc4c04b05cf8b664f4d0246e Port: <none> Host Port: <none> Command: /bin/prometheus-config-reloader Args: --log-format=logfmt --reload-url=http://127.0.0.1:9090/-/reload --config-file=/etc/prometheus/config/prometheus.yaml.gz --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml State: Running Started: Wed, 11 Mar 2020 18:07:04 +0000 Ready: True Restart Count: 0 Limits: cpu: 100m memory: 25Mi Requests: cpu: 100m memory: 25Mi Environment: POD_NAME: prometheus-demo-prometheus-operator-prometheus-0 (v1:metadata.name) Mounts: /etc/prometheus/config from config (rw) /etc/prometheus/config_out from config-out (rw) /var/run/secrets/kubernetes.io/serviceaccount from demo-prometheus-operator-prometheus-token-jvbrr (ro) rules-configmap-reloader: Container ID: docker://5804e45380ed1b5374a4c2c9ee4c9c4e365bee93b9ccd8b5a21f50886ea81a91 Image: quay.io/coreos/configmap-reload:v0.0.1 Image ID: docker-pullable://quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057 Port: <none> Host Port: <none> Args: --webhook-url=http://127.0.0.1:9090/-/reload --volume-dir=/etc/prometheus/rules/prometheus-demo-prometheus-operator-prometheus-rulefiles-0 State: Running Started: Wed, 11 Mar 2020 18:07:06 +0000 Ready: True Restart Count: 0 Limits: cpu: 100m memory: 25Mi Requests: cpu: 100m memory: 25Mi Environment: <none> Mounts: /etc/prometheus/rules/prometheus-demo-prometheus-operator-prometheus-rulefiles-0 from prometheus-demo-prometheus-operator-prometheus-rulefiles-0 (rw) /var/run/secrets/kubernetes.io/serviceaccount from demo-prometheus-operator-prometheus-token-jvbrr (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: config: Type: Secret (a volume populated by a Secret) SecretName: prometheus-demo-prometheus-operator-prometheus Optional: false tls-assets: Type: Secret (a volume populated by a Secret) SecretName: prometheus-demo-prometheus-operator-prometheus-tls-assets Optional: false config-out: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: prometheus-demo-prometheus-operator-prometheus-rulefiles-0: Type: ConfigMap (a volume populated by a ConfigMap) Name: prometheus-demo-prometheus-operator-prometheus-rulefiles-0 Optional: false prometheus-demo-prometheus-operator-prometheus-db: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: demo-prometheus-operator-prometheus-token-jvbrr: Type: Secret (a volume populated by a Secret) SecretName: demo-prometheus-operator-prometheus-token-jvbrr Optional: false QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4m51s default-scheduler Successfully assigned monitoring/prometheus-demo-prometheus-operator-prometheus-0 to gke-c-7dkls-default-0-c6ca178a-gmcq Normal Pulling 4m45s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Pulling image "quay.io/prometheus/prometheus:v2.15.2" Normal Pulled 4m39s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Successfully pulled image "quay.io/prometheus/prometheus:v2.15.2" Normal Pulling 4m36s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Pulling image "quay.io/coreos/prometheus-config-reloader:v0.37.0" Normal Pulled 4m35s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Successfully pulled image "quay.io/coreos/prometheus-config-reloader:v0.37.0" Normal Pulling 4m34s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Pulling image "quay.io/coreos/configmap-reload:v0.0.1" Normal Started 4m34s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Started container prometheus-config-reloader Normal Created 4m34s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Created container prometheus-config-reloader Normal Pulled 4m33s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Successfully pulled image "quay.io/coreos/configmap-reload:v0.0.1" Normal Created 4m32s (x2 over 4m36s) kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Created container prometheus Normal Created 4m32s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Created container rules-configmap-reloader Normal Started 4m32s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Started container rules-configmap-reloader Normal Pulled 4m32s kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Container image "quay.io/prometheus/prometheus:v2.15.2" already present on machineNormal Started 4m31s (x2 over 4m36s) kubelet, gke-c-7dkls-default-0-c6ca178a-gmcq Started container prometheus