使用Prometheus和Grafana监控golang服务

2009 年 4 月 22 日

环境

centOS 7.0

Prometheus2.14.0

Grafana6.5.2

下载安装Prometheus

wget https://github.com/prometheus/prometheus/releases/download/v2.14.0/prometheus-2.14.0.linux-386.tar.gz

tar -xavf prometheus-2.14.0.linux-386.tar.gz

启动

在解压目录里就有缺省的配置文件prometheus.yml。可以不用修改直接使用启动。

./prometheus --config.file=prometheus.yml

在浏览器中输入主机IP:9090访问就能看到Prometheus界面

时序类型

Counter：计数器，数据的值持续增加或持续减少。表示的是一个持续变化趋势值，用来记录当前的数量。一般用于记录当前请求数量，错误数

Gauge：计量器（类似仪表盘）。表示当前数据的一个瞬时值，改值可任意增加或减少。一般用来记录内存使用量，磁盘使用量，文件打开数量等。

Histogram:柱状图。主要用于在一定范围内对数据进行采样，计算在一定范围内的分布情况，通常它采集的数据展示为直方图。一般用来记录请求时长或响应时长

Summary:摘要。主要用于表示一段时间内数据采样结果。总量，而不是根据统计区间计算出来

Grafana

下载

wget https://dl.grafana.com/oss/release/grafana-6.5.2-1.x86_64.rpm

安装

sudo yum localinstall grafana-6.5.2-1.x86_64.rpm

启动

systemctl daemon-reload 
systemctl start grafana-server
systemctl status grafana-server

配置文件

配置文件在/etc/sysconfig/grafana-server

GRAFANA_USER=grafana
GRAFANA_GROUP=grafana
GRAFANA_HOME=/usr/share/grafana
LOG_DIR=/var/log/grafana
DATA_DIR=/var/lib/grafana
MAX_OPEN_FILES=10000
CONF_DIR=/etc/grafana
CONF_FILE=/etc/grafana/grafana.ini
RESTART_ON_UPGRADE=true
PLUGINS_DIR=/var/lib/grafana/plugins
PROVISIONING_CFG_DIR=/etc/grafana/provisioning
# Only used on systemd systems
PID_FILE_DIR=/var/run/grafana

访问

浏览器输入IP:3000，初次登陆帐号和密码都是admin

进入后会要求生成初次数据源(create your first data source)

生成新的dashboard

实例

接下来做几个实际的例子看看实际效果

测试代码请到例子代码

Counter

例子监控rpc的数量。counter的计数是不断累加的

golang代码，关键部分

//Create a new CounterVec
rpcCounter = prometheus.NewCounterVec(
    prometheus.CounterOpts{
        Name: "rpc_counter",
        Help: "RPC counts",
    },
    []string{"api"},
)
    
//registers the provided collector     
prometheus.MustRegister(rpcCounter)

//Add the given value to counter
rpcCounter.WithLabelValues("api_bookcontent").Add(float64(rand.Int31n(50)))
rpcCounter.WithLabelValues("api_chapterlist").Add(float64(rand.Int31n(10)))

在prometheus的配置文件中添加

- job_name: 'req-monitor'
    static_configs:
      - targets: ['localhost:8082']
        labels:
          group: 'newgroup1'

重启prometheus

ps -aux | grep prometheus
kill -9 xxxx

./prometheus --config.file=prometheus.yml

编译程序(在linux下运行)

GOOS=linux go build

执行

./prometheus_rpc_http -listen-address=:8082 &

在prometheus下查看

在Grafana下新建dashboard

其中计算公式为 rate(rpc_counter[1m]) 意思是对1minute 的rpc_counter值取平均

可以看到其中有两条线 api=”api_bookcontent”, api=”api_chapterlist”正是我们在代码中通过rpcCounter.WithLabelValues()设置的label

Gauge

golang关键部分代码

rpcReqSize = prometheus.NewGaugeVec(
    prometheus.GaugeOpts{
        Name: "rpc_req_size",
        Help: "RPC request size",
    },
    []string{"api"},
)
    
prometheus.MustRegister(rpcReqSize)

rpcReqSize.WithLabelValues("api_bookcontent").Set(float64(rand.Int31n(8000)))
rpcReqSize.WithLabelValues("api_chapterlist").Set(float64(rand.Int31n(5000)))

在prometheus下查看

在Grafana下新建dashboard

Histogram

golang关键部分代码

httpReqDurationsHistogram = prometheus.NewHistogramVec(
    prometheus.HistogramOpts{
        Name: "http_req_durations_histogram",
        Help: "http req latency distributions.",
        // 4 buckets, starting from 0.1 and adding 0.5 between each bucket
        Buckets: prometheus.LinearBuckets(0.1, 0.5, 4),
    },
    []string{"http_req_histogram"},
)


prometheus.MustRegister(httpReqDurationsHistogram)

v := rand.Float64()
httpReqDurationsHistogram.WithLabelValues("booksvc_req").Observe(1.5 * v)

prometheus下查看

可以看到我们在代码中定义了4个buckets，在图中就有对应的四个buckets数据(le=”0.1″,le=”0.6″,le=”1.1″,le=”1.6″)

在Grafana下新建dashboard

计算公式使用rate(http_req_durations_histogram_bucket[30s])

计算30s http_req_durations_histogram_bucket的平均值

根据数值可以看到0.1秒响应的占1.3%， 0.6秒内占17.3%， 1.1秒内响应的占34.7， 1.6秒内响应的占60%

Summary

golang关键代码

rpcDurations = prometheus.NewSummaryVec(
        prometheus.SummaryOpts{
            Name:       "rpc_durations_seconds",
            Help:       "RPC latency distributions.",
            Objectives: map[float64]float64{0.5: 0.5, 0.9: 1.5, 0.99: 2.0},
        },
        []string{"service"},
    )
    
prometheus.MustRegister(rpcDurations)

v = rand.Float64()
rpcDurations.WithLabelValues("user_rpc").Observe(v)

v = 0.5 + rand.Float64()            rpcDurations.WithLabelValues("book_rpc").Observe(v)
            
v = 1.0 + rand.Float64()    rpcDurations.WithLabelValues("bookshelf_rpc").Observe(v)

在prometheus下查看

在Grafana下新建dashboard

计算公式为rate(rpc_durations_seconds_sum[1m])

M	T	W	T	F	S	S
« Jan
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

演道网

使用Prometheus和Grafana监控golang服务

环境

下载安装Prometheus

启动

时序类型

Grafana

实例

Counter

Gauge

Histogram

Summary

About The Author

php

环境

下载安装Prometheus

启动

时序类型

Grafana

实例

Counter

Gauge

Histogram

Summary

Related Posts

About The Author

php