Kafka压力测试
一、测试目的
本次性能测试在正式环境下单台服务器上Kafka处理MQ消息能力进行压力测试。测试包括对Kafka写入MQ消息和消费MQ消息进行压力测试,根据10w、100w和1000w级别的消息处理结果,评估Kafka的处理性能是否满足项目需求。(该项目期望Kafka能够处理上亿级别的MQ消息)
二、测试范围及方法
2.1 测试范围概述
测试使用Kafka自带的测试脚本,通过命令对Kafka发起写入MQ消息和Kafka消费MQ消息的请求。模拟不同数量级的MQ消息写入和MQ消息消费场景,根据Kafka的处理结果,评估Kafka是否满足处理亿级以上的消息的能力。
2.2性能测试场景设计
2.2.1 Kafka写入消息压力测试
测试场景 |
MQ消息数 |
每秒写入消息数 |
记录大小(单位:字节) |
---|---|---|---|
写入测试 |
10W |
2000条 |
1000 |
10W |
5000条 |
1000 |
|
100W |
5000条 |
1000 |
2.2.2 Kafka消费消息压力测试
测试场景 |
消费MQ消息数 |
---|---|
Kafka消息消费测试 |
10W |
100W |
|
1000W |
2.3测试方法简要描述
2.3.1测试目的
验证带台服务器上Kafka写入消息和消费消息的能力,根据测试结果评估当前Kafka集群模式是否满足上亿级别的消息处理能力。
2.3.2测试方法
在服务器上使用Kafka自带的测试脚本,分别模拟10w、100w和1000w的消息写入请求,查看Kafka处理不同数量级的消息数时的处理能力,包括每秒生成消息数、吞吐量、消息延迟时间。Kafka消息吸入创建的topic命名为test_perf,使用命令发起消费该topic的请求,查看Kafka消费不同数量级别的消息时的处理能力。
压测命令信息:
测试项 |
消息数(W) |
测试命令 |
---|---|---|
写入MQ消息 |
10 |
./kafka-producer-perf-test.sh –topic test_perf –num-records 100000 –record-size 1000 –throughput 2000 –producer-props bootstrap.servers=10.150.30.60:9092 |
100 |
./kafka-producer-perf-test.sh –topic test_perf –num-records 1000000 –record-size 2000 –throughput 5000 –producer-props bootstrap.servers=10.150.30.60:9092 |
|
1000 |
./kafka-producer-perf-test.sh –topic test_perf –num-records 10000000 –record-size 2000 –throughput 5000 –producer-props bootstrap.servers=10.150.30.60:9092 |
|
消费MQ消息 |
10 |
./kafka-consumer-perf-test.sh –broker-list localhost:9092 –topic test_perf –fetch-size 1048576 –messages 100000 –threads 1 |
100 |
./kafka-consumer-perf-test.sh –broker-list localhost:9092 –topic test_perf –fetch-size 1048576 –messages 1000000 –threads 1 |
|
1000 |
./kafka-consumer-perf-test.sh –broker-list localhost:9092 –topic test_perf –fetch-size 1048576 –messages 10000000 –threads 1 |
脚本执行目录:服务器上安装Kafka的bin目录;
三、测试环境
3.1 测试环境机器配置表
主 机 |
数量 |
资 源 |
操作系统 |
---|---|---|---|
MQ消息服务/处理 |
1 |
硬件:1(核)-4(G)-40(G)软件:Kafka单机(kafka_2.12-2.1.0) |
ubuntu-16.04.5-server-amd64 |
3.2 测试工具
Kafka压测工具 |
Kafka自带压测脚本 |
---|
3.3 测试环境搭建
这里仅仅使用单机版的kakfa,为了快速搭建,使用自带的zk。
新建目录
mkdir / opt / kafka_server_test
dockerfile
FROM ubuntu : 16.04
# 修改更新源为阿里云
ADD sources . list / etc / apt / sources . list
ADD kafka_2 .12 – 2.1 . 0 . tgz /
# 安装jdk
RUN apt – get update && apt – get install – y openjdk -8 – jdk — allow – unauthenticated && apt – get clean all
EXPOSE 9092
# 添加启动脚本
ADD run . sh .
RUN chmod 755 run . sh
ENTRYPOINT [ “/run.sh”
]
run.sh
# !/ bin / bash
# 启动自带的zookeeper
cd / kafka_2 .12 – 2.1 . 0
bin / zookeeper – server – start . sh config / zookeeper . properties &
# 启动kafka
sleep 3
bin / kafka – server – start . sh config /
server
.
properties
sources.list
deb http : // mirrors . aliyun . com / ubuntu / xenial main restricted
deb http : // mirrors . aliyun . com / ubuntu / xenial – updates main restricted
deb http : // mirrors . aliyun . com / ubuntu / xenial universe
deb http : // mirrors . aliyun . com / ubuntu / xenial – updates universe
deb http : // mirrors . aliyun . com / ubuntu / xenial multiverse
deb http : // mirrors . aliyun . com / ubuntu / xenial – updates multiverse
deb http : // mirrors . aliyun . com / ubuntu / xenial – backports main restricted universe multiverse
deb http : // mirrors . aliyun . com / ubuntu xenial – security main restricted
deb http : // mirrors . aliyun . com / ubuntu xenial – security universe
deb http : // mirrors . aliyun . com /
ubuntu xenial
–
security multiverse
目录结构如下:
. /
├── dockerfile
├── kafka_2 .12 – 2.1 . 0 . tgz
├── run . sh
└── sources
.
list
生成镜像
docker build – t kafka_server_test / opt / kafka_server_test
启动kafka
docker run – d – it kafka_server_test
四、测试结果
4.1测试结果说明
本次测试针对Kafka消息处理的能力 进行压力测试,对Kafka集群服务器中的一台进行MQ消息服务的压力测试,关注Kafka消息写入的延迟时间是否满足需求。对Kafka集群服务器中的一台进行MQ消息处理的压力测试,验证Kafka的消息处理能力。
4.2.1写入MQ消息
测试项 |
总数(单位:w) |
单消息大小(字节) |
秒发送消息数 |
写入消息数/秒 |
95%的消息延迟(单位:ms) |
---|---|---|---|---|---|
写入MQ消息 |
10 |
1000 |
2000 |
1999.84 |
1 |
100 |
1000 |
5000 |
4999.84 |
1 |
|
1000 |
1000 |
5000 |
4999.99 |
1 |
压测结果
在上面已经启动了kafka容器,查看进程
root@ubuntu : / opt# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5ced2eb77349 kafka_server_test “/run.sh” 34 minutes ago Up 34 minutes 0.0 . 0.0 : 2181 -> 2181 / tcp , 0.0 . 0.0 : 9092 -> 9092
/
tcp youthful_bhaskara
进入kafka的bin目录
root@ubuntu : / opt# docker exec – it 5ced2eb77349 / bin / bash
root@5ced2eb77349 : / # cd /kafka_2.12-2.1.0/
root@5ced2eb77349 :
/kafka_2.12-2.1.0# cd bin/
1. 写入10w消息压测结果
执行命令
. / kafka – producer – perf – test . sh — topic test_perf — num – records 100000 — record – size 1000 — throughput 2000 — producer – props bootstrap . servers = localhost : 9092
输出:
records sent , 1202.4 records / sec ( 1.15 MB / sec ), 1678.8 ms avg latency , 2080.0 max latency .
records sent , 2771.8 records / sec ( 2.64 MB / sec ), 1300.4 ms avg latency , 2344.0 max latency .
records sent , 2061.6 records / sec ( 1.97 MB / sec ), 17.1 ms avg latency , 188.0 max latency .
records sent , 1976.6 records / sec ( 1.89 MB / sec ), 10.0 ms avg latency , 177.0 max latency .
records sent , 2025.2 records / sec ( 1.93 MB / sec ), 15.4 ms avg latency , 253.0 max latency .
records sent , 2000.8 records / sec ( 1.91 MB / sec ), 6.1 ms avg latency , 163.0 max latency .
records sent , 1929.7 records / sec ( 1.84 MB / sec ), 3.7 ms avg latency , 128.0 max latency .
records sent , 2072.0 records / sec ( 1.98 MB / sec ), 14.1 ms avg latency , 163.0 max latency .
records sent , 2001.6 records / sec ( 1.91 MB / sec ), 4.5 ms avg latency , 116.0 max latency .
records sent , 1997.602877 records / sec ( 1.91 MB / sec ), 290.41 ms avg latency , 2344.00 ms max latency , 2 ms 50th , 1992 ms 95th , 2177 ms 99th , 2292 ms 99 .
9th
.
2. 写入100w消息压测结果
执行命令
. / kafka – producer – perf – test . sh — topic test_perf — num – records 1000000 — record – size 1000 — throughput 5000 — producer – props bootstrap . servers = localhost : 9092
输出:
records sent , 2158.5 records / sec ( 2.06 MB / sec ), 2134.9 ms avg latency , 2869.0 max latency .
records sent , 7868.4 records / sec ( 7.50 MB / sec ), 1459.2 ms avg latency , 2815.0 max latency .
records sent , 4991.0 records / sec ( 4.76 MB / sec ), 20.3 ms avg latency , 197.0 max latency .
records sent , 4972.3 records / sec ( 4.74 MB / sec ), 61.8 ms avg latency , 395.0 max latency .
records sent , 4880.2 records / sec ( 4.65 MB / sec ), 64.7 ms avg latency , 398.0 max latency .
records sent , 5085.9 records / sec ( 4.85 MB / sec ), 17.7 ms avg latency , 180.0 max latency .
records sent , 5030.8 records / sec ( 4.80 MB / sec ), 14.7 ms avg latency , 157.0 max latency .
records sent , 5056.0 records / sec ( 4.82 MB / sec ), 1.4 ms avg latency , 58.0 max latency .
records sent , 5001.0 records / sec ( 4.77 MB / sec ), 0.8 ms avg latency , 58.0 max latency .
records sent , 5002.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 25.0 max latency .
records sent , 5000.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 14.0 max latency .
records sent , 5002.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 19.0 max latency .
records sent , 5005.0 records / sec ( 4.77 MB / sec ), 1.2 ms avg latency , 57.0 max latency .
records sent , 5003.0 records / sec ( 4.77 MB / sec ), 1.3 ms avg latency , 55.0 max latency .
records sent , 5000.0 records / sec ( 4.77 MB / sec ), 0.9 ms avg latency , 44.0 max latency .
records sent , 5003.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 49.0 max latency .
records sent , 4988.0 records / sec ( 4.76 MB / sec ), 1.1 ms avg latency , 49.0 max latency .
records sent , 5014.0 records / sec ( 4.78 MB / sec ), 0.8 ms avg latency , 44.0 max latency .
records sent , 5001.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 10.0 max latency .
records sent , 5009.8 records / sec ( 4.78 MB / sec ), 0.5 ms avg latency , 25.0 max latency .
records sent , 5001.2 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 7.0 max latency .
records sent , 5002.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 49.0 max latency .
records sent , 5005.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 25.0 max latency .
records sent , 5006.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 14.0 max latency .
records sent , 5005.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 19.0 max latency .
records sent , 4976.1 records / sec ( 4.75 MB / sec ), 0.6 ms avg latency , 14.0 max latency .
records sent , 5036.0 records / sec ( 4.80 MB / sec ), 0.6 ms avg latency , 18.0 max latency .
records sent , 4999.8 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 14.0 max latency .
records sent , 4980.2 records / sec ( 4.75 MB / sec ), 0.5 ms avg latency , 14.0 max latency .
records sent , 5026.0 records / sec ( 4.79 MB / sec ), 0.5 ms avg latency , 14.0 max latency .
records sent , 5003.0 records / sec ( 4.77 MB / sec ), 0.4 ms avg latency , 10.0 max latency .
records sent , 5000.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 16.0 max latency .
records sent , 5007.0 records / sec ( 4.78 MB / sec ), 0.5 ms avg latency , 42.0 max latency .
records sent , 5001.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 24.0 max latency .
records sent , 5002.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 14.0 max latency .
records sent , 5009.0 records / sec ( 4.78 MB / sec ), 0.5 ms avg latency , 10.0 max latency .
records sent , 5006.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 18.0 max latency .
records sent , 5001.0 records / sec ( 4.77 MB / sec ), 0.4 ms avg latency , 6.0 max latency .
records sent , 5000.0 records / sec ( 4.77 MB / sec ), 128.2 ms avg latency , 955.0 max latency .
records sent , 4999.375078 records / sec ( 4.77 MB / sec ), 88.83 ms avg latency , 2869.00 ms max latency , 1 ms 50th , 327 ms 95th , 2593 ms 99th , 2838 ms 99 .
9th
.
3. 写入1000w消息压测结果
执行命令
. / kafka – producer – perf – test . sh — topic test_perf — num – records 10000000 — record – size 1000 — throughput 5000 — producer – props bootstrap . servers = localhost : 9092
输出:
records sent , 1053.0 records / sec ( 1.00 MB / sec ), 1952.7 ms avg latency , 3057.0 max latency .
records sent , 4173.8 records / sec ( 3.98 MB / sec ), 4585.7 ms avg latency , 5256.0 max latency .
records sent , 9765.2 records / sec ( 9.31 MB / sec ), 2621.9 ms avg latency , 4799.0 max latency .
…
records sent , 5000.8 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 79.0 max latency .
records sent , 4999.2 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 54.0 max latency .
records sent , 5003.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 19.0 max latency .
records sent , 4996.445029 records / sec ( 4.76 MB / sec ), 310.11 ms avg latency , 22474.00 ms max latency , 1 ms 50th , 1237 ms 95th , 7188 ms 99th , 20824 ms 99 .
9th
.
kafka-producer-perf-test.sh 脚本命令的参数解析(以100w写入消息为例):–topic topic名称,本例为test_perf –num-records 总共需要发送的消息数,本例为100000 –record-size 每个记录的字节数,本例为1000 –throughput 每秒钟发送的记录数,本例为5000 –producer-props bootstrap.servers=localhost:9092 (发送端的配置信息,本次测试取集群服务器中的一台作为发送端,可在kafka的config目录,以该项目为例:/usr/local/kafka/config;查看server.properties中配置的zookeeper.connect的值,默认端口:9092)
MQ消息写入测试结果解析:
本例中写入100w条MQ消息为例,每秒平均向kafka写入了4.77MB的数据,大概是4999.375条消息/秒,每次写入的平均延迟为88.83毫秒,最大的延迟为2869毫秒。
4.2.2消费MQ消息
消费MQ消息 |
消费消息总数(单位:w) |
共消费数据(单位:M) |
每秒消费数据(单位:M) |
每秒消费消息数 |
消费耗时(单位:s) |
---|---|---|---|---|---|
消费MQ消息 |
10 |
95.36 |
137 |
143899.3 |
0.695 |
100 |
953.66 |
177.19 |
185804.5 |
5.38 |
|
1000 |
9536.73 |
198.25 |
207878.6 |
48.11 |
压测结果
1. 消费10w消息压测结果
. / kafka – consumer – perf – test . sh — broker – list localhost : 9092 — topic test_perf — fetch – size 1048576 — messages 100000 — threads 1
注意:此脚本没有–zookeeper选项,参考链接有错误!
必须要执行写入10w消息之后,才能执行上面的命令,否则运行时,会报下面的错误!
[ 2018 – 12 – 06 05 : 47 : 52 , 832 ] WARN [ Consumer clientId = consumer -1 , groupId = perf – consumer -19548 ] Error while fetching metadata with correlation id 18 : { test_perf = LEADER_NOT_AVAILABLE } ( org . apache . kafka . clients . NetworkClient )
WARNING : Exiting before consuming the expected number of messages : timeout ( 10000 ms ) exceeded . You can use the —
timeout option to increase the timeout
.
正常输出:
start . time , end . time , data . consumed . in . MB , MB . sec , data . consumed . in . nMsg , nMsg . sec , rebalance . time . ms , fetch . time . ms , fetch . MB . sec , fetch . nMsg . sec
2018 – 12 – 06 05 : 50 : 41 : 276 , 2018 – 12 – 06 05 : 50 : 45 : 281 , 95.3674 , 23.8121 , 100000 , 24968.7890 , 78 , 3927 , 24.2851 ,
25464.7313
2. 消费100w消息压测结果
. / kafka – consumer – perf – test . sh — broker – list localhost : 9092 — topic test_perf — fetch – size 1048576 — messages 1000000 — threads 1
输出:
start . time , end . time , data . consumed . in . MB , MB . sec , data . consumed . in . nMsg , nMsg . sec , rebalance . time . ms , fetch . time . ms , fetch . MB . sec , fetch . nMsg . sec
2018 – 12 – 06 05 : 59 : 32 : 360 , 2018 – 12 – 06 05 : 59 : 51 : 624 , 954.0758 , 49.5264 , 1000421 , 51932.1532 , 41 , 19223 , 49.6320 ,
52042.9173
3. 消费1000w消息压测结果
. / kafka – consumer – perf – test . sh — broker – list localhost : 9092 — topic test_perf — fetch – size 1048576 — messages 10000000 — threads 1
输出:
start . time , end . time , data . consumed . in . MB , MB . sec , data . consumed . in . nMsg , nMsg . sec , rebalance . time . ms , fetch . time . ms , fetch . MB . sec , fetch . nMsg . sec
2018 – 12 – 06 06 : 35 : 54 : 143 , 2018 – 12 – 06 06 : 38 : 05 : 585 , 9536.9539 , 72.5564 , 10000221 , 76080.8646 , 39 , 131403 , 72.5779 ,
76103.4451
kafka-consumer-perf-test.sh 脚本命令的参数为:–broker-list 指定kafka的链接信息,本例为localhost:9092 –topic 指定topic的名称,本例为test_perf,即4.2.1中写入的消息;–fetch-size 指定每次fetch的数据的大小,本例为1048576,也就是1M –messages 总共要消费的消息个数,本例为1000000,100w
以本例中消费100w条MQ消息为例总共消费了954.07M的数据,每秒消费数据大小为49.52M,总共消费了1000421条消息,每秒消费51932.15条消息。
五、结果分析
一般写入MQ消息设置5000条/秒时,消息延迟时间小于等于1ms,在可接受范围内,说明消息写入及时。
Kafka消费MQ消息时,1000W待处理消息的处理能力如果在每秒20w条以上,那么处理结果是理想的。
根据Kafka处理10w、100w和1000w级的消息时的处理能力,可以评估出Kafka集群服务,是否有能力处理上亿级别的消息。
本次测试是在单台服务器上进行,基本不需要考虑网络带宽的影响。所以单台服务器的测试结果,对评估集群服务是否满足上线后实际应用的需求,很有参考价值。