通过SQL查询zabbix历史数据
近期由于需要缩减机器的配置,以减少配置,需要导出数 zabbix 的些监控数据,以评估服务器的CPU和内存使用情况,做缩减。
美中不足的 zabbix 的web管理虽然可以查询指定时间的历史数据,但是没有导出报表的功能。
不过官方提供的 API,可以使用API导出。当然也可以查询数据库中的值,来做分析,目前网上也有多小工具,也是基于这两种方法来做的。
对于使用过zabbix的同学,我想应该都知道大概原理,不是通过agent或者snmp 收集数据,之后存储的数据中,同时在前台页面展示。
监控数据主要分为两类,一个是历史数据记录,也就是history之类的表,通过这个表可以查询到相关监控项目的数据。
另外一个就是趋势数据,是zabbix通过计算(数据来源为history_unit)。
获取相对应的主机ID
1SELECT hostid,host FROM hosts WHERE host = 'b-live-kafka-01';
2+--------+-----------------+
3| hostid | host |
4+--------+-----------------+
5| 10302 | b-live-kafka-01 |
6+--------+-----------------+
通过 hostid 查询到相当监控项目的id和名称,由于数据过多,使用 LIKE 过滤一下
1SELECT itemid, key_ FROM items WHERE hostid = '10302' AND key_ LIKE '%cpu%';
2+--------+------------------------------+
3| itemid | key_ |
4+--------+------------------------------+
5| 30869 | system.cpu.intr |
6| 30870 | system.cpu.load[all,avg15] |
7| 30871 | system.cpu.load[all,avg1] |
8| 30872 | system.cpu.load[all,avg5] |
9| 130324 | system.cpu.num |
10| 30873 | system.cpu.switches |
11| 30874 | system.cpu.util[,guest] |
12| 30875 | system.cpu.util[,guest_nice] |
13| 30876 | system.cpu.util[,idle] |
14| 30877 | system.cpu.util[,interrupt] |
15| 30878 | system.cpu.util[,iowait] |
16| 30879 | system.cpu.util[,nice] |
17| 30880 | system.cpu.util[,softirq] |
18| 30881 | system.cpu.util[,steal] |
19| 30882 | system.cpu.util[,system] |
20| 30883 | system.cpu.util[,user] |
21+--------+------------------------------+
2216 rows in set (0.00 sec)
可以看到有很多监控项,由于在历史记录的表中,通过监控项id来查找数据,获取cpu使用率的itemid
1SELECT itemid, key_ FROM items WHERE hostid = '10302' AND key_ = 'system.cpu.util[,idle]';
2+--------+------------------------+
3| itemid | key_ |
4+--------+------------------------+
5| 30876 | system.cpu.util[,idle] |
6+--------+------------------------+
71 row in set (0.00 sec)
通过 itemid 来查找监控记录,
使用 system.cpu.util[,idle], 这个是CPU空闲率,用100减去这个值,查询最前的20条数据
1SELECT FROM_UNIXTIME(clock) AS time,
2 100 - value_min AS Min,
3 100 - value_max AS Max,
4 100 - value_avg AS Avg
5FROM trends WHERE itemid = 30876
6ORDER BY clock DESC LIMIT 20;
7+---------------------+---------+--------+---------+
8| time | Min | Max | Avg |
9+---------------------+---------+--------+---------+
10| 2022-08-28 08:00:00 | 22.0083 | 3.0211 | 9.8478 |
11| 2022-08-28 07:00:00 | 20.9520 | 2.5756 | 7.8344 |
12| 2022-08-28 06:00:00 | 15.0965 | 2.9327 | 7.9170 |
13| 2022-08-28 05:00:00 | 12.4462 | 2.7913 | 5.7385 |
14| 2022-08-28 04:00:00 | 72.5317 | 3.4075 | 8.6970 |
15| 2022-08-28 03:00:00 | 26.3472 | 3.6292 | 7.8114 |
16| 2022-08-28 02:00:00 | 16.1904 | 3.4179 | 9.3516 |
17| 2022-08-28 01:00:00 | 19.1819 | 7.1658 | 10.9278 |
18| 2022-08-28 00:00:00 | 13.1924 | 3.9569 | 8.2620 |
19| 2022-08-27 23:00:00 | 15.2952 | 4.0299 | 7.3377 |
20| 2022-08-27 22:00:00 | 14.8313 | 5.8909 | 9.1118 |
21| 2022-08-27 21:00:00 | 15.3385 | 4.3806 | 7.8885 |
22| 2022-08-27 20:00:00 | 11.9476 | 4.3139 | 7.7731 |
23| 2022-08-27 19:00:00 | 14.0114 | 4.1177 | 9.1797 |
24| 2022-08-27 18:00:00 | 12.0893 | 4.7092 | 8.3353 |
25| 2022-08-27 17:00:00 | 19.0263 | 3.7427 | 11.2143 |
26| 2022-08-27 16:00:00 | 21.2013 | 9.8987 | 13.9991 |
27| 2022-08-27 15:00:00 | 17.9462 | 9.3643 | 12.9202 |
28| 2022-08-27 14:00:00 | 28.8963 | 7.9107 | 11.8968 |
29| 2022-08-27 13:00:00 | 22.9940 | 9.1483 | 14.9234 |
30+---------------------+---------+--------+---------+
3120 rows in set (0.00 sec)
指定时间范围查询。
比如最近30天的CPU的最大值,最小值,平均值,保留两位小数,通过表中 clock 列来限制范围,30天是 2592000 秒。
1SELECT ROUND(MIN(100 - value_min),2) AS Min,
2 ROUND(MAX(100 - value_max),2) AS Max,
3 ROUND(AVG(100 - value_avg),2) AS Avg
4FROM trends
5WHERE
6 itemid = 30876 AND clock >= UNIX_TIMESTAMP() - 2592000;
7+------+-------+-------+
8| Min | Max | Avg |
9+------+-------+-------+
10| 7.86 | 39.48 | 14.69 |
11+------+-------+-------+
121 row in set (0.00 sec)
组合SQL语句,合并查询,
主机名称为 b-live-kafka-01
1SELECT c.host, ROUND(MIN(100 - value_min),2) AS Min,
2 ROUND(MAX(100 - value_max),2) AS Max,
3 ROUND(AVG(100 - value_avg),2) AS Avg
4FROM trends a
5 JOIN items b ON a.itemid=b.itemid AND key_ = 'system.cpu.util[,idle]'
6 JOIN hosts c ON c.hostid=b.hostid AND host = 'b-live-kafka-01'
7WHERE a.clock >= UNIX_TIMESTAMP() - 2592000;
8+-----------------+------+-------+-------+
9| host | Min | Max | Avg |
10+-----------------+------+-------+-------+
11| b-live-kafka-01 | 7.86 | 39.48 | 14.67 |
12+-----------------+------+-------+-------+
至此,最近30天的数据,已经查询到了,内存使用率的方法也是一样,区别就是表不一样,CPU的在历史数据在趋势表(trends)中,内存的历史数据在history_uint。
查询最近30天的内存使用
zabbix 在数据库中,并没有存储内存的百分比,所以需要通过 已用内存 / 总内存 *100 来计算百分比使用率。
主机名还是为 b-live-kafka-01
1SELECT a.host, ROUND(Max / total_value * 100, 2) AS MEM FROM (
2 SELECT c.host,
3 MAX(a.value) AS Max
4 FROM history_uint a
5 join items b ON a.itemid = b.itemid
6 join hosts c ON c.hostid = b.hostid
7 WHERE b.key_ = 'vm.memory.size[available]'
8 AND c.host = 'b-live-kafka-01'
9 AND a.clock >= UNIX_TIMESTAMP() - (86400 * 30)
10) a JOIN (
11 SELECT c.host,
12 a.value AS total_value
13 FROM history_uint a
14 join items b ON a.itemid = b.itemid
15 join hosts c ON c.hostid = b.hostid
16 WHERE b.key_ = 'vm.memory.size[total]'
17 AND c.host = 'b-live-kafka-01'
18 ORDER BY clock DESC LIMIT 1
19) b ON a.host = b.host;
20+-----------------+-------+
21| host | MEM |
22+-----------------+-------+
23| b-live-kafka-01 | 32.96 |
24+-----------------+-------+
251 row in set (0.06 sec)
查询指定主机组内的主机使用率,时间也是最近30天
1SELECT FROM_UNIXTIME(t.clock) AS time,
2 grp.name AS GroupName,
3 h.host as Host,
4 100 - t.value_avg AS CpuAVG
5FROM hosts_groups a
6 JOIN hstgrp grp ON grp.groupid = a.groupid
7 JOIN items b on b.hostid = a.hostid
8 JOIN hosts h ON h.hostid = b.hostid
9 JOIN trends t ON t.itemid = b.itemid
10WHERE grp.name = 'common-group'
11 AND b.key_ = 'system.cpu.util[,idle]'
12 AND clock <= UNIX_TIMESTAMP() AND clock >= UNIX_TIMESTAMP() - 2592000
13 GROUP BY h.host,t.clock ORDER BY t.clock DESC LIMIT 5;
14+---------------------+-------------------+-------+---------+
15| time | GroupName | Host | CpuAVG |
16+---------------------+-------------------+-------+---------+
17| 2022-08-28 23:00:00 | common-group | zk01 | 3.0861 |
18| 2022-08-28 23:00:00 | common-group | rabbitmq01 | 5.0272 |
19| 2022-08-28 23:00:00 | common-group | monitor | 2.6925 |
20| 2022-08-28 23:00:00 | common-group | zk03 | 3.2457 |
21| 2022-08-28 23:00:00 | common-group | kafka01 | 3.5372 |
22+---------------------+-------------------+-------+---------+