安裝配置¶
原文: 安装配置
介紹¶
Node Exporter 是用於暴露 *NIX 主機指標的 Exporter,比如採集 CPU、內存、磁盤等信息。採用 Go 編寫,不存在任何第三方依賴,所以只需要下載解壓即可運行。
安装¶
由於 Node Exporter 是一個獨立的二進製文件,可以直接從 Prometheus 下載頁面 下載解壓運行:
$ wget https://github.com/prometheus/node_exporter/releases/download/v1.4.0-rc.0/node_exporter-1.4.0-rc.0.linux-amd64.tar.gz
$ tar -xvf node_exporter-1.4.0-rc.0.linux-amd64.tar.gz
node_exporter-1.4.0-rc.0.linux-amd64/
node_exporter-1.4.0-rc.0.linux-amd64/LICENSE
node_exporter-1.4.0-rc.0.linux-amd64/NOTICE
node_exporter-1.4.0-rc.0.linux-amd64/node_exporter
$ cd node_exporter-1.4.0-rc.0.linux-amd64 && ls -la
total 18604
drwxr-xr-x 2 dxlab dxlab 4096 Jul 28 00:19 .
drwxrwxr-x 6 dxlab dxlab 4096 Sep 25 07:32 ..
-rw-r--r-- 1 dxlab dxlab 11357 Jul 28 00:19 LICENSE
-rwxr-xr-x 1 dxlab dxlab 19024432 Jul 28 00:15 node_exporter
-rw-r--r-- 1 dxlab dxlab 463 Jul 28 00:19 NOTICE
配置¶
Node Exporter 下載解壓後,可以通過下面的命令運行:
結果:
ts=2022-09-24T23:35:46.243Z caller=node_exporter.go:182 level=info msg="Starting node_exporter" version="(version=1.4.0-rc.0, branch=HEAD, revision=73dabdfe9eb9a46a5349a486ebf4bb75e9660f29)"
ts=2022-09-24T23:35:46.243Z caller=node_exporter.go:183 level=info msg="Build context" build_context="(go=go1.18.4, user=root@eb4ea82c806e, date=20220727-16:14:50)"
ts=2022-09-24T23:35:46.244Z caller=diskstats_common.go:100 level=info collector=diskstats msg="Parsed flag --collector.diskstats.device-exclude" flag=^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$
ts=2022-09-24T23:35:46.244Z caller=filesystem_common.go:111 level=info collector=filesystem msg="Parsed flag --collector.filesystem.mount-points-exclude" flag=^/(dev|proc|run/credentials/.+|sys|var/lib/docker/.+|var/lib/containers/storage/.+)($|/)
ts=2022-09-24T23:35:46.244Z caller=filesystem_common.go:113 level=info collector=filesystem msg="Parsed flag --collector.filesystem.fs-types-exclude" flag=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:108 level=info msg="Enabled collectors"
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=arp
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=bcache
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=bonding
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=btrfs
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=conntrack
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=cpu
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=cpufreq
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=diskstats
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=dmi
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=edac
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=entropy
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=fibrechannel
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=filefd
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=filesystem
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=hwmon
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=infiniband
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=ipvs
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=loadavg
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=mdadm
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=meminfo
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=netclass
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=netdev
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=netstat
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=nfs
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=nfsd
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=nvme
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=os
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=powersupplyclass
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=pressure
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=rapl
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=schedstat
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=selinux
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=sockstat
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=softnet
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=stat
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=tapestats
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=textfile
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=thermal_zone
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=time
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=timex
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=udp_queues
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=uname
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=vmstat
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=xfs
ts=2022-09-24T23:35:46.246Z caller=node_exporter.go:115 level=info collector=zfs
ts=2022-09-24T23:35:46.247Z caller=node_exporter.go:199 level=info msg="Listening on" address=:9100
ts=2022-09-24T23:35:46.247Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false
從日誌上可以看出 node_exporter 監聽在 9100 端口上,默認的 metrics 接口通過 /metrics
端點暴露,
我們可以通過訪問 http://localhost:9100/metrics
來獲取監控指標數據:
該 metrics 接口數據就是一個標準的 Prometheus 監控指標格式,我們只需要將該端點配置到 Prometheus 中即可抓取該指標數據。為了了解 node_exporter 可配置的參數,我們可以使用 ./node_exporter -h
來查看幫助信息:
./node_exporter -h
--web.listen-address=":9100" # 監聽的端口,默認是9100
--web.telemetry-path="/metrics" # metrics的路徑,默認為/metrics
--web.disable-exporter-metrics # 是否禁用go、prome默認的metrics
--web.max-requests=40 # 最大並行請求數,默認40,設置為0時不限制
--log.level="info" # 日誌等級: [debug, info, warn, error, fatal]
--log.format=logfmt # 置日誌打印target和格式: [logfmt, json]
--version # 版本號
--collector.{metric-name} # 各個metric對應的參數
......
收集器¶
配置 Node Exporter 其中最重要的參數就是 --collector.<name>
,通過該參數可以啟用我們收集的功能模塊,node_exporter 會默認採集一些模塊,要禁用這些默認啟用的收集器可以通過 --no-collector.<name>
標誌來禁用,如果只啟用某些特定的收集器,基於先使用 --collector.disable-defaults
標誌禁用所有默認的,然後在通過指定具體的收集器 --collector.<name>
來進行啟用。
默認啟用收集器¶
下表列出了默認啟用的收集器:
Name | Description | OS |
---|---|---|
arp | Exposes ARP statistics from /proc/net/arp . | Linux |
bcache | Exposes bcache statistics from /sys/fs/bcache/ . | Linux |
bonding | Exposes the number of configured and active slaves of Linux bonding interfaces. | Linux |
btrfs | Exposes btrfs statistics | Linux |
boottime | Exposes system boot time derived from the kern.boottime sysctl. | Darwin, Dragonfly, FreeBSD, NetBSD, OpenBSD, Solaris |
conntrack | Shows conntrack statistics (does nothing if no /proc/sys/net/netfilter/ present). | Linux |
cpu | Exposes CPU statistics | Darwin, Dragonfly, FreeBSD, Linux, Solaris, OpenBSD |
cpufreq | Exposes CPU frequency statistics | Linux, Solaris |
diskstats | Exposes disk I/O statistics. | Darwin, Linux, OpenBSD |
dmi | Expose Desktop Management Interface (DMI) info from /sys/class/dmi/id/ | Linux |
edac | Exposes error detection and correction statistics. | Linux |
entropy | Exposes available entropy. | Linux |
exec | Exposes execution statistics. | Dragonfly, FreeBSD |
fibrechannel | Exposes fibre channel information and statistics from /sys/class/fc_host/ . | Linux |
filefd | Exposes file descriptor statistics from /proc/sys/fs/file-nr . | Linux |
filesystem | Exposes filesystem statistics, such as disk space used. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
hwmon | Expose hardware monitoring and sensor data from /sys/class/hwmon/ . | Linux |
infiniband | Exposes network statistics specific to InfiniBand and Intel OmniPath configurations. | Linux |
ipvs | Exposes IPVS status from /proc/net/ip_vs and stats from /proc/net/ip_vs_stats . | Linux |
loadavg | Exposes load average. | Darwin, Dragonfly, FreeBSD, Linux, NetBSD, OpenBSD, Solaris |
mdadm | Exposes statistics about devices in /proc/mdstat (does nothing if no /proc/mdstat present). | Linux |
meminfo | Exposes memory statistics. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
netclass | Exposes network interface info from /sys/class/net/ | Linux |
netdev | Exposes network interface statistics such as bytes transferred. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
netstat | Exposes network statistics from /proc/net/netstat . This is the same information as netstat -s . | Linux |
nfs | Exposes NFS client statistics from /proc/net/rpc/nfs . This is the same information as nfsstat -c . | Linux |
nfsd | Exposes NFS kernel server statistics from /proc/net/rpc/nfsd . This is the same information as nfsstat -s . | Linux |
nvme | Exposes NVMe info from /sys/class/nvme/ | Linux |
os | Expose OS release info from /etc/os-release or /usr/lib/os-release | any |
powersupplyclass | Exposes Power Supply statistics from /sys/class/power_supply | Linux |
pressure | Exposes pressure stall statistics from /proc/pressure/ . | Linux (kernel 4.20+ and/or CONFIG_PSI) |
rapl | Exposes various statistics from /sys/class/powercap . | Linux |
schedstat | Exposes task scheduler statistics from /proc/schedstat . | Linux |
selinux | Exposes SELinux statistics. | Linux |
sockstat | Exposes various statistics from /proc/net/sockstat . | Linux |
softnet | Exposes statistics from /proc/net/softnet_stat . | Linux |
stat | Exposes various statistics from /proc/stat . This includes boot time, forks and interrupts. | Linux |
tapestats | Exposes statistics from /sys/class/scsi_tape . | Linux |
textfile | Exposes statistics read from local disk. The --collector.textfile.directory flag must be set. | any |
thermal | Exposes thermal statistics like pmset -g therm . | Darwin |
thermal_zone | Exposes thermal zone & cooling device statistics from /sys/class/thermal . | Linux |
time | Exposes the current system time. | any |
timex | Exposes selected adjtimex(2) system call stats. | Linux |
udp_queues | Exposes UDP total lengths of the rx_queue and tx_queue from /proc/net/udp and /proc/net/udp6 . | Linux |
uname | Exposes system information as provided by the uname system call. | Darwin, FreeBSD, Linux, OpenBSD |
vmstat | Exposes statistics from /proc/vmstat . | Linux |
xfs | Exposes XFS runtime statistics. | Linux (kernel 4.4+) |
zfs | Exposes ZFS performance statistics. | FreeBSD, Linux, Solaris |
其它收集器¶
node_exporter 還實現了許多默認沒有啟用的收集器。造成這種情況的原因因不同的 collector 而異,可能包括:
Name | Description | OS |
---|---|---|
buddyinfo | Exposes statistics of memory fragments as reported by /proc/buddyinfo. | Linux |
cgroups | A summary of the number of active and enabled cgroups | Linux |
devstat | Exposes device statistics | Dragonfly, FreeBSD |
drbd | Exposes Distributed Replicated Block Device statistics (to version 8.4) | Linux |
ethtool | Exposes network interface information and network driver statistics equivalent to ethtool , ethtool -S , and ethtool -i . | Linux |
interrupts | Exposes detailed interrupts statistics. | Linux, OpenBSD |
ksmd | Exposes kernel and system statistics from /sys/kernel/mm/ksm . | Linux |
lnstat | Exposes stats from /proc/net/stat/ . | Linux |
logind | Exposes session counts from logind. | Linux |
meminfo_numa | Exposes memory statistics from /proc/meminfo_numa . | Linux |
mountstats | Exposes filesystem statistics from /proc/self/mountstats . Exposes detailed NFS client statistics. | Linux |
network_route | Exposes the routing table as metrics | Linux |
ntp | Exposes local NTP daemon health to check time | any |
perf | Exposes perf based metrics (Warning: Metrics are dependent on kernel configuration and settings). | Linux |
processes | Exposes aggregate process statistics from /proc . | Linux |
qdisc | Exposes queuing discipline statistics | Linux |
runit | Exposes service status from runit. | any |
slabinfo | Exposes slab statistics from /proc/slabinfo . Note that permission of /proc/slabinfo is usually 0400, so set it appropriately. | Linux |
supervisord | Exposes service status from supervisord. | any |
sysctl | Expose sysctl values from /proc/sys . Use --collector.sysctl.include(-info) to configure. | Linux |
systemd | Exposes service and system status from systemd. | Linux |
tcpstat | Exposes TCP connection status information from /proc/net/tcp and /proc/net/tcp6 . (Warning: the current version has potential performance issues in high load situations.) | Linux |
wifi | Exposes WiFi device and station statistics. | Linux |
zoneinfo | Exposes NUMA memory zone metrics. | Linux |
各個系統管理者可針對個自系統的特性與監控的目標來調整 collector 的啟用列表。
Prometheus 配置修改¶
Node Exporter 啟動完成後我們使用配置的方式在之前的 Prometheus 中新增一個 scrape_configs
的抓取任務,來獲取 node_exporter 暴露的指標,範例配置文件如下所示:
global:
scrape_interval: 5s # 抓取频率
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "node-exporter" # 新增 node_exporter 任務
static_configs:
- targets: ["localhost:9100"]
...
...
修改完 prometheus.yml
後重新啟動 Prometheus,在 "Status -> Targets" 的網頁就可觀察到是否順利地抓取到 Node exporter 所暴露的指標: