prometheus+node exporter+alertmanager+grafana监控平台部署
目录
- 安装配置Prometheus服务器
- 安装配置node_exporter
- 安装Grafana展示工具
- 安装配置Alertmanager插件
Prometheus安装
- 系统:CentOS7
为了安全,我们这里不用root用户启动相关服务,或者用我们自建的prometheus用户启动服务,首先需要创建一个用户:
$ groupadd prometheus $ useradd -g prometheus -M -s /sbin/nologin prometheus
下载prometheus压缩包
wget https://github.com/prometheus/prometheus/releases/download/v2.14.0/prometheus-2.14.0.linux-amd64.tar.gz
解压并安装prometheus服务:
tar xf prometheus-2.14.0.linux-amd64.tar.gz -C /srv/ $ cd /srv/ $ mv prometheus-2.7.1.linux-amd64/ prometheus $ mkdir -pv /srv/prometheus/data $ chown -R prometheus.prometheus /srv/prometheus
创建prometheus系统服务启动文件/usr/lib/systemd/system/prometheus.service:
[Unit] Description=Prometheus Server Documentation=https://prometheus.io/docs/introduction/overview/ After=network-online.target [Service] User=prometheus Restart=on-failure ExecStart=/srv/prometheus/prometheus \ --config.file=/srv/prometheus/prometheus.yml \ --storage.tsdb.path=/srv/prometheus/data ExecReload=/bin/kill -HUP $MAINPID [Install] WantedBy=multi-user.target
完整普罗米修斯系统服务启动文件参见:prometheus.service
修改prometheus配置文件/srv/prometheus/prometheus.yml:
global: scrape_interval: 15s evaluation_interval: 15s alerting: alertmanagers: - static_configs: - targets: ["localhost:9093"] rule_files: #- "alert.rules" scrape_configs: - job_name: 'prometheus' scrape_interval: 5s static_configs: - targets: ['localhost:9090'] - job_name: 'node' scrape_interval: 10s static_configs: - targets: ['要监控主机1ip:9100','监控主机2ip:9100'] #多个个主机用,分开
完整的prometheus配置文件可以参见:prometheus.yml
启动服务命令(依次执行):
$ systemctl daemon-reload $ systemctl start prometheus.service $ systemctl enable prometheus.service $ systemctl status prometheus.service
Prometheus服务支持热加载配置:
$ systemctl reload prometheus.service
Prometheus服务启动完成后,可以通过http:// localhost:9090访问Prometheus的UI界面。
安装配置node_exporter
为监控服务器CPU,内存,磁盘,I / O等信息,需要在监控机器上安装node_exporter服务。
首先我们需要从node_exporter下载页下载我们需要安装的版本,这里我们选择则安装的node_exporter版本是v0.17.0的最新版本。
wget https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz
解压并安装node_exporter服务:
$ tar xf /opt/soft/node_exporter-0.17.0.linux-amd64.tar.gz -C /srv/ $ cd /srv/ $ mv node_exporter-0.17.0.linux-amd64/ node_exporter $ chown -R prometheus.prometheus /srv/node_exporter
创建node_exporter系统服务启动文件 /usr/lib/systemd/system/node_exporter.service
#Prometheus Node Exporter Upstart script [Unit] Description=Node Exporter Wants=network-online.target After=network-online.target [Service] User=prometheus ExecStart=/srv/node_exporter/node_exporter [Install] WantedBy=default.target
完整node_exporter系统服务启动文件参见:node_exporter.service
启动node_exporter服务:
$ systemctl daemon-reload $ systemctl enable node_exporter $ systemctl start node_exporter $ systemctl status node_exporter
服务启动后可以用http:// 被监控主机ip:9100 / metrics测试node_exporter是否获取到路由器的监控指标。如果可以正常获取到上游的指标后,我们可以将node_exporter整合到prometheus中,具体如下:
修改prometheus的配置文件/srv/prometheus/prometheus.yml,增加如下内容:
scrape_configs: ... - job_name: 'node' scrape_interval: 10s static_configs: - targets: ['localhost:9100']
之前的prometheus配置文件已经做过修改了,这里只是提及一下
重启Prometheus服务:
systemctl reload prometheus.service
安装Grafana展示工具
首先,需要准备grafana的repo源,手动添加/etc/yum.repos.d/grafana.repo文件:
[grafana] name=grafana baseurl=https://packages.grafana.com/oss/rpm repo_gpgcheck=1 enabled=1 gpgcheck=1 gpgkey=https://packages.grafana.com/gpg.key sslverify=1 sslcacert=/etc/pki/tls/certs/ca-bundle.crt
可参考官方文档:grafana
然后就可以用yum安装grafana了:
$ yum makecache $ yum -y install grafana
等待安装完成后就可以启动服务了:
$ systemctl enable grafana-server $ systemctl start grafana-server
登录grafana
浏览器访问:http://localhost:3000,默认账号密码 admin/admin
添加数据源
在登陆首页,点击"Configuration-Data Sources"按钮,跳转到添加数据源页面,配置如下:Name: prometheusType: prometheusURL: http://localhost:9090/Access: Server取消Default的勾选,其余默认,点击"Add",如下:
导入dashboard
从grafana官网下载相关dashboard到本地,如:https://grafana.com/dashboards/8919
Upload已下载至本地的json文件
Grafana.com Dashboard输入grafana官网的Dashboard链接(如:https://grafana.com/dashboards/1860)
可以下载使用upload上传,也可不下载直接复制链接
import导入即可
部署Alertmanager 钉钉报警
1. 下载&安装
$ wget https://github.com/prometheus/alertmanager/releases/download/v0.15.2/alertmanager-0.15.2.linux-amd64.tar.gz $ tar zxf alertmanager-0.15.2.linux-amd64.tar.gz $ mv alertmanager-0.15.2.linux-amd64.tar.gz /srv/alertmanager
配置文件
alertmanager的webhook集成了钉钉报警,所以他不是本来就有的。钉钉对格式要求很严格,一会还需要使用插件进行格式转换 。
vim /srv/alerlmanager/alertmanager.yml
global: resolve_timeout: 5m route: receiver: webhook group_wait: 30s group_interval: 5m repeat_interval: 4h group_by: [alertname] routes: - receiver: webhook group_wait: 10s match: team: node receivers: - name: webhook webhook_configs: - url: http://localhost:8060/dingtalk/ops_dingding/send send_resolved: true
启动alertmanager
$ nohup ./alertmanager --config.file=alertmanager.yml 2>&1 1>altermanager.log & #查看端口: $ netstat -anpt | grep 9093
报警规则
监控主机是否存活
cd /usr/local/prometheus cat rules.yml groups: - name: test-rule rules: - alert: 主机状态 expr: up == 0 for: 2m labels: status: warning annotations: summary: "{{$labels.instance}}:服务器关闭" description: "{{$labels.instance}}:服务器关闭"
修改prometheus配置文件
修改alerting和rule_file
rule_files可以指定多个规
在这里插入代码片
将钉钉接入 Prometheus AlertManager WebHook
参考文档:http://theo.im/blog/2017/10/16/release-prometheus-alertmanager-webhook-for-dingtalk/插件下载地址:https://github.com/timonwong/prometheus-webhook-dingtalk
安装
把主机名换成主机ip,为报警方便提供url
$ mkdir -p /usr/lib/golang/src/github.com/timonwong/ $ cd /usr/lib/golang/src/github.com/timonwong/ $ git clone https://github.com/timonwong/prometheus-webhook-dingtalk.git $ cd prometheus-webhook-dingtalk $ make(出错不要管他)
启动
不会加机器人的去网上搜ding.profile是钉钉机器人的webhook
nohup ./prometheus-webhook-dingtalk --ding.profile="ops_dingding=https://oapi.dingtalk.com/robot/send?access_token=xxx" 2>&1 1>dingding.log &
测试
再启动exporter,已经恢复
- prometheus+grafana+alertmanager搭建服务器告警监控平台
- prometheus+grafana+node_exporter简单部署linux服务器监控
- Centos7.X 搭建Prometheus+node_exporter+Grafana实时监控平台
- 部署prometheus+alertmanager监控平台
- Kubernetes监控方案之Grafana + Alertmanager + Prometheus
- Prometheus+ Grafana 监控系列---node_exporter采集与Grafana 显示
- Docker+ cadvisor+Prometheus+Grafana监控部署实践
- centos7下部署 Prometheus+Grafana超炫监控
- [k8s]prometheus+grafana监控node和mysql(普罗/grafana均vm安装)
- Prometheus+Grafana打造Mysql监控平台
- redis指标监控可视化搭建:redis_exporter + prometheus +grafana
- Prometheus+ Grafana 监控系列---node_exporte 安装配置
- prometheus + mysqld_exporter + grafana 实现对mysql db的监控
- prometheus+node_exporter监控系统搭建
- xcopy艾高贝9使用Prometheus+grafana打造高逼格监控平台
- pika主备部署(redis-sentinel)+监控(prometheus+grafana)
- 使用Prometheus+grafana打造高逼格监控平台 推荐
- Spring Boot 应用可视化监控(Prometheus + Grafana)
- 使用 prometheus + grafana 监控 ceph 集群
- Docker实践(八):Prometheus + Grafana + 钉钉搭建监控告警系统