您的位置:首页 > 移动开发 > IOS开发

整合nagios+cacti遇到问题及解决办法

2012-12-18 11:59 381 查看
7. Nagios每日健康检查报警短信
对于没有移动短信网关通道来说,让监控平台每天下午4:00发一条短信,不管有没有故障都发,这样以便管理员能够知道短信报警及nagios服务是否正常。
检查报警的方法如下:
7.1. 编写检查脚本
# cat /root/sh/nagios_check.sh
#!/bin/bash
#auther by Kevin@cmcc.com.cn
#check nagios service
nid=/usr/local/nagios/var/nagios.lock
if [ -f $nid ]
then
/usr/local/nagios/libexec/sms/sendsms.sh 13800000000 "Nagios service is OK, Don't worry it!"
echo -e "nagios service is ok"
else
/etc/init.d/nagios start
/usr/local/nagios/libexec/sms/sendsms.sh 13800000000 " nagios service is restart,It's ok "
fi
7.2. 添加crond计划
# crontab –e 添加如下内容:

00 16 * * * /root/sh/nagios_check.sh > /root/sh/nagios_check.log /dev/null 2>&1
7.3. 配置飞信机器人报警 7.3.1. Commands.cfg配置文件添加如下内容:

#host-notify-by-sms
define command {
command_name host-notify-by-sms
command_line /usr/local/nagios/libexec/sms/sendsms.sh 13800000000 " ** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is AT: $DATE$ $HOSTSTATE$ ** "
}

#service-notify-by-sms
define command {
command_name service-notify-by-sms
command_line /usr/local/nagios/libexec/sms/sendsms.sh 13800000000 " *** $NOTIFICATIONTYPE$ $HOSTNAME$ $DATE$ $TIME$ $SERVICEDESC$ is $SERVICESTATE$ info:$SERVICEOUTPUT$ *** "
}
7.3.2. Contacts.cfg配置添加:

define contact{
contact_name sms-members
use sms-mail-contact
alias Nagios Admin SMS
email admin@139.com
pager 13800000000
}

define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members sms-members
}
7.3.3. Templates.cfg
define contact{
name sms-contact
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r,f,s
host_notification_options d,u,r,f,s
service_notification_commands notify-service-by-sms
host_notification_commands notify-host-by-sms
register 0
}

7.3.4. 修改展示页面监控图片大小: /usr/local/nagios/etc/pnp/config.php
# vim /usr/local/nagios/etc/pnp/config.php
$conf['graph_width'] = "500";
$conf['graph_height'] = "100";
这两行是定义监控页面大小比例的。RRDTool graph Image Size

8. Troubleshooting

8.1. web界面修改某个服务时报错
例如对某个服务进行临时安排其执行时间,或者不让它发警告,web页面上都有这样的设置.但是常常会有错误信息如下:

Could not open command file '/usr/local/nagios/var/rw/nagios.cmd' for update!
The permissions on the external command file and/or directory may be incorrect. Read the FAQs on how to setup proper permissions.
An error occurred while attempting to commit your command for processing.
关于这部分在nagios.cfg中有下面的内容

# EXTERNAL COMMAND FILE
# This is the file that Nagios checks for external command requests.
# It is also where the command CGI will write commands that are submitted
# by users, so it must be writeable by the user that the web server
# is running as (usually 'nobody'). Permissions should be set at the
# directory level instead of on the file, as the file is deleted every
# time its contents are processed.
这段话的核心意思是apache的运行用户要有对文件写的权限.权限应该设置在目录上,因为每次文件的内容被处理后文件就会被删掉

command_file=/usr/local/nagios/var/rw/nagios.cmd
本来将apache2运行的用户apache加到nagios组就应该可以了的
但是这个却不行,就将rw这个目录及其子文件的权限改了777,这样就可以了.
8.2. 点击host,service选项时,结果无法显示
安装nagios后,访问页面可以,点击host,service选项时,都是无法显示。后台日志
报错:
[Wed Sep 01 17:31:32 2010] [error] [client 222.128.103.52] Premature end of script headers: status.cgi, referer: http://public.ipaddr/nagios/side.php [Wed Sep 01 17:31:33 2010] [error] [client 222.128.103.52] (13)Permission denied: exec of '/usr/local/nagios/sbin/status.cgi' failed, referer: http://public.ipaddr/nagios/side.php

解决方法:原因是因为开启了selinux,getenforce
令SELinux处于容许模式
setenforce 0
如果要永久性更变它,需要更改/etc/selinux/config里的设置并重启系统。
不关闭SELinux或是永久性变更它的方法是让CGI模块在SELinux下指定强制目标模式:
chcon -R -t httpd_sys_content_t /usr/local/nagios/sbin/
chcon -R -t httpd_sys_content_t /usr/local/nagios/share/
关闭即可。
8.3. nagios3.2.0以后,安装nagios在访问http://ip/nagios时出现如下错误提示:

解决方法如下:nagios3.2.0将页面从之前的html换成了php,首次安装需要先决条件php
yum install php即可
8.4. 出现pnp小太阳图标,点击报错如下:
Initalising
Using /usr/local/nagios/share/perfdata/
RRDTool /usr/bin/rrdtool found.
RRDTool /usr/bin/rrdtool is executable
PHP Function proc_open is enabled
PHP Function fpassthru is enabled
PHP Function xml_parser_create is enabled
PHP zlib Support found.
PHP GD Support can’t found.

解决方法: yum –y install php-gd
# service httpd restart
再次点击小太阳图标时,出现如下页面,则表示正常:

8.5. 安装NAGIOS时发现有Status Map、Alert Histogram打不开链接,提示找不到statusmap.cgi和histogram.cgi.
解决办法:
原因一:因为gd-devel没有安装的问题,造成NAGIOS在编译时不生成这statusmap.cgi
原因二:NAGIOS在编译在前, gd-devel安装在后,造成不生成这statusmap.cgi

8.6. 后台apache日志报错如下:
# tail -f /etc/httpd/logs/error_log
[Fri Feb 18 19:07:18 2011] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Fri Feb 18 19:07:18 2011] [notice] Digest: generating secret for digest authentication ...
[Fri Feb 18 19:07:18 2011] [notice] Digest: done
[Fri Feb 18 19:07:18 2011] [notice] Apache/2.2.3 (CentOS) configured -- resuming normal operations
[Fri Feb 18 19:07:20 2011] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/
[Fri Feb 18 19:07:42 2011] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/
[Fri Feb 18 19:07:55 2011] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/

监控http服务出现响应超时的情况,如下所示:
# /usr/local/nagios/libexec/check_http -I localhost -w 15 -c 20 -t 30
HTTP WARNING: HTTP/1.1 403 Forbidden - 5240 bytes in 0.003 second response time |time=0.002991s;15.000000;20.000000;0.000000 size=5240B;;;0

解决方法:
# echo -n none > /var/www/html/index.html
8.7. 进行编译安装ndoutils-1.4b7时,报错如下:
#./db/installdb -ucacti -pcacti -d cacti
DBD::mysql::db do failed: Table 'cacti.nagios_dbversion' doesn't exist at ./db/installdb line 51.

命令使用错误,解决方法如下:

# ./installdb -ucacti -pcacti -h localhost -d cacti //加上 –h localhost参数
DBD::mysql::db do failed: Table 'cacti.nagios_dbversion' doesn't exist at ./installdb line 51.
** Creating tables for version 1.4b7
Using mysql.sql for installation...
** Updating table nagios_dbversion
Done!
8.8. 安装后,查看/usr/local/nagios/var/nagios.log日志,报错如下:
#tail –f /usr/local/nagios/var/nagios.log

[1298198680] Error: Could not safely copy module '/usr/local/nagios/bin/ndomod.o'. The module will not be loaded: No such file or directory
[1298202280] Auto-save of retention data completed successfully.

原因为:前面安装ndoutils-1.4b7,少了一个操作步骤。解决办法如下:
# mv /usr/local/nagios/bin/ndomod-3x.o /usr/local/nagios/bin/ndomod.o //新添加

正确的日志如下:

# tail -f /usr/local/nagios/var/nagios.log
[1298346735] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1298346735] Nagios 3.2.1 starting... (PID=13489)
[1298346735] Local time is Tue Feb 22 11:52:15 CST 2011
[1298346735] LOG VERSION: 2.0
[1298346735] ndomod: NDOMOD 1.4b9 (10-27-2009) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1298346735] ndomod: Successfully connected to data sink. 0 queued items to flush.
[1298346735] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[1298350335] Auto-save of retention data completed successfully.
[1298353935] Auto-save of retention data completed successfully.
[1298357535] Auto-save of retention data completed successfully.

8.9. 有时开机后,后台报错如下:
# tail -f /usr/local/nagios/var/nagios.log

[1298439477] ndomod: Still unable to connect to data sink. 23512 items lost, 5000 queued items to flush.
[1298439493] ndomod: Still unable to connect to data sink. 23590 items lost, 5000 queued items to flush.

以上报错一般是由于ndo2db没有启动,手动启动即可:
#/usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg 启动ndo2db

8.10.访问npc插件页面时,主机图标为红色叉号:
解决办法如下:

# cp -r /usr/local/nagios/share/images/logos/logo.gif /var/www/html/cacti/plugins/npc/logo.gif
重新刷新页面即可解决问题。正常页面为:

8.11.访问点击小太阳后,报错如下:
Hostnane is not set:是pnp的提示,pnp需要以以下方式访问index.php?host=$HOSTNAME$&srv=$SERVICEDESC$ 或者index.php?host=$HOSTNAME;
而通过脚本推送时,变量发生了变化,生成的文件如下:
#define_host
define host {
name host-pnp
register 0
process_perf_data 1
action_url /nagios/pnp/index.php?host=nagios.com.cn$ 这样不正确的
action_url /nagios/pnp/index.php?host=$HOSTNAME$ //这是正确格式
}
#define_service
define service {
name srv-pnp
register 0
process_perf_data 1
action_url /nagios/pnp/index.php?host=nagios.com.cn$&srv=$ 这样是不正确的
action_url /nagios/pnp/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$ //正确格式
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  办法