You've already forked nrpe
6.3 KiB
6.3 KiB
NRPE Ansible Role
This Ansible role installs and configures NRPE plugins for monitoring various system and service metrics.
Features
- Deploys custom NRPE checks
- Configures sudoers for checks requiring root privileges
Supported Services
- load
- memory
- disk usage
- disk read-only
- network bandwidth
- dns
- docker
- exim mailqueue
- postfix mailqueue
- needrestart
- process age & zombies
- systemd specific services
- systemd failed services
- mysql
- postgresql
- redis
- kubernetes -- etcd health -- API server access -- deployments -- jobs & cronjobs -- pki certs -- pod restarts -- pv & pvc -- replicasets
- raid -- mdadm -- 3ware
Available Checks
The following checks are deployed to /usr/lib/nagios/plugins/ (or configured path):
check_3warecheck_cilium_healthcheck_coredns_healthcheck_disk_usagecheck_dnscheck_dockercheck_etcd_healthcheck_ethcheck_exim_mailqueuecheck_k8s_apiserver_accesscheck_k8s_deploymentscheck_k8s_jobs_cronjobscheck_k8s_pki_certscheck_k8s_pod_restartscheck_k8s_pv_pvccheck_k8s_replicasetscheck_mdadmcheck_memorycheck_mysql_longqueriescheck_needrestartcheck_postfix_mailqueuecheck_postgresqlcheck_proc_agecheck_redis_healthcheck_rofscheck_systemd_failedcheck_systemd_service
Role Variables
| Variable | Default | Related Check | Description |
|---|---|---|---|
nrpe_allowed_hosts |
127.0.0.1,51.158.69.165,49.12.224.53 |
NRPE Config | Allowed hosts to connect to NRPE daemon. |
nrpe_load_warning |
{{ ansible_processor_cores }} |
check_load |
Warning threshold for system load (1min, 5min, 15min). |
nrpe_load_critical |
{{ ansible_processor_cores * 2 }} |
check_load |
Critical threshold for system load. |
nrpe_check_total_procs_warning |
500 |
check_procs |
Warning threshold for total processes count. |
nrpe_check_total_procs_critical |
800 |
check_procs |
Critical threshold for total processes count. |
nrpe_check_zombie_procs_warning |
5 |
check_procs |
Warning threshold for zombie processes. |
nrpe_check_zombie_procs_critical |
10 |
check_procs |
Critical threshold for zombie processes. |
nrpe_disk_usage_warning |
80 |
check_disk_usage |
Warning threshold for disk usage (%). |
nrpe_disk_usage_critical |
90 |
check_disk_usage |
Critical threshold for disk usage (%). |
nrpe_disk_inode_warning |
80 |
check_disk_usage |
Warning threshold for inode usage (%). |
nrpe_disk_inode_critical |
90 |
check_disk_usage |
Critical threshold for inode usage (%). |
nrpe_memory_warning |
80 |
check_memory |
Warning threshold for memory usage (%). |
nrpe_memory_critical |
90 |
check_memory |
Critical threshold for memory usage (%). |
nrpe_swap_warning |
70 |
check_swap |
Warning threshold for swap usage (%). |
nrpe_swap_critical |
80 |
check_swap |
Critical threshold for swap usage (%). |
nrpe_mailq_warning |
10 |
check_postfix_mailqueue, check_exim_mailqueue |
Warning threshold for mail queue size. |
nrpe_mailq_critical |
20 |
check_postfix_mailqueue, check_exim_mailqueue |
Critical threshold for mail queue size. |
nrpe_smtp_host |
localhost |
check_smtp |
Host to check for SMTP service. |
nrpe_bandwidth_warning |
12M |
check_eth |
Warning threshold for bandwidth usage. |
nrpe_bandwidth_critical |
15M |
check_eth |
Critical threshold for bandwidth usage. |
nrpe_postgresql_host |
localhost |
check_postgresql |
PostgreSQL host. |
nrpe_postgresql_port |
5432 |
check_postgresql |
PostgreSQL port. |
nrpe_postgresql_user |
nagios |
check_postgresql |
PostgreSQL user. |
nrpe_postgresql_password |
changeme_ |
check_postgresql |
PostgreSQL password. |
nrpe_postgresql_backend_warning |
75 |
check_postgresql |
Warning threshold for backend connections (%). |
nrpe_postgresql_backend_critical |
90 |
check_postgresql |
Critical threshold for backend connections (%). |
nrpe_mysql_host |
localhost |
check_mysql_longqueries |
MySQL host. |
nrpe_mysql_user |
nagios |
check_mysql_longqueries |
MySQL user. |
nrpe_mysql_password |
changeme_ |
check_mysql_longqueries |
MySQL password. |
nrpe_mysql_longqueries_warning |
600 |
check_mysql_longqueries |
Warning threshold for long running queries (seconds). |
nrpe_mysql_longqueries_critical |
1200 |
check_mysql_longqueries |
Critical threshold for long running queries (seconds). |
nrpe_proc_age_warning |
400 |
check_proc_age |
Warning threshold for process age (seconds). |
nrpe_proc_age_critical |
600 |
check_proc_age |
Critical threshold for process age (seconds). |
nrpe_redis_memory_warning |
80 |
check_redis_health |
Warning threshold for Redis memory usage (%). |
nrpe_redis_memory_critical |
90 |
check_redis_health |
Critical threshold for Redis memory usage (%). |
nrpe_redis_connected_clients_warning |
200 |
check_redis_health |
Warning threshold for connected clients. |
nrpe_redis_connected_clients_critical |
500 |
check_redis_health |
Critical threshold for connected clients. |
nrpe_redis_hitrate_warning |
80 |
check_redis_health |
Warning threshold for cache hit rate (%). |
nrpe_redis_hitrate_critical |
50 |
check_redis_health |
Critical threshold for cache hit rate (%). |
nrpe_redis_fragments_warning |
1.5 |
check_redis_health |
Warning threshold for fragmentation ratio. |
nrpe_redis_fragments_critical |
2.0 |
check_redis_health |
Critical threshold for fragmentation ratio. |
nrpe_redis_replication_lag_warning |
10 |
check_redis_health |
Warning threshold for replication lag (seconds). |
nrpe_redis_replication_lag_critical |
60 |
check_redis_health |
Critical threshold for replication lag (seconds). |
Example Playbooks
Basic Usage
---
- hosts: all
roles:
- nrpe
Custom Configuration
---
- hosts: database_servers
roles:
- role: nrpe
vars:
nrpe_allowed_hosts: '127.0.0.1,10.0.0.5'
nrpe_load_warning: 2
nrpe_load_critical: 4
nrpe_memory_warning: 75
nrpe_memory_critical: 85
nrpe_disk_usage_warning: 70
nrpe_disk_usage_critical: 85
License
MIT