You've already forked nrpe
README - add documentation
This commit is contained in:
151
README.md
151
README.md
@@ -1,2 +1,151 @@
|
||||
# nrpe
|
||||
# NRPE Ansible Role
|
||||
|
||||
This Ansible role installs and configures NRPE plugins for monitoring various system and service metrics.
|
||||
|
||||
## Features
|
||||
|
||||
- Deploys custom NRPE checks
|
||||
- Configures sudoers for checks requiring root privileges
|
||||
|
||||
## Supported Services
|
||||
|
||||
- load
|
||||
- memory
|
||||
- disk usage
|
||||
- disk read-only
|
||||
- network bandwidth
|
||||
- dns
|
||||
- docker
|
||||
- exim mailqueue
|
||||
- postfix mailqueue
|
||||
- needrestart
|
||||
- process age & zombies
|
||||
- systemd specific services
|
||||
- systemd failed services
|
||||
- mysql
|
||||
- postgresql
|
||||
- redis
|
||||
- kubernetes
|
||||
-- etcd health
|
||||
-- API server access
|
||||
-- deployments
|
||||
-- jobs & cronjobs
|
||||
-- pki certs
|
||||
-- pod restarts
|
||||
-- pv & pvc
|
||||
-- replicasets
|
||||
- raid
|
||||
-- mdadm
|
||||
-- 3ware
|
||||
|
||||
## Available Checks
|
||||
|
||||
The following checks are deployed to `/usr/lib/nagios/plugins/` (or configured path):
|
||||
|
||||
- `check_3ware`
|
||||
- `check_cilium_health`
|
||||
- `check_coredns_health`
|
||||
- `check_disk_usage`
|
||||
- `check_dns`
|
||||
- `check_docker`
|
||||
- `check_etcd_health`
|
||||
- `check_eth`
|
||||
- `check_exim_mailqueue`
|
||||
- `check_k8s_apiserver_access`
|
||||
- `check_k8s_deployments`
|
||||
- `check_k8s_jobs_cronjobs`
|
||||
- `check_k8s_pki_certs`
|
||||
- `check_k8s_pod_restarts`
|
||||
- `check_k8s_pv_pvc`
|
||||
- `check_k8s_replicasets`
|
||||
- `check_mdadm`
|
||||
- `check_memory`
|
||||
- `check_mysql_longqueries`
|
||||
- `check_needrestart`
|
||||
- `check_postfix_mailqueue`
|
||||
- `check_postgresql`
|
||||
- `check_proc_age`
|
||||
- `check_redis_health`
|
||||
- `check_rofs`
|
||||
- `check_systemd_failed`
|
||||
- `check_systemd_service`
|
||||
|
||||
## Role Variables
|
||||
|
||||
| Variable | Default | Related Check | Description |
|
||||
|----------|---------|---------------|-------------|
|
||||
| `nrpe_allowed_hosts` | `127.0.0.1,51.158.69.165,49.12.224.53` | NRPE Config | Allowed hosts to connect to NRPE daemon. |
|
||||
| `nrpe_load_warning` | `{{ ansible_processor_cores }}` | `check_load` | Warning threshold for system load (1min, 5min, 15min). |
|
||||
| `nrpe_load_critical` | `{{ ansible_processor_cores * 2 }}` | `check_load` | Critical threshold for system load. |
|
||||
| `nrpe_check_total_procs_warning` | `500` | `check_procs` | Warning threshold for total processes count. |
|
||||
| `nrpe_check_total_procs_critical` | `800` | `check_procs` | Critical threshold for total processes count. |
|
||||
| `nrpe_check_zombie_procs_warning` | `5` | `check_procs` | Warning threshold for zombie processes. |
|
||||
| `nrpe_check_zombie_procs_critical` | `10` | `check_procs` | Critical threshold for zombie processes. |
|
||||
| `nrpe_disk_usage_warning` | `80` | `check_disk_usage` | Warning threshold for disk usage (%). |
|
||||
| `nrpe_disk_usage_critical` | `90` | `check_disk_usage` | Critical threshold for disk usage (%). |
|
||||
| `nrpe_disk_inode_warning` | `80` | `check_disk_usage` | Warning threshold for inode usage (%). |
|
||||
| `nrpe_disk_inode_critical` | `90` | `check_disk_usage` | Critical threshold for inode usage (%). |
|
||||
| `nrpe_memory_warning` | `80` | `check_memory` | Warning threshold for memory usage (%). |
|
||||
| `nrpe_memory_critical` | `90` | `check_memory` | Critical threshold for memory usage (%). |
|
||||
| `nrpe_swap_warning` | `70` | `check_swap` | Warning threshold for swap usage (%). |
|
||||
| `nrpe_swap_critical` | `80` | `check_swap` | Critical threshold for swap usage (%). |
|
||||
| `nrpe_mailq_warning` | `10` | `check_postfix_mailqueue`, `check_exim_mailqueue` | Warning threshold for mail queue size. |
|
||||
| `nrpe_mailq_critical` | `20` | `check_postfix_mailqueue`, `check_exim_mailqueue` | Critical threshold for mail queue size. |
|
||||
| `nrpe_smtp_host` | `localhost` | `check_smtp` | Host to check for SMTP service. |
|
||||
| `nrpe_bandwidth_warning` | `12M` | `check_eth` | Warning threshold for bandwidth usage. |
|
||||
| `nrpe_bandwidth_critical` | `15M` | `check_eth` | Critical threshold for bandwidth usage. |
|
||||
| `nrpe_postgresql_host` | `localhost` | `check_postgresql` | PostgreSQL host. |
|
||||
| `nrpe_postgresql_port` | `5432` | `check_postgresql` | PostgreSQL port. |
|
||||
| `nrpe_postgresql_user` | `nagios` | `check_postgresql` | PostgreSQL user. |
|
||||
| `nrpe_postgresql_password` | `changeme_` | `check_postgresql` | PostgreSQL password. |
|
||||
| `nrpe_postgresql_backend_warning` | `75` | `check_postgresql` | Warning threshold for backend connections (%). |
|
||||
| `nrpe_postgresql_backend_critical` | `90` | `check_postgresql` | Critical threshold for backend connections (%). |
|
||||
| `nrpe_mysql_host` | `localhost` | `check_mysql_longqueries` | MySQL host. |
|
||||
| `nrpe_mysql_user` | `nagios` | `check_mysql_longqueries` | MySQL user. |
|
||||
| `nrpe_mysql_password` | `changeme_` | `check_mysql_longqueries` | MySQL password. |
|
||||
| `nrpe_mysql_longqueries_warning` | `600` | `check_mysql_longqueries` | Warning threshold for long running queries (seconds). |
|
||||
| `nrpe_mysql_longqueries_critical` | `1200` | `check_mysql_longqueries` | Critical threshold for long running queries (seconds). |
|
||||
| `nrpe_proc_age_warning` | `400` | `check_proc_age` | Warning threshold for process age (seconds). |
|
||||
| `nrpe_proc_age_critical` | `600` | `check_proc_age` | Critical threshold for process age (seconds). |
|
||||
| `nrpe_redis_memory_warning` | `80` | `check_redis_health` | Warning threshold for Redis memory usage (%). |
|
||||
| `nrpe_redis_memory_critical` | `90` | `check_redis_health` | Critical threshold for Redis memory usage (%). |
|
||||
| `nrpe_redis_connected_clients_warning` | `200` | `check_redis_health` | Warning threshold for connected clients. |
|
||||
| `nrpe_redis_connected_clients_critical` | `500` | `check_redis_health` | Critical threshold for connected clients. |
|
||||
| `nrpe_redis_hitrate_warning` | `80` | `check_redis_health` | Warning threshold for cache hit rate (%). |
|
||||
| `nrpe_redis_hitrate_critical` | `50` | `check_redis_health` | Critical threshold for cache hit rate (%). |
|
||||
| `nrpe_redis_fragments_warning` | `1.5` | `check_redis_health` | Warning threshold for fragmentation ratio. |
|
||||
| `nrpe_redis_fragments_critical` | `2.0` | `check_redis_health` | Critical threshold for fragmentation ratio. |
|
||||
| `nrpe_redis_replication_lag_warning` | `10` | `check_redis_health` | Warning threshold for replication lag (seconds). |
|
||||
| `nrpe_redis_replication_lag_critical` | `60` | `check_redis_health` | Critical threshold for replication lag (seconds). |
|
||||
|
||||
## Example Playbooks
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```yaml
|
||||
---
|
||||
- hosts: all
|
||||
roles:
|
||||
- nrpe
|
||||
```
|
||||
|
||||
### Custom Configuration
|
||||
|
||||
```yaml
|
||||
---
|
||||
- hosts: database_servers
|
||||
roles:
|
||||
- role: nrpe
|
||||
vars:
|
||||
nrpe_allowed_hosts: '127.0.0.1,10.0.0.5'
|
||||
nrpe_load_warning: 2
|
||||
nrpe_load_critical: 4
|
||||
nrpe_memory_warning: 75
|
||||
nrpe_memory_critical: 85
|
||||
nrpe_disk_usage_warning: 70
|
||||
nrpe_disk_usage_critical: 85
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
Reference in New Issue
Block a user