One of the most attractive services to complement your monitoring tools is the Prometheus Anomaly Detector service. This service excels in detecting anomalies based on Prometheus data.
You might be wondering about the advantages of using this service. It assists the NOC-DevOps team in intelligently identifying anomalies within the monitoring system, resulting in smart alerts (Smart Alerting) based on the detected data. This capability significantly enhances operational efficiency and responsiveness.
Manually checking all graphs and data from the monitoring system is challenging and unsustainable. In contrast, the Prometheus Anomaly Detector continuously monitors system behavior. Fixed threshold alarms are often inadequate for many services. Therefore, this service learns service behaviors and adjusts the anomaly detection intervals flexibly.
Anomalies can include:
Timely detection and notification of anomalies can significantly reduce problem resolution time and improve the technical team’s performance during incidents.
You can access the project source at GitHub. We have provided a Docker-Compose file for easy setup and management. All environment variables are thoroughly explained in the project’s GitHub repository. Additionally, a comprehensive description of the data processing model is available.
version: "3.1"
services:
pad:
image: quay.io/aicoe/prometheus-anomaly-detector:latest
ports:
- 8080:8080
environment:
FLT_PROM_URL: "http://prometheus"
FLT_RETRAINING_INTERVAL_MINUTES: 1
FLT_METRICS_LIST: 'example promql query'
APP_FILE: "app.py"
FLT_DATA_START_TIME: "3d"
FLT_ROLLING_TRAINING_WINDOW_SIZE: "15d"
groups:
- name: example
rules:
- record: job:http_inprogress_requests:sum
expr: sum by (job) (http_inprogress_requests)
- alert: transaction-anomaly
expr: avg(transactions:rate5m) / avg(transactions:rate5m_prophet{value_type="yhat"}) * 100 - 100 > 10 or transactions:rate5m_prophet{value_type="anomaly"} == 1
for: 10s
labels:
severity: page
annotations:
summary: " an anomaly detected on database transactions "
message: " transaction on the database is {{$value}}% upper than normal value "
These smart alarms notify you of ongoing anomalies. For dashboard creation, it is advisable to display both the main data and the anomaly detection data. This approach allows for assessing the reliability and accuracy of predicted values and monitoring how the system’s current behavior aligns with the expected range.
AI specialists and colleagues can further enhance this topic. Many organizations have achieved significant success in this area and developed similar products. The concept of Anomaly Detection and AI in DevOps tools, especially Monitoring, is extensive, offering many features beyond those covered by this service.
By leveraging the Prometheus Anomaly Detector, you can elevate your monitoring capabilities, ensuring proactive incident management and optimized system performance.