tischrei 0618989a8a hc_ops
Reviewed-by: Gode, Sebastian <sebastian.gode@t-systems.com>
Co-authored-by: tischrei <tino.schreiber@t-systems.com>
Co-committed-by: tischrei <tino.schreiber@t-systems.com>
2024-02-22 14:55:55 +00:00

41 lines
1.6 KiB
ReStructuredText

.. _epmon_overview:
============================
Endpoint Monitoring overview
============================
EpMon is a standalone python based process targeting every OTC service. It
finds service in the service catalogs and sends GET requests to the configured
endpoints.
Performing extensive tests like provisioning a server is giving a great
coverage, but is usually not something what can be performed very often and
leaves certain gaps on the timescale of monitoring. In order to cover this gap
EpMon component is capable to send GET requests to the given URLs relying on the
API discovery of the OpenStack cloud (perform GET request to /servers or the
compute endpoint). Such requests are cheap and can be performed in the loop, i.e.
every 5 seconds. Latency of those calls, as well as the return codes, are being
captured and sent to the metrics storage.
Currently EpMon configuration is located in system-config:
https://github.com/opentelekomcloud-infra/system-config/blob/main/inventory/service/group_vars/apimon.yaml
(this will change in future once CloudMon will take place)
And defines the query HTTP targets for every single OTC service.
EpMon dashboard provides general availability status of every service definition
from service catalog:
.. image:: training_images/epmon_status_dashboard.jpg
Additionally it provides further details for the endpoints like response times,
detected error codes or no responses at all.
.. image:: training_images/epmon_dashboard_details.jpg
EpMon findings are also reported to Alerta and notifications are sent to Zulip
dedicated topic "apimon_endpoint_monitoring".