Reviewed-by: gtema <artem.goncharov@gmail.com> Co-authored-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-committed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
1.6 KiB
Endpoint Monitoring overview
EpMon is a standalone python based process targeting every OTC service. It finds service in the service catalogs and sends GET requests to the configured endpoints.
Performing extensive tests like provisioning a server is giving a great coverage, but is usually not something what can be performed very often and leaves certain gaps on the timescale of monitoring. In order to cover this gap EpMon component is capable to send GET requests to the given URLs relying on the API discovery of the OpenStack cloud (perform GET request to /servers or the compute endpoint). Such requests are cheap and can be performed in the loop, i.e. every 5 seconds. Latency of those calls, as well as the return codes, are being captured and sent to the metrics storage.
Currently EpMon configuration is located in system-config: https://github.com/opentelekomcloud-infra/system-config/blob/main/inventory/service/group_vars/apimon.yaml (this will change in future once CloudMon will take place)
And defines the query HTTP targets for every single OTC service.
EpMon dashboard provides general availability status of every service definition from service catalog:
Additionally it provides further details for the endpoints like response times, detected error codes or no responses at all.
EpMon findings are also reported to Alerta and notifications are sent to Zulip dedicated topic "apimon_endpoint_monitoring".