Reviewed-by: Gode, Sebastian <sebastian.gode@t-systems.com> Co-authored-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-committed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
3.4 KiB
Endpoint Monitoring overview
EpMon is a standalone python based process targeting every OTC service. It finds service in the service catalogs and sends GET requests to the configured endpoints.
Performing extensive tests like provisioning a server is giving a great coverage, but is usually not something what can be performed very often and leaves certain gaps on the timescale of monitoring. In order to cover this gap EpMon component is capable to send GET requests to the given URLs relying on the API discovery of the OpenStack cloud (perform GET request to /servers or the compute endpoint). Such requests are cheap and can be performed in the loop, i.e. every 5 seconds. Latency of those calls, as well as the return codes, are being captured and sent to the metrics storage.
Currently EpMon configuration is located in stackmon-config: https://github.com/opentelekomcloud-infra/stackmon-config/blob/main/epmon/config.yaml
And defines the query HTTP targets (urls) for every single OTC service.
Service entry in OTC Service Catalog (https://git.tsi-dev.otc-service.com/ecosystem/service_catalog) is a prerequisite to enable service to be queried by EpMon. If there are multiple entries in service catalog, such service entries can be marked for skip in case they are obsolete. EpMon config.yaml only defines the service queries but doesn't say how and when to use them. For actual use across different monitoring sources and targets the configuration matrix is defined in: https://github.com/opentelekomcloud-infra/stackmon-config/blob/main/config.yaml
In the following example autoscaling service confiration in EpMon is shown:
as:
service_type: as
sdk_proxy: auto_scaling
urls:
- /
- /scaling_group
- /scaling_configuration
- /scaling_policy
as_swiss:
service_type: as
sdk_proxy: auto_scaling
urls:
- /
- /scaling_group
- /scaling_configuration
as_skip_v1:
service_type: asv1
urls: []
There are 3 entries of autoscaling service.
- "as" entry is default one and used for public cloud regions.
- "as_swiss" entry is specific for Swisscloud
- "as_skip_v1" entry is entry to be skipped from EpMon
By default all entries in service catalog are triggered for EpMon.
The mandatory parameter for all entries is "service_type". This must match the service_type entry in service catalog.
Another important parameter is "sdk_proxy". This attribute identifies which otcextension module should be used for execution of HTTP GET queries.
The most important parameter is "urls". It defines list of URLs which will be triggered for the specific service. As service_type is known then not full url is required to be defined but only required is its path which appears after predefined url from service catalog.
If some specific service (or some specific service version) is supposed to be skipped from endpoint monitoring then it must defined in epmon config with urls parameter setting the empty list. This ensures that even default queries from service catalog are overwritten by the empty list in this config. In this example service type asv1 (entry from service catalog) is not being triggered by EpMon at all as it contains empty urls list.
Collected response codes and response times are sent to graphite for further processing by Metrics Processor.