Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Nils Magnus <magnus@linuxtag.org> Co-committed-by: Nils Magnus <magnus@linuxtag.org>
5.6 KiB
Monitoring coverage
While monitoring the cloud services of the OTC (which we call monitoring environments) is convenient and effective most of the time, it is obvious that in corner cases the servers performing the actual monitoring (which we call monitoring zones) should include also externa zones. Who monitors whom (and how) can be configured in a matrix definition:
https://github.com/opentelekomcloud-infra/stackmon-config/blob/main/config.yaml
Monitoring Environments
These targets are covered by the SD2 monitoring setup and are displayed in separate tabs (or on separate pages for the Swisscloud):
- eu-de,
- eu-nl, and
- eu-ch2 (Swisscloud).
Monitoring Zones
From these zones the monitoring probes are sent to the targets: * Inside OTC (eu-de, eu-ch2) * Outside OTC (Swisscloud)
Scope of monitoring
The SD2 is a special application of the more generic Stackmon project and utilizes several plugins to collect its metrics:
- HTTP-GET queries are sent to service API endpoints:
- applies to all services from the service catalog,
- multiple GET queries may be configured per service.
- Static Resources
- not yet implemented in SD2 (projected for 1Q2024),
- specific services,
- availability of the resource or resource functionality.
- Global resources
- not yet implemented in SD2 (projected for 2024),
- OTC console,
- OTC helpcenter,
- OTC community portal,
- OTC public website.
Example configuration of the monitoring matrix and covered services:
# Mapping of environments to test projects
- env: production_eu-de
monitoring_zone: eu-de
db_entry: apimon.apimon
plugins:
- name: apimon
schedulers_inventory_group_name: schedulers
executors_inventory_group_name: executors
tests_project: apimon
tasks:
- scenario1_token.yaml
- name: epmon
epmon_inventory_group_name: epmon_de
cloud_name: production_eu-de # env in zone has few creds. We need to pick one
config_elements:
- antiddos
- antiddos_skip_bad_type
- as
- as_skip_v1
- bms_skip
- cce_skip_unver
- cce
- ces
- ces_skip_v1
- compute
- css
- cts_skip_unver
- cts
- data_protect_skip
- database_skip
- dcs
- dcs_skip_v1
- dds
- deh
- dis_skip_unver
- dis
- dms
- dms_skip_v2
- dns
- dws
- dws_skip_v1
- identity
- image
- kms_skip_unver
- kms
- mrs
- nat
- network
- object_skip
- object_store
- orchestration
- rds_skip_unver
- rds_skip_v1
- rds
- sdrs
- sfsturbo
- share
- smn
- smn_skip_v2
- volume_skip_v2
- volume
- env: production_eu-nl
monitoring_zone: eu-de
db_entry: apimon.apimon
plugins:
- name: apimon
schedulers_inventory_group_name: schedulers
executors_inventory_group_name: executors
#epmons_inventory_group_name: epmons
tests_project: apimon
tasks:
- scenario1_token.yaml
- name: epmon
epmon_inventory_group_name: epmon_de
cloud_name: production_eu-nl # env in zone has few creds. We need to pick one
config_elements:
- antiddos
- antiddos_skip_bad_type
- as
- as_skip_v1
- bms_skip
- cce_skip_unver
- cce
- ces
- ces_skip_v1
- compute
- css
- cts_skip_unver
- cts
- data_protect_skip
- database_skip
- dcs
- dcs_skip_v1
- dds
- deh
- dis_skip_unver
- dis
- dms
- dms_skip_v2
- dns
- dws
- dws_skip_v1
- identity
- image
- kms_skip_unver
- kms
- mrs
- nat
- network
- object_skip
- object_store
- orchestration
- rds_skip_unver
- rds_skip_v1
- rds
- sdrs
- sfsturbo
- share
- smn
- smn_skip_v2
- volume_skip_v2
- volume
Note that Service Managers or Engineers usually don't need to touch this configuration. Details should be negotiated with Platform Engineers.
The attribute env
defines the target for monitoring (which region is to be monitored). The attribute monitoring_zone
defines the source of monitoring (from which region the monitoring will be triggered).
Note that this configuration covers not only SD2 component, but also the even more generic Stackmon framework. It is plugin based so additional plugins can be added. Currently two plugins are enabled:
- apimon
- epmon
Apimon plugin triggers scenario-based Ansible playbooks which simulate the customer use-cases including also creation of resources (POST requests). Currently only one scenario is enabled for token authorization (scenario1_token.yaml). As the SD2 only evaluates the HTTP GET metrics other scenarios are not yet enabled. Playbooks are stored on GitHub at:
https://github.com/stackmon/apimon-tests/tree/main/playbooks
The EpMon plugin defines which service entries are used in which specific environment. Services not present in an environment won't have entry in this config as well, respectively.