review of training material

Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Nils Magnus <magnus@linuxtag.org>
Co-committed-by: Nils Magnus <magnus@linuxtag.org>
This commit is contained in:
Nils Magnus 2023-10-12 18:02:41 +00:00 committed by zuul
parent f114248cfb
commit 6e2da0d05c
6 changed files with 277 additions and 200 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,24 +1,26 @@
=========================
Status Dashboard Frontend
=========================
===========================
Status Dashboard 2 Frontend
===========================
Status Dashboard provides the status information of OTC cloud services across different regions.
The web based frontend of the SD2 provides public (and internal,
after authentication) status information of OTC cloud services
across all configured regions. It supports these features:
The following features are supported on Status Dashboard:
- Support of service health with 5 service statuses
- Authentication by OpenID connect
- Service categories - meta grouping of services into groups
- Regions - different services are existing in regions
- Incidents - entry about issues affecting certain regions and certain services
- Displays the service health through five service status.
- Authentication by OpenID connect (which in turn is connected
to the OTC LDAP directory).
- Several service are grouped into categories.
- Regions - several services are existing in regions.
- Incidents - entry about issues affecting certain regions and
certain services.
- Support of all OTC environments
- built-in API support
- RSS notification
- SLA view on all services
- Incident history
- Incident data is available through an API.
- RSS notification (for the OTC mobile app and other integrations).
- SLA view of the services.
- Incident history.
Two Status Dashboard portals are available:
- public status dashboard: https://status.cloudmon.eco.tsi-dev.otc-service.com/
- hybrid status dashboard: https://status-ch2.cloudmon.eco.tsi-dev.otc-service.com/
@ -27,12 +29,15 @@ Service Health View
.. image:: training_images/sd2_frontend.jpg
From the architecture POV Status Dashboard is a Flask based
web server serving API and rendering web content with a
PostgreSQL database. The project source is available at
https://github.com/stackmon/status-dashboard
From the architecture POV Status Dashboard is a flask based web server serving API and rendering web content with the postgresql as database.
Source can be found at https://github.com/stackmon/status-dashboard
Configuration of the status dashboard frontend is located at github: https://github.com/opentelekomcloud-infra/stackmon-config/blob/main/sdb_prod/catalog.yaml
The catalog yaml file contains definitions of service name, service type, service categories and regions.
Configuration of the status dashboard frontend is located
at github: https://github.com/opentelekomcloud-infra/stackmon-config/blob/main/sdb_prod/catalog.yaml
The ``catalog.yaml`` file contains definitions of the
service name, service type, service categories and regions.
Example of AutoScaling service entry in SD catalog:
@ -57,6 +62,4 @@ SLA view https://status.cloudmon.eco.tsi-dev.otc-service.com/sla is calculated o
.. image:: training_images/sd2_sla.jpg
Details how to work with incidents can be found at :ref:`incidents <sd2_incidents>` page.
Details how to work with incidents are described on the :ref:`incidents <sd2_incidents>` page.

View File

@ -1,7 +1,7 @@
.. _sd2_flow:
SD2 Flow Process
================
SD2 Data Flow Process
=====================
.. image:: training_images/sd2_data_flow.svg
@ -9,18 +9,22 @@ SD2 Flow Process
:alt: sd2_data_flow
#. Service squad adds new data entries in github repository for
EpMOn (service URL queries),
adjusts flag and health metrics if required,
and adds service entry in SD catalog.
#. Cloudmon fetches public configuration from GitHub
and internal configuration (credentials, certs, keys,...) from local place and generate final configuration.
#. EpMon plugin is executed and triggers HTTP requests from defined configuration
#. Metrics from HTTP requests are collected by Statsd.
#. Collected metrics are stored in time-series database Graphite.
#. Metric Processor evaluates HTTP metrics from Graphite TSDB.
and generates new flag and health metrics based on defined rules and thresholds in configuration.
#. Status Dashboard changing service health semaphore light based on resulting health metrics from Metric Procesor.
#. Grafana uses metrics and statistics databases as the data sources for the
dashboards. The dashboard with various panels show the real-time status of
the platform. Grafana supports also historical views and trends.
#. Service squad adds new data entries in GitHub repository for
EpMon (service URL queries), adjusting flag and health
metrics if required, and adds a service entry in the SD catalog.
#. Cloudmon fetches public configuration from GitHub and internal
configuration (credentials, certs, keys, ...) from a local
repository place to generate the final configuration.
#. EpMon plugin is executed and triggers HTTP requests as defined
by the configuration.
#. Metrics resulting by the HTTP requests are collected by Statsd.
#. Collected metrics are stored in the time series database Graphite.
#. The Metric Processor evaluates HTTP metrics from Graphite TSDB
and generates new flag and health metrics based on defined
rules and thresholds in configuration.
#. Status Dashboard changes service health semaphore based on the
resulting health metrics from the Metric Procesor.
#. Grafana uses metrics and statistics databases as the data
sources for the dashboards. The dashboard with various panels
shows the real-time status of the platform. Grafana supports
also historical views and trends.