docsportal/doc/source/internal/sd2_training/status_dashboard_frontend.rst
Hasko, Vladimir f114248cfb adding SD2 training content
Reviewed-by: Gode, Sebastian <sebastian.gode@t-systems.com>
Co-authored-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-committed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
2023-10-04 10:07:42 +00:00

2.0 KiB

Status Dashboard Frontend

Status Dashboard provides the status information of OTC cloud services across different regions.

The following features are supported on Status Dashboard:

  • Support of service health with 5 service statuses
  • Authentication by OpenID connect
  • Service categories - meta grouping of services into groups
  • Regions - different services are existing in regions
  • Incidents - entry about issues affecting certain regions and certain services
  • Support of all OTC environments
  • built-in API support
  • RSS notification
  • SLA view on all services
  • Incident history

Two Status Dashboard portals are available: - public status dashboard: https://status.cloudmon.eco.tsi-dev.otc-service.com/ - hybrid status dashboard: https://status-ch2.cloudmon.eco.tsi-dev.otc-service.com/

Service Health View

image

From the architecture POV Status Dashboard is a flask based web server serving API and rendering web content with the postgresql as database. Source can be found at https://github.com/stackmon/status-dashboard

Configuration of the status dashboard frontend is located at github: https://github.com/opentelekomcloud-infra/stackmon-config/blob/main/sdb_prod/catalog.yaml The catalog yaml file contains definitions of service name, service type, service categories and regions.

Example of AutoScaling service entry in SD catalog:

- attributes:
    category: Compute
    region: EU-DE
    type: as
  name: Auto Scaling
- attributes:
    category: Compute
    region: EU-NL
    type: as
  name: Auto Scaling

SLA view

SLA view https://status.cloudmon.eco.tsi-dev.otc-service.com/sla is calculated only from the "outage" service health status and provide 6 months SLA history of each service.

image

Details how to work with incidents can be found at incidents <sd2_incidents> page.