forked from docs/internal-documentation
Reviewed-by: Gode, Sebastian <sebastian.gode@t-systems.com> Co-authored-by: tischrei <tino.schreiber@t-systems.com> Co-committed-by: tischrei <tino.schreiber@t-systems.com>
142 lines
9.0 KiB
ReStructuredText
142 lines
9.0 KiB
ReStructuredText
.. _metric_databases:
|
|
|
|
================
|
|
Metric Databases
|
|
================
|
|
|
|
Metrics are stored in 2 different database types:
|
|
|
|
- Graphite time series database
|
|
- Postgresql relational database
|
|
|
|
|
|
Graphite
|
|
========
|
|
|
|
|
|
`Graphite <https://graphiteapp.org/>`_ is an open-source enterprise-ready
|
|
time-series database. ApiMon, EpMon, and CloudMon data are stored in the
|
|
clustered Graphite TSDB. Metrics emitted by the processes are gathered in the
|
|
row of statsd processes which aggregate metrics to 10s precision.
|
|
|
|
|
|
+---------------------+-----------------------------------------------------------------------------------------------+
|
|
| Parameter | Value |
|
|
+=====================+===============================================================================================+
|
|
| Grafana Datasource | apimon-carbonapi |
|
|
+---------------------+-----------------------------------------------------------------------------------------------+
|
|
| Database type | time series |
|
|
+---------------------+-----------------------------------------------------------------------------------------------+
|
|
| Main namespace | stats |
|
|
+---------------------+-----------------------------------------------------------------------------------------------+
|
|
| Metric type | OpenStack API metrics (including otcextensions) collecting response codes, latencies, methods |
|
|
| | ApiMOn metrics (create_cce_cluster, delete_volume_eu-de-01, etc) |
|
|
| | Custom metrics which can be created by tags in ansible playbooks |
|
|
+---------------------+-----------------------------------------------------------------------------------------------+
|
|
| Database attributes | "timers", "counters", "environment name", "monitoring location", "service", "request method", |
|
|
| | "resource", "response code", "result", custom metrics, etc |
|
|
+---------------------+-----------------------------------------------------------------------------------------------+
|
|
| result of API calls | attempted |
|
|
| | passed |
|
|
| | failed |
|
|
+---------------------+-----------------------------------------------------------------------------------------------+
|
|
|
|
|
|
.. image:: training_images/graphite_query.jpg
|
|
|
|
|
|
All metrics are under "stats" namespace:
|
|
|
|
Under "stats" there are following important metric types:
|
|
|
|
- counters
|
|
- timers
|
|
- gauges
|
|
|
|
Counters and timers have following subbranches:
|
|
|
|
- apimon.metric → specific apimon metrics not gathered by the OpenStack API
|
|
methods
|
|
- openstack.api → pure API request metrics
|
|
|
|
Every section has further following branches:
|
|
|
|
- environment name (production_regA, production_regB, etc)
|
|
|
|
- monitoring location (production_regA, awx) - specification of the environment from which the metric is gathered
|
|
|
|
|
|
openstack.api
|
|
-------------
|
|
|
|
OpenStack metrics branch is structured as following:
|
|
|
|
- service (normally service_type from the service catalog, but sometimes differs slightly)
|
|
|
|
- request method (GET/POST/DELETE/PUT)
|
|
|
|
- resource (service resource, i.e. server, keypair, volume, etc). Sub-resources are joined with "_" (i.e. cluster_nodes)
|
|
|
|
- response code - received response code
|
|
|
|
- count/upper/lower/mean/etc - timer specific metrics (available only under stats.timers.openstack.api.$environment.$zone.$service.$request_method.$resource.$status_code.{count,mean,upper,*})
|
|
- count/rate - counter specific metrics (available only under stats.counters.openstack.api.$environment.$zone.$service.$request_method.$resource.$status_code.{count,mean,upper,*})
|
|
|
|
- attempted - counter for the attempted requests (only for counters)
|
|
- failed - counter of failed requests (not received response, connection problems, etc) (only for counters)
|
|
- passed - counter of requests receiving any response back (only for counters)
|
|
|
|
|
|
apimon.metric
|
|
-------------
|
|
|
|
- metric name (i.e. create_cce_cluster, delete_volume_eu-de-01, etc) - complex metrics branch
|
|
|
|
- attempted/failed/failedignored/passed/skipped - counters for the corresponding operation results (this branch element represents status of the corresponding ansible task)
|
|
|
|
- $az - some metrics would have availability zone for the operation on that level. Since this info is not always available this is a varying path
|
|
|
|
- curl - subtree for the curl type of metrics
|
|
|
|
- $name - short name of the host to be checked
|
|
|
|
|
|
- stats.timers.apimon.metric.$environment.$zone.**csm_lb_timings**.{public,private}.{http,https,tcp}.$az.__VALUE__ - timer values for the loadbalancer test
|
|
- stats.counters.apimon.metric.$environment.$zone.**csm_lb_timings**.{public,private}.{http,https,tcp}.$az.{attempted,passed,failed} - counter values for the loadbalancer test
|
|
- stats.timers.apimon.metric.$environment.$zone.**curl**.$host.{passed,failed}.__VALUE__ - timer values for the curl test
|
|
- stats.counters.apimon.metric.$environment.$zone.**curl**.$host.{attempted,passed,failed} - counter values for the curl test
|
|
- stats.timers.apimon.metric.$environment.$zone.**dns**.$ns_name.$host - timer values for the NS lookup test. $ns_name is the DNS servers used to query the records
|
|
- stats.counters.apimon.metric.$environment.$zone.**dns**.$ns_name.$host.{attempted,passed,failed} - counter values for the NS lookup test
|
|
|
|
|
|
Postgresql
|
|
==========
|
|
|
|
Relational database stores ApiMon playbook scenario results which provides statistics about most common service functionalities and use cases.
|
|
These queries are used mainly on Test Results dashboard and Service specific statistics dashboards.
|
|
|
|
|
|
+-------------------------------+-------------------------------------------------------------------------------------------------------------+
|
|
| Parameter | Value |
|
|
+===============================+=============================================================================================================+
|
|
| Grafana Datasource | apimon-pg |
|
|
+-------------------------------+-------------------------------------------------------------------------------------------------------------+
|
|
| Database Type | relational |
|
|
+-------------------------------+-------------------------------------------------------------------------------------------------------------+
|
|
| Database Table | results_summary |
|
|
+-------------------------------+-------------------------------------------------------------------------------------------------------------+
|
|
| Metric type | apimon playbook result statistics |
|
|
+-------------------------------+-------------------------------------------------------------------------------------------------------------+
|
|
| Database Fields | "timestamp", "name", "job_id", "result", "duration", "result_task" |
|
|
+-------------------------------+-------------------------------------------------------------------------------------------------------------+
|
|
| result field values | 0 - success |
|
|
| | 1 - ? |
|
|
| | 2 - skipped |
|
|
| | 3 - failed |
|
|
+-------------------------------+-------------------------------------------------------------------------------------------------------------+
|
|
| result_task object parameters | "timestamp", "name", "job_id", "result", "duration", "action", "environment", "zone", "anonymized_response" |
|
|
+-------------------------------+-------------------------------------------------------------------------------------------------------------+
|
|
|
|
|
|
.. image:: training_images/postgresql_query.jpg
|