forked from docs/internal-documentation
Reviewed-by: Gode, Sebastian <sebastian.gode@t-systems.com> Co-authored-by: tischrei <tino.schreiber@t-systems.com> Co-committed-by: tischrei <tino.schreiber@t-systems.com>
200 lines
7.9 KiB
ReStructuredText
200 lines
7.9 KiB
ReStructuredText
.. _test_scenarios:
|
|
|
|
==============
|
|
Test Scenarios
|
|
==============
|
|
|
|
|
|
The Executor role of each API-Monitoring environment is responsible for
|
|
executing individual jobs (scenarios). Those can be defined as Ansible playbooks
|
|
(what allow them to be pretty much anything) or any other executable form (as
|
|
python script). With Ansible on it's own having nearly limitless capability and
|
|
availability to execute anything else ApiMon can do pretty much anything. The
|
|
only expectation is that whatever is being done produces some form of metric for
|
|
further analysis and evaluation. Otherwise there is no sense in monitoring. The
|
|
scenarios are collected in a `Github
|
|
<https://github.com/opentelekomcloud-infra/apimon-tests>`_ and updated in
|
|
real-time. In general mentioned test jobs do not need take care of generating
|
|
data implicitly. Since the API related tasks in the playbooks rely on the Python
|
|
OpenStack SDK (and its OTC extensions), metric data generated automatically by a
|
|
logging interface of the SDK ('openstack_api' metrics). Those metrics are
|
|
collected by statsd and stored to :ref:`graphite TSDB <metric_databases>`.
|
|
|
|
Additionally metric data are generated also by executor service which collects
|
|
the playbook names, results and duration time ('ansible_stats' metrics) and
|
|
stores them to :ref:`postgresql relational database <metric_databases>`.
|
|
|
|
The playbooks with monitoring scenarios are stored in separate repository on
|
|
`Github <https://github.com/opentelekomcloud-infra/apimon-tests>`_ (the location
|
|
will change with CloudMon replacement in `future
|
|
<https://stackmon.github.io/>`_). Playbooks address the most common use cases
|
|
with cloud services conducted by end customers.
|
|
|
|
The metrics generated by Executor are described on :ref:`Metric
|
|
Definitions <metrics_definition>` page.
|
|
|
|
In addition to metrics generated and captured by a playbook ApiMon also captures
|
|
:ref:`stdout of the execution <logs>`. and saves this log for additional
|
|
analysis to OpenStack Swift storage where logs are being uploaded there with a
|
|
configurable retention policy.
|
|
|
|
|
|
New Test Scenario introduction
|
|
==============================
|
|
|
|
As already mentioned playbook scenarios are stored in separate repository on
|
|
`Github <https://github.com/opentelekomcloud-infra/apimon-tests>`_. Due to the
|
|
fact that we have various environments which differ between each other by
|
|
location, supported services, different flavors, etc it's required to have
|
|
monitoring configuration matrix which defines the monitoring standard and scope
|
|
for each environment. Therefore to enable playbook in some of the monitored
|
|
environments (PROD EU-DE, EU-NL, PREPROD, Swisscloud) further update is required
|
|
in the `monitoring matrix
|
|
<https://github.com/opentelekomcloud-infra/system-config/blob/main/inventory/service/group_vars/apimon.yaml>`_.
|
|
This will be also matter of change in future once `StackMon
|
|
<https://stackmon.github.io/>`_ will take place.
|
|
|
|
|
|
Rules for Test Scenarios
|
|
========================
|
|
|
|
Ansible playbooks need to follow some basic regression testing principles to
|
|
ensure sustainability of the endless execution of such scenarios:
|
|
|
|
- **OpenTelekomCloud and OpenStack collection**
|
|
|
|
- When developing test scenarios use available `Opentelekomcloud.Cloud
|
|
<https://docs.otc-service.com/ansible-collection-cloud/>`_ or
|
|
`Openstack.Cloud
|
|
<https://docs.ansible.com/ansible/latest/collections/openstack/cloud/index.html>`_
|
|
collections for native interaction with cloud in ansible.
|
|
- In case there are features not supported by collection you can still use
|
|
script module and call directly python SDK script to invoke required request
|
|
towards cloud
|
|
|
|
- **Unique names of resources**
|
|
|
|
- Make sure that resources don't conflict with each other and are easily
|
|
trackable by its unique name
|
|
|
|
- **Teardown of the resources**
|
|
|
|
- Make sure that deletion / cleanup of the resources is triggered even if some
|
|
of the tasks in playbooks will fail
|
|
- Make sure that deletion / cleanup is triggered in right order
|
|
|
|
- **Simplicity**
|
|
|
|
- Do not over-complicate test scenario. Use default auto-filled parameters
|
|
wherever possible
|
|
|
|
- **Only basic / core functions in scope of testing**
|
|
|
|
- ApiMon is not supposed to validate full service functionality. For such
|
|
cases we have different team / framework within QA responsibility
|
|
- Focus only on core functions which are critical for basic operation /
|
|
lifecycle of the service.
|
|
- The less functions you use the less potential failure rate you will have on
|
|
running scenario for whatever reasons
|
|
|
|
- **No hardcoding**
|
|
|
|
- Every single hardcoded parameter in scenario will later lead to potential
|
|
outage of the scenario's run in future when such parameter might change
|
|
- Try to obtain all such parameters dynamically from the cloud directly.
|
|
|
|
- **Special tags for combined metrics**
|
|
|
|
- In case that you want to combine multiple tasks in playbook in single custom
|
|
metric you can do with using tags parameter in the tasks
|
|
|
|
|
|
Custom metrics in Test Scenarios
|
|
================================
|
|
|
|
|
|
OpenStack SDK and otcextensions (otcextensions covers services which are out of
|
|
scope of OpenStack SDK and extends its functionality with services provided by
|
|
OTC) support metric generation natively for every single API call and ApiMon
|
|
executor supports collection of ansible playbook statistics so every single
|
|
scenario and task can store its result, duration and name in metric database.
|
|
|
|
But in some cases there's a need to provide measurement on multiple tasks which
|
|
represent some important aspect of the customer use case. For example measure
|
|
the time and overall result from the VM deployment until successful login via
|
|
SSH. Single task results are stored as metrics in metric database but it would
|
|
be complicated to transfer processing logic of metrics on grafana. Therefore
|
|
tags feature on task level introduces possibility to address custom metrics.
|
|
|
|
|
|
In following example (snippet from `scenario2_simple_ece.yaml
|
|
<https://github.com/opentelekomcloud-infra/apimon-tests/blob/master/playbooks/scenario2_simple_ecs.yaml>`_)
|
|
custom metric stores the result of multiple tasks in special metric name
|
|
create_server::
|
|
|
|
- name: Create Server in default AZ
|
|
openstack.cloud.server:
|
|
auto_ip: false
|
|
name: "{{ test_server_fqdn }}"
|
|
image: "{{ test_image }}"
|
|
flavor: "{{ test_flavor }}"
|
|
key_name: "{{ test_keypair_name }}"
|
|
network: "{{ test_network_name }}"
|
|
security_groups: "{{ test_security_group_name }}"
|
|
tags:
|
|
- "metric=create_server"
|
|
- "az=default"
|
|
register: server
|
|
|
|
- name: get server id
|
|
set_fact:
|
|
server_id: "{{ server.id }}"
|
|
|
|
- name: Attach FIP
|
|
openstack.cloud.floating_ip:
|
|
server: "{{ server_id }}"
|
|
tags:
|
|
- "metric=create_server"
|
|
- "az=default"
|
|
|
|
- name: get server info
|
|
openstack.cloud.server_info:
|
|
server: "{{ server_id }}"
|
|
register: server
|
|
tags:
|
|
- "metric=create_server"
|
|
- "az=default"
|
|
|
|
- set_fact:
|
|
server_ip: "{{ server['openstack_servers'][0]['public_v4'] }}"
|
|
tags:
|
|
- "metric=create_server"
|
|
- "az=default"
|
|
|
|
- name: find servers by name
|
|
openstack.cloud.server_info:
|
|
server: "{{ test_server_fqdn }}"
|
|
register: servers
|
|
tags:
|
|
- "metric=create_server"
|
|
- "az=default"
|
|
|
|
- name: Debug server info
|
|
debug:
|
|
var: servers
|
|
|
|
# Wait for the server to really start and become accessible
|
|
- name: Wait for SSH port to become active
|
|
wait_for:
|
|
port: 22
|
|
host: "{{ server_ip }}"
|
|
timeout: 600
|
|
tags: ["az=default", "service=compute", "metric=create_server"]
|
|
|
|
- name: Try connecting
|
|
retries: 10
|
|
delay: 1
|
|
command: "ssh -o 'UserKnownHostsFile=/dev/null' -o 'StrictHostKeyChecking=no' linux@{{ server_ip }} -i ~/.ssh/{{ test_keypair_name }}.pem"
|
|
tags: ["az=default", "service=compute", "metric=create_server"]
|
|
|