.. _test_scenarios: ============== Test Scenarios ============== The Executor role of each API-Monitoring environment is responsible for executing individual jobs (scenarios). Those can be defined as Ansible playbooks (what allow them to be pretty much anything) or any other executable form (as python script). With Ansible on it's own having nearly limitless capability and availability to execute anything else ApiMon can do pretty much anything. The only expectation is that whatever is being done produces some form of metric for further analysis and evaluation. Otherwise there is no sense in monitoring. The scenarios are collected in a `Github `_ and updated in real-time. In general mentioned test jobs do not need take care of generating data implicitly. Since the API related tasks in the playbooks rely on the Python OpenStack SDK (and its OTC extensions), metric data generated automatically by a logging interface of the SDK ('openstack_api' metrics). Those metrics are collected by statsd and stored to :ref:`graphite TSDB `. Additionally metric data are generated also by executor service which collects the playbook names, results and duration time ('ansible_stats' metrics) and stores them to :ref:`postgresql relational database `. The playbooks with monitoring scenarios are stored in separate repository on `Github `_ (the location will change with CloudMon replacement in `future `_). Playbooks address the most common use cases with cloud services conducted by end customers. The metrics generated by Executor are described on :ref:`Metric Definitions ` page. In addition to metrics generated and captured by a playbook ApiMon also captures :ref:`stdout of the execution `. and saves this log for additional analysis to OpenStack Swift storage where logs are being uploaded there with a configurable retention policy. New Test Scenario introduction ============================== As already mentioned playbook scenarios are stored in separate repository on `Github `_. Due to the fact that we have various environments which differ between each other by location, supported services, different flavors, etc it's required to have monitoring configuration matrix which defines the monitoring standard and scope for each environment. Therefore to enable playbook in some of the monitored environments (PROD EU-DE, EU-NL, PREPROD, Swisscloud) further update is required in the `monitoring matrix `_. This will be also matter of change in future once `StackMon `_ will take place. Rules for Test Scenarios ======================== Ansible playbooks need to follow some basic regression testing principles to ensure sustainability of the endless execution of such scenarios: - **OpenTelekomCloud and OpenStack collection** - When developing test scenarios use available `Opentelekomcloud.Cloud `_ or `Openstack.Cloud `_ collections for native interaction with cloud in ansible. - In case there are features not supported by collection you can still use script module and call directly python SDK script to invoke required request towards cloud - **Unique names of resources** - Make sure that resources don't conflict with each other and are easily trackable by its unique name - **Teardown of the resources** - Make sure that deletion / cleanup of the resources is triggered even if some of the tasks in playbooks will fail - Make sure that deletion / cleanup is triggered in right order - **Simplicity** - Do not over-complicate test scenario. Use default auto-filled parameters wherever possible - **Only basic / core functions in scope of testing** - ApiMon is not supposed to validate full service functionality. For such cases we have different team / framework within QA responsibility - Focus only on core functions which are critical for basic operation / lifecycle of the service. - The less functions you use the less potential failure rate you will have on running scenario for whatever reasons - **No hardcoding** - Every single hardcoded parameter in scenario will later lead to potential outage of the scenario's run in future when such parameter might change - Try to obtain all such parameters dynamically from the cloud directly. - **Special tags for combined metrics** - In case that you want to combine multiple tasks in playbook in single custom metric you can do with using tags parameter in the tasks Custom metrics in Test Scenarios ================================ OpenStack SDK and otcextensions (otcextensions covers services which are out of scope of OpenStack SDK and extends its functionality with services provided by OTC) support metric generation natively for every single API call and ApiMon executor supports collection of ansible playbook statistics so every single scenario and task can store its result, duration and name in metric database. But in some cases there's a need to provide measurement on multiple tasks which represent some important aspect of the customer use case. For example measure the time and overall result from the VM deployment until successful login via SSH. Single task results are stored as metrics in metric database but it would be complicated to transfer processing logic of metrics on grafana. Therefore tags feature on task level introduces possibility to address custom metrics. In following example (snippet from `scenario2_simple_ece.yaml `_) custom metric stores the result of multiple tasks in special metric name create_server:: - name: Create Server in default AZ openstack.cloud.server: auto_ip: false name: "{{ test_server_fqdn }}" image: "{{ test_image }}" flavor: "{{ test_flavor }}" key_name: "{{ test_keypair_name }}" network: "{{ test_network_name }}" security_groups: "{{ test_security_group_name }}" tags: - "metric=create_server" - "az=default" register: server - name: get server id set_fact: server_id: "{{ server.id }}" - name: Attach FIP openstack.cloud.floating_ip: server: "{{ server_id }}" tags: - "metric=create_server" - "az=default" - name: get server info openstack.cloud.server_info: server: "{{ server_id }}" register: server tags: - "metric=create_server" - "az=default" - set_fact: server_ip: "{{ server['openstack_servers'][0]['public_v4'] }}" tags: - "metric=create_server" - "az=default" - name: find servers by name openstack.cloud.server_info: server: "{{ test_server_fqdn }}" register: servers tags: - "metric=create_server" - "az=default" - name: Debug server info debug: var: servers # Wait for the server to really start and become accessible - name: Wait for SSH port to become active wait_for: port: 22 host: "{{ server_ip }}" timeout: 600 tags: ["az=default", "service=compute", "metric=create_server"] - name: Try connecting retries: 10 delay: 1 command: "ssh -o 'UserKnownHostsFile=/dev/null' -o 'StrictHostKeyChecking=no' linux@{{ server_ip }} -i ~/.ssh/{{ test_keypair_name }}.pem" tags: ["az=default", "service=compute", "metric=create_server"]