This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Developers Guide

Developers Guide Overview

This document describes how to develop and add to Krkn. Before you start, it is recommended that you read the following documents first:

  1. Krkn Main README
  2. List of all Supported Scenarios

Be sure to properly install Krkn. Then you can start to develop krkn. The following documents will help you get started:

  1. Add k8s functionality to krkn-lib
  2. Add a New Chaos Scenario using Plugin API: Adding a new scenario into krkn
  3. Test your changes

NOTE: All base kubernetes functionality should be added into krkn-lib and called from krkn

Once a scenario gets added to krkn, changes will be need in krkn-hub and krknctl as well. See steps below on help to edit krkn-hub and krknctl

Questions?

For any questions or further guidance, feel free to reach out to us on the Kubernetes workspace in the #krkn channel. We’re happy to assist. Now, release the Krkn!

Follow Contribution Guide

Once all you’re happy with your changes, follow the contribution guide on how to create your own branch and squash your commits

1 - Krkn-lib

Krkn-lib contains the base kubernetes python functions

PyPI

krkn-lib

Krkn Chaos and resiliency testing tool Foundation Library

Contents

The Library contains Classes, Models and helper functions used in Kraken to interact with Kubernetes, Openshift and other external APIS. The goal of this library is to give to developers the building blocks to realize new Chaos Scenarios and to increase the testability and the modularity of the Krkn codebase.

Packages

The library is subdivided in several Packages under src/krkn_lib

  • ocp: Openshift Integration
  • k8s: Kubernetes Integration
  • elastic: Collection of ElasticSearch functions for posting telemetry
  • prometheus: Collection of prometheus functions for collecting metrics and alerts
  • telemetry:
    • k8s: Kubernetes Telemetry collection and distribution
    • ocp: Openshift Telemetry collection and distribution
  • models: Krkn shared data models
    • k8s: Kubernetes objects model
    • krkn: Krkn base models
    • telemetry: Telemetry collection model
    • elastic: Elastic model for data
  • utils: common functions

Documentation and Available Functions

The Library documentation of available functions is here. The documentation is automatically generated by Sphinx on top of the reStructuredText Docstring Format comments present in the code.

Installation

Git

Clone the repository

git clone https://github.com/krkn-chaos/krkn-lib
cd krkn-lib

Install the dependencies

Krkn lib uses poetry for its dependency management and packaging. To install the proper packages please use:

$ pip install poetry
$ poetry install --no-interaction

Testing your changes

To see how you can configure and test your changes see testing changes

2 - Adding scenarios via plugin api

Scenario Plugin API:

This API enables seamless integration of Scenario Plugins for Krkn. Plugins are automatically detected and loaded by the plugin loader, provided they extend the AbstractPluginScenario abstract class, implement the required methods, and adhere to the specified naming conventions.

Plugin folder:

The plugin loader automatically loads plugins found in the krkn/scenario_plugins directory, relative to the Krkn root folder. Each plugin must reside in its own directory and can consist of one or more Python files. The entry point for each plugin is a Python class that extends the AbstractPluginScenario abstract class and implements its required methods.

__init__ file

For the plugin to be properly found by the plugin api, there needs to be a init file in the base folder

For example: init.py

AbstractPluginScenario abstract class:

This abstract class defines the contract between the plugin and krkn. It consists of two methods:

  • run(...)
  • get_scenario_type()

Most IDEs can automatically suggest and implement the abstract methods defined in AbstractPluginScenario: pycharm (IntelliJ PyCharm)

run(...)

    def run(
        self,
        run_uuid: str,
        scenario: str,
        krkn_config: dict[str, any],
        lib_telemetry: KrknTelemetryOpenshift,
        scenario_telemetry: ScenarioTelemetry,
    ) -> int:

This method represents the entry point of the plugin and the first method that will be executed.

Parameters:

  • run_uuid:
    • the uuid of the chaos run generated by krkn for every single run.
  • scenario:
    • the config file of the scenario that is currently executed
  • krkn_config:
    • the full dictionary representation of the config.yaml
  • lib_telemetry
    • it is a composite object of all the krkn-lib objects and methods needed by a krkn plugin to run.
  • scenario_telemetry
    • the ScenarioTelemetry object of the scenario that is currently executed

Return value:

Returns 0 if the scenario succeeds and 1 if it fails.

get_scenario_types():

python def get_scenario_types(self) -> list[str]:

Indicates the scenario types specified in the config.yaml. For the plugin to be properly loaded, recognized and executed, it must be implemented and must return one or more strings matching scenario_type strings set in the config.

Naming conventions:

A key requirement for developing a plugin that will be properly loaded by the plugin loader is following the established naming conventions. These conventions are enforced to maintain a uniform and readable codebase, making it easier to onboard new developers from the community.

plugin folder:

  • the plugin folder must be placed in the krkn/scenario_plugin folder starting from the krkn root folder
  • the plugin folder cannot contain the words
    • plugin
    • scenario

plugin file name and class name:

  • the plugin file containing the main plugin class must be named in snake case and must have the suffix _scenario_plugin:
    • example_scenario_plugin.py
  • the main plugin class must named in capital camel case and must have the suffix ScenarioPlugin :
    • ExampleScenarioPlugin
  • the file name must match the class name in the respective syntax:
    • example_scenario_plugin.py -> ExampleScenarioPlugin

scenario type:

  • the scenario type must be unique between all the scenarios.

logging:

If your new scenario does not adhere to the naming conventions, an error log will be generated in the Krkn standard output, providing details about the issue:

2024-10-03 18:06:31,136 [INFO] πŸ“£ `ScenarioPluginFactory`: types from config.yaml mapped to respective classes for execution:
2024-10-03 18:06:31,136 [INFO]   βœ… type: application_outages_scenarios ➑️ `ApplicationOutageScenarioPlugin` 
2024-10-03 18:06:31,136 [INFO]   βœ… types: [hog_scenarios, arcaflow_scenario] ➑️ `ArcaflowScenarioPlugin` 
2024-10-03 18:06:31,136 [INFO]   βœ… type: container_scenarios ➑️ `ContainerScenarioPlugin` 
2024-10-03 18:06:31,136 [INFO]   βœ… type: managedcluster_scenarios ➑️ `ManagedClusterScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… types: [pod_disruption_scenarios, pod_network_scenario, vmware_node_scenarios, ibmcloud_node_scenarios] ➑️ `NativeScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: network_chaos_scenarios ➑️ `NetworkChaosScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: node_scenarios ➑️ `NodeActionsScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: pvc_scenarios ➑️ `PvcScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: service_disruption_scenarios ➑️ `ServiceDisruptionScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: service_hijacking_scenarios ➑️ `ServiceHijackingScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: cluster_shut_down_scenarios ➑️ `ShutDownScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: syn_flood_scenarios ➑️ `SynFloodScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: time_scenarios ➑️ `TimeActionsScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   βœ… type: zone_outages_scenarios ➑️ `ZoneOutageScenarioPlugin`

2024-09-18 14:48:41,735 [INFO] Failed to load Scenario Plugins:

2024-09-18 14:48:41,735 [ERROR] β›” Class: ExamplePluginScenario Module: krkn.scenario_plugins.example.example_scenario_plugin
2024-09-18 14:48:41,735 [ERROR] ⚠️ scenario plugin class name must start with a capital letter, end with `ScenarioPlugin`, and cannot be just `ScenarioPlugin`.

ExampleScenarioPlugin

The ExampleScenarioPlugin class included in the tests folder can be used as a scaffolding for new plugins and it is considered part of the documentation.

Adding CI tests

Depending on the complexity of the new scneario, it would be much appreciated if a CI test of the scenario would be added to our github action that gets run on each PR.
To add a test:

3 - Adding New Scenario to Krkn-hub

Adding/Editing a New Scenario to Krkn-hub

  1. Create folder with scenario name under krkn-hub

  2. Create generic scenario template with environment variables

    a. See scenario.yaml for example

    b. Almost all parameters should be set using a variable (these will be set in the env.sh file or through the command line environment variables)

  3. Add defaults for any environment variables in an “env.sh” file

    a. See env.sh for example

  4. Create script to run.sh chaos scenario a. See run.sh for example

    b. edit line 16 with your scenario yaml template

    c. edit line 17 and 23 with your yaml config location

  5. Create Dockerfile template

    a. See dockerfile template for example

    b. Lines to edit

     i. 12: replace "application-outages" with your folder name
    
     ii. 14: replace "application-outages" with your folder name
    
     iii. 17: replace "application-outages" with your scenario name
    
     iv. 18: replace description with a description of your new scenario
    
  6. Add service/scenario to docker-compose.yaml file following syntax of other services

  7. Point the dockerfile parameter in your docker-compose to the Dockerfile file in your new folder

  8. Add the folder name to the list of scenarios in build.sh

  9. Update the krkn website and main README with new scenario type

NOTE:

  1. If you added any main configuration variables or new sections be sure to update config.yaml.template
  2. Similar to above, also add the default parameter values to env.sh

4 - Adding New Scenario to Krknctl

Adding Scenario to Krknctl

Adding a New Scenario to Krknctl

For krknctl to find the parameters of the scenario it uses a krknctl input json file. Once this file is added to krkn-hub, krknctl will be able to find it along with the details of how to run the scenario.

Add KrknCtl Input Json

This file adds every environment variable that is set up for krkn-hub to be defined as a flag to the krknctl cli command. There are a number of different type of variables that you can use, each with their own required fields. See below for an example of the different variable types

An example krknctl-input.json file can be found here

Enum Type Required Key/Values

{
    "name": "<name>",
    "short_description":"<short-description>",
    "description":"<longer-description>",
    "variable":"<variable_name>", //this needs to match environment variable in krkn-hub
    "type": "enum",
    "allowed_values": "<value>,<value>",
    "separator": ",",
    "default":"", // any default value
    "required":"<true_or_false>" // true or false if required to set when running
}

String Type Required Key/Values

{
    "name": "<name>",
    "short_description":"<short-description>",
    "description":"<longer-description>",
    "variable":"<variable_name>", //this needs to match environment variable in krkn-hub
    "type": "string",
    "default": "", // any default value
    "required":"<true_or_false>" // true or false if required to set when running
}

Number Type Required Key/Values

{
    "name": "<name>",
    "short_description": "<short-description>",
    "description": "<longer-description>",
    "variable": "<variable_name>", //this needs to match environment variable in krkn-hub
    "type": "number",  // options: string, number, file, file64
    "default": "", // any default value
    "required": "<true_or_false>" // true or false if required to set when running
}

File Type Required Key/Values

{
    "name": "<name>",
    "short_description":"<short-description>",
    "description":"<longer-description>",
    "variable":"<variable_name>", //this needs to match environment variable in krkn-hub
    "type":"file",  
    "mount_path": "/home/krkn/<file_loc>", // file location to mount to, using /home/krkn as the base has correct read/write locations
    "required":"<true_or_false>" // true or false if required to set when running
}

File Base 64 Type Required Key/Values

{
    "name": "<name>",
    "short_description":"<short-description>",
    "description":"<longer-description>",
    "variable":"<variable_name>", //this needs to match environment variable in krkn-hub
    "type":"file_base64",  
    "required":"<true_or_false>" // true or false if required to set when running
}

5 - Adding to Krkn Test Suite

This guide covers how to add both unit tests and functional tests to the krkn project. Tests are essential for ensuring code quality and preventing regressions.

Unit Tests

Unit tests in krkn are located in the tests/ directory and use Python’s unittest framework with comprehensive mocking to avoid requiring external dependencies like cloud providers or Kubernetes clusters.

Creating a Unit Test

1. File Location and Naming

Place your test file in the tests/ directory with the naming pattern test_<feature>.py:

tests/
β”œβ”€β”€ test_kubevirt_vm_outage.py
β”œβ”€β”€ test_ibmcloud_node_scenarios.py
β”œβ”€β”€ test_ibmcloud_power_node_scenarios.py
└── test_<your_feature>.py

2. Basic Test Structure

#!/usr/bin/env python3

"""
Test suite for <Feature Name>

IMPORTANT: These tests use comprehensive mocking and do NOT require any external
infrastructure, cloud credentials, or Kubernetes cluster. All API calls are mocked.

Test Coverage:
- Feature 1: Description
- Feature 2: Description

Usage:
    # Run all tests
    python -m unittest tests.test_<your_feature> -v

    # Run with coverage
    python -m coverage run -a -m unittest tests/test_<your_feature>.py -v

Assisted By: Claude Code
"""

import unittest
from unittest.mock import MagicMock, patch, Mock

# Import the classes you're testing
from krkn.scenario_plugins.<module> import YourClass


class TestYourFeature(unittest.TestCase):
    """Test cases for YourClass"""

    def setUp(self):
        """Set up test fixtures before each test"""
        # Mock environment variables if needed
        self.env_patcher = patch.dict('os.environ', {
            'API_KEY': 'test-api-key',
            'API_URL': 'https://test.example.com'
        })
        self.env_patcher.start()

        # Mock external dependencies
        self.mock_client = MagicMock()

        # Create instance to test
        self.instance = YourClass()

    def tearDown(self):
        """Clean up after each test"""
        self.env_patcher.stop()

    def test_success_scenario(self):
        """Test successful operation"""
        # Arrange: Set up test data
        expected_result = "success"

        # Act: Call the method being tested
        result = self.instance.your_method()

        # Assert: Verify the result
        self.assertEqual(result, expected_result)

    def test_failure_scenario(self):
        """Test failure handling"""
        # Arrange: Set up failure condition
        self.mock_client.some_method.side_effect = Exception("API Error")

        # Act & Assert: Verify exception is handled
        with self.assertRaises(Exception):
            self.instance.your_method()


if __name__ == '__main__':
    unittest.main()

3. Best Practices for Unit Tests

  • Use Comprehensive Mocking: Mock all external dependencies (cloud APIs, Kubernetes, file I/O)
  • Add IMPORTANT Note: Include a note in the docstring that tests don’t require credentials
  • Document Test Coverage: List what scenarios each test covers
  • Organize Tests by Category: Use section comments like # ==================== Core Tests ====================
  • Test Edge Cases: Include tests for timeouts, missing parameters, API exceptions
  • Use Descriptive Names: Test names should clearly describe what they test

4. Running Unit Tests

# Run all unit tests
python -m unittest discover -s tests -v

# Run specific test file
python -m unittest tests.test_your_feature -v

# Run with coverage
python -m coverage run -a -m unittest discover -s tests -v
python -m coverage report

Functional Tests

Functional tests in krkn are integration tests that run complete chaos scenarios against a real Kubernetes cluster (typically KinD in CI). They are located in the CI/tests/ directory.

Understanding the Functional Test Structure

CI/
β”œβ”€β”€ run.sh                          # Main test runner
β”œβ”€β”€ run_test.sh                     # Individual test executor
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ common_test_config.yaml     # Base configuration template
β”‚   └── <scenario>_config.yaml      # Generated configs per scenario
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ common.sh                   # Common helper functions
β”‚   β”œβ”€β”€ functional_tests            # List of tests to run
β”‚   └── test_*.sh                   # Individual test scripts
└── out/
    └── <test_name>.out             # Test output logs

Adding a New Functional Test

Step 1: Create the Test Script

Create a new test script in CI/tests/ following the naming pattern test_<scenario>.sh:

#!/bin/bash
set -xeEo pipefail

source CI/tests/common.sh

trap error ERR
trap finish EXIT

function functional_test_<your_scenario> {
  # Set environment variables for the scenario
  export scenario_type="<scenario_type>"
  export scenario_file="scenarios/kind/<scenario_file>.yml"
  export post_config=""

  # Generate config from template with variable substitution
  envsubst < CI/config/common_test_config.yaml > CI/config/<your_scenario>_config.yaml

  # Optional: View the generated config
  cat CI/config/<your_scenario>_config.yaml

  # Run kraken with coverage
  python3 -m coverage run -a run_kraken.py -c CI/config/<your_scenario>_config.yaml

  # Success message
  echo "<Your Scenario> scenario test: Success"

  # Optional: Verify expected state
  date
  kubectl get pods -n <namespace> -l <label>=<value> -o yaml
}

# Execute the test function
functional_test_<your_scenario>

Step 2: Create or Reference Scenario File

Ensure your scenario YAML file exists in scenarios/kind/:

# scenarios/kind/<your_scenario>.yml
- id: my-chaos-scenario
  config:
    namespace: default
    label_selector: app=myapp
    # ... scenario-specific configuration

Step 3: Update GitHub Actions Workflow

If you want the test to run on pull requests, add it to .github/workflows/tests.yml:

- name: Setup Pull Request Functional Tests
  if: github.event_name == 'pull_request'
  run: |
    # ... existing tests ...
    echo "test_<your_scenario>" >> ./CI/tests/functional_tests

Functional Test Patterns

Pattern 1: Simple Scenario Test

Tests a single scenario execution:

function functional_test_simple {
  export scenario_type="pod_disruption_scenarios"
  export scenario_file="scenarios/kind/pod_simple.yml"
  export post_config=""

  envsubst < CI/config/common_test_config.yaml > CI/config/simple_config.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/simple_config.yaml

  echo "Simple scenario test: Success"
}

Pattern 2: Test with Setup/Teardown

Tests that require specific cluster state:

function functional_test_with_setup {
  # Setup: Deploy test workload
  kubectl apply -f CI/templates/test_workload.yaml
  kubectl wait --for=condition=ready pod -l app=test --timeout=300s

  # Run scenario
  export scenario_type="pod_disruption_scenarios"
  export scenario_file="scenarios/kind/pod_test.yml"
  envsubst < CI/config/common_test_config.yaml > CI/config/test_config.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/test_config.yaml

  # Verify state
  kubectl get pods -l app=test

  # Teardown
  kubectl delete -f CI/templates/test_workload.yaml

  echo "Test with setup: Success"
}

Pattern 3: Multi-Step Scenario Test

Tests that run multiple related scenarios:

function functional_test_multi_step {
  # Step 1: Initial disruption
  export scenario_type="node_scenarios"
  export scenario_file="scenarios/kind/node_stop.yml"
  envsubst < CI/config/common_test_config.yaml > CI/config/node_config.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/node_config.yaml

  # Wait for recovery
  sleep 30

  # Step 2: Follow-up disruption
  export scenario_file="scenarios/kind/node_start.yml"
  envsubst < CI/config/common_test_config.yaml > CI/config/node_config.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/node_config.yaml

  echo "Multi-step scenario test: Success"
}

Configuration Variables

The common_test_config.yaml uses environment variable substitution via envsubst. Common variables include:

  • $scenario_type: The chaos scenario plugin type (e.g., pod_disruption_scenarios)
  • $scenario_file: Path to the scenario YAML file
  • $post_config: Additional post-scenario configuration

Example usage in config:

kraken:
  chaos_scenarios:
    - $scenario_type:
        - $scenario_file

Running Functional Tests

Run All Functional Tests

./CI/run.sh

This will:

  1. Create CI/out/ directory for logs
  2. Read test names from CI/tests/functional_tests
  3. Execute each test via CI/run_test.sh
  4. Generate results in CI/results.markdown

Run a Single Functional Test

./CI/run_test.sh test_<your_scenario> CI/results.markdown

View Test Results

cat CI/results.markdown

Example output:

Test                   | Result | Duration
-----------------------|--------|---------
test_pod              | Pass   | 0:2:15
test_your_scenario    | Pass   | 0:1:45

View Test Logs

cat CI/out/test_<your_scenario>.out

Error Handling

Functional tests use common error handling from CI/tests/common.sh:

trap error ERR    # Catches errors
trap finish EXIT  # Runs on script exit

# error() function handles exit codes:
# - Exit code 1: Error logged, test fails
# - Exit code 2: Expected exit, test passes (wraps to 0)

Best Practices for Functional Tests

  1. Use set -xeEo pipefail: Ensures errors are caught and commands are logged
  2. Source common.sh: Always include source CI/tests/common.sh for error handling
  3. Set Traps: Use trap error ERR and trap finish EXIT
  4. Verify State: Check cluster state before and after scenarios
  5. Clear Success Messages: Use descriptive success messages
  6. Coverage Integration: Run kraken with python3 -m coverage run -a
  7. Resource Cleanup: Clean up any resources created during the test
  8. Timeout Values: Use appropriate timeout values for kubectl wait commands

Example: Complete Functional Test

Here’s a complete example combining all the concepts:

#!/bin/bash
set -xeEo pipefail

source CI/tests/common.sh

trap error ERR
trap finish EXIT

function functional_test_my_scenario {
  # Setup: Deploy test application
  echo "Setting up test workload..."
  kubectl create namespace test-namespace || true
  kubectl apply -f CI/templates/my_test_app.yaml
  kubectl wait --for=condition=ready pod -l app=my-test-app -n test-namespace --timeout=300s

  # Configure scenario
  export scenario_type="pod_disruption_scenarios"
  export scenario_file="scenarios/kind/my_scenario.yml"
  export post_config=""

  # Generate config
  envsubst < CI/config/common_test_config.yaml > CI/config/my_scenario_config.yaml

  # Optional: Display config for debugging
  echo "Generated configuration:"
  cat CI/config/my_scenario_config.yaml

  # Run kraken scenario
  echo "Running chaos scenario..."
  python3 -m coverage run -a run_kraken.py -c CI/config/my_scenario_config.yaml

  # Verify expected state
  echo "Verifying cluster state..."
  kubectl get pods -n test-namespace -l app=my-test-app

  # Cleanup
  echo "Cleaning up..."
  kubectl delete namespace test-namespace --wait=false

  # Success
  echo "My scenario test: Success"
  date
}

# Execute the test
functional_test_my_scenario

Testing in CI

The GitHub Actions workflow (.github/workflows/tests.yml) runs functional tests:

  1. Pull Requests: Runs a subset of quick tests
  2. Main Branch: Runs all tests including integration scenarios

To add your test to CI:

- name: Setup Pull Request Functional Tests
  run: |
    echo "test_my_scenario" >> ./CI/tests/functional_tests

Debugging Failed Tests

When a functional test fails:

  1. Check the output log: cat CI/out/test_<name>.out
  2. Review the results: cat CI/results.markdown
  3. Check cluster state: kubectl get pods --all-namespaces
  4. Review kraken logs: Look for error messages in the output
  5. Verify configuration: Ensure variables are properly substituted in generated config

Summary

  • Unit Tests: Located in tests/, use comprehensive mocking, require no external dependencies
  • Functional Tests: Located in CI/tests/, run against real Kubernetes, test full scenarios
  • Test Execution: Unit tests via unittest, functional tests via CI/run.sh
  • Coverage: Both test types contribute to code coverage metrics
  • CI Integration: All tests run automatically in GitHub Actions

6 - Testing your changes

This page gives details about how you can get a kind cluster configured to be able to run on krkn-lib (the lowest level of krkn-chaos repos) up through krknctl (our easiest way to run and highest level repo)

Configure Kind Testing Environment

  1. Install kind

  2. Create cluster using kind-config.yml under krkn-lib base folder

kind create cluster --wait 300s --config=kind-config.yml

Install Elasticsearch and Prometheus

To be able to run the full test suite of tests you need to have elasticsearch and prometheus properly configured on the cluster

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add stable https://charts.helm.sh/stable
helm repo update

Prometheus

Deploy prometheus on your cluster

kubectl create namespace monitoring
helm install \
--wait --timeout 360s \
kind-prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set prometheus.service.nodePort=30000 \
--set prometheus.service.type=NodePort \
--set grafana.service.nodePort=31000 \
--set grafana.service.type=NodePort \
--set alertmanager.service.nodePort=32000 \
--set alertmanager.service.type=NodePort \
--set prometheus-node-exporter.service.nodePort=32001 \
--set prometheus-node-exporter.service.type=NodePort \
--set prometheus.prometheusSpec.maximumStartupDurationSeconds=300

ElasticSearch

Set environment variables of elasticsearch variables

export ELASTIC_URL="https://localhost"
export ELASTIC_PORT="9091"
export ELASTIC_USER="elastic"
export ELASTIC_PASSWORD="test"

Deploy elasticsearch on your cluster

helm install \
--wait --timeout 360s \
elasticsearch \
oci://registry-1.docker.io/bitnamicharts/elasticsearch \
--set master.masterOnly=false \
--set master.replicaCount=1 \
--set data.replicaCount=0 \
--set coordinating.replicaCount=0 \
--set ingest.replicaCount=0 \
--set service.type=NodePort \
--set service.nodePorts.restAPI=32766 \
--set security.elasticPassword=test \
--set security.enabled=true \
--set image.tag=7.17.23-debian-12-r0 \
--set security.tls.autoGenerated=true

Testing Changes in Krkn-lib

To be able to run all the tests in the krkn-lib suite, you’ll need to have prometheus and elastic properly configured. See above steps for details

Install poetry

Using a virtual environment install poetry and install krkn-lib requirmenets

$ pip install poetry
$ poetry install --no-interaction

Run tests

poetry run python3 -m coverage run -a -m unittest discover -v src/krkn_lib/tests/

Adding tests

Be sure that if you are adding any new functions or functionality you are adding unit tests for it. We want to keep above an 80% coverage in this repo since its our base functionality

Testing Changes in Krkn

Unit Tests

Krkn unit tests are located in the tests/ directory and use Python’s unittest framework with comprehensive mocking. IMPORTANT: These tests do NOT require any external infrastructure, cloud credentials, or Kubernetes cluster - all dependencies are mocked.

Prerequisites

Install krkn dependencies in a virtual environment:

# Create and activate virtual environment
python3.11 -m venv chaos
source chaos/bin/activate

# Install requirements
pip install -r requirements.txt

Running Unit Tests

Run all unit tests:

python -m unittest discover -s tests -v

Run all unit tests with coverage:

python -m coverage run -a -m unittest discover -s tests -v
python -m coverage report

Run specific test file:

python -m unittest tests.test_kubevirt_vm_outage -v

Run specific test class:

python -m unittest tests.test_kubevirt_vm_outage.TestKubevirtVmOutageScenarioPlugin -v

Run specific test method:

python -m unittest tests.test_kubevirt_vm_outage.TestKubevirtVmOutageScenarioPlugin.test_successful_injection_and_recovery -v

Viewing Coverage Results

After running tests with coverage, generate an HTML report:

# Generate HTML coverage report
python -m coverage html

# View the report
open htmlcov/index.html  # macOS
xdg-open htmlcov/index.html  # Linux

Or view a text summary:

python -m coverage report

Example output:

Name                                                          Stmts   Miss  Cover
---------------------------------------------------------------------------------
krkn/scenario_plugins/kubevirt_vm_outage/...                   215     12    94%
krkn/scenario_plugins/node_actions/ibmcloud_node_scenarios.py  185      8    96%
---------------------------------------------------------------------------------
TOTAL                                                          2847    156    95%

Test Output

Unit test output shows:

  • Test names and descriptions
  • Pass/fail status for each test
  • Execution time
  • Any assertion failures or errors

Example output:

test_successful_injection_and_recovery (tests.test_kubevirt_vm_outage.TestKubevirtVmOutageScenarioPlugin)
Test successful deletion and recovery of a VMI using detailed mocking ... ok
test_injection_failure (tests.test_kubevirt_vm_outage.TestKubevirtVmOutageScenarioPlugin)
Test failure during VMI deletion ... ok
test_validation_failure (tests.test_kubevirt_vm_outage.TestKubevirtVmOutageScenarioPlugin)
Test validation failure when KubeVirt is not installed ... ok

----------------------------------------------------------------------
Ran 30 tests in 1.234s

OK

Adding Unit Tests

When adding new functionality, always add corresponding unit tests. See the Adding Tests to Krkn guide for detailed instructions.

Key requirements:

  • Use comprehensive mocking (no external dependencies)
  • Add “IMPORTANT” note in docstring about no credentials needed
  • Test success paths, failure paths, edge cases, and exceptions
  • Organize tests into logical sections
  • Aim for >80% code coverage

Functional Tests (if able to run scenario on kind cluster)

Configuring test Cluster

After creating a kind cluster with the steps above, create these test pods on your cluster

kubectl apply -f CI/templates/outage_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=outage --timeout=300s
kubectl apply -f CI/templates/container_scenario_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=container --timeout=300s
kubectl create namespace namespace-scenario
kubectl apply -f CI/templates/time_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=time-skew --timeout=300s
kubectl apply -f CI/templates/service_hijacking.yaml
kubectl wait --for=condition=ready pod -l "app.kubernetes.io/name=proxy" --timeout=300s

Install Requirements

$ python3.11 -m venv chaos
$ source chaos/bin/activate
$ pip install -r requirements.txt

Run Tests

  1. Add prometheus configuration variables to the test config file
yq -i '.kraken.port="8081"' CI/config/common_test_config.yaml
yq -i '.kraken.signal_address="0.0.0.0"' CI/config/common_test_config.yaml
yq -i '.kraken.performance_monitoring="localhost:9090"' CI/config/common_test_config.yaml
  1. Add tests to the list of functional tests to run
echo "test_service_hijacking" > ./CI/tests/functional_tests
echo "test_app_outages" >> ./CI/tests/functional_tests
echo "test_container"      >> ./CI/tests/functional_tests
echo "test_pod" >> ./CI/tests/functional_tests
echo "test_namespace"      >> ./CI/tests/functional_tests
echo "test_net_chaos"      >> ./CI/tests/functional_tests
echo "test_time"           >> ./CI/tests/functional_tests
echo "test_cpu_hog" >> ./CI/tests/functional_tests
echo "test_memory_hog" >> ./CI/tests/functional_tests
echo "test_io_hog" >> ./CI/tests/functional_tests
  1. Run tests
./CI/run.sh

Results can be seen in ./CI/results.markdown

Adding Tests

Be sure that if you are adding any new scenario you are adding tests for it based on a 5 (3 master, 2 worker) node kind cluster. See more details on how to add functional tests here The tests live here

Testing Changes for Krkn-hub

Install Podman/Docker Compose

You can use either podman-compose or docker-compose for this step

NOTE: Podman might not work on Mac’s

pip3 install docker-compose

OR

To get latest podman-compose features we need, use this installation command

pip3 install https://github.com/containers/podman-compose/archive/devel.tar.gz

Build Your Changes

  1. Run build.sh to create Dockerfile’s for each scenario
  2. Edit the docker-compose.yaml file to point to your quay.io repository (optional; required if you want to push or are testing krknctl)
ex.) 
image: containers.krkn-chaos.dev/krkn-chaos/krkn-hub:chaos-recommender 

change to >

image: quay.io/<user>/krkn-hub:chaos-recommender
  1. Build your image(s) from base krkn-hub directory

    Builds all images in docker-compose file

    docker-compose build
    

    Builds single image defined by service/scenario name

    docker-compose build <scenario_type>
    

    OR

    Builds all images in podman-compose file

    podman-compose build
    

    Builds single image defined by service/scenario name

    podman-compose build <scenario_type>
    

Push Images to your quay.io

Push all Images using docker-compose

docker-compose push

Push a single image using docker-compose

docker-compose push <scenario_type>

OR

Single Image (have to go one by one to push images through podman)

podman-compose push <scenario_type>

OR

podman push quay.io/<username>/krkn-hub:<scenario_type>

Run your new scenario

docker run -d -v <kube_config_path>:/root/.kube/config:Z quay.io/<username>/krkn-hub:<scenario_type>

OR

podman run -d -v <kube_config_path>:/root/.kube/config:Z quay.io/<username>/krkn-hub:<scenario_type>

See krkn-hub documentation for each scenario to see all possible variables to use

Testing Changes in Krknctl

Once you’ve created a krknctl-input.json file using the steps here, you’ll want to test those changes using the below steps. You will need a either podman or docker installed as well as a quay account.

Build and Push to personal Quay

First you will build your changes of krkn-hub and push changes to your own quay repository for testing

Run Krknctl with Personal Image

Once you have your images in quay, you are all set to configure krknctl to look for these new images. You’ll edit the config file of krknctl found here and edit the quay_org to be set to your quay username

With these updates to your config, you’ll build your personal krknctl binary and you’ll be all set to start testing your new scenario and config options.

If any krknctl code changes are required, you’ll have to make changes and rebuild the the krknctl binary each time to test as well