This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Network Chaos Scenario

    Scenario to introduce network latency, packet loss, and bandwidth restriction in the Node's host network interface. The purpose of this scenario is to observe faults caused by random variations in the network.

    How to Run Network Chaos Scenarios

    Choose your preferred method to run network chaos scenarios:

    Example scenario files from scenarios-hub:

    Sample scenario config for egress traffic shaping
    network_chaos:                                    # Scenario to create an outage by simulating random variations in the network.
      duration: 300                                   # In seconds - duration network chaos will be applied.
      node_name:                                      # Comma separated node names on which scenario has to be injected.
      label_selector: node-role.kubernetes.io/master  # When node_name is not specified, a node with matching label_selector is selected for running the scenario.
      instance_count: 1                               # Number of nodes in which to execute network chaos.
      interfaces:                                     # List of interface on which to apply the network restriction.
      - "ens5"                                        # Interface name would be the Kernel host network interface name.
      execution: serial|parallel                      # Execute each of the egress options as a single scenario(parallel) or as separate scenario(serial).
      egress:
        latency: 500ms
        loss: 50%                                    # percentage
        bandwidth: 10mbit
      image: quay.io/krkn-chaos/krkn:tools
    
    Sample scenario config for ingress traffic shaping (using a plugin)
    - id: network_chaos
      config:
        node_interface_name:                            # Dictionary with key as node name(s) and value as a list of its interfaces to test
          ip-10-0-128-153.us-west-2.compute.internal:
            - ens5
            - genev_sys_6081
        label_selector: node-role.kubernetes.io/master  # When node_interface_name is not specified, nodes with matching label_selector is selected for node chaos scenario injection
        instance_count: 1                               # Number of nodes to perform action/select that match the label selector
        kubeconfig_path: ~/.kube/config                 # Path to kubernetes config file. If not specified, it defaults to ~/.kube/config
        execution_type: parallel                        # Execute each of the ingress options as a single scenario(parallel) or as separate scenario(serial).
        network_params:
            latency: 500ms
            loss: '50%'
            bandwidth: 10mbit
        wait_duration: 120
        test_duration: 60
        image: quay.io/krkn-chaos/krkn:tools
    

    Note: For ingress traffic shaping, ensure that your node doesn’t have any IFB interfaces already present. The scenario relies on creating IFBs to do the shaping, and they are deleted at the end of the scenario.

    Steps
    • Pick the nodes to introduce the network anomaly either from node_name or label_selector.
    • Verify interface list in one of the nodes or use the interface with a default route, as test interface, if no interface is specified by the user.
    • Set traffic shaping config on node’s interface using tc and netem.
    • Wait for the duration time.
    • Remove traffic shaping config on node’s interface.
    • Remove the job that spawned the pod.

    How to Use Plugin Name

    Add the plugin name to the list of chaos_scenarios section in the config/config.yaml file

    kraken:
        kubeconfig_path: ~/.kube/config                     # Path to kubeconfig
        ..
        chaos_scenarios:
            - network_chaos_scenarios:
                - scenarios/<scenario_name>.yaml
    

    Run

    python run_kraken.py --config config/config.yaml
    

    This scenario introduces network latency, packet loss, bandwidth restriction in the egress traffic of a Node’s interface using the tc and Netem. For more information refer the following documentation.

    Run

    If enabling Cerberus to monitor the cluster and pass/fail the scenario post chaos, refer docs. Make sure to start it before injecting the chaos and set CERBERUS_ENABLED environment variable for the chaos injection container to autoconnect.

    $ podman run \
      --name=<container_name> \
      --net=host \
      --pull=always \
      --env-host=true \
      -v <path-to-kube-config>:/home/krkn/.kube/config:Z \
      -d containers.krkn-chaos.dev/krkn-chaos/krkn-hub:network-chaos
    $ podman logs -f <container_name or container_id> # Streams Kraken logs
    $ podman inspect <container-name or container-id> \
      --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
    
    $ docker run \
      -e <VARIABLE>=<value> \
      --net=host \
      --pull=always \
      -v <path-to-kube-config>:/home/krkn/.kube/config:Z \
      -d containers.krkn-chaos.dev/krkn-chaos/krkn-hub:network-chaos
    
    $ docker logs -f <container_name or container_id> # Streams Kraken logs
    $ docker inspect <container-name or container-id> \
      --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
    

    Supported parameters

    The following environment variables can be set on the host running the container to tweak the scenario/faults being injected:

    Example if –env-host is used:

    export <parameter_name>=<value>
    

    OR on the command line like example:

    -e <VARIABLE>=<value>
    

    See list of variables that apply to all scenarios here that can be used/set in addition to these scenario specific variables

    Egress Scenarios
    ParameterDescriptionDefault
    DURATIONDuration in seconds - during with network chaos will be applied.300
    IMAGEImage used to disrupt network on a podquay.io/krkn-chaos/krkn:tools
    NODE_NAMENode name to inject faults in case of targeting a specific node; Can set multiple node names separated by a comma""
    LABEL_SELECTORWhen NODE_NAME is not specified, a node with matching label_selector is selected for running.node-role.kubernetes.io/master
    INSTANCE_COUNTTargeted instance count matching the label selector1
    INTERFACESList of interface on which to apply the network restriction.[]
    EXECUTIONExecute each of the egress option as a single scenario(parallel) or as separate scenario(serial).parallel
    EGRESSDictonary of values to set network latency(latency: 50ms), packet loss(loss: 0.02), bandwidth restriction(bandwidth: 100mbit){bandwidth: 100mbit}
    Ingress Scenarios
    ParameterDescriptionDefault
    DURATIONDuration in seconds - during with network chaos will be applied.300
    IMAGEImage used to disrupt network on a podquay.io/krkn-chaos/krkn:tools
    TARGET_NODE_AND_INTERFACE# Dictionary with key as node name(s) and value as a list of its interfaces to test. For example: {ip-10-0-216-2.us-west-2.compute.internal: [ens5]}""
    LABEL_SELECTORWhen NODE_NAME is not specified, a node with matching label_selector is selected for running.node-role.kubernetes.io/master
    INSTANCE_COUNTTargeted instance count matching the label selector1
    EXECUTIONUsed to specify whether you want to apply filters on interfaces one at a time or all at once.parallel
    NETWORK_PARAMSlatency, loss and bandwidth are the three supported network parameters to alter for the chaos test. For example: {latency: 50ms, loss: ‘0.02’}""
    WAIT_DURATIONEnsure that it is at least about twice of test_duration300

    For example:

    $ podman run \
      --name=<container_name> \
      --net=host \
      --pull=always \
      --env-host=true \
      -v <path-to-custom-metrics-profile>:/home/krkn/kraken/config/metrics-aggregated.yaml \
      -v <path-to-custom-alerts-profile>:/home/krkn/kraken/config/alerts \
      -v <path-to-kube-config>:/home/krkn/.kube/config:Z \
      -d containers.krkn-chaos.dev/krkn-chaos/krkn-hub:network-chaos
    
    krknctl run network-chaos (optional: --<parameter>:<value> )
    

    Can also set any global variable listed here

    Scenario specific parameters:

    ParameterDescriptionTypeDefault
    --traffic-typeSelects the network chaos scenario type can be ingress or egressenumingress
    --imageImage used to disrupt network on a podstringquay.io/krkn-chaos/krkn:tools
    --durationDuration in seconds - during with network chaos will be applied.number300
    --label-selectorWhen NODE_NAME is not specified, a node with matching label_selector is selected for running.stringnode-role.kubernetes.io/master
    --executionExecute each of the egress option as a single scenario(parallel) or as separate scenario(serial).enumparallel
    --instance-countTargeted instance count matching the label selector.number1
    --node-nameNode name to inject faults in case of targeting a specific node; Can set multiple node names separated by a commastring
    --interfacesList of interface on which to apply the network restriction. eg.[eth0,eth1,eth2]string
    --egressDictonary of values to set network latency(latency: 50ms), packet loss(loss: 0.02), bandwidth restriction(bandwidth: 100mbit) eg. {bandwidth: 100mbit}string“{bandwidth: 100mbit}”
    --target-node-interfaceDictionary with key as node name(s) and value as a list of its interfaces to test. For example: {ip-10-0-216-2.us-west-2.compute.internal: ens5]}string
    --network-paramslatency, loss and bandwidth are the three supported network parameters to alter for the chaos test. For example: {latency: 50ms, loss: 0.02}string
    --wait-durationEnsure that it is at least about twice of test_durationnumber300

    To see all available scenario options

    krknctl run network-chaos --help