This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Network Chaos NG Scenario

This scenario introduce a new infrastructure to refactor and port the current implementation of the network chaos plugins, it also introduces a new scenario for node ip traffic filtering

1 - Network Chaos API

AbstractNetworkChaosModule abstract module class

All the plugins must implement the AbstractNetworkChaosModule abstract class in order to be instantiated and ran by the Netwok Chaos NG plugin. This abstract class implements two main abstract methods:

  • run(self, target: str, kubecli: KrknTelemetryOpenshift, error_queue: queue.Queue = None) is the entrypoint for each Network Chaos module. If the module is configured to be run in parallel error_queue must not be None
    • target: param is the name of the resource (Pod, Node etc.) that will be targeted by the scenario
    • kubecli: the KrknTelemetryOpenshift needed by the scenario to access to the krkn-lib methods
    • error_queue: a queue that will be used by the plugin to push the errors raised during the execution of parallel modules
  • get_config(self) -> (NetworkChaosScenarioType, BaseNetworkChaosConfig) returns the common subset of settings shared by all the scenarios BaseNetworkChaosConfig and the type of Network Chaos Scenario that is running (Pod Scenario or Node Scenario)

BaseNetworkChaosConfig base module configuration

Is the base class that contains the common parameters shared by all the Network Chaos NG modules.

  • id is the string name of the Network Chaos NG module
  • wait_duration if there is more than one network module config in the same config file, the plugin will wait wait_duration seconds before running the following one
  • test_duration the duration in seconds of the scenario
  • label_selector the selector used to target the resource
  • instance_count if greater than 0 picks instance_count elements from the targets selected by the filters randomly
  • execution if more than one target are selected by the selector the scenario can target the resources both in serial or parallel.
  • namespace the namespace were the scenario workloads will be deployed

2 - Node Network Filter

Overview

Creates iptables rules on one or more nodes to block incoming and outgoing traffic on a port in the node network interface. Can be used to block network based services connected to the node or to block inter-node communication.

Configuration

- id: node_network_filter
  wait_duration: 300
  test_duration: 100
  label_selector: "kubernetes.io/hostname=ip-10-0-39-182.us-east-2.compute.internal"
  instance_count: 1
  execution: parallel
  namespace: 'default'
  # scenario specific settings
  ingress: false
  egress: true
  target: node
  interfaces: []
  ports:
    - 2049

for the common module settings please refer to the documentation.

  • ingress: filters the incoming traffic on one or more ports. If set one or more network interfaces must be specified
  • egress : filters the outgoing traffic on one or more ports.
  • target: sets the type of resource to be targeted, values can be node or pod
  • interfaces: a list of network interfaces where the incoming traffic will be filtered
  • ports: the list of ports that will be filtered

Examples

AWS EFS (Elastic File System) disruption

- id: node_network_filter
  wait_duration: 300
  test_duration: 100
  label_selector: "node-role.kubernetes.io/worker="
  instance_count: 0
  execution: parallel
  namespace: 'default'
  # scenario specific settings
  ingress: false
  egress: true
  target: node
  interfaces: []
  ports:
    - 2049

This configuration will disrupt all the PVCs provided by the AWS EFS service to an OCP/K8S cluster. The service is essentially an elastic NFS service so blocking the outgoing traffic on the port 2049 in the worker nodes will cause all the pods mounting the PVC to be unable to read and write in the mounted folder.

Etcd Split Brain

- id: node_network_filter
  wait_duration: 300
  test_duration: 100
  label_selector: "node-role.kubernetes.io/master="
  instance_count: 1
  execution: parallel
  namespace: 'default'
  # scenario specific settings
  ingress: false
  egress: true
  target: node
  interfaces: []
  ports:
    - 2379
    - 2380

This configuration will cause the disruption of the etcd traffic in one of the master nodes, this configuration will cause one of the three master node to be isolated by the other nodes causing the election of two etcd leader nodes, one is the isolated node, the other will be elected between one of the two remaining nodes.