Manage the Crosswork Cluster

This section contains the following topics:

Cluster Management Overview

The Cisco Crosswork platform uses a cluster architecture. The cluster distributes platform services across a unified group of virtual machine (VM) hosts, called nodes. The underlying software architecture distributes processing and traffic loads across the nodes automatically and dynamically. This architecture helps Cisco Crosswork respond to how you actually use the system, allowing it to perform in a scalable, highly available, and extensible manner.

A single Crosswork cluster consists of a minimum of three nodes, all operating in a hybrid configuration. These three hybrid nodes are mandatory for all Cisco Crosswork deployments. If you have more demanding scale requirements, you can add up to two worker nodes. For more information, see Deploy New Cluster Nodes.

Only users assigned to the admin role or a role with proper permissions will have access to all of the cluster configuration.

Check Cluster Health

Use the Crosswork Manager window to check the health of the cluster. To display this window, from the main menu, choose Administration > Crosswork Manager.

Figure 1. Crosswork Manager Window
Crosswork Manager Window

The Crosswork Manager window gives you summary information about the status of the cluster nodes, the Platform Infrastructure, and the applications you have installed.

The top left section of the window provides details about the cluster while the top right provides details about overall cluster resource consumption. The bottom section breaks down the resource utilization by node, with a separate detail tile for each node. The window shows other details, including the IP addresses in use, whether each node is a hybrid or worker, and so on.

On the top-right corner, click the View more visualizations link to Visually Monitor System Functions in Real Time.

Cluster Management

For details on the nodes in the cluster: On the Crosswork Summary tab, click the Crosswork Cluster tile. Cisco Crosswork displays a Cluster Management window like the one shown in the following figure.

Figure 2. Cluster Management Window
Cluster Management window

Attention


In some cases of manual installations, the Cluster Management window may not display the inventory details in the upper left corner of this screen correctly. In such cases, you need to manually import the cluster inventory file as described in Import Cluster Inventory. Failure to import the inventory can impact your ability to deploy additional nodes and manage the cluster properly.


VM Node Details

To see details for a single node: On the tile for the node, click and choose View Details. The VM Node window displays the node details and the list of microservices running on the node.

Figure 3. Cluster Management Window
VM Node Details window

To restart a microservice, click under the Action column, and choose Restart.

For information on how to use the Crosswork Health tab, see Monitor Platform Infrastructure and Application Health.

Failed Nodes

  • If one of the hybrid nodes is faulty, along with one or more worker nodes and applications, try the Clean System Reboot procedure described in Cluster System Recovery.

  • If more than one hybrid node is faulty, follow the Redeploy and Recover procedure described in Cluster System Recovery.

Import Cluster Inventory

If you have installed your cluster manually using the vCenter UI (without the help of cluster installer tool), you must import an inventory file (.tfvars file) to Cisco Crosswork to reflect the details of your cluster. The inventory file contains information about the VMs in your cluster along with the data center parameters.


Attention


Crosswork cannot deploy or remove VM nodes in your cluster until you complete this operation.



Note


Please uncomment the "OP_Status" parameter while importing the cluster inventory file manually. If you fail to do this, the status of the VM will incorrectly appear as "Initializing" even after the VM becomes functional. 


Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

Step 3

Choose Actions > Import Cluster Inventory to display the Import Cluster Inventory dialog box.

Step 4

(Optional) Click Download sample template file to download and edit the template.

Step 5

Click Browse and select the cluster inventory file.

Step 6

Click Import to complete the operation.


Deploy New Cluster Nodes

After your Cisco Crosswork cluster is formed, you may need more nodes to meet your requirements. The following steps show how to deploy a new VM node:


Note


The Crosswork Summary window and the Cluster Management window display information about your cluster. While both windows display the status of the same cluster, there may be slight mismatches in the representation. This occurs because the Crosswork Summary window displays the node status based on Kubernetes, while the Cluster Management window also considers the node status in the data center.

An example of this mismatch is when a worker node deployment fails in the Crosswork UI due to insufficient data center resources. In this case, the status of the failed worker node is displayed as "degraded" in the Cluster Management window, while the same status appears as "down" in the Crosswork Summary window.


Before you begin

You must know the following:

  • Details about the Cisco Crosswork network configuration, such as the management IP address.

  • Details about the VMware host where you are deploying the new node, such as the data store and data VM interface IP address.

  • The type of node you want to add. Your cluster can have a minimum of three hybrid nodes and up to two worker nodes.

  • If you installed your cluster manually, you must import the cluster inventory file to Cisco Crosswork before you can deploy a new node. For more information, see Import Cluster Inventory. The Deploy VM option will be disabled until you complete the import operation.

Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

Step 3

Choose Actions > Deploy VM to display the Deploy New VM Node window.

Figure 4. Deploy VM Node Window
Deploy New VM Node

Step 4

Fill the relevant values in the fields provided.

Step 5

Click Deploy. The system starts to provision the new node in VMware. Cisco Crosswork adds a tile for the new node in the Crosswork Manager window. The tile displays the progress of the deployment.

You can monitor the node deployment status by choosing Cluster Management > Actions > View Job History, or from the VMware user interface.

If you have added the VM node using Cisco Crosswork APIs: On the newly added VM node tile, click and choose Deploy to complete the operation.

Step 6

If this node was added to reduce the heavy load (running > 90%) on the existing nodes, you can rebalance the resources (see Rebalance Cluster Resources for details), or restart some processes to force the system to move them to the newly added node.


Rebalance Cluster Resources

As part of cluster management, Crosswork constantly monitors the resource utilization in each cluster node. If the CPU utilization in any of the nodes becomes high (by default, the "high" range is set as 90-100%), Crosswork triggers a notification prompting you to take action. You can then use the Rebalance feature to reallocate the resources between the existing VM nodes in your cluster.

If the other nodes in your cluster are also nearing their full capacity, you are recommended to deploy a new worker node before attempting the Rebalance option to ensure easy reallocation of resources. For more information about adding a worker node, see Deploy New Cluster Nodes.


Caution


Rebalancing can take from 15 to 30 minutes during which the Crosswork Applications will be unavailable. Once initiated, a rebalance operation cannot be canceled.


Before you begin

  • Crosswork must be in maintenance mode before rebalancing to ensure data integrity.

  • Any users logged in during the rebalancing will lose their sessions. Notify other users beforehand that you intend to put the system in maintenance mode for rebalancing, and give them a timeline to log out.

Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

For the sake of this procedure, a sample cluster (day0-control) with 3 hybrid nodes and 1 worker node is considered. The CPU utilization is high in one of the hybrid nodes (100% in cw-tb2-cluster-01). See the below image for more details.

A banner displayed below the cluster name warns you about the resource over utilization in the cluster node and recommends adding more worker nodes.

Figure 5. Rebalance notification

On the tile for the node, you can click and choose View Details to see more details.

Step 3

Click Rebalance, and the Rebalance Requirements are displayed. Read through the requirements and select the two check boxes once you are ready to start the rebalancing.

Figure 6. Rebalancing Requirements

Step 4

Click Rebalance to initiate the process. Crosswork begins to reallocate the resources in the over utilized VM node to the other nodes in the cluster.

A dialog box indicating the status of rebalancing is displayed. Kindly wait for the process to complete.

Figure 7. Rebalancing Status

Step 5

After the rebalancing process is completed, you may see one of the following result scenarios:

  • Success scenario: A dialog box indicating successful rebalancing operation. Follow the instructions in the dialog box to proceed further.

    Figure 8. Rebalancing Result - Success
  • Failure scenario - scope available to add new worker nodes: A dialog box indicating rebalancing failure is displayed. In this case, the system prompts you to add a new worker node and try the rebalance process again.

    Figure 9. Rebalancing Result - Add new Worker node
  • Failure scenario - no scope to add new worker nodes: A dialog box indicating rebalancing failure is displayed. In this case, the system prompts you to contact the TAC as new worker nodes cannot be added.

    Figure 10. Rebalancing Result - Add new Worker node

View and Edit Data Center Credentials

This section explains the procedure to view and edit the credentials for the data center (such as VMware vCenter) where Cisco Crosswork is deployed.

Before you begin

Ensure you have the current credentials for vCenter.


Note


In case you have changed your password since Crosswork was originally deployed, you may need to update the stored credentials that Crosswork will use when deploying the new VM.


Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

Step 3

Choose Actions > View/Edit Data Center to display the Edit Data Center window.

The Edit Data Center window displays details of the data center.

Step 4

Use the Edit Data Center window to enter values for the Access fields: Address, Username, and Password).

Step 5

Click Save to save the data center credential changes.


View Job History

Use the Job History window to track the status of jobs, such as deploying a VM or importing cluster inventory.

Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

Step 3

Choose Actions > View Job History.

The Job History window displays a list of cluster jobs. You can filter or sort the Jobs list using the fields provided: Status, Job ID, VM ID, Action, and Users.

Step 4

Click any job to view it in the Job Details panel at the right.


Export Cluster Inventory

Use the cluster inventory file to monitor and manage your Cisco Crosswork cluster.

Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

Step 3

Choose Actions > Export Cluster Inventory.

Cisco Crosswork downloads the cluster inventory gzip file to your local directory.


Retry Failed Nodes

Node deployments with incorrect information can fail. After providing the correct details, you can retry the deployment.

Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

Figure 11. Cluster Management Window: Failed VM Deployment
Retry Failed Nodes

Step 3

Click Retry on the failed node tile to display the Deploy New VM Node window.

Step 4

Provide corrected information in the fields provided.

Step 5

Click Deploy.


Erase Nodes

As an administrator, you can erase (that is, remove or delete) any failed or healthy node from the Cisco Crosswork cluster. Erasing a node removes the node reference from the Cisco Crosswork cluster and deletes it from the host VM.

The steps to erase a node are the same for both hybrid and worker nodes. However, the number and timing of erasure is different in each case:

  • The system must maintain three operational hybrid nodes at all times. If one of the hybrid nodes stops functioning, Crosswork will attempt to compensate, however the system performance and protection against further failures will be severely impacted. In such cases, the faulty node is erased and a new hybrid node needs to be deployed to replace it.

  • You can have up to two worker nodes. While you can erase all of them without consequences, we recommend that you erase and replace them one at a time.

  • If you are still having trouble after taking these steps, contact the Cisco Customer Experience team for assistance.


Warning


  • Erasing a node is a disruptive action and can block some processes until the action is completed. To minimize disruption, conduct this activity during a maintenance window only.

  • Removing worker and hybrid nodes places extra workload on the remaining nodes and can impact system performance. You are encouraged to contact the Cisco Cisco Customer Experience team before removing nodes.

  • While removing a Hybrid or Worker node, the Cisco Crosswork UI may become unreachable for 1-2 minutes, due to the relocation of the robot-ui pod to a new node.



Note


For manual cluster installation, you must erase the VM from Crosswork UI and then delete the VM from the data center (e.g. vCenter).


Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

Step 3

On the tile for the node you want to remove, click and select Erase to display the Erase VM Node dialog box .

Step 4

Click Erase again to confirm the action.

Note

 

A removed node will continue to be visible in the Grafana dashboard as an entry with only historical data.


Manage Maintenance Mode Settings

Maintenance mode provides a means for shutting down the Crosswork system temporarily. The maintenance mode shutdown is graceful. Crosswork synchronizes all application data before the shutdown.

It can take several minutes for the system to enter maintenance mode and to restart when maintenance mode is turned off. During these periods, users should not attempt to log in or use the Crosswork applications.

Before you begin


Attention


  • Make a backup of your Crosswork cluster before enabling the maintenance mode.

  • Notify other users that you intend to put the system in maintenance mode and give them a deadline to log out. The maintenance mode operation cannot be canceled once you initiate it.


Procedure


Step 1

To put Crosswork in maintenance mode:

  1. From the main menu, choose Administration > Settings > System Settings > Maintenance Mode.

  2. Drag the Maintenance slider to the right, or On position.

  3. Crosswork warns you that it is about to initiate a shutdown. Click Continue to confirm your choice.

    It can take several minutes for the system to enter maintenance mode. During that period, other users should not attempt to log in or use the Crosswork applications.

    Note

     

    If you wish to reboot the cluster, wait for 5 minutes after system has entered maintenance mode in order to allow the Cisco Crosswork database to sync, before proceeding.

Step 2

To restart Crosswork from maintenance mode:

  1. From the main menu, choose Administration > Settings > System Settings > Maintenance Mode.

  2. Drag the Maintenance slider to the left, or Off position.

    It can take several minutes for the system to restart. During this period, users should not attempt to log in or use the Crosswork applications.

    Note

     

    If a reboot or restore was performed when the system was previously put in maintenance mode, the system will boot up in the maintenance mode and you will be prompted with a popup window to toggle the maintenance mode off. If you do not see a prompt (even when the system was rebooted while in maintenance mode), you must toggle the maintenance mode on and off to allow the applications to function normally.


Cluster System Recovery

When System Recovery Is Needed


Caution


The methods explained in this topic may fail if you use a cluster profile consisting of only 3 hybrid VM nodes (and no worker nodes). The failure happens due to the lack of VM resiliency caused by the absence of worker nodes.


At some time during normal operations of your Cisco Crosswork cluster, you may find that you need to recover the entire system. This can be the result of one or more malfunctioning nodes, one or more malfunctioning services or applications, or a disaster that destroys the hosts for the entire cluster.

A functional cluster requires a minimum of three hybrid nodes. These hybrid nodes share the processing and traffic loads imposed by the core Cisco Crosswork management, orchestration, and infrastructure services. The hybrid nodes are highly available and able to redistribute processing loads among themselves, and to worker nodes, automatically.

The cluster can tolerate one hybrid node reboot (whether graceful or ungraceful). During the hybrid node reboot, the system is still functional, but degraded from an availability point of view. The system can tolerate any number of failed worker nodes, but again, system availability is degraded until the worker nodes are restored.

Cisco Crosswork generates alarms when nodes, applications, or services are malfunctioning. If you are experiencing system faults, examine the alarm and check the health of the individual node, application, or service identified in the alarm. You can use the features described in Cluster Management Overview to drill down on the source of the problem and, if it turns out to be a service fault, restart the problem service.

If you see alarms indicating that one hybrid node has failed, or that one hybrid node and one or more worker nodes have failed, start by attempting to reboot or replace (erase and then readd) the failed nodes. If you are still having trouble after that, consider performing a clean system reboot.

The loss of two or more hybrid nodes is a double fault. Even if you replace or reboot the failed hybrid nodes, there is no guarantee that the system will recover correctly. There may also be cases where the entire system has degraded to a bad state. For such states, you can deploy a new cluster, and then recover the entire system using a recent backup taken from the old cluster.


Important


  • VM shutdown is not supported on a 3 VM cluster that is running the Crosswork Network Controller solution. If a VM fails, the remaining two VMs cannot support all the pods being migrated from the failed VM. You must deploy additional worker nodes to enable the VM shutdown.

  • Reboot of one of the VMs is supported in a 3 VM cluster. In case of a reboot, the VM restore can take from 5 minutes (if the orch pod is not running in the rebooted VM) up to 25 minutes (if the orch pod is running in the rebooted VM).


The following two sections describe the steps to follow in each case.

Clean System Reboot (VMware)

Follow these steps to perform a clean system reboot:

  1. Put Crosswork in Maintenance mode. See Manage Maintenance Mode Settings for more details.

  2. Power down the VM hosting each node:

    1. Log in to the VMware vSphere Web Client.

    2. In the Navigator pane, right-click the VM that you want to shut down.

    3. Choose Power > Power Off.

    4. Wait for the VM status to change to Off.

  3. Repeat Step 2 for each of the remaining VMs, until all the VMs are shut down.

  4. Power up the VM hosting the first of your hybrid nodes:

    1. In the Navigator pane, right-click the VM that you want to power up.

    2. Choose Power > Power Up.

    3. Wait for the VM status to change to On, then wait another 30 seconds before continuing.

  5. Repeat Step 4 for each of the remaining hybrid nodes, staggering the reboot by 30 seconds before continuing. Then continue with each of your worker nodes, again staggering the reboot by 30 seconds.

  6. The time taken for all the VMs to be powered on can vary based on the performance characteristics of your hardware. After all VMs are powered on, wait for a few minutes and login to Crosswork.

  7. Move Crosswork out of Maintenance mode. See Manage Maintenance Mode Settings for more details.


    Note


    If your Crosswork cluster is not in a healthy state, attempts to force maintenance mode will likely fail. Despite a successful attempt, application sync issues may still happen. In such cases, alarms will be generated indicating the list of failed services and the failure reason. If you face this scenario, you may still proceed with the "Redeploy and Restore" method mentioned below.


Redeploy and Restore (VMware)

Follow these steps to redeploy and recover your system from a backup. Note that this method assumes you have taken periodic backups of your system before it needed recovery. For information on how to take backups, see Manage Cisco Crosswork Backup and Restore.

  1. Power down the VM hosting each node:

    1. Log in to the VMware vSphere Web Client.

    2. In the Navigator pane, right-click the VM that you want to shut down.

    3. Choose Power > Power Off.

    4. Wait for the VM status to change to Off.

    5. Repeat these steps as needed for the remaining nodes in the cluster.

  2. Once all the VMs are powered down, delete them:

    1. In the VMware vSphere Web Client Navigator pane, right-click the VM that you want to delete.

    2. Choose Delete from Disk.

    3. Wait for the VM status to change to Deleted.

    4. Repeat these steps as needed for the remaining VM nodes in the cluster.

  3. Deploy a new Cisco Crosswork cluster, as explained in Cisco Crosswork Network Controller 6.0 Installation Guide.

  4. Recover the system state to the newly deployed cluster, as explained in Restore Cisco Crosswork After a Disaster.

Collect Cluster Logs and Metrics

As an administrator, you can monitor or audit the components of your Cisco Crosswork cluster by collecting periodic logs and metrics for each cluster component. These components include the cluster as a whole, individual node in the cluster, and the microservices running on each of the nodes.

Cisco Crosswork provides logs and metrics using the following showtech options:

  • Request All to collect both logs and metrics.

  • Request Metrics to collect only metrics.

  • Collect Logs to collect only logs.

  • View Showtech Jobs to view all showtech jobs.


    Note


    Showtech logs must be collected separately for each application.


Procedure


Step 1

From the main menu, choose Administration > Crosswork Manager.

Step 2

On the Crosswork Summary tab, click the Crosswork Cluster tile to display the Cluster Management window.

Step 3

To collect logs and metrics for the cluster, click Actions and select the showtech option that you want to perform.

Step 4

To collect logs and metrics for any node in the cluster:

  1. Click the node tile.

  2. Click Showtech Options and select the operation that you want to perform.

Step 5

To collect logs and metrics for the individual microservices running on the VM node, click under the Actions column. Then select the showtech option that you want to perform.

Step 6

(Optional) Click View Showtech Jobs to view the status of your showtech jobs. The Showtech Requests window displays the details of the showtech jobs.

Figure 12. Showtech Requests window
Showtech Requests window

Step 7

Click Publish to publish the showtech logs. The Enter Destination Server dialog box is displayed. Enter the relevant details and click Publish.

Figure 13. Destination Server window
Destination Server window

Step 8

Click Details to view details of the showtech log publishing.