The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.
This chapter describes how to configure a high-availability (HA) environment in your Cisco DCNM OVA deployment for your Cisco Dynamic Fabric Automation (DFA) solution. It also includes details about the HA functionality for each of the applications bundled within the Cisco DCNM OVA.
This chapter includes the following sections:
Note For instruction about installing these applications with the Cisco DCNM OVA, see the“Installing the Cisco DCNM OVA” section.
To achieve HA for applications that are run on the Cisco DCNM OVA, you can run two virtual appliances. You can run one in Active mode and the other in Standby mode.
Note This document refers to these appliances as OVA-A and OVA-B, respectively.
1. All applications run on both appliances.
The application data is either constantly synchronized or applications share a common database as applicable.
2. Only one of the applications running on the two appliances serves the client requests. Initially this would be the applications running on OVA-A. The application continues to do so until one of the following happens:
– The application on OVA-A crashes.
– The operating system on OVA-A crashes.
– OVA-A is powered off for some reason.
3. At this point, the application running on the other appliance (OVA-B) takes over.
For DCNM REST API and AMQP, this transition is done by a load-balancing software that hides the interface address of the appliances using a Virtual IP (VIP) address.
For LDAP, both nodes are configured as duplicates of each other. The LDAP clients (switches) are configured with primary and secondary LDAP IPs, so if the active LDAP fails they try contacting the LDAP running on the standby.
For DHCP, when the first node fails, the second node starts serving the IP addresses.
4. The existing connections to OVA-A are dropped and the new connections are routed to OVA-B.
This scenario demonstrates why one of the nodes (OVA-A) is initially referred to as the Active node and OVA-B is referred as the Standby node.
The application-level and virtual machine (VM)-level and switchover process is as follows.
An application-level failover can also be triggered manually. For instance, you might want to run AMQP on OVA-B and the rest of the applications on OVA-A. In that case, you can log in to the SSH terminal of OVA-A and stop AMQP by using the appmgr stop amqp command.
This failover triggers the same process that is described in the “Automatic Failover” section; subsequent requests to the AMQP Virtual IP address are redirected to OVA-B
This section contains the following topics that describe the prerequisites for obtaining a high-availability (HA) environment.
You must deploy two standalone OVAs. When you deploy both OVAs, you must meet the following criteria:
After the OVA is powered up, verify that all the applications are up and running by using the appmgr status all command.
After all of the applications are up and running, stop the applications by using the appmgr stop all command.
Note When the OVA is started up for the first time, please wait for all the applications to run before you shut down any of the applications or power off the virtual appliance.
Note For instructions on deploying the Cisco DCNM OVA, see Chapter 2, “Installing Cisco DCNM OVA Management Software”.
The DCNM HA cluster needs a server that has both NFS/SCP capabilities. This server is typically a Linux server.
Note The server has to be in the enhanced fabric management network because the switches will use this server to download images and configurations.
Make sure that the exported directory is writable from both peers. The procedure to export a directory /var/lib/sharedarchive on a CentOS server is listed in the following paragraph. The steps will vary based on your environment.
Note You might need root privileges to execute these commands. If you are a nonroot user, please use them with ‘sudo’.
[root@repository ~]# service nfs restart
The same folder /var/lib/sharedarchive can also be accessed through SCP with SCP credentials.
The /var/lib/sharedarchive * (rw,sync) command provides read-write permissions to all servers on /var/lib/sharedarchive. Refer to CentOS documentation for information on restricting write permissions to specific peers.
Two free IPv4 addresses are needed to set up VIP addresses. The first IP address will be used in the management access network; it should be in the same subnet as the management access (eth0) interface of the OVAs. The second IP address should be in the same subnet as enhanced fabric management (eth1) interfaces (switch/POAP management network).
This section describes all of the Cisco DFA HA applications.
Cisco DCNM OVA has two interfaces: one that connects to the OVA management network and one that connects to the enhanced fabric management/DFA network. Virtual IP addresses are defined for both interfaces.
Only three Virtual IPs are defined:
Note Although DCNM OVA in HA sets up a VIP, the VIP is intended to be used for the access of DCNM, REST API. For GUI access, we still recommend that you use the individual IP addresses of the DCNM HA peers and use the same to launch LAN/SAN Java clients, etc.
See the following table for a complete list of DFA applications and their corresponding HA mechanisms.
The data center network management function is provided by the Cisco Prime Data Center Network Manager (DCNM) server. Cisco DCNM provides the setup, visualization, management, and monitoring of the data center infrastructure. Cisco DCNM can be accessed from your browser at http://[host/ip].
Note For more information about Cisco DCNM, see http://cisco.com/go/dcnm
Cisco DCNMs that run on both OVAs are configured in clustering and federated modes for HA.
You can enable automatic failover in the Cisco DCNM UI by choosing: Admin > Federation . If you enable an automatic failover and the Cisco DCNM that is running on OVA-A fails, the automatic failover moves only the fabrics and shallow-discovered LANs that are managed by OVA-A to OVA-B automatically.
An OVA HA setup has two VIP addresses (one for each network) for the Cisco DCNM at the default HTTP port. These VIPs can be used for accessing the DCNM RESTful services on the OVA management network and the enhanced fabric management network. For example, external systems such as Cisco UCS Director can point to the VIP in the OVA management network and the request gets directed to the active Cisco DCNM. Similarly, the switches in an enhanced fabric management network access the VIP address on the enhanced fabric management network during the POAP process.
You can still directly connect to Cisco DCNM real IP addresses and use them as you would in a DCNM in a cluster/federated set up.
Note We recommend that you use a VIP addresses only for accessing DCNM RESTful API. To access the Cisco DCNM Web UI/DCNM SAN/LAN thick client, connect to the server’s real IP address.
For Cisco DCNM, we recommend that you have licenses on the first instance and a spare matching license on the second instance.
Enable an automatic failover option in the Cisco DCNM UI when an OVA HA pair is set up by choosing: Admin > Federation . This process ensures that if the DCNM that is running on OVA-A fails, all the fabrics and shallow-discovered LANs managed by DCNM-A are managed by DCNM-B automatically after a given time interval (usually about 5 minutes after the failure of DCNM on OVA-A).
The Cisco DCNM VIP address still resides on OVA-A. The Representational State Transfer Web Services (REST) calls initially hit the VIP addresses on OVA-A and get redirected to the Cisco DCNM that is running on OVA-B.
When the Cisco DCNM on OVA-A comes up, the VIP address automatically redirects the REST requests to DCNM-A.
The VIP address that is configured for Cisco DCNM REST API on OVA-A can fail due to two reasons:
In both cases, the VIP address of Cisco DCNM automatically migrates to OVA-B. The only difference is which DCNM will be used after the failover.
– If a load-balancing software failure occurs, the VIP address on OVA-B directs the requests to DCNM-A.
– If an OVA-A failure occurs, the VIP address on OVA-B directs the requests to DCNM-B.
The automatic failover ensures that the ownership of all of the fabrics and shallow-discovered LANs managed by DCNM-A automatically change to DCNM-B.
When OVA-A is brought up and Cisco DCNM is running, the VIP addresses keep running on the Standby node. The failback of Virtual IP addresses from OVA-B to OVA-A occurs only in the following sequence.
3. OVA-B goes down or the load-balancing software fails on OVA-B.
RabbitMQ is the message broker that provides the Advanced Messaging Queuing Protocol (AMQP).
Note For more information about RabbitMQ, go to http://www.rabbitmq.com/documentation.html
Enabling the HA on the OVA creates a VIP address in the OVA management network. Orchestration systems such as vCloud Director, set their AMQP broker to the VIP address.
Enabling the HA on the OVA also configures the RabbitMQ broker that runs on each node to be a duplicate of the broker that is running on the other node. Both OVAs act as “disk nodes” of a RabbitMQ cluster, which means that all the persistent messages stored in durable queues are replicated. The RabbitMQ policy ensures that all the queues are automatically replicated to all the nodes.
If RabbitMQ-A fails, the VIP address on OVA-A redirects the subsequent AMQP requests to RabbitMQ-B.
When RabbitMQ-A comes up, the VIP address automatically starts directing the AMQP requests to RabbitMQ-A.
The VIP address configured for the AMQP broker on OVA-A can fail due to two reasons:
In both cases, the VIP address of the AMQP automatically migrates to OVA-B. The only difference is which AMQP broker will be used after the failover.
– In a load-balancing software failure, the VIP address on OVA-B directs the requests to RabbitMQ-A.
– In an OVA-A failure, the VIP address on OVA-B directs the requests to RabbitMQ-B.
When OVA-A is brought up and AMQP-A is running, the VIP addresses keep running on the OVA-B (directing the requests to AMQP-A). The failback of the RabbitMQ VIP from OVA-B to OVA-A occurs only in the following sequence.
3. OVA-B goes down or the load-balancing software fails on OVA-B.
The OVA installs an LDAP server an asset database to the switches.
LDAP HA is achieved through OpenLDAP mirror mode replication. Each LDAP server that is running on one OVA becomes a duplicate of the LDAP server that is running on the other OVA.
Both LDAP IP address show up in the Cisco DCNM Web UI ( Admin -> DFA Settings ) in the following order: LDAP-A, LDAP-B.
Cisco DCNM always attempts to write on LDAP-A as follows.
The data on LDAP-B eventually gets replicated to LDAP-A when it becomes available.
When you configure the asset databases, every switch is configured with multiple LDAP servers, as shown in the following example.
The first active LDAP server that is configured in the switch becomes the Active LDAP server. The Active LDAP server is queried first for autoconfigurations.
For every read operation that the switch needs to perform, the Active LDAP server is contacted first, followed by the rest of the LDAP servers.
Use the show fabric database statistics command to find the Active LDAP server, which is marked by an asterisk (*) in the output.
In the previous example, during autoconfiguration, a leaf switch first queries 10.77.247.148, which is the active network database (indicated by “*n”). If that is not available, it automatically contacts the second LDAP server configured as an network database (10.77.247.147 in this example).
This section describes the behavior when you use a remote LDAP server in an HA environment.
Cisco DCNM allows only two external LDAP servers that are assumed to be synchronized with each other.
The switch and LDAP interaction that use the remote LDAP server is the same interaction as when you are using the OVA-packaged LDAP. The Active LDAP server is contacted first; if it is not reachable, the switch then attempts to read from the next available LDAP server.
DHCP on both OVAs listen on the interface of the enhanced fabric management network. The native Internet Systems Consortium (ISC) DHCPD failover mechanism is be used for HA. The lease information is automatically synchronized using native code.
When a tenant host or virtual machine (VM) comes up, it sends a broadcast that is relayed by the leaf node. In such a scenario, the VM profiles should be configured with both relay addresses of OVA-A and OVA-B.
Scope changes through the Cisco DCNM UI ensure proper synchronization of scopes among the peers. We do not recommend that you do a manual configuration of the DHCP scope configuration file.
Note You must update the IP range for the default scope before creating the new scope, otherwise DHCP will be unable to star. See the “Starting DHCP in an HA Setup” section for information on updating the IP range for the DHCP scope through the Cisco DCNM UI.
Because both of the OVAs in an HA environment are deployed identically, either one of them can be the Active peer. The other OVA would be the Standby peer. All of the configuration CLI commands in the following sections are executed from the secure shell (SSH) terminal.
Step 1 Log in to the SSH terminal of the OVA that you want to become the Active peer and enter the appmgr set ha active command.
Step 2 Make sure that each prerequisite is in place and press y ; if not all of the pre-requisites are in place, press n to exit.
A prompt for the root password appears.
Step 3 Enter the administrative password created during OVA installation.
You will now be prompted for the management access interface (eth0 IP address) of the Standby peer.
Step 4 Enter the management IP address of the peer DCNM.
The active OVA generates a pair of authentication keys and transfers it to the peer’s authorized keys.
a. Enter the root password of the Standby peer when prompted.
All of the other network information needed from the Standby peer is automatically picked up by the Active peer and displayed for confirmation.
b. Ensure that it is the correct peer and press y to continue.
Step 5 Enter the VIP addresses for both the management access (eth0) and enhanced fabric management networks (eth1).
Make sure that the VIP addresses are currently not used by any other interfaces in their respective networks.
Step 6 Enter the database URL to set the database. The script uses a JDBC thin driver, so you should enter the URL in the same format.
a. Enter the database password.
b. Enter the database password again for verification.
The script tries to do a sample query from the database to check the details entered. The Cisco DCNM schema and related data are loaded after you confirm that all the data are valid.
Step 7 Enter repository settings:
a. Enter an SCP/NFS repository IP address for the enhanced fabric management network.
b. Enter the IP/exported-directory location.
The script does a test mount and unmounts it shortly after. It is permanently mounted after user confirmation. Similar checks are done for SCP repository users.
c. You will have to enter the SCP password three times (twice for the script and the third time when the script does a test write on the repository).
d. Enter an NTP server IP address. This step is very important for all the applications that run on a cluster.
Step 8 A summary of the details entered will be displayed. If you want to reenter the details, press n.
Once the HA setup is complete, you can check the role of the ha as follows:
Step 1 Log in to the SSH terminal of OVA-B and enter the appmgr setup ha standby command.
The standby OVA generates a pair of authentication keys and transfers it to the peer’s authorized keys.
a. Enter the root password of the Active peer when prompted.
All the other network information entered during active the OVA setup is automatically picked up by the Standby peer and displayed for confirmation.
b. Carefully check if it is the correct peer and press y to continue.
Once confirmed, OVA-B is configured to be a Standby peer, and the following message is displayed.
Note For information about updating default POAP scopes and starting DHCP using HA, please see, Starting DHCP in an HA Setup
Step 3 Check the HA role of the node by entering the appmgr show ha-role command.
Step 1 Log in to the SSH terminal of the Active peer (OVA-A) and start all applications by entering the appmgr start all command.
Step 2 Wait for all the applications to start. Once all applications (except dhcpd) are up and running, go to the next procedure.
Note To start DHCP using HA, see the “Starting DHCP in an HA Setup” section.
Step 1 Login to the SSH terminal of the Standby peer and start all applications using the appmgr start all command. Wait for all the applications to start.
Step 2 Once all applications (except dhcpd) are up/running, proceed to the next step.
Note For starting DHCP using HA, please see, Starting DHCP in an HA Setup
In an HA setup, DHCPD will be initially down. In this procedure, you will update the IP range address for the POAP DHCP scope. Use the following procedure to bring up DHCP.
Note You must update the IP range for the default scope before creating the new scope, otherwise DHCP will be unable to start.
Step 1 Log in to Cisco DCNM web UI.
Step 2 On the menu bar, choose Config> POAP > DHCP Scope and enter the free IP range address for the default DHCP scope named enhanced_fabric_mgmt_scope.
DHCP is automatically started on both the OVAs.
Step 4 Verify all applications are running by opening an SSH terminal session and using the appmgr status all command.