Troubleshooting Cisco APIC-EM Single and Multi-Host

The following information may be used to troubleshoot Cisco APIC-EM single and multi-host:

Recovery Procedures for Cisco APIC-EM Node Failures

The following table describes recommended procedures to take to resolve a Cisco APIC-EM single node failure scenario.

Table 1  Single Host Recovery Procedures

Node Failure Scenario

Symptoms and Recovery Procedures

Power outage

In most cases, the node should recover automatically when the power is restored. In rare situations, some of the APIC-EM services may not come up cleanly due to some transient conditions. In such cases, you would need to execute the following steps to ensure that the node comes back online cleanly:

  1. If you have not already done so, restart the power on the failed host.

  2. Reset the host.

    See Resetting the Cisco APIC-EM.

Bad or faulty hardware

Perform the following steps to recover from a node failure scenario due to bad or faulty hardware:

  1. RMA the bad or faulty hardware.

  2. Reinstall new hardware.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.

  3. Install Cisco APIC-EM controller software on the new hardware.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.

  4. Restore your database backup using the controller's GUI.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

  5. Ensure that you have installed and enabled any applications that were previously running on the controller.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

  6. If applicable to your configuration, add the new host to the cluster.

    See Adding a New Host to a Multi-Host Cluster.

Controller software upgrade failure

In this case, to recover from the upgrade failure and return to the current Cisco APIC-EM version, perform the following steps:

  1. Restore your database backup using the controller's GUI.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

  2. Ensure that you have installed and enabled any applications that were previously running on the controller.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

The following table describes recommended procedures to take to resolve a Cisco APIC-EM multi-host (node) failure scenario.

Table 2  Multi-Host Recovery Procedures

Node Failure Scenario

Symptoms and Recovery Procedures

Power outage causing one or more of the cluster nodes to go down.

In most cases, the host(s) should rejoin the Cisco APIC-EM cluster on its own when the power is restored. In rare situations, some of the Cisco APIC-EM services may not form the cluster with the existing Cisco APIC-EM hosts. In such cases, you would need to execute the following steps to ensure that the failed host joins the cluster:

  1. If you have not already done so, restart the power on the failed host.

  2. Reset the host.

    See Resetting the Cisco APIC-EM.

Note   

If after a power outage, the host does not come back up, then follow the procedures directly below for recovering from bad or faulty hardware.

Bad or faulty hardware on one of the cluster nodes.

In this case, you would need to first remove the faulty (bad) host from the cluster and then add the new host to the cluster. Perform the following steps:

  1. RMA the bad or faulty hardware.

  2. Reinstall new hardware.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.

  3. Install Cisco APIC-EM controller software on the new hardware.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.

  4. Restore your database backup using the controller's GUI.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

  5. Ensure that you have installed and enabled any applications that were previously running on the controller.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

  6. Add the new host to the cluster.

    See Adding a New Host to a Multi-Host Cluster.

Network connectivity issues between the cluster nodes.

In most cases, the node(s) should rejoin the Cisco APIC-EM cluster on its own when the network connectivity is restored. In rare situations, some of the Cisco APIC-EM services may not form the cluster with the existing Cisco APIC-EM nodes. In such cases, you would need to execute the following steps to ensure that the failed node joins the cluster:

  1. Reset the host.

    See Resetting the Cisco APIC-EM.

Controller software upgrade failure on one of the cluster hosts.

In this case, to recover from the upgrade failure and return to the current Cisco APIC-EM version, perform the following steps:

  1. Restore your database backup using the controller's GUI.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

  2. Ensure that you have installed and enabled any applications that were previously running on the controller.

    See the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

Hardware upgrade on one of the cluster nodes.

Gracefully, shut down the host, upgrade the hardware (RAM, CPU, etc.) and restart the host.

See Shutting Down and Starting Up a Host in a Multi-Host Cluster.

Removing a Single Host from a Multi-Host Cluster

To troubleshoot an issue with a multi-host cluster, you may need to remove a single host from a multi-host cluster. This procedure describes how to remove one of the hosts running Cisco APIC-EM from a multi-host cluster. You use the Cisco APIC-EM configuration wizard to perform this procedure.


Note


The configuration wizard option to remove a host only appears if the host on which you are running the configuration wizard is part of a multi-host cluster. If the host is not part of a multi-host cluster, then the option to remove a host does not display. When performing this procedure, controller downtime occurs. For this reason, we recommend that you perform this procedure during a maintenance time period.


Before You Begin

You should have installed the Cisco APIC-EM on a multi-host cluster as described in the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.

You must perform this procedure on the single host that is to be removed from the multi-host cluster.

The multi-host cluster should still be operational.


    Step 1   Using a Secure Shell (SSH) client, log into the host (appliance, server, or virtual machine) with the IP address that you specified using the configuration wizard.
    Note   

    The IP address to enter for the SSH client is the IP address that you configured for the network adapter. This IP address connects the appliance to the external network.

    Step 2   When prompted, enter your Linux username ('grapevine') and password for SSH access.
    Step 3   Enter the following command to access the configuration wizard.
    $ config_wizard
    
    
    Note   

    The config_wizard command is in the PATH of the 'grapevine' user, and not the "root" user. Either run the command as the "grapevine" user, or fully qualify the command as the "root" user. For example: /home/grapevine/bin/config_wizard

    Step 4   Review the Welcome to the APIC-EM Configuration Wizard! screen and choose the option to remove the host from the cluster:
    • Remove this host from its APIC-EM cluster

    Step 5   A message appears with the following options:
    • [cancel]—Exit the configuration wizard.

    • [proceed]—Begin the process to remove this host from its cluster.

    Choose proceed>> to begin. After choosing proceed>>, the configuration wizard begins to remove this host from its cluster.
    Step 6   At the end of this process, you must then either run the configuration wizard again to configure the host as a new Cisco APIC-EM or join the Cisco APIC-EM to a cluster.
    Important:

    If you wish to use this host again as either a stand-alone controller or operating within a cluster, then you must run the configuration wizard again and re-install the Cisco APIC-EM. Do not attempt to use this host again as either a standalone host or within a cluster without re-installing the Cisco APIC-EM.


    Removing a Faulted Host from a Multi-Host Cluster

    Perform the steps in the following procedure to remove a faulted or inoperative host (running Cisco APIC-EM) from a multi-host cluster. You use the Cisco APIC-EM configuration wizard to perform this procedure. A host becomes faulted when it can no longer participate in the cluster due to hardware or software issues.

    After following this procedure on a three host cluster (moving from three hosts to two hosts), you will lose high-availability protection against loss of a host. After following this procedure for a two host cluster, then the cluster will become inoperable until that second host is brought back up and added to the cluster.


    Note


    The fact that the host becomes "faulted" results in replacement instances of the services on the faulted host being grown on the remaining hosts in the cluster. During the time period when the replacement instances are being grown and depending on the types of services being grown, certain Cisco APIC-EM functionality may not be available.


    Before You Begin

    You have installed the Cisco APIC-EM on a multi-host cluster following the procedure described in the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.

    You must perform this procedure on an active host in the multi-host cluster. You cannot perform this procedure on the faulted host that is to be removed from the multi-host cluster. A faulted host is displayed as red in the System Health tab view in the Home page of the controller's GUI.


    Note


    You should always first attempt to bring the faulted host back online. After determining that the faulted host can no longer participate in the cluster, then try to remove the faulted host using the Remove this host from its APIC-EM cluster configuration wizard option (as described in the previous procedure). You should only follow this procedure and the Remove a faulted host from this APIC-EM cluster configuration wizard option, if that other option is tried first and is unsuccessful in removing the host.



      Step 1   Using a Secure Shell (SSH) client, log into the host (appliance, server, or virtual machine) with the IP address that you specified using the configuration wizard.
      Note   

      The IP address to enter for the SSH client is the IP address that you configured for the network adapter. This IP address connects the appliance to the external network.

      Step 2   When prompted, enter your Linux username ('grapevine') and password for SSH access.
      Step 3   Enter the following command to access the configuration wizard.
      $ config_wizard
      
      
      Note   

      The config_wizard command is in the PATH of the 'grapevine' user, and not the "root" user. Either run the command as the "grapevine" user, or fully qualify the command as the "root" user. For example: /home/grapevine/bin/config_wizard.

      Step 4   Review the Welcome to the APIC-EM Configuration Wizard! screen and choose the option to forcibly remove the faulted host from the cluster:
      • Remove a faulted host from this APIC-EM cluster

      Step 5   A message appears with the following options:
      • <Remove IP Address from cluster>—Forcibly removes the faulted host (identified by its IP address) from the multi-host cluster.

      • <exit>—Exit the configuration wizard without removing the faulted host.

      Choose <Remove IP Address from cluster> to begin. After choosing <Remove IP Address from cluster>, the configuration wizard begins to remove this faulted host from its cluster.
      Step 6   At the end of this process, you must then either run the configuration wizard again to configure the host as a new controller or join the controller to a cluster.
      Important:

      If you wish to use this host again as either a stand-alone controller or operating within a cluster, then you must run the configuration wizard again and re-install the Cisco APIC-EM. Do not attempt to use this host again as either a standalone host or within a cluster without re-installing the Cisco APIC-EM.


      Resetting the Cisco APIC-EM

      You can troubleshoot a Cisco APIC-EM deployment by resetting the controller back to configuration values that were originally set using the configuration wizard the first time. A reset of the controller is helpful, when the controller has gotten itself into an unstable state and other troubleshooting activities have not resolved the situation.


      Note


      In a multi-host environment, you need to perform this procedure on only a single host. After performing this procedure on a single host, the other two hosts will be automatically reset.


      Before You Begin

      You have installed the Cisco APIC-EM following the procedure described in the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.


        Step 1   Using a Secure Shell (SSH) client, log into the host (physical or virtual) with the IP address that you specified using the configuration wizard.
        Note   

        The IP address to enter for the SSH client is the IP address that you configured for the network adapter. This IP address connects the host to the external network.

        Step 2   When prompted, enter your Linux username ('grapevine') and password for SSH access.
        Step 3   Navigate to the bin directory on the Grapevine root. The bin directory contains the grapevine scripts.
        Step 4   Enter the reset_grapevine command at the prompt to run the reset grapevine script.
        $ reset_grapevine
        
        

        The reset_grapevine command returns the configuration settings back to values that you configured when running the configuration wizard for the first time. The configuration settings are saved to a .JSON file. This .JSON file is located at: /etc/grapevine/controller-config.json. The reset_grapevine command uses the data in the controller-config.json file to return to the earlier configuration settings, so do not delete this file. If you delete this file, you must run the configuration wizard again and reenter your configuration data.

        Important:

        The reset_grapevine command will terminate if the SSH connection is disconnected for any reason. To avoid this, we recommend that you use tmux (terminal multiplexer) which is already installed on the controller to run the reset_grapevine command in the session. You can use the following commands for tmux:

        tmux new -s session_name reset_grapevine

        Command to create a new session using tmux for reset-grapevine.

        For example, you can enter the following command:

        tmux new -s session100 reset_grapevine

        tmux ls

        Command to view a the list of tmux sessions.

        tmux attach -t session_name reset_grapevine

        Command to attach to a tmux session.

        For example, you can enter the following command:

        tmux attach -t session200 reset_grapevine

        To get more information about tmux, you can run the man tmux command.

        After entering the reset_grapevine command, you are then prompted to reenter your Grapevine password.

        Step 5   Enter your Grapevine password a second time.
        
        [sudo] password for grapevine:********
        
        

        You are then prompted to delete all virtual disks The virtual disks are where the Cisco APIC-EM database resides. For example, data about devices that the controller discovered are saved on these virtual disks. If you enter yes (y), all of this data is deleted. If you enter no (n), then the new cluster will come up populated with your existing data once the reset procedure completes.

        Step 6   Enter n to prevent the deletion all of the virtual disks.
        
        THIS IS A DESTRUCTIVE OPERATION
        Do you want to delete all VIRTUAL DISKS in your APIC-EM cluster? (y/n):n
        
        

        You are then prompted to delete all Cisco APIC-EM authentication timeout policies, user password policies, and user accounts other than the primary administrator account.

        Step 7   Enter n to prevent the deletion of all authentication timeout policies, user password policies, and user accounts other than the primary administrator account.
        
        THIS IS A DESTRUCTIVE OPERATION
        Do you want to delete authentication timeout policies, user password policies, 
        and Cisco APIC-EM user accounts other than the primary administrator account? (y/n): n
        
        

        You are then prompted to delete any imported certificates.

        Step 8   Enter n to prevent the deletion of any imported certificates.
        
        THIS IS A DESTRUCTIVE OPERATION
        Do you want to delete the imported certificates? (y/n): n
        
        

        You are then prompted to delete any backups.

        Step 9   Enter n to prevent the deletion of any backups.
        
        THIS IS A DESTRUCTIVE OPERATION
        Do you want to delete the backups? (y/n): n
        
        

        The controller then resets itself with the configuration values that were originally set using the configuration wizard the first time. When the controller is finished resetting, you are presented with a command prompt from the controller.

        Step 10   Using the Secure Shell (SSH) client, log out of the host.

        Adding a New Host to a Multi-Host Cluster

        Perform the steps in this procedure to configure Cisco APIC-EM on your host and to join it to another, pre-existing host to create a cluster. Configuring the Cisco APIC-EM on multiple hosts to create a cluster is best practice for both high availability and scale.


        Caution


        • When joining a host to a cluster as described in the procedure below, there is no merging of the data on the two hosts. The data that currently exists on the host that is joining the cluster is erased and replaced with the data that exists on the cluster that is being joined to.

        • When joining the additional hosts to form a cluster be sure to join only a single host at a time. You should not join multiple hosts at the same time, as doing so will result in unexpected behavior.

        • You should also expect some service downtime when the adding or removing hosts to a cluster, since the services are then redistributed across the hosts. Be aware that during the service redistribution, there will be downtime.


        Before You Begin

        You must have performed the following prerequisites:

        • You must have either received a Cisco APIC-EM Controller Appliance with the Cisco APIC-EM pre-installed or you must have downloaded, verified, and installed the Cisco ISO image onto a second server or virtual machine.

        • You must have already configured Cisco APIC-EM on the first host (server or virtual machine) in your planned multi-host cluster following the steps in the previous procedure.

        • Additionally, you must have checked the controller's health on the first host using the SYSTEM HEALTH tab in the GUI. The SYSTEM HEALTH tab is directly accessible from the HOME page. For information about this procedure, see the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

        This procedure must be run on the second host that you are joining to the cluster. When joining the new host to the cluster, you must specify an existing host in the cluster to connect to.


        Note


        The Cisco APIC-EM multi-host configuration supports the following two workflows:

        • You first configure a single host running Cisco APIC-EM in your network. After performing this procedure, you then use the wizard to configure and join two additional hosts to form a cluster.

        • If you already have several single hosts configured with Cisco APIC-EM, you can use the configuration wizard to join two additional hosts to a single host to form a cluster.



          Step 1   Boot up the host.
          Step 2   Review the APIC-EM License Agreement screen that appears and choose either <view license agreement> to review the license agreement or accept>> to accept the license agreement and proceed with the deployment.
          Note   

          You will not be able to proceed without accepting the license agreement.

          After accepting the license agreement, you are then prompted to select a configuration option.

          Step 3   Review the Welcome to the APIC-EM Configuration Wizard! screen and choose one of the two displayed options to begin.
          • Create a new APIC-EM cluster

          • Add this host to an existing APIC-EM cluster

          For the multi-host deployment, click the Add this host to an existing APIC-EM cluster option.

          Step 4   Enter configuration values for the NETWORK ADAPTER #1 on the host.

          The configuration wizard discovers and prompts you to confirm values for the network adapter or adapters on your host. For example, if your host has two network adapters you are prompted to confirm configuration values for network adapter #1 (eth0) and network adapter #2 (eth1).

          Note   
          Important:

          On Cisco UCS servers, the NIC labeled with number 1 would be the physical NIC. The NIC labeled with the number 2 would be eth1.

          Host IP address

          Enter a host IP address to use for the network adapter. This host IP address connects to the external network or networks.

          Note   

          The network adapter(s) connect to the external network or networks. These external network(s) consists of the network devices, NTP servers, as well as providing access to the northbound REST APIs. The external network(s) also provides access to the controller GUI.

          Netmask

          Enter the netmask for the network adapter's IP address.

          Later in this procedure, the following information will be discovered and copied from the cluster to the configuration file of this host:

          • Default Gateway IP address

          • DNS Servers

          • Static Routes

          Once satisfied with the controller network adapter settings, enter next>> to proceed. After entering next>>, the configuration wizard proceeds to validate the values you entered. After validation, you are then prompted to enter values for the APIC-EM CLUSTER SETTINGS.

          Step 5   Enter configuration values for the APIC-EM CLUSTER SETTINGS.

          Remote Host IP

          Enter the eth0 IP address of the pre-configured host that you are now joining to form a cluster.

          Note   

          If a virtual IP address has already been configured on another host for a multi-host cluster, you may also enter that IP address value. This field accepts either the IP address of a pre-configured host to the cluster or the virtual IP address of the cluster.

          Administrator Username

          Enter an administrator username.

          This is the administrator username on the pre-configured host that you are now joining to form a cluster.

          Administrator Password

          Enter an administrator password.

          This is the administrator password on the pre-configured host that you are now joining to form a cluster.

          For information about the requirements for an administrator password, see the Password Requirements section in Chapter 2, Securing the Cisco APIC-EM in the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.

          Note   

          The administrator password is encrypted and hashed in the controller database.

          After configuring the administrator cluster settings, enter next>> to proceed. After entering next>>, the configuration wizard then proceeds to prepare the host to join the cluster.

          You will receive a message to please wait, while the remote cluster is being queried and data is retrieved.

          Step 6   Enter configuration values for the Virtual IP.
          Note   

          If you are joining the host to a cluster where the virtual IP has already been configured, then you will not be prompted for virtual IP configuration values. If you are joining the host to a cluster where a virtual IP has not yet been configured, then you will be prompted for virtual IP configuration values.

          Virtual IP

          Enter the virtual IP address to use for the network that the controller is directed to.

          Note   

          For additional information about virtual IP, see the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.

          Once satisfied with the virtual IP address settings, enter next>> to proceed. After entering next>>, the configuration wizard proceeds to validate the values you entered.

          Step 7   (Optional) Enter additional configuration values for the Virtual IP.

          The configuration wizard proceeds to continue its discovery of any pre-existing configuration values on the hosts in the cluster. Depending upon what the configuration wizard discovers, you may be prompted to enter additional configuration values. For example:

          • If eth1 was configured on a pre-existing host in the cluster, then you are prompted to enter the host IP address that was configured for eth1. You are also prompted for a VIP, if it has not yet been configured for this NIC.

          • If eth2 was configured on a pre-existing host in the cluster, then you are prompted to enter the host IP address that was configured for eth2. You are also prompted for a VIP, if it has not yet been configured for this NIC.

          • If eth3 was configured on a pre-existing host in the cluster, then you are prompted to enter the host IP address that was configured for this eth3. You are also prompted for a VIP, if it has not yet been configured for this NIC.

          Note   

          This configuration wizard discovery process and prompting continues for the number of configured Ethernet ports in the cluster.

          Virtual IP

          Enter the virtual IP address to use for the network that the controller is directed to.

          IP address

          Enter an IP address to use for this network adapter. This IP address connects to the external network or networks.

          Note   

          The network adapter(s) connect to the external network or networks. These external network(s) consists of the network devices, NTP servers, as well as providing access to the northbound REST APIs. The external network(s) also provides access to the controller GUI.

          Once satisfied with the virtual IP address settings, enter next>> to proceed. After entering next>>, the configuration wizard proceeds to validate the values you entered.

          Step 8   A final message appears stating that the wizard is now ready to proceed to join the host to the cluster.

          The following options are available:

          • [back]—Review and verify or modify your configuration settings.

          • [cancel]—Discard your configuration settings and exit the configuration wizard.

          • [proceed]—Save your configuration settings and begin the process to join this host to the specified Cisco APIC-EM.

          Enter proceed>> to proceed. After entering proceed>>, the configuration wizard applies the configuration values that you entered above.

          Note   

          At the end of the configuration process, a successful configuration message appears.

          Step 9   Open your browser and enter an IP address to access the Cisco APIC-EM GUI.

          You can use the first displayed IP address of the Cisco APIC-EM GUI at the end of the configuration process.

          Note   

          The first displayed IP address can be used to access the Cisco APIC-EM GUI. The second displayed IP address accesses the network where the devices reside.

          Step 10   After entering the IP address in the browser, a message stating that "Your connection is not private" appears.

          Ignore the message and click the Advanced link.

          Step 11   After clicking the Advanced link, a message stating that the site’s security certificate is not trusted appears.

          Ignore the message and click the link.

          Note   

          This message appears because the controller uses a self-signed certificate. You will have the option to upload a trusted certificate using the controller GUI after installation completes.

          Step 12   In the Login window, enter the administrator username and password that you configured above and click the Log In button.

          What to Do Next

          Proceed to follow the same procedure described here to join the third and final host to the multi-host cluster.

          After configuring each host be sure to check the controller's health on the host using the SYSTEM HEALTH tab in the GUI. The SYSTEM HEALTH tab is directly accessible from the HOME page. For information about this procedure, see the Cisco Application Policy Infrastructure Controller Enterprise Module Administrator Guide.


          Note


          You can send feedback about the Cisco APIC-EM by clicking the Feedback icon ("I wish this page would....") at the lower right of each window in the GUI. Clicking on this icon opens an email. Use this email to send a comment on the current window or to send a request to the Cisco APIC-EM development team.


          Shutting Down and Starting Up a Host in a Multi-Host Cluster

          Perform the steps in this procedure to gracefully shutdown and restart a host in a multi-host cluster.


          Note


          It is best practice to gracefully shutdown a host, before removing it from the multi-host cluster.


          Before You Begin

          You should have installed the Cisco APIC-EM on a multi-host cluster as described in the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.

          You must perform this procedure on the single host that is to be removed from the multi-host cluster.

          The multi-host cluster should still be operational.


            Step 1   Using a Secure Shell (SSH) client, log into the host (appliance, server, or virtual machine) with the IP address that you specified using the configuration wizard.
            Note   

            The IP address to enter for the SSH client is the IP address that you configured for the network adapter. This IP address connects the appliance to the external network.

            Step 2   When prompted, enter your Linux username ('grapevine') and password for SSH access.
            Step 3   Enter the following command to redeploy services off of this host and onto the other hosts in the multi-host cluster.
            $ grape host evacuate
            
            
            Step 4   Power off the host.
            Step 5   Proceed to perform any troubleshooting or maintenance operations on the host that you powered off.
            Step 6   Power on the host back on.
            Step 7   If the hosts comes up and no error message appears, then enter the following command on the host to enable services on it.
            $ grape host enable
            
            
            Note   

            If the host fails to come up, then proceed directly to step 9 below.

            Step 8   If the hosts comes up and no error message appears, then enter the following additional command on the host to rebalance services on it and with other hosts in your multi-host cluster.
            $ grape instance rebalance
            
            
            Note   

            If the hosts comes up and no error message appears, then you are finished with the procedure. If the host fails to come up, then proceed directly to step 9 below.

            Step 9   Log into one of the other operational hosts (working hosts) in the multi-host cluster.
            Step 10   Enter the following command on the selected operational host.
            $ remove faulted node
            
            

            This command will remove the stale entries of the host that was shut down.

            Step 11   Run the configuration wizard on the selected operational node to trigger 'remove fault-node node'.
            $ config_wizard
            
            
            Step 12   The operational host will then display another selection, 'Revert to single-node',
            Step 13   Select the 'Revert to single-node' option and wait until operation completes.
            Step 14   Proceed to join the host back to the existing two host cluster using the configuration wizard and as described in the procedure to add a host to a multi-host cluster. For information, see Adding a New Host to a Multi-Host Cluster.

            Confirming the Multi-Host Cluster Configuration Values

            If you are experiencing issues with your multi-host cluster, then you can use the Cisco APIC-EM CLI to check the configuration values.

            Before You Begin

            You should have installed the Cisco APIC-EM following the procedure described in the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.


              Step 1   Using a Secure Shell (SSH) client, log into the host (physical or virtual) with the IP address that you specified using the configuration wizard.
              Note   

              The IP address to enter for the SSH client is the IP address that you configured for the network adapter. This IP address connects the host to the external network.

              Step 2   When prompted, enter your Linux username ('grapevine') and password for SSH access.
              Step 3   Enter the following command to display the multi-host configuration.
              $ grape root display
              
              

              Command output similar to the following should appear.

              
              ROOT                                   PROPERTY             VALUE
              ----------------------------------------------------------------------
              
              4cbe3972-9872-4771-800d-08c89463f1eb   hostname             root-1
              4cbe3972-9872-4771-800d-08c89463f1eb   interfaces           [{'interface': 'eth0', 'ip': '209.165.200.10', 'mac': '00:50:56:100:d2:14', 'netmask': '255.255.255.0'}, {'interface': 'eth1', 'ip': '209.165.200.10', 'mac': '00:50:56:95:5c:18', 'net  mask': '255.255.255.0'}, {'interface': 'grape-br0', 'ip': '209.165.200.11', 'mac': 'ba:ed:c4:19:0d:77', 'netmask': '255.255.255.0'}]
              4cbe3972-9872-4771-800d-08c89463f1eb   is_alive             True
              4cbe3972-9872-4771-800d-08c89463f1eb   last_heartbeat       Wed Sep 09, 2015 11:02:52 PM (just now)
              
              4cbe3972-9872-4771-800d-08c89463f1eb   public_key           ssh-rsa                                                                                                             
              c2EAAAADAQABAAABAQDYlyCfidke3MTjGkzsTAu73MtG+lynFFvxWZ4xVIkDkhGC7KCs6XMhORMaABb6
              bU4EX/6osa4qyta4NYaijxjL6GL6kPkSBZiEKcUekHCmk1+H+Ypp5tc0wyvSpe5HtbLvPicLrXHHI/TS
              ...
              V44t+VvtFaLurG9+FW/ngZwGrR/grapevine@grapevine-root
              
              4cbe3972-9872-4771-800d-08c89463f1eb   root_id              4cbe3972-9872-4771-800d-08c89463f1eb
              4cbe3972-9872-4771-800d-08c89463f1eb   root_index           0
              4cbe3972-9872-4771-800d-08c89463f1eb   root_version         0.3.0.958.dev140-gda6a16
              4cbe3972-9872-4771-800d-08c89463f1eb   vm_password          ******
              (grapevine)
              
              #
              
              ROOT                                   PROPERTY             VALUE
              ----------------------------------------------------------------------
              
              4cbe3972-9872-4771-800d-08c89463f1eb   hostname             root-2
              4cbe3972-9872-4771-800d-08c89463f1eb   interfaces           [{'interface': 'eth0', 'ip': '209.165.200.101, 'mac': '00:50:56:100:d2:14', 'netmask': '255.255.255.0'}, {'interface': 'eth1', 'ip': '209.165.200.11', 'mac': '00:50:56:95:5c:18', 'net  mask': '255.255.255.0'}, {'interface': 'grape-br0', 'ip': '209.165.200.11', 'mac': 'ba:ed:c4:19:0d:77', 'netmask': '255.255.255.0'}]
              4cbe3972-9872-4771-800d-08c89463f1eb   is_alive             True
              4cbe3972-9872-4771-800d-08c89463f1eb   last_heartbeat       Wed Sep 09, 2015 11:02:52 PM (just now)
              
              4cbe3972-9872-4771-800d-08c89463f1eb   public_key           ssh-rsa                                                                                                             
              c2EAAAADAQABAAABAQDYlyCfidke3MTjGkzsTAu73MtG+lynFFvxWZ4xVIkDkhGC7KCs6XMhORMaABb6
              bU4EX/6osa4qyta4NYaijxjL6GL6kPkSBZiEKcUekHCmk1+H+Ypp5tc0wyvSpe5HtbLvPicLrXHHI/TS
              ...
              V44t+VvtFaLurG9+FW/ngZwGrR/grapevine@grapevine-root
              
              4cbe3972-9872-4771-800d-08c89463f1eb   root_id              4cbe3972-9873-4771-800d-08c89463f1eb
              4cbe3972-9872-4771-800d-08c89463f1eb   root_index           0
              4cbe3972-9872-4771-800d-08c89463f1eb   root_version         0.3.0.958.dev140-gda6a16
              4cbe3972-9872-4771-800d-08c89463f1eb   vm_password          ******
              (grapevine)
              

              The following data is displayed by this command:

              • hostname—The configured hostname.

              • interfaces—The configured interface values, including Ethernet port, IP address, and netmask.

              • is_alive—Status of the host. True indicates a running host, False indicates a host that has shut down.

              • last_heartbeat—Date and time of last heartbeat message sent from the host.

              • public_key—Public key used by host.

              • root_id—Individual root identification number.

              • root_index—Individual root index number.

              • root_version—Software version of root.

              • vm_password—VMware vSphere password that is masked.

              Step 4   If any of the fields in the command output appear incorrect, enter the root cause analysis (rca) command.
              $ rca
              
              

              The rca command runs a root cause analysis script that creates a tar file that contains the following data:

              • Log files

              • Configuration files

              • Command output

              Note   

              For a multi-host deployment (three hosts), you need to perform this procedure and run the rca command on each of the three hosts.

              Step 5   Send the tar file created by the rca command procedure to Cisco support for assistance in resolving your issue.

              For information about contacting Cisco support, see Contacting the Cisco Technical Assistance Center.


              Changing the Settings in a Multi-Host Cluster

              To troubleshoot an issue with a multi-host cluster, you may need to change its configuration settings. This procedure describes how to change the Cisco APIC-EM external network settings, NTP server address, and/or password for the Linux grapevine user in a multi-host cluster. The external network settings that can be changed include:

              • Host IP address

              • Virtual IP address

              • DNS server

              • Default gateway

              • Static routes


              Note


              In order to change the external network settings, NTP server address, and/or the Linux grapevine user password in a multi-host deployment, you need to first break up the multi-host cluster. As a result, controller downtime occurs. For this reason, we recommend that you perform this procedure during a maintenance time period. For information about changing settings for a single host configuration, see Updating the Configuration Using the Wizard
              Before You Begin

              You must have successfully configured the Cisco APIC-EM as a multi-host cluster using the configuration wizard, as described in the Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide.


                Step 1   Using a Secure Shell (SSH) client, log into one of the hosts in your cluster.

                Log in using the IP address that you specified using the configuration wizard.

                Note   

                The IP address to enter for the SSH client is the IP address that you configured for the network adapter. This IP address connects the appliance to the external network.

                Step 2   When prompted, enter your Linux username ('grapevine') and password for SSH access.
                Step 3   Enter the following command to access the configuration wizard.
                $ config_wizard
                
                
                Note   

                The config_wizard command is in the PATH of the 'grapevine' user, and not the "root" user. Either run the command as the "grapevine" user, or fully qualify the command as the "root" user. For example: /home/grapevine/bin/config_wizard

                Step 4   Review the Welcome to the APIC-EM Configuration Wizard! screen and choose the option to remove the host from the cluster:
                • Remove this host from its APIC-EM cluster

                Step 5   A message appears with the following options:
                • [cancel]—Exit the configuration wizard.

                • [proceed]—Begin the process to remove this host from its cluster.

                Choose proceed>> to begin. After choosing proceed>>, the configuration wizard begins to remove this host from its cluster.

                At the end of this process, this host is removed from the cluster.

                Step 6   Repeat the above steps (steps 1-5) on a second host in the cluster.
                Note   

                You must repeat the above steps on each host in your cluster, until you only have a single host remaining. You must make your configuration changes on this final remaining host.

                Step 7   Using a Secure Shell (SSH) client, log into that final host in your cluster and run the configuration wizard.
                $ config_wizard
                
                

                After logging into the host, begin the configuration process.

                Step 8   Make any necessary changes to the configuration values for the external network settings, NTP server address, and/or password for the Linux grapevine user using the wizard.

                After making your configuration change(s), continue through the configuration process to the final message.

                Step 9   At the end of the configuration process, a final message appears stating that the wizard is now ready to proceed with applying the configuration.

                The following options are available:

                • [back]—Review and verify your configuration settings.

                • [cancel]—Discard your configuration settings and exit the configuration wizard.

                • [save & exit]—Save your configuration settings and exit the configuration wizard.

                • [proceed]—Save your configuration settings and begin applying them.

                Enter proceed>> to complete the installation. After entering proceed>>, the configuration wizard applies the configuration values that you entered above.

                Note   

                At the end of the configuration process, a CONFIGURATION SUCCEEDED! message appears.

                Step 10   Log into the other hosts in your multi-host cluster and use the configuration wizard to recreate the cluster.

                Refer to Cisco Application Policy Infrastructure Controller Enterprise Module Installation Guide for information about this specific procedure.