- Preface
- Overview
- Troubleshooting Matrix
- Troubleshooting an Installation or Update
- Troubleshooting the Configuration
- Troubleshooting Cisco APIC-EM Multi-Host
- Troubleshooting Services Using System Health
- Troubleshooting Services Using the Controller Admin Console
- Troubleshooting Using the Logs
- Troubleshooting Passwords
- Troubleshooting Commands
- Troubleshooting Log Files
- Contacting the Cisco Technical Assistance Center
- Index
Troubleshooting
Cisco APIC-EM Multi-Host
The following information may be used to troubleshoot a Cisco APIC-EM multi-host configuration:
- Troubleshooting Cisco APIC-EM Multi-Host
- Confirming the Multi-Host Cluster Configuration Values
- Changing the Settings in a Multi-Host Cluster
- Removing a Single Host from a Multi-Host Cluster
- Removing a Faulted Host from a Multi-Host Cluster
Troubleshooting Cisco APIC-EM Multi-Host
The following table describes recommended actions to take to resolve a Cisco APIC-EM multi-host issue.
Symptom |
Possible Cause |
Recommended Action |
---|---|---|
Controller in a multi-host configuration appears to be in an unstable state. For example, applications are not running, or applications are inaccessible, and/or not appearing in the GUI. |
Controller in unstable state, possibly due to error(s) in entering configuration values with the Cisco APIC-EM configuration wizard. |
Log into the host, check the configuration values, and reenter any configuration values that are incorrect. References: |
Controller was working fine for a multi-host configuration, but after a period of time one of the hosts becomes erratic and unstable. |
Possible failed service or services in the multi-host cluster. |
Remove and then reattach unstable host from the multi-host cluster. References: |
Controller was working fine for a multi-host configuration, but after a period of time one of the hosts fails. |
Possible failed service or services in the multi-host cluster. |
Remove and then reattach failed and inoperable host from the multi-host cluster. References: |
Host fails due to a power outage. |
Power to the server or appliance was inadvertently shut off. When the power returned to the server or appliance, the controller failed to restart properly. |
Reset the controller on the host that experienced the power outage back to its previous configuration. References: |
Confirming the Multi-Host Cluster Configuration Values
If you are experiencing issues with your multi-host cluster, then you can use the Cisco APIC-EM CLI to check the configuration values.
You should have attempted to deploy the Cisco APIC-EM following the procedure described in the Cisco APIC-EM deployment guide.
Step 1 | Using a Secure
Shell (SSH) client, log into the host (physical or virtual) with the IP address
that you specified using the configuration wizard.
| ||
Step 2 | When prompted, enter your Linux username ('grapevine') and password for SSH access. | ||
Step 3 | Enter the
following command to display the multi-host configuration.
$ grape root display Command output similar to the following should appear. ROOT PROPERTY VALUE ---------------------------------------------------------------------- 4cbe3972-9872-4771-800d-08c89463f1eb hostname root-1 4cbe3972-9872-4771-800d-08c89463f1eb interfaces [{'interface': 'eth0', 'ip': '209.165.200.10', 'mac': '00:50:56:100:d2:14', 'netmask': '255.255.255.0'}, {'interface': 'eth1', 'ip': '209.165.200.10', 'mac': '00:50:56:95:5c:18', 'net mask': '255.255.255.0'}, {'interface': 'grape-br0', 'ip': '209.165.200.11', 'mac': 'ba:ed:c4:19:0d:77', 'netmask': '255.255.255.0'}] 4cbe3972-9872-4771-800d-08c89463f1eb is_alive True 4cbe3972-9872-4771-800d-08c89463f1eb last_heartbeat Wed Sep 09, 2015 11:02:52 PM (just now) 4cbe3972-9872-4771-800d-08c89463f1eb public_key ssh-rsa c2EAAAADAQABAAABAQDYlyCfidke3MTjGkzsTAu73MtG+lynFFvxWZ4xVIkDkhGC7KCs6XMhORMaABb6 bU4EX/6osa4qyta4NYaijxjL6GL6kPkSBZiEKcUekHCmk1+H+Ypp5tc0wyvSpe5HtbLvPicLrXHHI/TS ... V44t+VvtFaLurG9+FW/ngZwGrR/grapevine@grapevine-root 4cbe3972-9872-4771-800d-08c89463f1eb root_id 4cbe3972-9872-4771-800d-08c89463f1eb 4cbe3972-9872-4771-800d-08c89463f1eb root_index 0 4cbe3972-9872-4771-800d-08c89463f1eb root_version 0.3.0.958.dev140-gda6a16 4cbe3972-9872-4771-800d-08c89463f1eb vm_password ****** (grapevine) # ROOT PROPERTY VALUE ---------------------------------------------------------------------- 4cbe3972-9872-4771-800d-08c89463f1eb hostname root-2 4cbe3972-9872-4771-800d-08c89463f1eb interfaces [{'interface': 'eth0', 'ip': '209.165.200.101, 'mac': '00:50:56:100:d2:14', 'netmask': '255.255.255.0'}, {'interface': 'eth1', 'ip': '209.165.200.11', 'mac': '00:50:56:95:5c:18', 'net mask': '255.255.255.0'}, {'interface': 'grape-br0', 'ip': '209.165.200.11', 'mac': 'ba:ed:c4:19:0d:77', 'netmask': '255.255.255.0'}] 4cbe3972-9872-4771-800d-08c89463f1eb is_alive True 4cbe3972-9872-4771-800d-08c89463f1eb last_heartbeat Wed Sep 09, 2015 11:02:52 PM (just now) 4cbe3972-9872-4771-800d-08c89463f1eb public_key ssh-rsa c2EAAAADAQABAAABAQDYlyCfidke3MTjGkzsTAu73MtG+lynFFvxWZ4xVIkDkhGC7KCs6XMhORMaABb6 bU4EX/6osa4qyta4NYaijxjL6GL6kPkSBZiEKcUekHCmk1+H+Ypp5tc0wyvSpe5HtbLvPicLrXHHI/TS ... V44t+VvtFaLurG9+FW/ngZwGrR/grapevine@grapevine-root 4cbe3972-9872-4771-800d-08c89463f1eb root_id 4cbe3972-9873-4771-800d-08c89463f1eb 4cbe3972-9872-4771-800d-08c89463f1eb root_index 0 4cbe3972-9872-4771-800d-08c89463f1eb root_version 0.3.0.958.dev140-gda6a16 4cbe3972-9872-4771-800d-08c89463f1eb vm_password ****** (grapevine) The following data is displayed by this command:
| ||
Step 4 | If any of the
fields in the command output appear incorrect, enter the root cause analysis
(rca) command.
$ rca | ||
Step 5 | Send the
tar file created by the
rca command procedure to Cisco support for
assistance in resolving your issue.
For information about contacting Cisco support, see Contacting the Cisco Technical Assistance Center. |
Changing the Settings in a Multi-Host Cluster
To troubleshoot an issue with a multi-host cluster, you may need to change its configuration settings. This procedure describes how to change the Cisco APIC-EM external network settings, NTP server address, and/or password for the Linux grapevine user in a multi-host cluster. The external network settings that can be changed include:
Note | In order to change the external network settings, NTP server address, and/or the Linux grapevine user password in a multi-host deployment, you need to first break up the multi-host cluster. As a result, controller downtime occurs. For this reason, we recommend that you perform this procedure during a maintenance time period. For information about changing settings for a single host configuration, see Updating the Configuration Using the Wizard |
You must have successfully configured the Cisco APIC-EM as a multi-host cluster using the configuration wizard, as described in the Cisco APIC-EM deployment guide.
Removing a Single Host from a Multi-Host Cluster
To troubleshoot an issue with a multi-host cluster, you may need to remove a single host from a multi-host cluster. This procedure describes how to remove one of the hosts running Cisco APIC-EM from a multi-host cluster. You use the Cisco APIC-EM configuration wizard to perform this procedure.
Note | The configuration wizard option to remove a host only appears if the host on which you are running the configuration wizard is part of a multi-host cluster. If the host is not part of a multi-host cluster, then the option to remove a host does not display. When performing this procedure, controller downtime occurs. For this reason, we recommend that you perform this procedure during a maintenance time period. |
You should have deployed Cisco APIC-EM on a multi-host cluster as described in the Cisco APIC-EM deployment guide.
You must perform this procedure on the single host that is to be removed from the multi-host cluster.
Step 1 | Using a Secure
Shell (SSH) client, log into the host (appliance, server, or virtual machine)
with the IP address that you specified using the configuration wizard.
| ||
Step 2 | When prompted, enter your Linux username ('grapevine') and password for SSH access. | ||
Step 3 | Enter the
following command to access the configuration wizard.
$ config_wizard
| ||
Step 4 | Review the Welcome to the APIC-EM Configuration Wizard! screen and choose the option to remove the host from the cluster: | ||
Step 5 | A message
appears with the following options:
| ||
Step 6 | At the end of this process, you must then either run the
configuration wizard again to configure the host as a new
Cisco APIC-EM
or join the
Cisco APIC-EM
to a cluster.
If you wish to use this host again as either a stand-alone controller or operating within a cluster, then you must run the configuration wizard again and re-install the Cisco APIC-EM. Do not attempt to use this host again as either a standalone host or within a cluster without re-installing the Cisco APIC-EM. |
Removing a Faulted Host from a Multi-Host Cluster
Perform the steps in the following procedure to remove a faulted or inoperative host (running Cisco APIC-EM) from a multi-host cluster. You use the Cisco APIC-EM configuration wizard to perform this procedure. A host becomes faulted when it can no longer participate in the cluster due to hardware or software issues.
After following this procedure on a three host cluster (moving from three hosts to two hosts), you will lose high-availability protection against loss of a host. After following this procedure for a two host cluster, then the cluster will become inoperable until that second host is brought back up and added to the cluster.
Note | The fact that the host becomes "faulted" results in replacement instances of the services on the faulted host being grown on the remaining hosts in the cluster. During the time period when the replacement instances are being grown and depending on the types of services being grown, certain Cisco APIC-EM functionality may not be available. |
You have deployed Cisco APIC-EM on a multi-host cluster following the procedure described in the Cisco APIC-EM deployment guide.
You must perform this procedure on an active host in the multi-host cluster. You cannot perform this procedure on the faulted host that is to be removed from the multi-host cluster. A faulted host is displayed as red in the System Health tab view in the Home page of the controller's GUI.
Note | You should always first attempt to bring the faulted host back online. After determining that the faulted host can no longer participate in the cluster, then try to remove the faulted host using the Remove this host from its APIC-EM cluster configuration wizard option (as described in the previous procedure). You should only follow this procedure and the Remove a faulted host from this APIC-EM cluster configuration wizard option, if that other option is tried first and is unsuccessful in removing the host. |
Step 1 | Using a Secure
Shell (SSH) client, log into the host (appliance, server, or virtual machine)
with the IP address that you specified using the configuration wizard.
| ||
Step 2 | When prompted, enter your Linux username ('grapevine') and password for SSH access. | ||
Step 3 | Enter the
following command to access the configuration wizard.
$ config_wizard
| ||
Step 4 | Review the Welcome to the APIC-EM Configuration Wizard! screen and choose the option to forcibly remove the faulted host from the cluster: | ||
Step 5 | A message
appears with the following options:
| ||
Step 6 | At the end of
this process, you must then either run the configuration wizard again to
configure the host as a new controller or join the controller to a cluster.
If you wish to use this host again as either a stand-alone controller or operating within a cluster, then you must run the configuration wizard again and re-install the Cisco APIC-EM. Do not attempt to use this host again as either a standalone host or within a cluster without re-installing the Cisco APIC-EM. |