Monitoring the Health of ESC Using REST API
ESC provides REST API for any third party software to monitor the health of ESC and its services. Using the API, the third party software can query the health condition of ESC periodically to check whether ESC is in service. In response to the query, API provides status code and messages, see Table 1 for details. In an HA setup the virtual IP (VIP) must be used as the monitoring IP. The return value provides the overall condition of the ESC HA pairs. See the Table 3 for details.
The REST API to monitor the health of ESC is as follows:
GET to https://<esc_vm_ip>:8060/esc/health
Note |
|
The monitoring health API response with error conditions is as follows:
Example of the JSON response:
<?xml version="1.0" encoding="UTF-8" ?>
<esc_health_report>
<status_code>{error status code}</status_code>
<message>{error message}</message>
</esc_health_report>
<?xml version="1.0" encoding="UTF-8" ?>
<esc_health_report>
<status_code>2010</status_code>
<message>ESC service is being provided. ESC AA cluster one or more node(s) not healthy</message>
<nodes>
<node>
<name>aa-esc-1.novalocal</name>
<status>HEALTHY</status>
<datacenter>dc1</datacenter>
<services>
<service>
<name>escmanager</name>
<status>running</status>
<is_expected>True</is_expected>
</service>
<service>
<name>elector</name>
<status>leader</status>
<is_expected>True</is_expected>
</service>
<service>
<name>drbd</name>
<status>active</status>
<is_expected>True</is_expected>
</service>
<service>
<name>pgsql</name>
<status>running</status>
<is_expected>True</is_expected>
</service>
...
</services>
</node>
<node>
<name>aa-esc-2.novalocal</name>
<status>HEALTHY</status>
<datacenter>dc1</datacenter>
<services>
<service>
<name>escmanager</name>
<status>running</status>
<is_expected>True</is_expected>
</service>
<service>
<name>elector</name>
<status>follower</status>
<is_expected>True</is_expected>
</service>
<service>
<name>drbd</name>
<status>standby</status>
<is_expected>True</is_expected>
</service>
<service>
<name>pgsql</name>
<status>stopped</status>
<is_expected>True</is_expected>
</service>
...
</services>
</node>
<node>
<name>aa-esc-3.novalocal</name>
<status>NOT_HEALTHY</status>
<datacenter>dc1</datacenter>
<services>
<service>
<name>escmanager</name>
<status>stopped</status>
<is_expected>False</is_expected>
</service>
<service>
<name>elector</name>
<status>follower</status>
<is_expected>True</is_expected>
</service>
<service>
<name>vimmanager</name>
<status>running</status>
<is_expected>True</is_expected>
</service>
...
</services>
</node>
</nodes>
</esc_health_report>
XML and JSON responses are also supported for the monitoring health API.
If the API response is successful, an additional field called stage is introduced.
<?xml version="1.0" encoding="UTF-8" ?>
<esc_health_report>
<status_code>{success status code}</status_code>
<stage>{Either INIT or READY}</stage>
<message>{success message}</message>
</esc_health_report>
The stage field has INIT or READY parameters.
INIT: The INIT parameter is the initial stage, where ESC accepts pre-provisioning requests such as configuring the config parameters or registering a vim connector.
READY: ESC is ready for any kind of provisioning requests such as deploying, undeploying and so on with this parameter.
The status code and messages below provide the health condition of ESC. The status codes with 2000 series imply that the ESC is operational. The status codes with 5000 series imply that at least one ESC component is not in service.
Status Code |
Message |
||
---|---|---|---|
2000 |
ESC services are running. |
||
2010 |
ESC services are being provided. ESC AA cluster one or more node(s) not healthy. |
||
2040 |
ESC services running. VIM is configured, ESC initializing connection to VIM. |
||
5010 |
ESC service, ESC_MANAGER is not running. |
||
5020 |
ESC service, CONFD is not running. |
||
5030 |
ESC service, MONA is not running. |
||
5040 |
ESC service, VIM_MANAGER is not running. |
||
5060 |
ESC service, ETSI is not running. |
||
5070 |
Vim Connector IDs [vimId_1,vimId_2,...,vimId_N] are down. or 6 of 25 VIM Connectors are down.
|
||
5080 |
The NFVO service is not available. |
||
5090 |
More than one ESC service (for example, confd and mona) are not running. |
||
5091 |
One or more ESC services is not running and the NFVO service is not available. |
||
5092 |
VIM Connector ID [vim-1] is down. The NFVO service is NOT available. |
Status Code |
Message |
---|---|
2000 |
ESC services are running (Active-Active setup). |
2010 |
ESC services are provided. In ESC Active/Active cluster one or more node(s) are not healthy. |
5000 |
ESC services not being provided, ESC AA cluster not healthy |
Note |
ESC HA mode refers to ESC HA in DRBD setup only. For more information on the ESC HA setup, see the Cisco Elastic Services Controller Install Guide. |
The table below describes the status message for standalone ESC and HA with success and failure scenarios. For more information on ESC standalone and HA setup, see the Cisco Elastic Services Controller Install Guide.
Success |
Partial Success |
Failure |
|||||
---|---|---|---|---|---|---|---|
Standalone ESC |
The response is collected from the monitoring health API and the status code is 2000. |
NA |
|
||||
ESC in HA (Active-Standby) |
The response is collected from the monitoring health API and the status code is 2000. |
The response is collected from the monitoring health API and the status code is 2010. This indicates that the ESC standby node cannot connect to ESC active node in ESC HA. However, this does not impact the ESC service to northbound. |
|
||||
ESC in HA (Active-Active) |
The response is collected from the monitoring health API and the status code is 2000. |
The response is collected from the monitoring health API and the status code is 2010. This indicates that the ESC services are being provided but one or more nodes are not healthy in the ESC AA cluster. This does not impact the ESC service to northbound. |
|
ESC Health Monitor Enhancements
The ESC Health Monitor API is enhanced to:
-
Determine the status of the ESC components.
-
Provide a single point of contact for the SNMP agent to simplify the connectivity and authentication details.
The ESC monitor component hosts the Health Monitor API, which can be used to provide a listing of the downed ESC components. The Health Monitor uses both public and internal health URLs for each ESC component to determine its individual status. For example, the VNFM status is determined by the health monitor by executing the URL:
https://localhost:8252/etsi/health
The URL determines the status of the ESC components, and returns a relevant status code and status message as part of the SNMP trap notifications.
ESC Health Monitor API for VIM Connector Status
The ESC Health Monitor API is extended to query the VIM connector details using the new ESC Health Monitor API (URL):
http://<escmanager-host>:8088/escmanager/vims
The URL is executed against the active node in the ESC standalone and HA setup, and against every node in the ESC Active/Active setup.
The health monitor payload returns additional information to determine the binary status of all the configured VIM connectors. The status of the VIM connectors is either healthy or down.
To determine if a single VIM connector is healthy, the ESC Health Monitor API performs a query on the VIM to which a VIM connector is defined. If the result is has a CONNECTION_SUCCESSFUL internal status, then the VIM connector is healthy.
If the query fails, then the VIM connector is down.
Furthermore, the returned status message contains a comma separated list of the specific VIM IDs which are down. The example shows the payload the ESC Health Monitor returns for two VIM connectors that are down:
{
"message": "VIM Connector IDs [vim-connector-site-1A, vim-connector-site-1C] are down.",
"status_code": "5070"
}
For details on the SNMP trap notifications for the VIM connectors, see Monitoring the Health of ESC Using SNMP Trap Notifications.
The ESC Health Monitor does not monitor the VIM connector status by default. To enable the ESC Health Monitor, see Enabling SNMP Traps for VIM and NFVO Monitoring in SNMP Trap Notifications.
ESC Health Monitor API for the NFVO Connectivity Status
The ESC Health Monitor API can determine the connectivity to the NFVO. ESC provides an API to query the connectivity of the NFVO to ESC. The NFVO responds to the standard SOL003 defined API query. The URL is as follows:
https://<vnfm-host>:8252/etsi/nfvo/health
If the NFVO authenticates successfully and responds to the SOL003 defined API, then the NFVO is reachable and healthy.
The example shows the payload the ESC Health Monitor returns when the NFVO is configured but not reachable:
{
"message": "The NFVO service is NOT available.",
"status_code": "5080"
}
The ESC Health Monitor does not monitor the NFVO connection status by default. To enable the ESC Health Monitor, see Enabling SNMP Traps for VIM and NFVO Monitoring in SNMP Trap Notifications.
For information on the ETSI deployment, see the Cisco Elastic Services Controller ETSI NFV MANO User Guide