About Telemetry
Collecting data for analyzing and troubleshooting has always been an important aspect in monitoring the health of a network.
Cisco NX-OS provides several mechanisms such as SNMP, CLI, and Syslog to collect data from a network. These mechanisms have limitations that restrict automation and scale. One limitation is the use of the pull model, where the initial request for data from network elements originates from the client. The pull model does not scale when there is more than one network management station (NMS) in the network. With this model, the server sends data only when clients request it. To initiate such requests, continual manual intervention is required. This continual manual intervention makes the pull model inefficient.
A push model continuously streams data out of the network and notifies the client. Telemetry enables the push model, which provides near-real-time access to monitoring data.
Telemetry Components and Process
Telemetry consists of four key elements:
-
Data Collection — Telemetry data is collected from the Data Management Engine (DME) database in branches of the object model specified using distinguished name (DN) paths. The data can be retrieved periodically (frequency-based) or only when a change occurs in any object on a specified path (event-based). You can use the NX-API to collect frequency-based data.
-
Data Encoding — The telemetry encoder encapsulates the collected data into the desired format for transporting.
NX-OS encodes telemetry data in the Google Protocol Buffers (GPB) and JSON format.
-
Data Transport — NX-OS transports telemetry data using HTTP for JSON encoding and the Google remote procedure call (gRPC) protocol for GPB encoding. The gRPC receiver supports message sizes greater than 4MB. (Telemetry data using HTTPS is also supported if a certificate is configured.)
Starting with Cisco Nexus 9.2(1), UDP and secure UDP (DTLS) are supported as telemetry transport protocols. You can add destinations that receive UDP. The encoding for UDP and secure UDP can be GPB or JSON.
Use the following command to configure the UDP transport to stream data using a datagram socket either in JSON or GPB: destination-group num ip address xxx.xxx.xxx.xxx port xxxx protocol UDP encoding {JSON | GPB }
Where num is a number between 1 and 4095.
Example for IPv4 destination: destination-group 100 ip address 171.70.55.69 port 50001 protocol UDP encoding GPB
The UDP telemetry will be sent with the following header: typedef enum tm_encode_ { TM_ENCODE_DUMMY, TM_ENCODE_GPB, TM_ENCODE_JSON, TM_ENCODE_XML, TM_ENCODE_MAX, } tm_encode_type_t; typedef struct tm_pak_hdr_ { uint8_t version; /* 1 */ uint8_t encoding; uint16_t msg_size; uint8_t secure; uint8_t padding; }__attribute__ ((packed, aligned (1))) tm_pak_hdr_t;
Use the first 6 bytes in the payload to successfully process telemetry data using UDP, using one of the following methods:
-
Read the information in the header to determine which decoder to use to decode the data, JSON or GPB, if the receiver is meant to receive different types of data from multiple end points, or
-
Remove the header if you are expecting one decoder (JSON or GPB) but not the other
Note
Depending on the receiving operation system and the network load, using the UDP protocol may result in packet drops.
-
-
Telemetry Receiver — A telemetry receiver is a remote management system or application that stores the telemetry data.
The GPB encoder stores data in a generic key-value format. The encoder requires metadata in the form of a compiled .proto
file to translate the data into GPB format.
In order to correctly receive and decode the data stream, the receiver requires the .proto
file that describes the encoding and the transport services. The encoding decodes the binary stream into a key value string
pair.
A telemetry .proto
file that describes the GPB encoding and gRPC transport is available on Cisco's GitLab: https://github.com/CiscoDevNet/nx-telemetry-proto
High Availability of the Telemetry Process
High availability of the telemetry process is supported with the following behaviors:
-
System Reload — During a system reload, any telemetry configuration and streaming services are restored.
-
Process Restart — If the telemetry process freezes or restarts for any reason, configuration and streaming services are restored when telemetry is restarted.