About Inband Network Telemetry
Inband Network Telemetry (INT) is a framework that is designed to monitor, collect, and report flows and network states, by the data plane, without requiring intervention or work by the control plane. INT sources (applications, end-host networking stacks, hypervisors, NICs, send-side TORs, and so on) embed the monitored INT flow information in normal data packets, and all the downstream devices add the same information with their details that are based on the source information received.
Inband Network Telemetry is used to achieve per-packet network visibility with low overheads. The following figure shows the typical workflow for Inband Network Telemetry.
As part of the Inband Network Telemetry process, every data packet is inspected without interfering with other packet-processing logic in the switch data plane. Not all the components that are shown in the preceding figure are necessarily located in each INT switch. For example, the watchlist component can be in one switch, and the event detection and telemetry report components can be located in another switch.
A telemetry watchlist table specifies the flows to monitor. It performs a match on the packet headers and switch ports, and provides telemetry action parameters. Packets that match the watchlist entries with telemetry actions are processed by the event detection logic. If a triggering event is detected, the switch generates a telemetry report to the monitor-collector. The report message includes packet header and switch metadata associated with the packet (for example, timestamp, ingress or egress ports, and queue depth or latency).
Terminology
Following are terms used for this feature:
-
INT Header — A packet header that carries INT information.
-
INT Packet — A packet containing an INT Header.
-
INT Instruction — Instructions that are embedded in the INT header, indicating which INT Metadata (defined below) to collect at each INT switch. The collected data is written into the INT Header.
-
INT Source — A trusted entity that creates and inserts INT Headers into the packets it sends.
-
INT Sink — A trusted entity that extracts the INT Headers and collects the path state that is contained in the INT Headers. The INT Sink is responsible for removing INT Headers so that INT is transparent to upper layers.
-
INT Metadata — Information that an INT Source inserts into the INT Header. Examples of metadata are described in INT Collection Parameters.
-
INT Domain — A group of interconnected INT switches under the same administration.
INT Packet Flow
At INT, source flows to be monitored are filtered using the flow watchlist. Various pieces of information on the filtered flows are extracted, based on the collect parameters enabled, such as ingress or egress port ID, ingress or egress timestamp, switch ID, and queue occupancy. The data packets are encapsulated with the INT information and sent toward the destination. The telemetry INT reports are then sent to the monitor-collector, based on whether the events are flow events, packet-drop events, or queue-congestion events.
Report Types
Following are the different types of reports used by INT:
-
Local flow reports — A general telemetry report, which is generated from flow events. Sent from the source or sink for host-to-host data flows matching the watchlist.
-
Drop reports — A general telemetry report, which is generated from drop events. Sent for certain supported drops. Every INT-enabled switch sends these reports to the monitor-collector.
-
Queue Congestion reports — A general telemetry report, which is generated from queue-related events. Sent for packets exceeding the queue depth or latency. Every INT-enabled switch sends these reports to the monitor-collector.
-
INT reports — A report that is specific to INT. Sent by the sink. When INT-encapsulated data packets are received on the sink fabric port, two reports are generated by the sink:
-
Local report for traffic arriving on a fabric port
-
INT report for data that is received from the source
-
About Flow Events
Each switch can play the role of endpoint for INT-enabled packets. The endpoint acts both as source and sink, where:
-
Source initiates INT operations by inserting a telemetry header into a packet and then instructing downstream network devices along the routing path to add desired telemetry information into the packet.
-
Sink retrieves the embedded or collected INT information from the received data packets, extracts the telemetry information from the incoming packets, and sends telemetry reports to the monitor-collector if triggering flow events are detected.
The following figure shows an example flow event for Inband Network Telemetry. Switches along the route path add switch metadata into the packet header based on the telemetry instructions that are carried in the telemetry header.
About Drop Packet and Queue Congestion Events
Each switch provides a report to the monitor-collector, based on the type of triggering event that occurred:
-
Dropped Packets — Drop reports are generated for most dropped packets, except for STP-blocked ports and ACL deny intentional drop packets. This provides visibility into the impact of packet drops on user traffic.
-
Congested Queues — Queue congestion reports are generated for traffic entering a specific queue during a period of queue congestion. This provides visibility into the traffic causing and prolonging queue congestion.
The following figure shows an example of drop and queue congestion reports.
About Packet Postcards
Beginning in NX-OS release 9.2.(3), packet postcards are a different way of transmitting the telemetry data to a monitor. With packet postcards, the host's data packet is sent to another host, just like with the traditional model of INT, which uses source and sink switches as INT endpoints. However, the way that telemetry data is processed is different.
-
In the traditional model, a host sends a flow, an INT source switch adds telemetry data to the host's packet, then the sink switch extracts all telemetry from each hop in the flow. The INT sink switch sends all telemetry data for the flow at once, which results in a larger amount of data that is transmitted to the monitor at one time.
-
With packet postcards, each switch in the INT domain can extract telemetry data and send it to the collector. The result is faster convergence, distributed packet processing, and less overhead than with a traditional model.
The following figure shows an example of packet postcards on a flow.
With packet postcards, the defined roles of source and sink switches become irrelevant because telemetry data is communicated at each switch.
-
If the packet or flow matches the watchlist conditions, the comparison is true, and event-detection logic generates the report and sends it to the monitor. Triggering events are detected based on the switch local information such as ingress or egress ports and queueing latency for the monitored flow. Unlike standard Inband Network Telemetry, a postcard switch never modifies the original data packets by adding INT metadata. After telemetry data is sent to the collector, the packet is then forwarded onto the network to either its next hop or destination.
-
If the packet or flow does not match the watchlist conditions, the comparison is false, and the event-detection logic does not process the telemetry data. The packet is forwarded by the switch, continuing along the route to its next hop. If it ingresses another switch in the INT domain, the packet is compared against that switch's watchlist, if necessary a packet postcard is sent to the monitor-collector, and the packet is forwarded until it reaches its eventual destination.