THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Revision | Publish Date | Comments |
---|---|---|
1.4 |
24-Aug-23 |
Updated the Upgrade Program Section |
1.3 |
23-Jun-23 |
Updated the How To Identify Affected Products Section |
1.2 |
13-Dec-22 |
Updated the Products Affected, Problem Description, Workaround/Solution, and Additional Information Sections |
1.1 |
14-Nov-22 |
Updated the Background, Workaround/Solution, and Additional Information Sections |
1.0 |
13-Oct-22 |
Initial Release |
Affected Product ID | Comments |
---|---|
N9K-C93180YC-FX3 |
|
N9K-C93180YC-FX3= |
Part Alternate |
N9K-C93108TC-FX-24 |
|
N9K-C93108TC-FX-24= |
Part Alternate |
N9K-C93180YC-FX-24 |
|
N9K-C93180YC-FX-24= |
Part Alternate |
N9K-C9364C-GX |
|
N9K-C9364C-GX= |
Part Alternate |
N9K-C93240YC-FX2 |
|
N9K-C93240YC-FX2= |
Part Alternate |
N9K-C9348GC-FXP |
|
N9K-C9348GC-FXP= |
Part Alternate |
N9K-C93108TC-FX |
|
N9K-C93108TC-FX= |
Part Alternate |
N9K-C9336C-FX2 |
|
N9K-C9336C-FX2= |
Part Alternate |
N9K-C93180YC-FX3S |
|
N9K-C93180YC-FX3S= |
Part Alternate |
APIC-SERVER-M3 |
|
APIC-SERVER-M3= |
Part Alternate |
APIC-SERVER-L3 |
|
APIC-SERVER-L3= |
Part Alternate |
N9K-C93180YC-FX |
Fix On Fail PID |
N9K-C93180YC-FX= |
Fix On Fail PID |
N9K-C93360YC-FX2 |
Fix On Fail PID |
N9K-C93360YC-FX2= |
Fix On Fail PID |
N9K-C93216TC-FX2 |
Fix On Fail PID |
N9K-C93216TC-FX2= |
Fix On Fail PID |
Defect ID | Headline |
---|---|
CSCwb98743 | Some DIMMs failing at higher than expected rate |
A limited number of Dual In-line Memory Modules (DIMMs) shipped from Cisco are impacted by a known deviation in the memory supplier's manufacturing process. This deviation can result in a higher rate of failure.
In Revision 1.2 of this field notice, the Application Policy Infrastructure Controller (APIC) products were moved from Fix on Fail to Proactive DIMM replacement.
It is required to replace the DIMMS and update from the older Cisco Integrated Management Controller (CIMC) BIOS (Version 4.1(3c) or earlier) in the same maintenance window.
DIMM manufacturers compose their DIMMs of multiple memory modules to reach the desired capacity. In this case, a manufacturing deviation in specific modules impacts 16GB DIMMs. This deviation was contained to a specific date range, and the DIMMs which use these chips were manufactured during the middle to end of 2020.
Since the discovery of this deviation, additional limits have been imposed on the manufacturing process to help prevent future DIMMs from experiencing this process variation.
Most DIMMs with this manufacturing deviation will exhibit persistent correctable memory errors. If left untreated, the DIMMs can eventually encounter an uncorrectable memory event. If encountered during runtime, uncorrectable errors will cause an unexpected switch reset.
Various DIMM Reliability, Availability, and Serviceability (RAS) features or even operating system features can mask the extent of these correctable errors. It is recommended to check your DIMMs for exposure using the Serial Number Validation Tool described in the Serial Number Validation section of this field notice. Only specific DIMMs are impacted by this issue.
Customers should replace the hardware DIMM to avoid the potential for unexpected switch/server failure. You can submit a replacement request using the order form after validating the Serial Number(s) as described in the How To Identify Affected Products section.
All Serial Numbers are the Switch or Server Serial Number, not the DIMM Serial Number.
DIMM Replacement
Cisco is offering field services free of charge for DIMM replacement through a qualified Cisco 3rd party Field Engineer.
After the replacement DIMMs have arrived onsite and you are ready to schedule the replacement(s), send an email to ciscodimmswap@centricsit.com to engage the field services team.
Impacted DIMMs can be identified based on the Product ID (PID) and their serial number. You will need to use the Serial Number Validation Tool described in the Serial Number Validation section of this field notice to identify impacted product.
Note: Cisco recommends notifying our engineers for onsite service to schedule repairs or replace the device. See the Additional Information section.
Switch Logs That Contain Memory Errors
When running NX-OS Standalone, a unit that experiences this issue can show these messages in the syslogs on the device:
%DAEMON-3-SYSTEM_MSG: Location: SOCKET:0 CHANNEL:? DIMM:? [] - mcelog
%DEVICE_TEST-3-MCE_24HR_FAIL: Module 1 has exceeded MCE 24 hour correctable threshold of 100 with #### correctable errors within 24 hours.
%DAEMON-3-SYSTEM_MSG: corrected Socket memory error count exceeded threshold: #### in 24h - mcelog
or
%DAEMON-3-SYSTEM_MSG: MESSAGE : corrected DIMM memory error count exceeded threshold: #### in 24h - mcelog
%DAEMON-3-SYSTEM_MSG: MESSAGE_Location: /var/log/mcelog - mcelog
Note: Syslogs referencing the count of errors will be generated whenever 100 errors are corrected and the cumulative count of errors will be printed.
These errors indicate that correctable errors are being generated, which should not impact switch performance. If errors continue, an uncorrectable error can be experienced, and the device will undergo a kernel panic.
Switch Special Notes
This field notice is to replace memory on site, and it is important to note that if your impacted switch/server has failed or has memory errors and is degraded, use the standard RMA replacement.
Onsite replacements are usually done during maintenance windows and typically scheduled.
Some switches are originally assembled with 24GB of memory, this is one 16GB and one 8GB DIMM. Because of this, we send both DIMMs on those devices. When the switch cover is removed, both DIMMs are replaced even though the 8GB is known to be extremely reliable. This is a proactive action decision made by the Cisco team.
“Cisco highly recommends that you take advantage of the field service offering of a field engineer. Please note that if you decided to change the DIMMs on your own there are some switch models that have a more extensive process to gain access to the DIMM location."
It is highly recommended for a Cisco Field Engineer to complete the replacement for switches with high screw counts.
You can view a list of the switch module and the number of screws involved in this table.
Switch Model (PID) | DIMM Access Method | Number of Screws |
---|---|---|
N9K-C93180YC-FX3S | DIMM Door Access | 6 |
N9K-C93240YC-FX2 | DIMM Door Access | 6 |
N9K-C9364C-GX | DIMM Door Access | 6 |
N9K-C93180YC-FX-24* | DIMM Door Access | 6 |
N9K-C93180YC-FX3 | DIMM Door Access | 6 |
N9K-C93108TC-FX3P | DIMM Door Access | 5 |
N9K-C9336C-FX2 | Top Cover Access | 37 |
N9K-C9348GC-FXP | Top Cover Access | 35 |
N9K-C93108TC-FX | Top Cover Access | 33 |
N9K-C93108TC-FX-24* | Top Cover Access | 33 |
Additionally, switches designed with 8GB low-density DIMM memory are impacted, but the failure rate is extremely low. These products are fixed upon failure.
8GB and Supervisor Product IDs |
---|
N9K SUP-A+ |
N9K SUP-A+= |
N9K-C93360YC-FX2 |
N9K-C93360YC-FX2= |
N9K-C93216TC-FX2 |
N9K-C93216TC-FX2= |
N9K-C93180YC-FX |
N9K-C93180YC-FX= |
APIC Server Special Notes
The CIMC BIOS issue is noted in UCS field notice FN72272. This BIOS issue will show higher EC errors counts that are potentially higher than the actual EC error count. You can see Uncorrectable errors because of older BIOS.
Cisco provides a tool to verify whether a device is impacted by this issue. In order to check the device, enter the device's serial number in the Serial Number Validation Tool.
Note: For security reasons, you must click on the Serial Number Validation Tool link provided in this section to check the serial number for the device. Use of the Serial Number Validation Tool URL external to this field notice will fail.
A Form must be filled out for each separate Ship to Address.
The Upgrade Order Reference Number should be unique for each time the Form is filled out.
Please enter a valid AFFECTED Serial Number in the Form, if there is not enough space, enter additional serial numbers in the NOTES section.
Depending on material availability, both you AND the Customer email address will receive a confirmation email with Order# within 2 -5 days. Some UMPIRE orders are proactive replacements and do NOT adhere to normal SLAs or Service Contracts.
NOTE: If your Ship to Address is in the following countries, please expect delays of up to 3 months depending on importation regulations: Argentina, Brazil, Colombia, Mexico, Venezuela, India, All countries in Asia (i.e. Singapore, Malaysia, Hong Kong, China, Vietnam, Korea, Thailand, Philippines), and all non-EU countries (ie: UAE, Turkey). You will receive your Order# at that time. Thank you for your patience as this process is beneficial for the customer; it will save them the cost of vat/duty in these countries (which is very high).
If you were given a Sales Order Number for the shipment of your replacement parts, please refer to the SO Status Tool (Please note: you must have a CCO User ID and Password to access this site):https://cisco-apps.cisco.com/cisco/psn/commerce
If you were given an RMA Number for the shipment of your replacement parts, please refer to the "Service Order QuickSearch" Tool at the following location (Please note: you must have a CCO User ID and Password to access this site):https://ibpm.cisco.com/rma/home/
If you have not received an email with an Order# after 10 days, please send an email with your Request#(s) and Customer in the Subject line to:mailto:umpire-escalations@cisco.com
If you were given a Sales Order Number for the shipment of your replacement parts, please refer to the SO Status Tool (Please note: you must have a CCO User ID and Password to access this site): https://cisco-apps.cisco.com/cisco/psn/commerce
Note: Fields marked with an asterisk (*) are required fields.
1 For phone and fax, include 011 and the country code outside North America.
2 The serial number input field for each Product ID can hold up to 4,000 characters, including commas and white space. For longer lists of serial numbers, please submit additional requests.
3 For customers in Japan only *** please enter the building and the floor in the address field. Also, enter the contact person's name, the telephone number and the e-mail address in the appropriate fields..
If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:
My Notifications—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.