Thursday, August 5, 2021

Cisco 6880X - Part 4 (CSCux07070)

It's the year 2020 and the month of April -- still paying for the mistake I made 7 years ago when I decided to try out Cisco's 6880 platform. One of those nightmares that you can't seem to snap out of no matter how hard you try. I want to say a lot more s*** about it but there is no point, we have less than ten left that need to be thrown out the window. One of the "thing" (I don't even feel like calling it a switch anymore) started rebooting randomly -- 7 reboots in about 12 hour period. Our team reached out to Cisco TAC - TAC came to the conclusion that both power supplies need to be RMA'd (they couldn't find any SEA log files on the system). How that was determined is beyond me -- since I wasn't on the TAC call, I didn't have a chance to ask why in the world would both ps's need to be replaced. I any case, knowing 6880 platform the way I do, I just didn't buy the power supply story, so  I went looking -- nothing fancy, just went through the system logs and noticed this right before every reboot:

5/-1: MAJ, DIAG_BU, online_diag_flush_pak_queue: flushed 1 packets when testing [1/-1/8]
1/-1: MAJ, DIAG_BU, online_diag_flush_pak_queue: flushed 1 packets when testing [1/-1/21]
1/-1: MAJ, DIAG_BU, online_diag_flush_pak_queue: flushed 1 packets when testing [1/-1/9]
1/-1: MAJ, DIAG_BU, online_diag_flush_pak_queue: flushed 1 packets when testing [1/-1/1]
5/-1: MAJ, GOLD, diag_publish_result[5/-1]: cpu limit hit, skip publishing result, test_id[38], testing_type[4]
5/-1: MAJ, GOLD, diag_publish_result[5/-1]: cpu limit hit, skip publishing result, test_id[39], testing_ty

show mod:
--- ----- -------------------------------------- ------------------ -----------
  1   20  DCEF-X 16P SFP+ Multi-Rate             C6880-X-16P10G     
  5   20  6880-X 16P SFP+ Multi-Rate (Active)    C6880-X-SUP        

Just looking at those logs and a quick google search landed me here:

High CPU due to Interrupt on C6880-X-16P10G