I have spent last weeks configuring our new Cisco Nexus 5596UP switches in two data centers. The decision to use configuration synchronization feature (also known as Switch Profile) seemed logical as our new DC infrastructure design dictates to use Dual Homed FEXes with Active/Passive NIC teaming topology. This scenario (like any Dual Homed) requires almost all configuration to be identical on both switches that are part of vPC domain. Overall, I like this neat feature. In my humble opinion, Cisco had to come up with it years ago. It works like a charm if you are working on a clean deployment and follow Cisco guidelines. But… when it comes to the migration from Legacy configuration mode to the Switch Profile mode with both vPC domain switches already being pre-configured separately… well, you’ll definitely face some issues! I personally have spent few days trying to solve one puzzle that driven me nuts!
Before reading further, I highly recommend you to familiarize yourself with my previous blog post and some official Cisco documents:
- vPC Domain Configuration Synchronization Guidelines (NExp.com.ua)
- Configuring Switch Profiles (Cisco Nexus 5500 Series NX-OS System Management Configuration Guide, Release 6.x)
- Troubleshooting Config-Sync Issues (Cisco Nexus 5000 Troubleshooting Guide)
Now, imagine that you have both vPC domain switches fully configured without using Config Sync mode. Nothing special. This is absolutely normal scenario, especially for early deployments (i.e. NX-OS 5.x) when Config Sync mode had many various limitations. Unfortunately (well, for you) the management/TA department decided to implement this shiny feature to decrease administrative overhead of any future configuration tasks.
If you carefully follow Cisco’s guidelines while importing configuration, you probably won’t have any problems. But… you can face some odd issues, like in my case, and can spend crazy hours trying to understand what went wrong. Here’s a good example of my own…
I started to import configuration and followed all recommendations. Everything seemed to be ok with exception of one single FEX interface that I wasn’t able to configure at all, with no difference in what configuration mode I tried to do that:
N5K-01(config)# interface ethernet101/1/15 N5K-01(config-if)# inherit port-profile PP-Servers Error: Command is not mutually exclusive N5K-01(config)# conf sync N5K-01(config-sync)# switch-profile vPC N5K-01(config-sync-sp)# interface ethernet101/1/15 N5K-01(config-sync-sp-if)# inherit port-profile PP-Servers N5K-01(config-sync-sp-if)# verify Failed: Verify Failed N5K-01(config-sync-sp-if)# show switch-profile status switch-profile : vPC ---------------------------------------------------------- Start-time: 633904 usecs after Mon Sep 30 09:37:15 2013 End-time: 640502 usecs after Mon Sep 30 09:37:15 2013 Profile-Revision: 17 Session-type: Verify Session-subtype: - Peer-triggered: No Profile-status: Verify Failed Local information: ---------------- Status: Verify Failure Error(s): Following commands failed mutual-exclusion checks: interface Ethernet101/1/15 inherit port-profile PP-Servers
Both switches were reporting the same errors. If you have carefully read Cisco’s Config Sync Troubleshooting guide, you probably know that “Command is not mutually exclusive” or “commands failed mutual-exclusion checks” mean that some configuration is not consistent across two vPC domain switches within two different configuration modes. In particular, it often happens when you import the config from the Global mode to the Switch Profile (Config Sync) mode on one switch, but forgot to do this on the other one before re-enabling synchronization… But that was not my case! I have confirmed Global and Switch Profile’s configuration for this interface and found it to be identical (empty, to be more precise) on both switches:
N5K-01# sh run include-switch-profile | section 101/1/15 interface Ethernet101/1/15 interface Ethernet101/1/15 N5K-02# sh run include-switch-profile | section 101/1/15 interface Ethernet101/1/15 interface Ethernet101/1/15
So, what was wrong? I have Google’d for days, read tens of different posts and finally found one comment that was covering a slightly different issue. Anyway, the idea behind that post was about NX-OS maintaining internal databases that are presenting running configuration using kind of NX-OS-friendly structure… This databases exist for both configuration modes – Global and Switch Profile. As it turned out, NX-OS uses those structures for mutual exception checks (and not only)! And, to make things worse… these databases can become unsynchronized with running-configuration. So it was in my case – I have checked running-config, confirmed it looked ok, but had no idea about those internal structures which didn’t reflect some of my recent changes. Every time you have an unexplained behavior of the Switch Profile feature, use the following commands to confirm the internal databases are consistent with the running configuration from the perspective of your problem:
- show system internal csm info switch-profile cfgd-db cmd-tbl
Display internal Switch Profile configuration database.
- show system internal csm info global-db cmd-tbl
Display internal Global configuration database.
This two commands will save hours of your time when troubleshooting odd Config Sync issues, remember them well! The basic idea is that internal databases content must reflect the state of the running configuration. In my case these internal databases have had the following records.
N5K-02# show system internal csm info switch-profile cfgd-db cmd-tbl | sec 101/1/15 parent_seq_no= 0, seq_no= 1079, clone_seq_no= 0, cmd= 'interface Ethernet101/1/15' N5K-02# show system internal csm info global-db cmd-tbl | sec 101/1/15 clone_seq_no= 0, cmd= 'interface Ethernet101/1/15' parent_seq_no= 811, seq_no= 812, clone_seq_no= 0, cmd= 'description Servers' parent_seq_no= 811, seq_no= 815, clone_seq_no= 0, cmd= 'duplex full' parent_seq_no= 811, seq_no= 814, clone_seq_no= 0, cmd= 'speed 1000' parent_seq_no= 811, seq_no= 813, clone_seq_no= 0, cmd= 'switchport access vlan 150'
Although only N5K-02’s output is shown here, both switches have had identical records in their internal databases. This doesn’t look right if you compare this output to the running-configuration contents shown few paragraphs before.
I recalled these commands were part of the Global configuration mode before I reset port’s configuration to default! I wanted to try a different configuration method as opposed to the import/verify/commit sequence. Basically, I wanted to reset port’s configuration to default in Global configuration mode on both switches and configure it on one of the switches via Switch Profile. I have checked my PuTTY session log and found this output right after I applied the default configuration to the interface:
Interface config wipeout failed for 0x2 Interface config wipeout failed for 0x1f690000
Even though I encounter this error every time when I wipe out any interface’s configuration using default interface command, I never paid any special attention to it. Mainly because I relied on running-configuration contents and it always reflected my changes. Now it became obvious what this error means… It tells network engineer that configuration was not wiped out from the internal configuration structures! As result, the switches stuck in internally unsynchronized/non-consistent state.
Luckily, there is a command to force internal databases synchronization – resync-database. It has to be executed on both switches separately, from the root of the Config Sync mode (not within Switch Profile). A complete example follows
N5K-01(config-sync)# resync-database Re-synchronization of switch-profile db takes a few minutes... Re-synchronize switch-profile db completed successfully. N5K-02(config-sync)# resync-database Re-synchronization of switch-profile db takes a few minutes... Re-synchronize switch-profile db completed successfully. N5K-02# show system internal csm info switch-profile cfgd-db cmd-tbl | sec 101/1/15 parent_seq_no= 0, seq_no= 1079, clone_seq_no= 0, cmd= 'interface Ethernet101/1/15' N5K-02# show system internal csm info global-db cmd-tbl | sec 101/1/15 clone_seq_no= 0, cmd= 'interface Ethernet101/1/15'
As you can see, the configuration is now consistent across running-configuration and internal databases on both switches (only N5K verification is shown). You are now allowed to apply the configuration to the problematic port via Switch Profile (or Global mode, if that was your requirement) and successfully commit all changes!
Hope this saved your time!