When connecting Cisco ACI fabric with HPE blade servers through HPE Virtual Connect Modules, users should pay additional attention when working with VC tunnel networks. Both ACI and VC tunnel mode has some unique internal traffic forwarding mechanism when comparing with traditional L2 MAC forwarding method.
This blog intends to highlight one scenario users may see unexpected traffic behavior when connecting ACI with VC tunnel mode. Some potential design alternatives to resolve the issue are then discussed.
Virtual Connect tunnel network has the benefit of scalability and simplicity of management. Instead of configuring individual vlans matching upstream switch and downlink blade servers, a single Virtual Connect network can transit up to 4K vlans transparently. Users only need to configure vlans at upstream switch and blade servers OS/hypervisor level.
When using Virtual Connect tunnel networks, internally VC maintains a consolidated MAC forwarding table for the defined VC internal tunnel network instead of separated MAC table per user VLAN defined at switch and server side.
This unique operation characteristics of VC tunnel network needs to be taken into consideration when working with ACI Fabric. ACI uses Bridge Domain(BD) as layer 2 broadcast boundary and each BD can include multiple End Point Groups(EPG). At access layer, users can bind encapsulation vlan to the desired EPG to carry user traffic. As you can see, in some ACI design scenarios, flooding can cross different user vlans(EPGs) when the EPGs are in the same BD.
The following diagram shows one of the problematic scenarios of connecting VC tunnel network and ACI .
(Please note: The diagram only demos one use case to see this interop issue. Other cases may exist with the same issue so the key is to understand the nature of the problem.)
In the above topology, Virtual Connect has one single tunnel network defined and uses one uplink to connect with ACI leaf node. Over this link, two user vlans are carried through, vlan 160 and vlan 161. On ACI side, vlan-160 is used as encap-vlan for EPG Mgmt and vlan-161 is used as encap-vlan for EGP vmotion.
ACI BD domain is set as flooding mode as blade servers’ gateway are outside ACI cloud.
In this setup, user will have connectivity issue for the blade servers. This is how the problem will arise.
1) The blade server sends one ARP broadcast request over vlan 160 network. The ARP packet will travel through VC tunnel network. VC will record blade server source MAC address learned from the server downlink and forward the packet out to its local uplink to ACI. so far so good.
2) ACI fabric will see the ARP broadcast packet coming in on access port vlan 160 and will map it into EPG Mgmt. Because BD is set to flood ARP packets. The packet will be flooded inside the bridge domain and will be in turn to all ports under EPG vMotion as the two EPGs are in the same BD.
The following capture was from ACI fabric access SPAN session capture the access port EGRESS direction. We saw the same ARP broadcast packet flooded out of the port(in EGP vMotion)
3) The same ARP broadcast packet will come back over the same uplink. VC will see the original blade server source MAC address from this uplink. At this point, VC will have the same MAC learned from both downlink server port and the uplink port within its single tunnel network MAC forwarding table. Remember VC tunnel mode will consolidate MAC table instead of maintaining per-vlan MAC table. This will in turn cause traffic interruption for this server.
There are several resolutions to the above scenario.
1) Set up ACI fabric EPG and BD as one to one mapping to simulate traditional layer 2 flooding behavior. So the flooding traffic in one EPG will not be able to reach other EPGs.
2) Use VC mapped mode networks. VC mapped mode networks are mapping to individual user vlans so will not have this conflicting MAC learning issue due to the fact that VC will maintain separate MAC forwarding table separation per user VLAN.
3) There is one promising ACI option. Although it doesn’t resolve this issue currently, I hope the feature can be enhanced in the future.
ACI release 1.1(1j) introduced the new option for BD flooding behavior called “Flood in Encapsulation”. This is to contain the flooding traffic inside the BD to only flood the traffic within the same EGP encapsulation vlan. This is a a promising solution as EPG Mgmt flooding traffic will not reach EPG vMotion with this option turned on even if these two EPGs share the same BD.
However, my test shows that even after setting this option, I still see flooding across EPG encap-vlans. Cisco ACI Fundamental also specifically pointed out the unsupported protocol for this option. This include ARP traffic, many common routing protocol like OSPF/BGP/EIGRP, multicast protocols like PIM/IGMP. So until this option can support ARP packets, it won’t be a viable solution for this case.
For readers want to dig deeper into VC tunnel mode MAC table, you can view VC internal tunnel network MAC forwarding table by the following method.
If VC is managed by HPE OneView software, you can make RestAPI call to OneView appliance like the following captures. You can see when the problem was seen, the blade server MAC address was learned from uplink LAG 25, which is VC module port X7. In working scenario, the same MAC address should be learned from server downlink like “downlink 3-a”.
HP OneView PowerShell cmdlet “show-HPOVLogicalInterconnectMacTable” can also retrieve VC MAC table.
The capture below shows incorrect MAC address entry after VC received the original packet from uplink. You can see LAG(Link Aggregation Group) 25 is on uplink X7.
The capture below shows the correct MAC address entry from server downlink “3-a”(Server bay 3, first flexNIC inside the physical link) before the issue.
If Virtual connect is managed by classic Virtual Connect Manager, you can use “show interconnect-mac-table” to retrieve MAC table information. The following capture just gives you an idea how the output looks like. It’s not for the scenario and configuration described above.
The following capture shows blade server connectivity issue we just discussed.