| « Upgrading with ISSU | Cisco Automated Config Backup » |
| posted in Cisco Networking | |
| by kannies on June 13th, 2012 | tags: BGP, Cisco, failover, HSRP, IP SLA |
Service providers are moving away from from providing TDM point-to-point based circuits and we are now seeing more provisioning of Metro Ethernet to the customer site.
This leaves us with an issue in that when your BGP peer becomes unreachable, because your local FastEthernet interface on the CE will still be up/up as it will probably be connected to some Layer 2 device, the customer network could suffer a complete outage for up to 3 minutes. The BGP default hold time is 180 seconds. For a customer that has been sold a 100M pipe with resilience this is not going to make them happy.
Here is the topology I am using for this example:

Gateways 1 & 2 have an iBGP neighborship over the f0/0 cross-link and provide a virtual default gateway using HSRP on the f0/1 LAN.
The LAN_HOST is not aware of routing and simply has a default route pointed to the HSRP address 192.168.1.1.
Gateway 1 is the primary router for inbound and outbound traffic. This is enforced using the following policies:
- Gateway 1 is the HSRP primary
- Gateway 1 has an inbound route map which sets the local preference to 150 for the prefix 198.77.64.40 witch is also the gateway of last resort as defined by a static route. This manipulates outbound traffic.
- Gateway 1 and Gateway 2 both have an outbound route map which sets the Med to 50 & 200 respectively. This manipulates inbound traffic.
Under normal conditions, the LAN can reach the Internet.

Here is an extended PING while PE1 experiences an outage.

That’s a long outage!
There are 2 problems:
- HSRP is monitoring the interface status and, even though PE1 went down, g1/0 on Gateway 1 didn’t.
- Because g1/0 didn’t go down, BGP didn’t shut the peering to PE1 down, instead it waited until the hold time expired. During this time, outbound/inbound traffic was being black holed.
Now let’s speed things up.
Using IP SLA & BGP Failover
On Gateway 1 create an IP SLA process which starts PINGing the eBGP peer 10.1.1.253 every 5 seconds.

Next create an object which tracks this process. Use number 2 because Object 1 is used on the HSRP interface tracker.
![]()
Next create a /32 static route to the peer using the peer itself as the next hop which uses the object status to validate itself. This will override the /30 connected route.
![]()
Next create a prefix list to match this route.
![]()
Then create a route-map which matches the prefix-list.
![]()
Finally add the following neighbor statement under the BGP process which uses the route-map and the BGP failover feature.
![]()
The output is truncated, the full command is: “neighbor 10.1.1.253 fall-over route-map PEER_REACHABLE”
With this in effect, outage time is shorter because the eBGP peer on Gateway 1 is shut down immediately upon it being unreachable which will purge any stagnant routes in the routing table.

Another final touch is to switch over the HSRP primary to avoid sub optimal routing by tracking the same object we created for BGP Failover.
![]()
This at least removes a hop from our trace.

There is little more we can do on the customer’s network AS100 as the remaining failover delay exists on the service provider network AS200.
PE1 & PE2 peer using loopbacks learned via OSPF with the Next-Hop-Self option set. The default OSPF hold-time is 40 seconds on the broadcast segment over the 172.16.0.0/30 network. When OSPF dies, the BGP Next-hop becomes unreachable and the associated routes are removed long before the BGP peering times out.
Just for giggles, changing the OSPF Hello & dead intervals to 2 & 6 respectively results in the following improved failover time.

We could just avoid all this headache and reduce the BGP hold timers in the first place, but that would be no fun
I am open for constructive criticism from the senior forum members as to what better designs could be deployed in this scenario.
I hope you enjoyed reading and it has been beneficial for you.
For completeness, please see below the configs for both Gateways and PEs.
Gateway 1
track 2 rtr 1 ! ! ! ! interface FastEthernet0/0 ip address 192.168.255.1 255.255.255.252 duplex auto speed auto ! interface FastEthernet0/1 ip address 192.168.1.253 255.255.255.0 duplex full speed 100 standby 1 ip 192.168.1.1 standby 1 priority 105 standby 1 preempt standby 1 track GigabitEthernet1/0 standby 1 track 2 decrement 10 ! interface GigabitEthernet1/0 ip address 10.1.1.254 255.255.255.252 negotiation auto ! router bgp 100 no synchronization bgp log-neighbor-changes network 192.168.1.0 neighbor 10.1.1.253 remote-as 200 neighbor 10.1.1.253 fall-over route-map PEER_REACHABLE neighbor 10.1.1.253 route-map INTERNET in neighbor 10.1.1.253 route-map PRIMARY out neighbor 192.168.255.2 remote-as 100 neighbor 192.168.255.2 next-hop-self no auto-summary ! ip forward-protocol nd ip route 10.1.1.253 255.255.255.255 GigabitEthernet1/0 10.1.1.253 track 2 ip route 0.0.0.0 0.0.0.0 198.77.64.40 no ip http server no ip http secure-server ! ! ! ! ip prefix-list INTERNET seq 5 permit 198.77.64.40/32 ! ip prefix-list PEER_REACHABLE seq 5 permit 10.1.1.253/32 ! ip prefix-list PRIMARY seq 5 permit 192.168.1.0/24 ip sla 1 icmp-echo 10.1.1.253 frequency 5 ip sla schedule 1 life forever start-time now logging alarm informational ! ! ! route-map PEER_REACHABLE permit 10 match ip address prefix-list PEER_REACHABLE ! route-map INTERNET permit 10 match ip address prefix-list INTERNET set local-preference 150 ! route-map INTERNET permit 20 ! route-map PRIMARY permit 10 match ip address prefix-list PRIMARY set metric 50 ! route-map PRIMARY permit 20
Gateway 2
interface FastEthernet0/0 ip address 192.168.255.2 255.255.255.252 duplex auto speed auto ! interface FastEthernet0/1 ip address 192.168.1.254 255.255.255.0 duplex auto speed auto standby 1 ip 192.168.1.1 standby 1 preempt standby 1 track GigabitEthernet1/0 ! interface GigabitEthernet1/0 ip address 10.2.2.254 255.255.255.252 negotiation auto ! router bgp 100 no synchronization bgp log-neighbor-changes network 192.168.1.0 neighbor 10.2.2.253 remote-as 200 neighbor 10.2.2.253 route-map BACKUP out neighbor 192.168.255.1 remote-as 100 neighbor 192.168.255.1 next-hop-self no auto-summary ! ip forward-protocol nd ip route 0.0.0.0 0.0.0.0 198.77.64.40 no ip http server no ip http secure-server ! ! ! ! ip prefix-list BACKUP seq 5 permit 192.168.1.0/24 ! ip prefix-list INTERNET seq 5 permit 198.77.64.40/32 logging alarm informational ! ! ! route-map BACKUP permit 10 match ip address prefix-list BACKUP set metric 200 ! route-map BACKUP permit 20
PE1
interface Loopback0 ip address 1.1.1.1 255.255.255.255 ! interface FastEthernet0/0 no ip address shutdown duplex half ! interface GigabitEthernet1/0 ip address 10.1.1.253 255.255.255.252 negotiation auto ! interface FastEthernet2/0 ip address 172.16.1.1 255.255.255.252 ip ospf hello-interval 2 ip ospf dead-interval 6 duplex auto speed auto ! interface FastEthernet2/1 no ip address shutdown duplex auto speed auto ! interface GigabitEthernet3/0 no ip address shutdown negotiation auto ! router ospf 1 log-adjacency-changes network 1.1.1.1 0.0.0.0 area 0 network 172.16.0.0 0.0.255.255 area 0 ! router bgp 200 no synchronization bgp log-neighbor-changes neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 update-source Loopback0 neighbor 2.2.2.2 next-hop-self neighbor 10.1.1.254 remote-as 100 no auto-summary
PE2
interface Loopback0 ip address 2.2.2.2 255.255.255.255 ! interface FastEthernet0/0 no ip address shutdown duplex half ! interface GigabitEthernet1/0 ip address 10.2.2.253 255.255.255.252 negotiation auto ! interface FastEthernet2/0 ip address 172.16.1.2 255.255.255.252 ip ospf hello-interval 2 ip ospf dead-interval 6 duplex auto speed auto ! interface FastEthernet2/1 no ip address shutdown duplex auto speed auto ! interface GigabitEthernet3/0 ip address 10.3.3.253 255.255.255.252 negotiation auto ! router ospf 1 log-adjacency-changes network 2.2.2.2 0.0.0.0 area 0 network 172.16.0.0 0.0.255.255 area 0 ! router bgp 200 no synchronization bgp log-neighbor-changes neighbor 1.1.1.1 remote-as 200 neighbor 1.1.1.1 update-source Loopback0 neighbor 1.1.1.1 next-hop-self neighbor 10.2.2.254 remote-as 100 neighbor 10.3.3.254 remote-as 40 no auto-summary
Finally here are the IP Routing and BGP table on Gateway 1 before and after a failover.
Gateway1#show ip route Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route Gateway of last resort is 198.77.64.40 to network 0.0.0.0 10.0.0.0/8 is variably subnetted, 2 subnets, 2 masks C 10.1.1.252/30 is directly connected, GigabitEthernet1/0 S 10.1.1.253/32 [1/0] via 10.1.1.253, GigabitEthernet1/0 192.168.255.0/30 is subnetted, 1 subnets C 192.168.255.0 is directly connected, FastEthernet0/0 C 192.168.1.0/24 is directly connected, FastEthernet0/1 198.77.64.0/32 is subnetted, 1 subnets B 198.77.64.40 [20/0] via 10.1.1.253, 00:01:04 S* 0.0.0.0/0 [1/0] via 198.77.64.40 Gateway1#show ip bgp sum Gateway1#show ip bgp BGP table version is 30, local router ID is 192.168.255.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path * i192.168.1.0 192.168.255.2 0 100 0 i *> 0.0.0.0 0 32768 i *> 198.77.64.40/32 10.1.1.253 150 0 200 40 i Gateway1# *May 28 23:41:08.563: %TRACKING-5-STATE: 2 rtr 1 state Up->Down *May 28 23:41:08.563: %BGP-5-ADJCHANGE: neighbor 10.1.1.253 Down Route to peer lost Gateway1# *May 28 23:41:09.631: %HSRP-5-STATECHANGE: FastEthernet0/1 Grp 1 state Active -> Speak Gateway1# Gateway1# Gateway1# Gateway1# Gateway1# *May 28 23:41:19.631: %HSRP-5-STATECHANGE: FastEthernet0/1 Grp 1 state Speak -> Standby
Gateway1#show ip route Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route Gateway of last resort is 198.77.64.40 to network 0.0.0.0 10.0.0.0/30 is subnetted, 1 subnets C 10.1.1.252 is directly connected, GigabitEthernet1/0 192.168.255.0/30 is subnetted, 1 subnets C 192.168.255.0 is directly connected, FastEthernet0/0 C 192.168.1.0/24 is directly connected, FastEthernet0/1 198.77.64.0/32 is subnetted, 1 subnets B 198.77.64.40 [200/0] via 192.168.255.2, 00:00:16 S* 0.0.0.0/0 [1/0] via 198.77.64.40 Gateway1#show ip bgp BGP table version is 32, local router ID is 192.168.255.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path * i192.168.1.0 192.168.255.2 0 100 0 i *> 0.0.0.0 0 32768 i *>i198.77.64.40/32 192.168.255.2 0 100 0 200 40 i
Comments
A thread has been created on the site forum specifically for commenting on this blog post.
