Cisco BGP Timers re-Explained

In this post, let me get some summary of Cisco BGP timers that you can find useful in real life as they reflect basic convergence and route propagation throughout a BGP routed network. I will try to present some of the timers on live example to the most possible extend, including basic modification and verification commands.

This article is part of and CCIE category that try to put together some common facts and references to Cisco documentation on various Routing&Switching topics.

As everything is better to explain on examples, let me start with a small BGP topology on which we will go through via the timers and dampening on examples.

Example BGP Topology — Example BGP Reference Topology

This article will cover these basic timers:

KEEPALIVE + HOLD-DOWN
ADVERTISEMENT INVERVAL
SCAN-TIMER (including BGP NEXT-HOP TRACKING)

Contents

BGP KEEPALIVE and HOLD-DOWN

First basic BGP times are Keepalive and Hold-down timer intervals. By default, keepalive timer is 60 seconds and hold-down timer is 3xkeepalive or 180seconds. Once the peering between two peers is UP, router starts a hold-down timer counting from 0 second up. Every keepalive message it reachieved from the neighbor peer resets this timer back to 0 seconds. As you might imagine, failing to receive three keepalives in a row will make the hold-down timer reach 180 seconds what will mean the neighbor is considered down and routes from this neighbor are flushed.

To verify current timers negotiated to a neighbor, issue the “show ip bgp neighbor” command, example below.

R1#show ip bgp neighbors
BGP neighbor is 5.100.1.2,  remote AS 64513, external link
BGP version 4, remote router ID 5.100.1.2
BGP state = Established, up for 02:39:16
Last read 00:00:16, last write 00:00:16, hold time is 180, keepalive interval is 60 seconds

You can also very easily modify these basic timer on per neighbor basis (or neighbor group basis) with command “neighbor X.X.X.X timers keepalive holddown [minimum holddown]” Example below sets the keepalive to 20 seconds and holddown to 60 seconds on R1.

R1(config-router)# neighbor 5.100.1.2 timers 20 60

NOTE: New keepalives will take effect only after new BGP session establishment as they are transferred and negotiated to the lower values from the two peers in the OPEN messages. Therefore in our example, I had to reset ip bgp peering with “clear ip bgp *” command, the SOFT RECONFIGURATION will not work with “clear ip bgp * soft”.

BGP “Minimum Hold-Down from Neighbor” option

You noticed in the BGP timers syntax one option, that we didn’t mentioned yet. Lets set R2 timers first to 5 seconds keepalive and 15 seconds holddown and reset the peers.

R2(config-router)#neighbor 5.100.1.1 timers 5 15
R2(config-router)#do show ip bgp neighbors
BGP neighbor is 5.100.1.1,  remote AS 64512, external link
BGP version 4, remote router ID 10.0.2.1
BGP state = Established, up for 00:00:44
Last read 00:00:03, last write 00:00:03, hold time is 15, keepalive interval is 5 seconds
Configured hold time is 15,keepalive interval is 5 seconds  Minimum holdtime from neighbor is 0 seconds

R1#show ip bgp neighbors
BGP neighbor is 5.100.1.2,  remote AS 64513, external link
BGP version 4, remote router ID 5.100.1.2
BGP state = Established, up for 00:01:33
Last read 00:00:03, last write 00:00:03, hold time is 15, keepalive interval is 5 seconds
 Configured hold time is 60,keepalive interval is 20 seconds  Minimum holdtime from neighbor is 0 seconds

You can see that again, the smaller timer win the negotiation and the timers are set to 5 second keepalive and 15 second holddown. Now look at the BGP ranges that are possible with these parameters:

R2(config-router)#neighbor 5.100.1.1 timers ?
<0-65535>  Keepalive interval

R2(config-router)#neighbor 5.100.1.1 timers 5 ?
<0-65535>  Holdtime

You can imagine that ISP providers wouldn’t like BGP on steroids with timers set too aggressively with this. Therefore they can protect themselves by setting minimum hold-down on what you want to agree. So lets PROTECT R1 that do not wants to have hold-down timer any less than 40 seconds and clear the session.

R1(config-router)#neighbor 5.100.1.2 timers 20 60 40
R1(config-router)#do clear ip bgp *
R1(config-router)#
*Mar  1 06:25:48.246: %BGP-5-ADJCHANGE: neighbor 5.100.1.2 Down User reset
*Mar  1 06:25:50.670: %BGP-3-NOTIFICATION: sent to neighbor 5.100.1.2 2/6 (unacceptable hold time) 0 bytes

As you can see the peer negotiation failed and session between peers was NOT formed. You can set protection limit to the hold-down timer negotiation if you want, but the price to pay is possibility of the session being refused from the other side.

BGP ADVERTISEMENT TIMER

To maintain the most stable routing table, BGP introduced a minimum delay between updates for a neighbor session. This interval is called Advertisement Interval. On Cisco routers, the default advertisement interval is 30 seconds for eBGP peers and 0 seconds for iBGP peers.

Lets give you an example in our lab topology.

Imagine that we just now configured the R1 BGP process and R1 is not yet configured to propagate its loobback 1 and 2 subnets (10.0.1.0/24 and 10.0.2.0/24). The starting configuration for R1 is shown below and is followed by empty show of advertised routes.

R1(config-router)#do sh run | b router bgp
router bgp 64512
no synchronization
bgp log-neighbor-changes
neighbor 5.100.1.2 remote-as 64513
neighbor 5.100.1.2 ebgp-multihop 2
neighbor 5.100.1.2 update-source Loopback0
no auto-summary
!
R1(config-router)#do show ip bgp neighbor 5.100.1.2 advertised-routes
Total number of prefixes 0

Also on R2, the routing database is empty:

R2#show ip bgp
R2#

As you notice, BGP is not yet aware of the loopbacks. So lets begin making it aware. We will first put network 10.0.1.0/24 into the BGP database and right afterwards the 10.0.2.0/24.

Adding 10.0.1.0/24:

R1(config-router)#network 10.0.1.0 mask 255.255.255.0
R1(config-router)#do show ip bgp neighbor 5.100.1.2 advertised-routes
BGP table version is 10, local router ID is 5.100.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.1.0/24      0.0.0.0                  0         32768 i

Total number of prefixes 1

R1(config-router)#do show clock
*05:27:49.674 UTC Fri Mar 1 2002

The route is advertised immediately and seen on R2:

R2#show ip bgp
BGP table version is 10, local router ID is 5.100.1.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.1.0/24      5.100.1.1                0             0 64512 i
R2(config-router)#do show clock
*05:27:51.852 UTC Fri Mar 1 2002

Now, lets quickly (under 5 seconds) try to enter the second network 10.0.2.0/24:

R1(config-router)#network 10.0.2.0 mask 255.255.255.0
R1(config-router)#do show ip bgp neighbor 5.100.1.2 advertised-routes
BGP table version is 11, local router ID is 5.100.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.1.0/24      0.0.0.0                  0         32768 i

Total number of prefixes 1
R1(config-router)#do show clock
*05:27:55.050 UTC Fri Mar 1 2002

You can also see that this route was NOT DELIVERED IMMEDIATELY TO R2 as the previous route.

R2#show ip bgp
BGP table version is 10, local router ID is 5.100.1.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.1.0/24      5.100.1.1                0             0 64512 i

R1(config-router)#do show clock
*05:27:57.122 UTC Fri Mar 1 2002

The reason for this is the mentioned 30 seconds Advertisement Interval timer between the eBGP peers. The second network statement do not enter the BGP advertisements to R2 until 30 seconds from the last update send regarding the first 10.0.1.0/24 network. If this is correct, the first network was added roughly at “05:27:49.674 UTC Fri Mar 1 2002“, therefore the second network 10.0.2.0/24 should not appear before 05:28:19.

I continuously entered a combination of “show ip bgp neighbor 5.100.1.2 advertised-routes” and “show clock” using copy&paste and verified this. To not make this blogpost full of outputs, I will only show the two last show commands.

R1(config-router)#do show ip bgp neighbor 5.100.1.2 advertised-routes
BGP table version is 11, local router ID is 5.100.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.1.0/24      0.0.0.0                  0         32768 i

Total number of prefixes 1
R1(config-router)#do show clock
*05:27:17.357 UTC Fri Mar 1 2002
R1(config-router)#do show ip bgp neighbor 5.100.1.2 advertised-routes
BGP table version is 11, local router ID is 5.100.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.1.0/24      0.0.0.0                  0         32768 i
*> 10.0.2.0/24      0.0.0.0                  0         32768 i

Total number of prefixes 2
R1(config-router)#do show clock
*05:28:21.974 UTC Fri Mar 1 2002

Also on the R2 side, this route appeared only after I saw it appear on R1 advertised routes.

R2#show ip bgp
BGP table version is 11, local router ID is 5.100.1.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.1.0/24      5.100.1.1                0             0 64512 i
*> 10.0.2.0/24      5.100.1.1                0             0 64512 i
R2(config-router)#do show clock
*05:28:23.154 UTC Fri Mar 1 2002

As you see, advertisement time is in effect, and if your network produces a lot of network that are entered and withdrew from BGP table, the BGP process will protect itself on eBGP sessions by limiting the number of updates in time. To verify currently configured advertisement timer, you can look on show ip bgp neighbors to see it.

R1(config-router)#do show ip bgp neighbors
BGP neighbor is 5.100.1.2,  remote AS 64513, external link
BGP version 4, remote router ID 5.100.1.2
BGP state = Established, up for 02:39:16
Last read 00:00:16, last write 00:00:16, hold time is 180, keepalive interval is 60 seconds
Neighbor capabilities:
Route refresh: advertised and received(old & new)
Address family IPv4 Unicast: advertised and received
Message statistics:
InQ depth is 0
OutQ depth is 0
Sent       Rcvd
Opens:                  8          7
Notifications:          6          1
Updates:               11          0
Keepalives:           162        162
Route Refresh:          0          0
Total:                181        170
Default minimum time between advertisement runs is 30 seconds

The same output for iBGP session would show you advertisement timer set to 0 second by default as inside one AS, the updates should be propagated immediately.

BGP SCAN-TIMER

In the old days, BGP routers considered going over the whole table of BGP prefixes to find best route and validate next-hop IP* a serious resource impact. And in most cases of Internet routers, it is still a resource consuming thing to go over approximately 200MB large Internet routing table to select best route to all the destinations.

*In BGP, next-hop is changed to peering IP when sent out on eBGP session, but remains the same on iBGP sessions. This however means, that if inside your network, your internal IGP protocol (OSPF, RIP, etc..) lost contact with the next-hop subnet, your BGP table should not select such prefix to the destination because even if it has the best parameters, your router do not know which way to send the traffic as the next-hop can be many L3 routers away.

The scan-timer was introduced in Cisco routers to go over the BGP prefix-tables every 60 seconds (default) and validate if we know an IGP route to the next-hop, or compare the BGP prefix attributes for better routes. You can very easily modify this attribute in both router bgp and address family configuration.

The configuration is quite easy, you just use command:

bgp scan-time [import] scanner-interval

To demonstrate this configuration in our topology, you can simply leave the default 60 seconds connected and then try to first remove the nexthop IP from IGP/static routes and quickly afterwards reenter the next-hop route. And … it would be reinstated immediately, or if you are fast enough in 5 seconds. This is because scan-timer is already mostly replaced in basic functions by something called the BGP next hop tracking that I will cover later in this article.

NOTE: In 12.2 IOS versions, the default scan-timer was 15 seconds as noted in 12.2 BGP Command Guide: bgp scan-time.

To cover usage examples that we cannot see in effect right now in our example BGP topology, because we are not running MPLS VPN network that uses VPNv4 addresses, I provide three very easy to understand examples. These example provide the “bgp scan-time” command examples also with and without [import] option.

In the following router configuration example, the scanning interval for next hop validation of IPv4 unicast routes for BGP routing tables is set to 20 seconds:

router bgp 100
 no synchronization
 bgp scan-time 20

In the following address family configuration example, the scanning interval for next hop validation of address family VPNv4 unicast routes for BGP routing tables is set to 45 seconds:

router bgp 150
 address-family vpn4 unicast
  bgp scan-time 45

In the following address family configuration example, the scanning interval for importing address family VPNv4 routes into IP routing tables is set to 30 seconds:

router bgp 150
 address-family vpnv4 unicast
&nbsp; bgp scan-time import 30

NOTE: The “bgp scan-time” command is ignored if your router has BGP next-hop tracking (NHT) enabled for the address-family. This is true for most of the newer IOS releases and in such cases your scan-timer will remain on 60 seconds.

BGP Next-Hop Tracking (NHT)

BGP next-hop tracking tracking is an event-based system introduced to Ciscos BGP implementation to provide faster convergence in on-demand fashion rather than periodic based on scan-timer. It is a system that monitors Routing Information Base (RIB) for next-hop related changes for both eBGP and iBGP introduced prefixes and reports such events to the BGP process.

This feature prevents black hole routing during periods of IGP protocol instability and BGP believing in certain state of next-hop reachability for 60 second period of scan-timer.

Configuring NHT:

As mentioned, the BGP next-hop tracking is active by default, so the only configuration options are

Disabling the NHT
no bgp nexthop trigger enable
Modify internal event delay for the NHT event-system
By default, the events that are triggered by NHT have an interleaving delay of 5 seconds by default. This prevents the BGP process to be bombarded by events generated by unstable IGP behavior. Delay of 5 seconds is quite suited for good IGP protocols that converge fast like OSPF. Reducing the time can lead to instability for BGP sessions and increasing the delay can be useful if you have slower IGP protocol to save resources.
bgp nexthop trigger delay delay-timer

NOTE: Both configuration options can be implemented separately under all BGP address-families.

SUMMARY

BGP is a robust, stable and beautifully complex protocol. There is no chance a single blogpost can cover every situation, but I hope I covered all the basic timers. Please, feel free to comment if you see anything that is not correct, or you know of something more to the BGP Timers topic that should be added.

More BGP resources to read:
Cisco BGP Case Studies – Great series of small examples of BGP functionality with explanation
BGP Configuration Guide – Basic resource for configuring BGP on Cisco routers
BGP Next-Hop Tracking Feature Guide
BGP Advanced Feature Configuration Guide – If you search for “Default BGP Scanner Behavior”, you can find default scan time for ordinary BGP routes

6 comments ...

Devesh says:

October 30, 2013 at 17:11

Nice article… Thanks for the effort
Randy M. says:

January 2, 2014 at 05:54

Great post; I’m actively working toward my CCNP Route exam and this post proved to be useful when attempting to understand the BGP keepalive intervals =) THANKS !
Emmanuel says:

March 14, 2014 at 08:07

It is very useful and helpful for me
thanks
Sharjil Shaikh says:

July 13, 2014 at 13:49

Very nice article , i was facing a issue for which i turned on to Google for help & this article proved very helpful.Thanks a lot.
Pingback: 3.7.a Describe, implement and troubleshoot peer relationships | ccie or die
Pingback: BGP | KB

Comments are closed.

kokotina

vmWare vRealize Automation (with embedded vRO) – Full Example of Custom Resources for Executing Ansible Playbooks from Blueprints

Network Topology Visualization #3 – Exploring other D3 visualization options for a DataCenter

[mini-update] Network Topology Visualization #2 – Using SNMP as data source and enhanced visuals

Autopilot for Elite Dangerous using OpenCV and thoughts on CV enabled bots in visual-to-keyboard loop

Network Topology Visualization – Example of Using LLDP Neighborships, NETCONF and little Python/Javascript

HP Networking/Comware NETCONF interface quick tutorial (using python’s ncclient and pyhpecw7)

[minipost] Protecting SSH on Mikrotik with 3-strike SSH ban using only firewall rules

Example of private VLAN isolation across Virtual and Physical servers using ESX/dvSwitch and HP Networking Comware switches

HPE’s DCN / Nuage SDN – Part 3 – REST API introduction

[minipost] Capturing bidirectional traffic of virtual machine (VMs) on vmWare ESX 6.x

HPE’s DCN / Nuage SDN – Part 2 – First Steps Creating Virtual/Overlay Customer Network

HPE’s DCN / Nuage SDN – Part 1 – Introduction and LAB Installation Tutorial

Tutorial for small Hadoop cloud cluster LAB using virtual machines and compiling/running first “Hello World” Map-Reduce example project

Ping on you! – Beta 2.0 is online

[minipost]Quick LAB/config example for IPv6 BGP between HP Networking Comware v5 andCisco

Ping on You! – small weekend web/CGI project

[minipost] Windows partition editing with diskpart

Checkpoint Firewall CLI tool “dbedit” and quick lab examples

Soviet Mi-24V Hind E, 1/72 scale

Tutorial for creating first external SDN application for HP SDN VAN controller – Part 3/3: “Node Cutter” SDN application in perl with web interface

Tutorial for creating first external SDN application for HP SDN VAN controller – Part 2/3: Influencing Flows via cURL commands

Tutorial for creating first external SDN application for HP SDN VAN controller – Part 1/3: LAB creation and REST API introduction

Tutorial: Email server for a small company – including IMAP for mobiles, SPF and DKIM

Eycalyptus – cloud introduction and auto-scaling tutorial