OSPF Split-Brain Problem and Designated Router (DR) Election

All the CCNA/CCNP guys out there once learned from their study that once an OSPF router is elected as Designated Router(DR), he will not willingly loose this status to a new OSPF router that becomes active on the same broadcast network despite having worse priority compared to the new one. Well, this is not completely true. There exists a situation where DR will simply have to willingly enter itself in a limited form of election process again and possibly loose to prevent issues in the network.

In this post, let me show you this problem called OSPF split-brain on live example topology that anyone can do in GNS3 simulator if they wish.

First, let’s summarize what most of the people out there know about OSPF DR election and what is by no means incorrect.

OSPF election of Designated Router (DR)

  • DR is select the router with highest priority, if priorities are equal, the highest router-ID wins.
  • Backup Designated Router (BDR) is router with second highest priority/router-ID.
  • All other routers in common broadcast domain only establish sessions with DR and BDR.
  • DR and BDR elections are not-preemptive, this means that once a DR and BDR is established, they keep their statuses despit new routers becomig active on the same broadcast area that may even have beeter priorities. In other words, there are no new elections held for every OSPF router that comes late.

Let’s start with an OSPF topology on which we will demonstrate an example of DR and BDR election:

OSPF topology

PF topology

So, the question for network oriented reader is, which router in this topology is going to be DR and BDR for the common 10.0.0.0/24 network?

Answer is that R1 will be DR and R3 will be BDR. This can be seen from the “show ip ospf neighbours” on any of the routers.

As already mentioned, OSPF DR election is by a non-preemptive election. This means that even if we change the priorities so that some other router will have better preference, until the current DR is active, new elections will not start. So lets change the priority on R2 to a bigger value.

OSPF Topology with Priority Change

OSPF Topology with Priority Change on R2

Now, let’s have a look at the election state and despite the new priority for R2 (NeighbourID 2.2.2.2), the DR/BDR have not changed:

NO CHANGE!
So to this point, the theory most of CCNA/CCNP people know is correct. Lets simulate a connection issue to our topology.

OSPF Broadcast Network Separated

OSPF Broadcast Network Separated

I created this topology with two switched specifically for this possibility of having the routers separated in pairs by loosing connection between the switches. This will also not give any information to the routers as there is no detectable power outage on their physical interfaces. Before separation R1 was DR and R3 was BDR (because of preemption).

When we separated the networks, Dead-Timers started to expire for DR/BRD sessions from routers trapped to wrong side of the network. From our R4 router perspective showed below, the R4 router lost connection the the old BDR router (R3) and also to R2. Afterwards R4 have become the new BDR for the “left” part of the network.

On the right side, the DR/BDR statuses have also changed only a little with R2 and R3?

So despite having smaller priority, R2 has immediatelly changed role from BDR to DR once it detected the old DR (R1) unreachable. R3 was elected new BDR.

In summary for “left” part of the network:

R1 – DR
R4 – BDR

In summary for “right” part of the network:

R2 – DR
R3 – BDR

Now who can tell what happens when I put the networks back together ? Who will be DR and who will be BDR? Will there be completely new elections starting from no-one being preferable in any way? Many questions that I would like to answer for you in this article summary.

Two DR routers find each other on one Broadcast domain

Two DR routers find each other on one Broadcast domain

Now this is an interesting result of how OSPF resolved this problem that is called a “split-brain” issue. Now, let’s look on this output from the R3 router to see who win.

INTERESTING! The two DR routers have meet. The one DR who discovered that there is another DR with better priority has willingly discarded its DR status entered election process. The same happened to BDR routers, the two BDR routers identified themselves and the only the R4 has willingly discarded its BDR status.  There were no elections held when we connected the two previously separated parts of the network together.

Explanation of what happened here can be found in the OSPFv2 RFC2328:

If we go to section 9.4, we can find there an explanation how the DR and BDR is elected in the situation when one or more routers already declare them selves DR or BDR.

 Summary

What I presented you is just part of the hidden “magic” OSPF protocol can do in situations that you will not find in the basic CCNA/CCIP literatrure. But as you have just seen, OSPF is ready for even these situations in the depths of the RFC standard and this means that maybe you already run OSPF in blisfull ignorance of the problems you network topology may provide for the protocol and the protocol itself is allowing you your good night sleep just by being prepared for much more situations that you are aware of …. and I personally appreacite my good night sleep.

Then again, I thank you for reading and if you liked this article, please share.

Peter

 

If you enjoyed this blog, please share.

About Peter Havrila

Author's Profile