Tuesday, November 18, 2014

EIGRP Hello and Hold Timers

The Hello and Holddown timers can be a bit tricky to understand if you mostly work with, say, OSPF for example. So let's get down to it.

Both timers are defined under an interface. In Classic Mode it is directly under the interface proces, whereas under the Named Mode it will be defined under the af-interface section of the EIGRP process.

The Hello timer defines how often the local router will send out a Hello packet. This information is not advertised to the neighbor and the Hello timer does not need to match between the neighbors for EIGRP to form adjacencies.

The Holddown timer defines how long the neighbor router will wait for a Hello packet from the local router. The way this works is that the Holddown is advertised in the Hello packet, so that the local router can instruct the neighbor how long it should wait for a Hello packet before tearing down the adjacency.
Hold Time advertised in an EIGRP Hello packet
So, because the Hello timer is locally significant only and the Holddown is the timeout for the local routers adjacency on the neighboring router, there is no need for the timers to match between EIGRP neighbors - unlike OSPF where these timers must match.

Note: I do believe the best-practice recommendation would be to have the timers match on the peering devices - but it is a technical possibility to have them configured differently between the adjacent routers.

Friday, November 14, 2014

Maximum Transmission Unit (MTU)

I was asked, this week by a colleague of mine, to explain to him some things regarding a Maximum Transmission Unit (MTU) setting at a customers site, where the customer told him, that he had to set his MTU to 1472 on an endpoint device in order to accommodate for Dot1Q VLAN tagging in his network (which I doubt he actually needed to do, but some times you need to choose your battles, when explaining network stuff to people who are not well-versed in such).

As I was explaining the meaning and workings of MTU size and when you would want to change it versus when not to, I discovered I actually didn't have all the answers he sought after - mainly because I did not have much of the specifics from this particular setup we were discussing, but also because I didn't actually remember all of the header sizes of IP and Ethernet - I knew the theory of it, but it is hard to convey that without specific numbers, which I feel like I shouldn't be struggling with as a CCIE candidate. So, I sat down and looked it all up, threw some pings and other traffic around in the lab and captured some packets - and now I am writing this down so that I may gain an even better grasp of these concepts myself - and also not have to spend time explaining it to someone, when I can just send them a link to this post.

Alright, on with it!

Maximum Transmission Unit, or MTU, defines the maximum size of an Ethernet frame transmitted on the network. In most cases, this would default to 1500 bytes. Testing the MTU is usually done using ICMP ping requests with a packet size set somewhere between 1300 and 1500 (depending on what you want to test) and the DF-bit set. The DF-bit means that the packet should not be fragmented and if it encounters a point on the path from A to B, that requires fragmentation, an ICMP packet will be returned stating that it needed fragmentation, but the DF-bit was set. This reply is generated by the device, which would otherwise have fragmented the packet - had the DF-bit not been set. When testing the MTU size, you should know how the OS handles ICMP sizes.

Let's do a sample with Windows 7 pinging its local gateway, which is set with an IP MTU of 1500 on its interface.
R1#show ip interface GigabitEthernet1
GigabitEthernet1 is up, line protocol is up
  Internet address is 172.16.0.1/24
  <output omitted>
  MTU is 1500 bytes
  <output omitted>
Let's try a ping size of 1500 with the DF-bit set and see what happens.
Win7 ping size 1500 w/ df-bit set

Okay, that didn't work - but why ?! We said 1500 in the length of the packet and that is what is allowed on the router interface. So, why are we needing to fragment the packet? Well, simply put, the packet is too large for the host to send out its interface without fragmenting it. The reason is that Windows 7 takes the -l 1500 command as meaning 1500 bytes of payload - then it adds the various headers (we'll get to the headers in a second). This makes the packet too big to be transmitted without fragmentation.

Now, lets try lowering the length of the packet to 1472 and see if we can get a packet through.
Win7 ping size 1472 w/ df-bit set

So, the packet was sent and a reply was received. Now, why can we only send with 1472, even though the client and the network is configured for 1500 bytes?

If we were to replicate this on a router, pinging another device with the size set to 1500 it would work - as seen in the output below, where router R2 pings R1 on the same layer 2 subnet (172.16.0.0/24).
R2#ping 172.16.0.1 size 1500 df-bit repeat 1
Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 172.16.0.1, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 4/4/4 ms
The simple explanation here is, that the packet length is interpreted differently between a router and a Windows 7 machine. The router takes the 1500 bytes value as meaning a 1500 byte Ethernet frame (excluding the Ethernet encapsulation itself), whereas the Windows machine means 1500 bytes of payload - then comes the layer 3 encapsulation and subsequently the layer 2 encapsulation.

Let's break down the packet that makes it through on the Windows machine.
Ping towards 172.16.0.1 with the df-bit set and a payload of 1472.
  • Payload: 1472 bytes
  • ICMP encapsulation: 8 bytes (1480 bytes)
  • IP encapsulation: 20 bytes (1500 bytes)
  • Ethernet encapsulation: 18 bytes (1518 bytes)
  • Transmit on wire
So, as seen above, the packet reaches 1500 bytes after it is encapsulated at layer 3 (IP). This is where the IP source and destination addresses are added (among other things). After that it as handed off to layer 2 and encapsulated in an Ethernet frame containing the source and destination MAC addresses. What you will most likely see when capturing packets is a a frame size of 14 bytes, which is not wrong, but as you can see above I wrote 18 bytes - the 4 extra bytes are from the FCS (frame check sequence), which most network cards don't capture in Wireshark.

Below I have a few MTU size examples. First we will look at a TCP example as seen in a Wireshark capture that doesn't include the Ethernet FCS (most network cards don't include this value).
Alright, next up we have the same example as above, but this time we include the FCS and the result is actually how packets today are sent (by default) on an Ethernet network.
The layer 2 MTU size is now 1518 bytes and this is what counts as the packet sent on the wire.

Now, let's take it a little further and add 802.1Q VLAN tags into the mix. The packet, as seen above, is transmitted to a switch, that then transmits it to another switch over a trunk link - adding a VLAN tag to the 1518 byte long frame. This VLAN tag is worth 4 bytes, which now makes the frame a whopping 1522 bytes!
In 1998, the max-packet was changed, in 802.3, from 1518 to 1522 to allow for the 4 bytes VLAN tagging. So, the new standard size in Ethernet networks today is actually considered to be 1522 - sizes above that would be considered jumbo frames.

This would be standard Ethernet network allowing for 1500 byte packets. However, the tale is not done quite yet. The Ethernet frame may be 1522 (when using VLAN tagging), but the bytes transmitted on the wire for a packet includes a little more than just the frame itself - this is unrelated to the MTU, but still a nice little fun-fact to know about.

Before an Ethernet frame is sent, the sender makes sure the coast is clear by sending a preamble and the SFD (start-of-frame-delimiter). This is known as CSMA/CD (Carrier Sense Multiple Access Collision Detection), which is used in 802.3 networks. This takes up 8 bytes on the wire. And finally, at the end of a transmission there is a silence called the Interframe Gap, which adds another 12 bytes. So, a frame of 1522 bytes becomes a total of 1542 bytes transmitted on the wire. Keep in mind that the preamble, SFD and Interframe Gap does not count towards the MTU size - neither does the Ethernet frame encapsulation including the FCS and VLAN tag (if any).

Thursday, November 13, 2014

EIGRP Wide Metrics

Along with the EIGRP multi-af (named) mode also came the Wide Metrics of EIGRP. Let us do a short recap of the classic metrics before tackling the wide metrics.

The classic metrics consisted of K values 1 through 5 with the default settings listed below.
K1 = bandwidth = 1
K2 = load = 0
K3 = delay = 1
K4 = reliability = 0
K5 = MTU = 0
So, the above means that any K value set to 1 i used in the metric calculation - so by default only bandwidth and delay is used. These K values are used in a formula (shown at the bottom of this post) to scale the bandidth, load, delay etc. and finally produce a number that is the actual metric for EIGRP.

In EIGRP Named Mode, the wide metrics was introduced along with a sixth K value. The delay was also changed from being measured in tens of microseconds to being measured in picoseconds. The new wide metrics, delay in picoseconds and K6 are only part of EIGRP multi-af mode - it is not available in classic mode.

The need for wide metrics, in short, is due to the classic metrics being unable to handle interfaces above 1 Gigabit properly. Due to both the bandwidth and delay values not allowing for the granularity needed. So the wide metrics introduce a 64-bit metric calculation as opposed to the classic 32-bit.

The new 64-bit calculations introduced another issue - the metric in the RIB can only accommodate 4 bytes (32 bits) of data. This is solved by scaling the metric for the RIB using the metric rib-scale <1-255> command in the routing process address-family section.

Below is an excerpt of the Cisco documentation regarding the EIGRP metrics calculation formula.
Use this command to alter the default behavior of EIGRP routing and metric computation and to allow the tuning of the EIGRP metric calculation for a particular type of service (ToS). 
If k5 equals 0, the composite EIGRP metric is computed according to the following formula:
metric = [k1 * bandwidth + (k2 * bandwidth)/(256 – load) + k3 * delay + K6 * extended metrics] 
If k5 does not equal zero, an additional operation is performed:
metric = metric * [k5/(reliability + k4)] 
Scaled Bandwidth= 10^7/minimum interface bandwidth (in kilobits per second) * 256 
Delay is in tens of microseconds for classic mode and pico seconds for named mode. In classic mode, a delay of hexadecimal FFFFFFFF (decimal 4294967295) indicates that the network is unreachable. In named mode, a delay of hexadecimal FFFFFFFFFFFF (decimal 281474976710655) indicates that the network is unreachable.  
Reliability is given as a fraction of 255. That is, 255 is 100 percent reliability or a perfectly stable link. 
Load is given as a fraction of 255. A load of 255 indicates a completely saturated link.
A thing to remember is, that if you change the K value weighting on one router, you will have to change it on all routers in the EIGRP network. The K values must match!

Wednesday, November 12, 2014

Configure Replace

If you plan on loading a lot of different topologies into your lab - be it physical or logical - you may want to consider learning the configure replace command to speed things up a bit. What this command can do for you in the lab is to cut down on the time it takes to "reset to zero" by having a base configuration for your router/switch, that you can then use to overwrite any changes done when tampering with different technologies or lab assignments. What I used to do was to make sure not to write anything to the config and then when I needed to reset the lab I would reload the devices and wait the excruciatingly long time it took for them to reload (this was with my hardware lab consisting of a couple of 2600 routers). When I started building a lab setup for my CCIE I knew I had to find a way to change configs faster and I found that this command would help me do just that.

This is how I setup my lab routers  (the routers I use are CSR1000v - Ciscos virtual cloud routers).
You boot up your router and configure the things you want to be configured just about always.
enable
configure terminal
!
hostname R1
!
logging buffered 8192
!
no aaa new-model
!
no ip domain lookup
!
no ip http server
no ip http secure-server
!
line con 0
 exec-timeout 0
 logging synchronous
!
end
When you are done setting up the very basics  - save the running-config to a file on flash.
copy running-config flash:/config/base.conf
Now, whenever you have configured anything on the router - like, say, some DMVPN or EIGRP configuration - and you want to reload it to the base to start another lab, this is what you do.

From the privileged exec mode enter the following command
R1#configure replace flash:/config/base.conf force
Total number of passes: 1
Rollback Done
R1#
*Nov 12 10:30:42.186: Rollback:Acquired Configuration lock.
R1#
This makes the running-config identical that of the base.conf file saved earlier. The "Total number of passes: 1" indicates it took 1 pass of the config to make it identical. The amounts of passes it will take depends on how much the runnin-config and the base.conf file differs.

There is a bit more to the command if you want to use it outside of the lab, but this just about covers what you may want in a lab environment.

Monday, November 10, 2014

EIGRP Packets

EIGRP communicates using IP Protocol 88 and uses the following packet types:
  • Hello/Ack
  • Update
  • Query
  • Reply
  • SIA-query
  • SIA-reply
Hello packet
Opcode = 5
The Hello packet is used to automatically form adjacencies with neighboring routers. This is done by sending a message to the multicast address 224.0.0.10 or FF02::A. Since this packet type is unreliable the sequence number will be set to 0. The Hello packet includes the routers K values and these must match between neighbors for adjacencies to form - this ensures a consistent metric calculation throughout the network. It also includes the Holdtime, which by default is set to 15 seconds - 3 times the default Hello interval. The Hello packet is also used to send Ack messages if the acknowledge is not able to "piggyback" in another Update, Query or Reply packet.

Update packet
Opcode = 1
The Update packet is used to exchange routing information between neighbors. It is sent as unicast to new adjacencies to inform them of the full topology and afterwards as multicast to all adjacent neighbors, when a topology changes occur (such as change of metric). The first Update packet exchange between neighbors will have the Init flag set - this instructs the neighboring router to advertise all routes. The next Update packet exchange will contain the actual routes. Updates are subsequently only sent when triggered by an event and the contents of the update will be that of the changed network - not the full routing information.

Query packet
Opcode = 3
A query packet is sent in response to a route going into the active state and requests an alternate path to the affected network from the adjacent routers. This is done by sending a Query packet containing the affected network(s) with an infinite metric.
Each destination in a Query packet will be processed by DUAL on the receiving end and a reply will be sent only after the entire Query packet has been processed.

Reply packet
Opcode = 4
The Reply packet is a response to a Query packet and is acknowledged immediately, when received, and then processed by DUAL on the receiving end. If the destination network requested in a Query packet is not in the topology table, the Reply packet will state an infinite metric in its response.

SIA-Query
Opcode = 10
The SIA-query packet is sent if a Query packet has gone unanswered for 90 seconds (by default). The SIA-query requests the router to respond with whether it is still working on processing the original Query request or not. It is sent as unicast to the neighbors that have yet to reply to a Query packet. Upon receiving an SIA-query, the router must immediately send an ack, before processing the contents of the SIA-query.

SIA-reply
Opcode = 11
The SIA-reply packet is sent in response to an SIA-query, if the receiving router is still active for the destination network specified in the SIA-query. This is done by responding with the Active flag set in the response.

EIGRP Feasible Successor Routes

EIGRP uses the term successor and feasible successor to denote the best path and backup path for a particular route. The successor is the route with the Computed Distance (CD). The CD is calculated by taking the advertising routers Reported Distance (RD), which is a neighbors CD to reach the particular subnet, and then adding the cost for the local router to reach the advertising router.

There is also a term known as Feasible Distance (FD). The FD is the lowest metric observed for a given route since the last time it went from active to passive. The FD is used when converging the topology to avoid temporary routing loops by comparing the FD with the RD of a given path. If the RD is higher than the FD it means that there is a possibility that the route could point back to the local router and would therefore cause a temporary routing loop to occur if used. Note that it is only a possibility that it would be a loop and EIGRP is designed with this in mind: it will rather black-hole traffic, than cause a temporary routing loop.

So, let's look at an example. Below is a diagram of an EIGRP network with 3 routers R1 through R3.
EIGRP topology with IP address notations
And below here we have the same topology with the EIGRP metric noted for R3 to reach the network 172.16.1.0/24
EIGRP topology with metric notations
Note: the composite metric is calculated by taking the lowest bandwidth (in kilobits) on the path and the cumulative delay (in 10s of microseconds) on the path in a formula that looks like this: 256*(BW+DLY).

From the perspective of router R3, the network 172.16.1.0/24 is reachable via successor route 10.0.0.5 with Computed Distance of 3072 and through the feasible successor route 172.16.2.1 with a Computed Distance of 3328.

Below is the topology output as seen on R3.
R3#show ip eigrp topology all-links
EIGRP-IPv4 Topology Table for AS(1)/ID(172.16.2.2)
<output omitted>
P 172.16.1.0/24, 1 successors, FD is 3072, serno 37
        via 10.0.0.5 (3072/2816), GigabitEthernet1.13
        via 172.16.2.1 (3328/3072), GigabitEthernet1.23
The output states "1 successors" meaning only one route has the lowest metric and will be entered into the routing table. The Computed Distance and the Reported Distance can be seen in the output in the parentheses (CD/RD).

This feasible route will be a backup route for the network 172.16.1.0/24 in case the current best route, directly through R1, should fail - effectively enabling EIGRP to failover to the backup route without having to mark the route as active and sending out queries to its adjacent neighbors.

This feature also allows EIGRP to do unequal cost load distribution. It can do this because it knows of a loop-free path to the destination with a lower cost than the best path. So, by setting the variance under the routing process, you can influence how much the cost of the best path and the lesser path may vary for them to be used in unequal load distribution. Again, I don't think unequal cost load distribution is a desired feature in most networks - otherwise it would have been a feature of some other routing protocol by now.

Forming adjacencies in EIGRP with primary addresses on different subnets

In EIGRP it is best to only use the network statement for primary addresses of an interface, if adjacencies are required to form properly on that link - the primary address, meaning the address configured on an interface without the secondary added to the end of it.

What you can do in EIGRP is, that you can enable it to run using the network command that matches a secondary address of an interface, but that will only make the process run on that interface - the process will use the primary interface to form adjacencies with any peers. Below is a the configuration of two routers, router R1 and R2. They are connected through a switch on interface GigabitEthernet1.

Router R1 configuration
R1(config)#interface GigabitEthernet1
R1(config-if)#ip address 10.0.0.1 255.255.255.0
R1(config-if)#ip address 172.16.0.1 255.255.255.0 secondary
R1(config-if)#no shutdown
R1(config-if)#exit
R1(config)#router eigrp 1
R1(config-router)#no auto-summary
R1(config-router)#network 10.0.0.1 0.0.0.0
Router R2 configuration
R2(config)#interface GigabitEthernet1
R2(config-if)#ip address 172.16.0.2 255.255.255.0
R2(config-if)#ip address 10.0.0.2 255.255.255.0 secondary
R2(config-if)#no shutdown
R2(config-if)#exit
R2(config)#router eigrp 1
R2(config-router)#no auto-summary
R2(config-router)#network 10.0.0.2 0.0.0.0
The output of a show ip eigrp neighbors command will on both routers shows some confusing information.

EIGRP adjacency table on R1

EIGRP adjacency table on R2

R1 shows a neighbor of 172.16.0.2 on interface Gi1 and R2 shows a neighbor of 10.0.0.1 on interface Gi1.

The addresses that form the neighborship are the primary addresses on the interfaces, where EIGRP is enabled. On R1 the primary address is 10.0.0.1 and on R2 the primary address is 172.16.0.2. The primary address is the address that is used for EIGRP messages regardless of which secondary address is used to enable the EIGRP process on an interface.

I want to point out here, that I actually expected some error messages to pop up and adjacencies to fail between the two routers, but it appears that these routers running IOS 15.4 (they're actually CSR1000v routers) are more capable than the routers I ran this type of scenario on back when I was studying for my CCNP ROUTE exam.

Nonetheless, this type of configuration will most likely not work on routers running older versions of IOS and in the case here, where the adjacencies do form, it gives a somewhat confusing output from the show ip eigrp neighbors command and also in the next-hop addresses of the routing tables.

So, in short, be sure to use the same common subnet as primary addresses for EIGRP enabled interfaces - it may work, but at best it gives confusing output when the neighbor that forms is on a subnet that is not defined in a network command under the router process.

EIGRP Overview

Back when I took my CCNA and subsequently my CCNP, I really liked working with EIGRP. I found it easier to configure and understand than OSPF, but still with a lot more stability and features than RIPv2. I haven't ever come across an installation where it was used in the real world, though, so my experience with EIGRP is only from the lab and a short period where I used it for my own routing protocol at home (with a DMVPN connection to a few other routers over the big web).

Cisco opened up the protocol to the public back in 2013 - but I haven't heard of any vendor supporting it yet. It was also not released in full - Cisco keeps all the fancy features locked up tight. I hope the protocol will gain some traction, but I do not think it is going to happen any time soon.

Quick sidenote: EIGRP got a "facelift" in IOS release 15.0(1)M and introduced a new cli structure for configuring EIGRP parameters. This new method was called "Named Mode" and the previous method of configuring EIGRP was retroactively renamed Classic Mode. The new Named Mode collects all the configuration elements of EIGRP under the process configuration - no more EIGRP interface sub-commands and stuff like that. I will have a separate post about the new Named Mode soon.

This is to be a (somewhat) quick overview of Cisco's routing protocol the Enhanced Interior Gateway Routing Protocol (EIGRP).
  • Classless distance vector routing protocol (sometimes referred to as a hybrid routing protocol)
  • Cisco proprietary
  • IETF Draft draft-savage-eigrp-02
  • Uses IP protocol 88
  • Sends Hello messages
    • Used to form neighbor adjacencies
    • Used as a keepalive between neighbors
    • Default Hello interval is 5 seconds
      • Default on slow (1544kbps and slower) NBMA link is 60 seconds
    • Hello messages are sent unreliably
  • Uses a Holddown timer
    • Default Holddown timer is set to 15 seconds
      • Default on slow NBMA link is 180 seconds
  • Sends partial and full updates
    • Updates are triggered
    • Uses reliable transport protocol (RTP)
  • Uses multicast address 224.0.0.10 for IPv4 and FF02::A for IPv6
    • Retransmissions are sent to each neighbor's unicast address
  • Default administrative distance
    • Internal: 90
    • External: 170
  • Uses a composite metric
    • Defaults to using bandwidth and delay to determine the best path
    • The composite metric can be weighted by tuning the K values 1 through 5
    • The K values must match on all routers
  • Supports a maximum hop count of 255 with the default set to 100
    • The hop count is mainly used as a loop-prevention mechanism
  • EIGRP defaults to using a maximum of 50% of the bandwidth on a link for exchanging hello and updates
    • This can be tuned using the interface level sub-command ip bandwidth-percent eigrp <as#> <seconds>
  • Supports authentication using MD5 (SHA is supported when using Named Mode)
  • Supports route tags
  • Supports next-hop advertisement
  • Supports manual route summarization in any arbitrary point in the network
  • Supports IPv4 and IPv6
  • Supports unequal cost load-sharing
  • Supports split-horizon with poison reverse
  • Uses Diffusing Update Algorithm (DUAL) to control diffusing computations of the topology
Things that have to match for adjacencies to form in EIGRP:
  • Authentication (if used)
  • K values
  • Autonomous System (AS) number
  • Primary addresses on interfaces configured in the same common subnet
The last item warrants a little more explanation and I have made a post that goes into more detail regarding this point here

The EIGRP composite metric is calculated using these five K values:
  1. Bandwidth
  2. Load
  3. Delay
  4. Reliability
  5. Maximum Transmission Unit (MTU)
By default, EIGRP uses only K values 1 and 3. This means that bandwidth and delay are the only values used in the composite metric. When manually tweaking EIGRP metrics it is recommended only to use the delay because the bandwidth is also used by other features such as QoS - whereas delay is only used by EIGRP.

EIGRP uses passive to show a route as stable and active to show a route that is in trouble - meaning a route that has been lost and it is now actively trying to find a new path to the network.

When a router loses reachability to a network it will send out queries to its adjacent neighbors to see if they have a path to the lost network. When this happens, the route is marked as active until replies are heard back from all the neighbors queried or the active timer runs out.
If a reply is not heard within 90 seconds, the local router will send an SIA-query (SIA meaning stuck-in-active) in an attempt to ascertain the reason for the missing reply - or more specifically, is the neighbor still working on the query request or did it not receive the initial query at all. Failure to respond to the SIA-query will result in the local router deleting routes through the non-responsive neighbor and resetting the adjacency. If the neighbor responds to the SIA-query, the active timer will be reset and another SIA-query will be sent again at half the active timer (90 seconds). This allows for an extension of the active timer if the reason behind the slowdown is the neighbors waiting for the active process to complete. A maximum of 3 SIA-queries will be sent before the neighbor adjacency will be reset.

EIGRP supports a graceful shutdown function, where the router sends a hello packet to its neighbor with all the K values set to 255. This happens when an interface running EIGRP is shutdown or the EIGRP process itself is shutdown. It enables the router to signal its neighbors to terminate the adjacency and allows the neighbors to initiate the process of finding an alternate route to the networks advertised by the router shutting down immediately instead of having to wait for the holddown timer to expire.

Below are a few nifty show commands for EIGRP.
The command show ip eigrp traffic gives a statistic of packets sent and received for the EIGRP proces.
The command show ip eigrp neighbors gives a view of the EIGRP adjacency table.
The command show ip route eigrp shows the EIGRP routes currently installed into the routing table.
The command show ip eigrp timers gives a view of the current hello and holddown timers for the EIGRP enabled interfaces.
The command show ip eigrp topology will show the EIGRP topology and will also display the EIGRP process router-id. With the keyword all-links it is possible to view the entire topology as advertised by neighbors (including feasible successor links).

Sunday, November 2, 2014

RIPng Overview

So, the eggheads of the networking industry couldn't bear an IP protocol without RIP and so we get RIPng for our IPv6 networks. The "ng" stands for Next Generation, but it is sometimes referred to as RIPv6 or IPv6 RIP.

Below are listed some of the facts of RIPng. The below summary is based on the defaults of Cisco's implementation of RIPng and there are only few differences compared to RIPv2 for IPv4.

  • Defined in RFC 2080
  • Runs on port UDP/521 (not 520 to avoid clashing with IPv4 RIP configurations)
  • Sends updates to multicast address FF02::9
  • Metric is still based on hop count with 15 being the maximum and 16 being infinity (unreachable)
    • Unlike RIPv2, the sending router does not increment the hop count in advertised routes before sending it out to its neighbors. Instead, it does the most logical thing and advertises what it has in its routing table. The receiving router is responsible for incrementing the entries before entering it into its own routing table
  • Default administrative distance is 120
  • It is driven mainly by timers
  • Update timer: 30 sec.
    • sends out the entire routing table every 30 sec. on RIP enabled interfaces (routes affected by the split horizon rule are excluded from the update)
    • Triggered updates occur when a route change occurs and an update, including only the changed route, is sent out. Regular updates are unaffected by this and are still sent per the update timer interval
    • Cisco uses a jitter variable to avoid update synchronization, just like in RIPv2, but I am unclear on the specific details - I can only assume they function the same way.
  • Expiration timer: 180 sec.
    • Similar to RIPv2 invalid timer. The expiration timer tracks the validity of a specific route. It resets to 0 whenever a route is received in an update and a route is considered invalid if the route is not received within 180 seconds.
    • After the expiration timer expires the route is advertised with a metric of 16 (unreachable) until it is purged from the routing table
  • Holddown timer: 0 sec.
    • Cisco defaults to not using the holddown timer in RIPng
  • Garbage collection timer: 120 sec.
    • Unlike RIPv1 and v2, RIPng garbage collection timer starts counting after the specific routes expiration timer is exceeded
    • The route is advertised with a metric of 16 (unreachable) for 120 seconds - after which the route is purged from the routing table
  • An update message can contain as many entries as the MTU size allows (unlike RIPv1 and v2, which only allows for 25 entries per update message)
  • Does not natively support authentication
    • It uses IPv6 built-in authentication features (I will find the time to do a post on that on a later date)
  • The ring is able to tag routes being redistributed into the routing process
  • Unlike RIPv1 and v2, RIPng supports multiple instances running on the same router
    • Cisco uses named instances, where instance names are locally significant and do not have to match between routers
      • Use the global configuration command ipv6 router rip <instance name> to enter the general process configuration mode
      • Use the interface sub-command ipv6 rip <instance name> enable to enable RIPng on a specific interface
Below is a packet capture of a RIPng update message (also known as a response) sent from the IPv6 link-local address of FE80::13:3 to the IPv6 multicast address of FF02::9.

RIPng response message, including two prefixes

Saturday, November 1, 2014

RIPv2 Overview

RIPv2 is a simple routing protocol - at least when compared to other major routing protocols like OSPF and BGP. The pros for using RIPv2 in a network is the ease of configuration and maintenance of the routes and also the network traffic generated by the protocol is somewhat less than that of other routing protocols. Also, most vendor equipment supports RIPv2 and so is a fairly common protocol to come by in smaller routing environments. The cons, however, are the protocols slow convergence time and lack of scalability.

Below are listed some of the quick hard facts of RIPv2. The below summary is based on the defaults of Cisco's implementation of RIPv2 - timers and the likes can vary from vendor to vendor.

  • Defined in RFC 2453
  • Supports classless routing operation (unlike RIPv1 which is only classfull)
  • Runs on port UDP/520
  • Sends updates to multicast address 224.0.0.9
    • can be changed to unicast with the neighbor <ip address> command under the routing process
    • can be changed to broadcast with the ip rip v2-broadcast command under a given interface
  • Metric is based on hop count with 15 being the maximum and 16 being infinity (unreachable)
    • The sending router increments the hop count in advertised routes before sending it out to its neighbors. Thus, receiving routers do not increment the metric, but enters the metric directly into its routing table
  • Default administrative distance is 120
  • It is driven mainly by timers
  • Update timer: 30 sec.
    • sends out the entire routing table every 30 sec. on RIP enabled interfaces (routes affected by the split horizon rule are excluded from the update)
    • Triggered updates occur when a route change occurs and an update, including only the changed route, is sent out. Regular updates are unaffected by this and are still sent per the update timer interval
    • Cisco implements a RIP_JITTER variable that randomly subtracts 0-15% from the 30 second timer to avoid the synchronization of routing updates with its neighbors. This changes the effective update interval to between 25,5 and 30 seconds
  • Invalid timer: 180 sec.
    • The invalid timer tracks the validity of a specific route. It resets to 0 whenever a route is received in an update and a route is considered invalid after 180 sec.
    • After the invalid timer expires the route is advertised with a metric of 16 (unreachable) until it is purged from the routing table
  • Holddown timer: 180 sec.
    • After the invalid timer expires the holddown timer starts counting. Even if a valid route is received in an update, it will not be entered into the routing table until the holddown timer expires or the route is flushed from the routing table (either manually or by the timer)
  • Flush after timer: 240 sec.
    • The flush after timer (or garbage collection timer) starts counting after the last routing update is received - effectively counting alongside the invalid timer and subsequently the holddown timer
    • By default, 60 seconds after the invalid timer expires (and only 60 seconds into the holddown timer) the flush after timer runs out and the route is purged from the routing table
  • No neighbor adjacency is formed and no hello packets are sent
  • An update message can contain up to 25 entries
  • Supports authentication using plain-text or MD5 hashing
  • RIPv2 is able to tag routes being redistributed into the routing process
Below is a packet capture of RIPv2 update (also known as a response) sent from 10.0.0.1 to a multicast address of 224.0.0.9. Worth noting is that the packet is filled with 25 network entries all with a metric of 1 except for the first entry being advertised with a metric of 16 due to a simulated network failure (interface shutdown) on the source router 10.0.0.2. Not shown in the screen dump is the subsequent packet sent from the router 10.0.0.2 including two additional subnets.

RIPv2 response message, including 25 entries
If we look deeper into one of the entries we can see that the entry includes a subnet mask (which is not included in RIPv1) making it possible to advertise variable length subnet masks (VLSM) - making it a classless routing protocol. Also, note the next-hop being set to all 0's, which means that the next-hop address is assumed to be the same as the source of the update (10.0.0.2).

RIPv2 response message entry
A short explanation of split-horizon is warranted. Split-horizon simply states that a router should not advertise a route out of the same interface that it learned it from. This is to prevent routing loops from occurring, but it can in some cases cause issues with routes not being sent to valid neighbors on DMVPN interfaces for example.

Basic Multi Layer Switch (MLS) configuration

Okay, so a fairly common thing would be to have a multi layer switch in your network - at least when your network becomes larger than what can usually be plugged into a switch or two.

It may look something like this: some beefy core MLS that can push packets real quick, some less beefy, but still awesome, distribution MLS and finally some relatively inexpensive layer 2 switches to connect your clients, printer, access points, servers and whatever else you may have need of connecting to the network.

Core-Distribution-Access Diagram
For the sake of simplicity we will just assume the access layer switches are configured and functions only at layer 2. In this example we will focus solely on the distribution switches/routers (MLS).

Let's start out by getting the layer 2 functions working. We configure our VTP mode to be transparent and then create 4 VLANs (10, 20, 30 and 40).
DIST-SW-01(config)#vtp mode transparent
Setting device to VTP Transparent mode for VLANS.
DIST-SW-01(config)#vlan 10,20,30,40
The same configuration is done on DIST-SW-02, but not shown here to keep this example somewhat brief and manageable.

Then we create the layer 2 port-channel between the two distribution switches and configures it as a trunk port for all VLANs.
DIST-SW-01(config)#interface range FastEthernet0/23-24
DIST-SW-01(config-if-range)#switchport trunk encapsulation dot1q
DIST-SW-01(config-if-range)#switchport mode trunk
DIST-SW-01(config-if-range)#switchport nonegotiate
DIST-SW-01(config-if-range)#channel-group 1 mode active
Before we configure any of the layer 3 features we will need to enable IP routing on the two distribution switches. You can enable IP interfaces and configure redundancy protocols, but what you will experience is a complete lack of routing ability if this command is absent.
DIST-SW-01(config)#ip routing
 DIST-SW-02(config)#ip routing
Secondly, we will configure a first hop redundancy protocol for the clients connected to the access switches to use as their default gateways. We will make distribution switch 1 the active forwarder for odd numbered VLANs and distribution switch 2 will be the active forwarder for evenly numbered VLANs.

Distribution switches with VLANS and HSRP configured
Configuration on Distribution Switch 1
interface Vlan10
 ip address 10.0.10.1 255.255.255.0
 standby 10 ip 10.0.10.254
 standby 10 priority 110
 standby 10 name VLAN_10
!
interface Vlan20
 ip address 10.0.20.1 255.255.255.0
 standby 20 ip 10.0.20.254
 standby 20 name VLAN_20
!
interface Vlan30
 ip address 10.0.30.1 255.255.255.0
 standby 30 ip 10.0.30.254
 standby 30 priority 110
 standby 30 name VLAN_30
!
interface Vlan40
 ip address 10.0.40.1 255.255.255.0
 standby 40 ip 10.0.40.254
 standby 40 name VLAN_40
Configuration on Distribution Switch 2
interface vlan 10
 ip address 10.0.10.2 255.255.255.0
 standby 10 ip 10.0.10.254
 standby 10 name VLAN_10
!
interface vlan 20
 ip address 10.0.20.2 255.255.255.0
 standby 20 priority 110
 standby 20 preempt
 standby 20 ip 10.0.20.254
 standby 20 name VLAN_20
!
interface vlan 30
 ip address 10.0.30.2 255.255.255.0
 standby 30 ip 10.0.30.254
 standby 30 name VLAN_30
!
interface vlan 40
 ip address 10.0.40.2 255.255.255.0
 standby 40 priority 110
 standby 40 preempt
 standby 40 ip 10.0.40.254
 standby 40 name VLAN_40
Because we use FHRP with the default gateway set differently for the odd and even numbered VLANs, we would want to make sure that the spanning-tree configuration chooses the correct root bridge for those VLANs as well - making Distribution Switch 1 the root for odd numbered VLANs and Distribution Switch 2 the root for evenly numbered VLANs.
DIST-SW-01(config)#spanning-tree vlan 1,10,30 priority 4096
DIST-SW-01(config)#spanning-tree vlan 20,40 priority 8192
DIST-SW-02(config)#spanning-tree vlan 1,10,30 priority 8192
DIST-SW-02(config)#spanning-tree vlan 20,40 priority 4096
Now we will configure the trunk ports from the distribution layer to the access layer. Again, we assume that the access switches are already configured appropriately for this scenario.
DIST-SW-01(config)#interface range fa0/19 , fa0/21
DIST-SW-01(config-if-range)# switchport trunk encapsulation dot1q
DIST-SW-01(config-if-range)# switchport mode trunk
DIST-SW-01(config-if-range)# switchport nonegotiate
The same commands are issued on DIST-SW-02.

Verify the configuration by examining the output of the commands shown in the sections below:

VTP and VLAN configuration
DIST-SW-01#show vtp status
DIST-SW-01#show vlan brief
EtherChannel (port-channel) configuration
DIST-SW-01#show etherchannel summary
DIST-SW-01#show etherchannel 1 detail
HSRP (standby) configuration
DIST-SW-01#show standby brief
DIST-SW-01#show standby vlan [10 | 20 | 30 | 40]
Spanning-tree configuration
DIST-SW-01#show spanning-tree vlan [10 | 20 | 30 | 40]
DIST-SW-01#show spanning-tree root
Switchport trunk configuration
DIST-SW-01#show interfaces trunk
DIST-SW-01#show interfaces [port-channel 1 | fa0/19 | fa0/21] trunk
DIST-SW-01#sh interfaces [port-channel 1 | fa0/19 | fa0/21] switchport

Now we should be able to reach our FHRP default gateways from a client connected to the VLANs 10,20,30 or 40 in the access switches. Here we show only for VLAN 10 and 20 as they should show connectivity through DIST-SW-01 and DIST-SW-02 respectively. Notice that the trace to 8.8.8.8 fails at 10.0.100.10 because that router doesn't have a route towards the destination - what matters here is that it goes to 10.0.10.1 and 10.0.20.2 even though the default gateway is set to .254.

Trace and ARP table on VLAN 10

Trace and ARP table on VLAN 20

Note: some Catalyst multi layer switches, like the Catalyst 3560 used in this example, there may be some commands unavailable if the Switch Database Management (SDM) template is configured to not support the configuration you are attempting.

To troubleshoot issues like these you must first verify that the commands you are trying to configure are supported by referring to the documentation for the specific platform. Secondly, make sure the image and licensing is correct. The command show version gives you the information regarding the platform and the image you are on and some information on the licensing (only on some platforms/IOS versions).

If the image and licensing is in order, but you still cannot input the desired commands, you may be using the incorrect SDM template. Check the currently used template with this command:
DIST-SW-01#show sdm prefer
 The current template is "desktop default" template.
 The selected template optimizes the resources in
 the switch to support this level of features for
 8 routed interfaces and 1024 VLANs.
  number of unicast mac addresses:                             6K
  number of IPv4 IGMP groups + multicast routes:    1K
  number of IPv4 unicast routes:                                  8K
  number of directly-connected IPv4 hosts:                 6K
  number of indirect IPv4 routes:                                 2K
  number of IPv4 policy based routing aces:               0
  number of IPv4/MAC qos aces:                                0.5K
  number of IPv4/MAC security aces:                         1K
This will display some of the maximums of the current SDM template in use. With this specific template I am unable to configure any policy based routing aces - meaning I cannot configure policy based routing (PBR).

If I had to do PBR on this MLS I would have to change the SDM template. The below command shows how that would be done - bear in mind that you cannot fine tune anything in the SDM templates they come pre-configured.
DIST-SW-01(config)#sdm prefer ?
  access              Access bias
  default             Default bias
  dual-ipv4-and-ipv6  Support both IPv4 and IPv6
  routing             Unicast bias
  vlan                VLAN bias
DIST-SW-01(config)#sdm prefer routing
Also, the switch will need to be reloaded for the new template to take effect.