IP Routing and Switching: Maximum Transmission Unit (MTU)

I was asked, this week by a colleague of mine, to explain to him some things regarding a Maximum Transmission Unit (MTU) setting at a customers site, where the customer told him, that he had to set his MTU to 1472 on an endpoint device in order to accommodate for Dot1Q VLAN tagging in his network (which I doubt he actually needed to do, but some times you need to choose your battles, when explaining network stuff to people who are not well-versed in such).

As I was explaining the meaning and workings of MTU size and when you would want to change it versus when not to, I discovered I actually didn't have all the answers he sought after - mainly because I did not have much of the specifics from this particular setup we were discussing, but also because I didn't actually remember all of the header sizes of IP and Ethernet - I knew the theory of it, but it is hard to convey that without specific numbers, which I feel like I shouldn't be struggling with as a CCIE candidate. So, I sat down and looked it all up, threw some pings and other traffic around in the lab and captured some packets - and now I am writing this down so that I may gain an even better grasp of these concepts myself - and also not have to spend time explaining it to someone, when I can just send them a link to this post.

Alright, on with it!

Maximum Transmission Unit, or MTU, defines the maximum size of an Ethernet frame transmitted on the network. In most cases, this would default to 1500 bytes. Testing the MTU is usually done using ICMP ping requests with a packet size set somewhere between 1300 and 1500 (depending on what you want to test) and the DF-bit set. The DF-bit means that the packet should not be fragmented and if it encounters a point on the path from A to B, that requires fragmentation, an ICMP packet will be returned stating that it needed fragmentation, but the DF-bit was set. This reply is generated by the device, which would otherwise have fragmented the packet - had the DF-bit not been set. When testing the MTU size, you should know how the OS handles ICMP sizes.

Let's do a sample with Windows 7 pinging its local gateway, which is set with an IP MTU of 1500 on its interface.

R1#show ip interface GigabitEthernet1
GigabitEthernet1 is up, line protocol is up
Internet address is 172.16.0.1/24
<output omitted>
MTU is 1500 bytes
<output omitted>

Let's try a ping size of 1500 with the DF-bit set and see what happens.

Win7 ping size 1500 w/ df-bit set

Okay, that didn't work - but why ?! We said 1500 in the length of the packet and that is what is allowed on the router interface. So, why are we needing to fragment the packet? Well, simply put, the packet is too large for the host to send out its interface without fragmenting it. The reason is that Windows 7 takes the -l 1500 command as meaning 1500 bytes of payload - then it adds the various headers (we'll get to the headers in a second). This makes the packet too big to be transmitted without fragmentation.

Now, lets try lowering the length of the packet to 1472 and see if we can get a packet through.

Win7 ping size 1472 w/ df-bit set

So, the packet was sent and a reply was received. Now, why can we only send with 1472, even though the client and the network is configured for 1500 bytes?

If we were to replicate this on a router, pinging another device with the size set to 1500 it would work - as seen in the output below, where router R2 pings R1 on the same layer 2 subnet (172.16.0.0/24).

R2#ping 172.16.0.1 size 1500 df-bit repeat 1
Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 172.16.0.1, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 4/4/4 ms

The simple explanation here is, that the packet length is interpreted differently between a router and a Windows 7 machine. The router takes the 1500 bytes value as meaning a 1500 byte Ethernet frame (excluding the Ethernet encapsulation itself), whereas the Windows machine means 1500 bytes of payload - then comes the layer 3 encapsulation and subsequently the layer 2 encapsulation.

Let's break down the packet that makes it through on the Windows machine.
Ping towards 172.16.0.1 with the df-bit set and a payload of 1472.

Payload: 1472 bytes
ICMP encapsulation: 8 bytes (1480 bytes)
IP encapsulation: 20 bytes (1500 bytes)
Ethernet encapsulation: 18 bytes (1518 bytes)
Transmit on wire

So, as seen above, the packet reaches 1500 bytes after it is encapsulated at layer 3 (IP). This is where the IP source and destination addresses are added (among other things). After that it as handed off to layer 2 and encapsulated in an Ethernet frame containing the source and destination MAC addresses. What you will most likely see when capturing packets is a a frame size of 14 bytes, which is not wrong, but as you can see above I wrote 18 bytes - the 4 extra bytes are from the FCS (frame check sequence), which most network cards don't capture in Wireshark.

Below I have a few MTU size examples. First we will look at a TCP example as seen in a Wireshark capture that doesn't include the Ethernet FCS (most network cards don't include this value).

Alright, next up we have the same example as above, but this time we include the FCS and the result is actually how packets today are sent (by default) on an Ethernet network.

The layer 2 MTU size is now 1518 bytes and this is what counts as the packet sent on the wire.

Now, let's take it a little further and add 802.1Q VLAN tags into the mix. The packet, as seen above, is transmitted to a switch, that then transmits it to another switch over a trunk link - adding a VLAN tag to the 1518 byte long frame. This VLAN tag is worth 4 bytes, which now makes the frame a whopping 1522 bytes!

In 1998, the max-packet was changed, in 802.3, from 1518 to 1522 to allow for the 4 bytes VLAN tagging. So, the new standard size in Ethernet networks today is actually considered to be 1522 - sizes above that would be considered jumbo frames.

This would be standard Ethernet network allowing for 1500 byte packets. However, the tale is not done quite yet. The Ethernet frame may be 1522 (when using VLAN tagging), but the bytes transmitted on the wire for a packet includes a little more than just the frame itself - this is unrelated to the MTU, but still a nice little fun-fact to know about.

Before an Ethernet frame is sent, the sender makes sure the coast is clear by sending a preamble and the SFD (start-of-frame-delimiter). This is known as CSMA/CD (Carrier Sense Multiple Access Collision Detection), which is used in 802.3 networks. This takes up 8 bytes on the wire. And finally, at the end of a transmission there is a silence called the Interframe Gap, which adds another 12 bytes. So, a frame of 1522 bytes becomes a total of 1542 bytes transmitted on the wire. Keep in mind that the preamble, SFD and Interframe Gap does not count towards the MTU size - neither does the Ethernet frame encapsulation including the FCS and VLAN tag (if any).

IP Routing and Switching

Friday, November 14, 2014

Maximum Transmission Unit (MTU)

No comments:

Post a Comment