IP Routing and Switching

Sunday, November 29, 2015

Pearson Vue cert holder information stolen

So, this week Pearson Vue made an announcement that they had been victim of a malicious intrusion of their certification system - specifically their Credential Manager System called Pearson Vue Credential Manager (PCM).

Pearson Vue are the ones who deliver the Cisco exams and has information on any individual who is Cisco Certified through their system. So anyone with a Cisco Certification may be affected by this attack and may have had their information stolen. This information includes name, address, e-mail and phone numbers as well as employment information.

They do press the point that they believe the Credential Manager System to be the only one affected and that their parent company (Pearson) as well as their website and testing system was not compromised.

It is not immediately clear what is stored in their PCM system, but I guess it is safe to assume that if you login to Pearson Vue and check the account information stored there, should give a good indication of what is probably stored in the PCM system and therefore may have been compromised.
You can login from this link: http://www.pearsonvue.com/cisco/

For more information on this issue I will refer to the links below:

Pearson Vue's statement from the 23rd of Nov. 2015: http://home.pearsonvue.com/About-Pearson-VUE/Press-Room/2015/Public-Statement-Regarding-Pearson-Credential-Mana.aspx

Pearson Vue's FAQ site regarding the above Public Statement: http://www.pearsonvue.com/faqs/PCM_faqs.asp

Sunday, April 19, 2015

BGP Multi-Exit Discriminator

Multi-Exit Discriminator (MED) is also known as the metric of BGP - it is labeled metric in the show ip bgp output. It is an optional nontransitive BGP Path Attribute (PA), meaning that it is optionally supported in the various BGP deployments and it is not not advertised outside the AS it is sent to. A lower MED is preferred over a higher one. Cisco defaults to a MED value of 0 - so by default it is set to the most preferred value.

Now, what the MED allows you to do is this: tell a neighboring AS, which external path is the best for certain prefixes.

For the example I will use the below topology with AS 123 connecting to AS 45.

BGP MED topology

The networks 40.0.0.0 /8 and 50.0.0.0 /8 are advertised by both R4 and R5 to their eBGP neighbors in AS 123. What MED can do is tell the routers in AS 123 to prefer one path over the other. In this example we want R4 to be the preferred path for network 40.0.0.0 /8 and R5 for network 50.0.0.0 /8.

This could possibly be a scenario with an enterprise connecting redundantly to the same ISP. The ISP then exchanges the reachability information with a MED value to have the customer prefer R4 for the networks 40.0.0.0 /8 and R5 for the networks 50.0.0.0 /8.

Below is the pertinent configuration on R4.

R4#show running-config | section router bgp
router bgp 45
bgp router-id 4.4.4.4
bgp log-neighbor-changes
network 40.0.0.0
network 50.0.0.0
neighbor PGROUP peer-group
neighbor PGROUP remote-as 45
neighbor PGROUP update-source Loopback0
neighbor 5.5.5.5 peer-group PGROUP
neighbor 140.1.14.1 remote-as 123
neighbor 140.1.14.1 route-map RMAP_SET_MED out
R4#show running-config | section route-map
<output omitted>
route-map RMAP_SET_MED permit 10
match ip address prefix-list PFX_40
set metric 50
route-map RMAP_SET_MED permit 20
match ip address prefix-list PFX_50
set metric 100
route-map RMAP_SET_MED permit 99

And below is the configuration done on R5 - almost identical except for the peerings and route-map configuration.

R5#show running-config | section router bgp
router bgp 45
bgp router-id 5.5.5.5
bgp log-neighbor-changes
network 40.0.0.0
network 50.0.0.0
neighbor PGROUP peer-group
neighbor PGROUP remote-as 45
neighbor PGROUP update-source Loopback0
neighbor 4.4.4.4 peer-group PGROUP
neighbor 140.1.25.2 remote-as 123
neighbor 140.1.25.2 route-map RMAP_SET_MED out
R5#show run
R5#show running-config | section route-map
<output omitted>
route-map RMAP_SET_MED permit 10
match ip address prefix-list PFX_40
set metric 100
route-map RMAP_SET_MED permit 20
match ip address prefix-list PFX_50
set metric 50
route-map RMAP_SET_MED permit 99

The route-map is configured to match on the given networks and set the metric (MED). The route-map is then configured on the eBGP peer in the outbound direction.

The result can be seen on R3.

R3#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*>i 40.0.0.0 140.1.14.4 50 100 0 45 i
*>i 50.0.0.0 140.1.25.5 50 100 0 45 i
*> 90.0.0.0 0.0.0.0 0 32768 i
*> 91.0.0.0 0.0.0.0 0 32768 i

The routes entered into R3s BGP table both have a metric of 50, but what about the networks that should be marked with 100 - well, R1 and R2 aren't advertising those routes because they are not best routes. The result here is that R3 will send packets destined for 40.0.0.0 /8 to router R1 and packets destined for 50.0.0.0 /8 to R2.

This is the BGP table on R1 and R2.

R1#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*> 40.0.0.0 140.1.14.4 50 0 45 i
*>i 50.0.0.0 140.1.25.5 50 100 0 45 i
* 140.1.14.4 100 0 45 i
*>i 90.0.0.0 3.3.3.3 0 100 0 i
*>i 91.0.0.0 3.3.3.3 0 100 0 i
R2#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*>i 40.0.0.0 140.1.14.4 50 100 0 45 i
* 140.1.25.5 100 0 45 i
*> 50.0.0.0 140.1.25.5 50 0 45 i
*>i 90.0.0.0 3.3.3.3 0 100 0 i
*>i 91.0.0.0 3.3.3.3 0 100 0 i

Both R1 and R2 have a single route in their BGP tables with a metric of 100 - this is the route advertised from the eBGP neighbor. It is not selected as best and is therefore not advertised to their iBGP neighbors and so they only have one.

Some good reading from Cisco, including a description of the bgp deterministic-med and bgp always-compare-med features, can be found here: http://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13759-37.html

BGP Confederation

Confederation is another solution to the iBGP full-mesh requirement. It is a more comprehensive change than implementing route reflectors to the network because it requires that the BGP process is replaced in the configuration. This poses a problem with BGP because it is kind of the highlander of routing protocols - there can be only one (process defined per router). So, to change the process you cannot really do a smooth transition from one to the other on the same router.

The way confederations work is by dividing an iBGP network into multiple sub-ASes and running eBGP between these sub-ASes. Outside of the confederation, all routers see the same AS, but inside the AS the confederation routers are aware of the confederation configuration due to the use of the AS_CONFED_SEQ and AS_CONFED_SET configured in the AS_PATH PA. These are used much like the regular AS_SEQ and AS_SET, but they are not propagated outside the actual AS.

Below is shown a network topology with AS 100 being the AS external BGP peers will peer with, but inside of AS 100 there are two confederation ASes (sub-AS). Sub-as 65123 with routers R1, R2 and R3 in a fully-meshed iBGP configuration and sub-AS 65045 with routers R4 and R5 with R4 and R3 connecting the two sub-ASes.

BGP Confederation Network Topology

Peerings between R3 and R4 works much like regular eBGP peerings, but because they are confederation peers, the next-hop stays unchanged in updates between them. The routers R5 and R6 are also eBGP peers, but regular eBGP.

Below is the pertinent configuration for R3 and R4.

R3#show running-config | section router bgp
router bgp 65123
bgp router-id 3.3.3.3
bgp log-neighbor-changes
bgp confederation identifier 100
bgp confederation peers 65045
neighbor 10.0.12.1 remote-as 65123
neighbor 10.0.32.2 remote-as 65123
neighbor 10.0.34.4 remote-as 65045
R4#show running-config | section router bgp
router bgp 65045
bgp router-id 4.4.4.4
bgp log-neighbor-changes
bgp confederation identifier 100
bgp confederation peers 65123
neighbor 10.0.34.3 remote-as 65123
neighbor 10.0.45.5 remote-as 65045

Even though the main AS is 100, the BGP process is configured with the ASN of the sub-AS - this is what makes this so difficult in configuring on production equipment. The main AS is configured using the bgp confederation identifier command. Note that the bgp confederation peers command is only needed on the confederation routers connecting between sub-ASes.

Lets have look at the BGP configuration on R2 for reference.

R2#show running-config | section router bgp
router bgp 65123
bgp router-id 2.2.2.2
bgp log-neighbor-changes
bgp confederation identifier 100
neighbor 10.0.12.1 remote-as 65123
neighbor 10.0.23.3 remote-as 65123

Notice that the only confederation command is the bgp confederation identifier command everything else is just iBGP.

The upside of this configuration is, that the BGP routers only need full-mesh peering with routers in the same sub-AS. When you talk large-scale deployments of BGP, this may significantly decrease the amount of configuration as well as control-plane traffic. You can combine the confederations with route reflector configuration as well to make it even more manageable.
But, the downside, is that this is for LARGE deployments and it is in no way an easy task to migrate to a confederation configuration, when a networks grows to a size that could leverage this feature - it is much easier just to configure route reflectors and call it a day.

Friday, April 17, 2015

The mystery of the disappearing traceroute timestamp

Ever done a traceroute from one Cisco router to another and wondered why the second slot, on the final hop, shows an '*' instead of a timestamp like any of the above. Take for example the trace output below.

R1#traceroute 172.16.6.1 source GigabitEthernet 1.1
Type escape sequence to abort.
Tracing the route to 172.16.6.1
VRF info: (vrf in name/id, vrf out name/id)
1 10.0.12.2 3 msec 1 msec 0 msec
2 10.0.23.3 1 msec 1 msec 1 msec
3 10.0.34.4 1 msec 1 msec 0 msec
4 10.0.45.5 6 msec 1 msec 1 msec
5 10.0.56.6 2 msec * 3 msec

Notice how only the final hop has a timeout (denoted by the '*') in the second portion - none of the previous hops had this.

Well, without spending as much time writing this post as I did finding out why this was, I will give a very short view of what causes this behaviour.

The way traces (in Cisco IOS) work is by sending a UDP packet to the destination host with each packet incrementing the destination port number as well as the TTL value. The packets will keep getting a "TTL exceeded in transit" reply from each hop until the TTL values is increased to the proper amount of hops it take to reach the destination network. When it finally has a TTL value high enough to reach a router without expiring the TTL, the destination router will respond with an ICMP type 3 (code 3) message - a destination unreachable (port unreachable) message.

Below is the packet capture from the router with the address 10.0.56.6.

Packet capture on R6

There are the three UDP packets sourced from inside interface of R1 (172.16.1.1) going to the inside interface of R6 (172.16.6.1), but there are only two ICMP unreachables sent in reply.

So, the problem is with an IOS default setting for ICMP unreachables. It can be found using the show run all command with little filtering (to parse out all the unrelated sections).

R6#show running-config all | include ip icmp
ip icmp rate-limit unreachable 500
ip icmp redirect subnet

The culprit is the ip icmp rate-limit unreachable 500 command. This setting can easily be tweaked to allow the traceroute to display correctly.

R6#configure terminal
R6(config)#no ip icmp rate-limit unreachable

Now the trace will display properly - so long as the destination is on R6.

R1#traceroute 172.16.6.1 source GigabitEthernet 1.1
Type escape sequence to abort.
Tracing the route to 172.16.6.1
VRF info: (vrf in name/id, vrf out name/id)
1 10.0.12.2 3 msec 0 msec 0 msec
2 10.0.23.3 1 msec 0 msec 1 msec
3 10.0.34.4 1 msec 0 msec 1 msec
4 10.0.45.5 1 msec 1 msec 1 msec
5 10.0.56.6 2 msec 1 msec 2 msec

Now, you may wonder why it is only the final destination that has this issue and not every single hop. Well, only the final hop sends an ICMP unreachable - the rest send TTL expired.

Although the traceroute now displays properly, I must point out that the default setting is a default for a reason. You most likely do not wish to turn this off on production equipment unless you have a VERY good reason to; it prevents DoS attacks from trashing the router by having it respond with copious amounts of ICMP unreachable packets - so best leave it as is and just live with the one missing reply in the traces.

BGP Route Filtering using AS_PATH list

When using BGP, it is possible to filter out routes using a special kind of access-list called an as-path access-list. This allows for matching based on the AS_PATH PA, which includes the AS_SEQ, AS_SET as well as the AS_CONFED_SEQ and AS_CONFED_SET.

The difficult part of this is not the configuration. You basically create an access-list that filters what is matched with deny statement and allows what is permitted. This access-list is then applied to a neighbor under the BGP router process in either the inbound or outbound direction.

The difficulty with as-path access-lists is the matching logic, which is based on regular expressions (regex). Not all people are familiar with regex, but if you have used grep, for example, you may be quite firm with the syntax. I want to make it clear that I am in no way a regex pro and I am in no way certain that the syntax below is the best to accomplish the tasks given, but they work in this very limited scenario.

Below is the topology used for the BGP as-path access-list configuration.

AS-Path access-list filtering topology

Before changing anything, we take a look at the BGP table of R4.

R4#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
* 20.0.0.0/24 10.0.14.1 0 1 2 i
*> 10.0.34.3 0 3 2 i
* 21.0.0.0/24 10.0.14.1 0 1 2 i
*> 10.0.34.3 0 3 2 i
* 22.0.0.0/24 10.0.14.1 0 1 2 i
*> 10.0.34.3 0 3 2 i
* 23.0.0.0/24 10.0.14.1 0 1 2 i
*> 10.0.34.3 0 3 2 i
* 30.0.0.0/24 10.0.14.1 0 1 2 3 i
*> 10.0.34.3 0 0 3 i
* 31.0.0.0/24 10.0.14.1 0 1 2 3 i
*> 10.0.34.3 0 0 3 i
* 32.0.0.0/24 10.0.14.1 0 1 2 3 i
*> 10.0.34.3 0 0 3 i
* 33.0.0.0/24 10.0.14.1 0 1 2 3 i
*> 10.0.34.3 0 0 3 i
*> 40.0.0.0/24 0.0.0.0 0 32768 i
*> 41.0.0.0/24 0.0.0.0 0 32768 i
*> 42.0.0.0/24 0.0.0.0 0 32768 i
*> 43.0.0.0/24 0.0.0.0 0 32768 i

There are multiple entries for the networks advertised on R3 and R2. What we want to do is this:

Filter out routes on R4 received from R1 where the AS path is longer than 2 hops.
Filter out routes on R4 received from R3 where the last AS in the path is AS 2.

We will create two as-path access-lists and apply them for each neighbor (R1 and R3) in the inbound direction. But first we need to figure out the regex that fits the criteria. One way to do that is by filtering the BGP table using the regexp keyword with the regular expression to fine tune the matching without applying a routing policy to the traffic.

Below we try out a regex string for the first requirement.

R4#show ip bgp regexp ^1_.+_.+$
<output omitted>
Network Next Hop Metric LocPrf Weight Path
* 30.0.0.0/24 10.0.14.1 0 1 2 3 i
* 31.0.0.0/24 10.0.14.1 0 1 2 3 i
* 32.0.0.0/24 10.0.14.1 0 1 2 3 i
* 33.0.0.0/24 10.0.14.1 0 1 2 3 i

Below is a breakdown of the regular expression and what it does in this context.

The caret (^) forces the string to match from the beginning of the path. Without it, it would mean that any number of ASes could precede AS 1.

The 1 followed by the underscore (_) matches on exactly the AS of 1 with the underscore being any delimiter - in this case a space. Without the underscore this would match AS 1, 12, 1234 and so on.

The dot (.) means any character and the plus (+) means one or more instances of the preceding character (which would be any character). So, at least one character of any kind should be in last two sections.

The underscore (_) again means that there must be another delimiter.

This is sequence is repeated once more, but without the trailing underscore (_) to make sure it can match any paths that may be longer than just three.

The final metacharacter is the almighty dollar sign ($), which means end of file.

What we get should match a path coming from AS 1 and then at least 2 more ASes following that. We could have made them all three ASes .+ if we wanted to be able to reuse the matching logic on other portions that doesn't match AS 1 first, but in this case I chose not to.

Now onto testing a string for the second requirement - matching any AS path where the final AS is exactly 2.

R4#show ip bgp regexp ^.*_2$
<output omitted>
Network Next Hop Metric LocPrf Weight Path
* 20.0.0.0/24 10.0.34.3 0 3 2 i
*> 10.0.14.1 0 1 2 i
* 21.0.0.0/24 10.0.34.3 0 3 2 i
*> 10.0.14.1 0 1 2 i
* 22.0.0.0/24 10.0.34.3 0 3 2 i
*> 10.0.14.1 0 1 2 i
* 23.0.0.0/24 10.0.34.3 0 3 2 i
*> 10.0.14.1 0 1 2 i

Here we just do the .* matching after the start of file metacharacter and finish off with the _2 just before the end of file metacharacter. The asterisk (*) works almost identical to the plus (+) used previously, but with the difference that it can match zero instances of the preceding character - so it can essentially match on nothing as well.

So, we now have the two regex strings we need to create the as-path access-lists.

R4#configure terminal
R4(config)#ip as-path access-list 1 deny ^1_.+_.+$
R4(config)#ip as-path access-list 1 permit .*
R4(config)#ip as-path access-list 2 deny ^.*_2$
R4(config)#ip as-path access-list 2 permit .*
R4(config)#router bgp 4
R4(config-router)#neighbor 10.0.14.1 filter-list 1 in
R4(config-router)#neighbor 10.0.34.3 filter-list 2 in
R4(config-router)#end
R4#clear ip bgp * in

With as-path access-list filtering, what is denied by the access-list is what is filtered and what is permitted is left unfiltered. The second line of of both as-path access-lists is akin to the any of regular ip access-lists. It changes the implied deny any to an explicit permit any instead.

This is now the BGP table of R4 after the filtering is done.

R4#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*> 20.0.0.0/24 10.0.14.1 0 1 2 i
*> 21.0.0.0/24 10.0.14.1 0 1 2 i
*> 22.0.0.0/24 10.0.14.1 0 1 2 i
*> 23.0.0.0/24 10.0.14.1 0 1 2 i
*> 30.0.0.0/24 10.0.34.3 0 0 3 i
*> 31.0.0.0/24 10.0.34.3 0 0 3 i
*> 32.0.0.0/24 10.0.34.3 0 0 3 i
*> 33.0.0.0/24 10.0.34.3 0 0 3 i
*> 40.0.0.0/24 0.0.0.0 0 32768 i
*> 41.0.0.0/24 0.0.0.0 0 32768 i
*> 42.0.0.0/24 0.0.0.0 0 32768 i
*> 43.0.0.0/24 0.0.0.0 0 32768 i

We have now accomplished the two requirements stated earlier using as-path access-list filtering.

To summarise: as-path access-list filtering is done using regular expressions. You can test out regular expressions using the show ip bgp regexp <regex string> command. The as-path access-list is applied using the neighbor <ip address> filter-list <as-path acl #> in|out command under the BGP router process. The regex examples in this post are quite simple - regex is a VERY powerful tool for matching and can be used for much more than just BGP.

Tuesday, April 14, 2015

BGP Route Filtering using Aggregate-address

Aggregating routes is useful to reduce the size of the routing tables by grouping several routes into smaller, less specific routes. In this example, I will be doing aggregation on router R1 for the subnets contained within 20.0.0.0 /6, meaning the networks 21.0.0.0 - 23.255.255.255, which are advertised to R1 by R2 over an eBGP peering.

BGP filtering w/ aggregate-address

Below is shown the BGP table on R3, before any aggregation is configured on R1.

R3#show ip bgp | include 2(0|1|2|3).0.0.0
*> 20.0.0.0/24 10.0.13.1 0 1 2 i
*> 21.0.0.0/24 10.0.13.1 0 1 2 i
*> 22.0.0.0/24 10.0.13.1 0 1 2 i
*> 23.0.0.0/24 10.0.13.1 0 1 2 i

Before the aggregation we see the individual /24 routes as advertised by R2. Now we configure the router R1 to create an aggregate-address using the keyword summary-only and see what happens.

R1#configure terminal
R1(config)#router bgp 1
R1(config-router)#aggregate-address 20.0.0.0 252.0.0.0 summary-only
R1(config-router)#end
R1#show ip bgp
BGP table version is 28, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
s> 20.0.0.0/24 10.0.12.2 0 0 2 i
*> 20.0.0.0/6 0.0.0.0 32768 i
s> 21.0.0.0/24 10.0.12.2 0 0 2 i
s> 22.0.0.0/24 10.0.12.2 0 0 2 i
s> 23.0.0.0/24 10.0.12.2 0 0 2 i
<output omitted>
R1#show ip route bgp | incl 2(0|1|2|3).0.0.0
B 20.0.0.0/6 [200/0], 00:03:04, Null0
20.0.0.0/24 is subnetted, 1 subnets
B 20.0.0.0 [20/0] via 10.0.12.2, 00:03:04
21.0.0.0/24 is subnetted, 1 subnets
B 21.0.0.0 [20/0] via 10.0.12.2, 00:03:04
22.0.0.0/24 is subnetted, 1 subnets
B 22.0.0.0 [20/0] via 10.0.12.2, 00:03:04
23.0.0.0/24 is subnetted, 1 subnets
B 23.0.0.0 [20/0] via 10.0.12.2, 00:03:04

The command is quite simple - just a single line and the /6 network matches the networks 20.0.0.0, 21.0.0.0, 22.0.0.0 and 23.0.0.0, which are now considered component networks of the aggregate address. The summary-only keyword instructs R1 to only advertise the summary address and none of the component subnets. The show ip bgp output shows the /24 subnets from R2 as suppressed (indicated by the "s" in the left-most column). But the routes are still in R1s routing table. Also, note the administrative distance of 200 for the summary route injected into the routing table by the aggregate-address command.

But what about the BGP table and routing table on R3 now that R1 is suppressing the /24 subnets.

R3#show ip bgp | include 2(0|1|2|3).0.0.0
*> 20.0.0.0/6 10.0.13.1 0 0 1 i
R3#show ip route bgp | include 2(0|1|2|3).0.0.0
B 20.0.0.0/6 [20/0] via 10.0.13.1, 00:08:09

Well, as expected the /24 routes are gone and they are replaced by the single /6 summary route. Notice that the AS path in the BGP table no longer shows AS 2 in the path list. This could be something you wanted to do intentionally to perhaps hide some networks' AS path from upstream routers.

It is also possible to use route-map as a so-called suppress-map instead of the summary-all keyword. This will allow you to select which component subnets are advertised along with the summary route and which are filtered. Below we configure a suppress map on R1 to allow the network 23.0.0.0 /24 along with the 20.0.0.0 /6 summary route.

R1#configure terminal
R1(config)#ip access-list extended ACL_SUPPRESS_MAP
R1(config-ext-nacl)#deny ip host 23.0.0.0 host 255.255.255.0
R1(config-ext-nacl)# permit ip any any
R1(config-ext-nacl)#exit
R1(config)#route-map RMAP_SUPPRESS_MAP permit 10
R1(config-route-map)#match ip address ACL_SUPPRESS_MAP
R1(config-route-map)#exit
R1(config)#router bgp 1
R1(config-router)#no aggregate-address 20.0.0.0 252.0.0.0 summary-only
R1(config-router)#aggregate-address 20.0.0.0 252.0.0.0 suppress-map RMAP_SUPPRESS_MAP

Note that the access-list denies the subnet that is to be allowed and permits the routes that are to be filtered out. It would be reverse if the route-map used a deny statement instead of a permit, but that is just making things more complicated than they have to be.

Now we should be able to see the result on R3.

R3#show ip bgp | include 2(0|1|2|3).0.0.0
*> 20.0.0.0/6 10.0.13.1 0 0 1 i
*> 23.0.0.0/24 10.0.13.1 0 1 2 i
R3#show ip route bgp | include 2(0|1|2|3).0.0.0
B 20.0.0.0/6 [20/0] via 10.0.13.1, 00:05:56
23.0.0.0/24 is subnetted, 1 subnets
B 23.0.0.0 [20/0] via 10.0.13.1, 00:06:26

As expected, the 23.0.0.0 /24 network is now in the BGP table and the routing table. Note that the 23.0.0.0 /23 subnets shows the AS 2 in the path now, but the aggregate address still omits it (because it is advertised by R1 - not R2).

Now, there is a slight problem with the aggregation configuration, however, that may affect routes on R2. R1 will send the aggregate route to R2 as well and R2 will enter it into the BGP table and routing table.

R2#show ip bgp | include 2(0|1|2|3)
BGP table version is 34, local router ID is 22.0.0.1
*> 20.0.0.0/24 0.0.0.0 0 32768 i
*> 20.0.0.0/6 10.0.12.1 0 0 1 i
*> 21.0.0.0/24 0.0.0.0 0 32768 i
*> 22.0.0.0/24 0.0.0.0 0 32768 i
*> 23.0.0.0/24 0.0.0.0 0 32768 i
R2#show ip bgp | include 2(0|1|2|3).0.0.0
*> 20.0.0.0/24 0.0.0.0 0 32768 i
*> 20.0.0.0/6 10.0.12.1 0 0 1 i
*> 21.0.0.0/24 0.0.0.0 0 32768 i
*> 22.0.0.0/24 0.0.0.0 0 32768 i
*> 23.0.0.0/24 0.0.0.0 0 32768 i
R2#show ip route | include 2(0|1|2|3).0.0.0
B 20.0.0.0/6 [20/0] via 10.0.12.1, 1d00h
20.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C 20.0.0.0/24 is directly connected, Loopback0
21.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C 21.0.0.0/24 is directly connected, Loopback1
22.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C 22.0.0.0/24 is directly connected, Loopback2
23.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C 23.0.0.0/24 is directly connected, Loopback3

This would cause R2 to send traffic to R1 in the case of one of its locally connected interfaces would go down - this may cause some undesired effects.
What if we would just like to have R2 discard the routes as they are indeed an aggregate based on the subnets originating from R2? Well, we just tell router R1 to populate the AS-SET of the AS_PATH PA with the as-set keyword.

Below we remove the previous aggregate using the suppress-map and reapply the summary-only, but now with the as-set keyword as well.

R1#configure terminal
R1(config)#router bgp 1
R1(config-router)#no aggregate-address 20.0.0.0 252.0.0.0 suppress-map RMAP_SUPPRESS_MAP
R1(config-router)#aggregate-address 20.0.0.0 252.0.0.0 summary-only as-set

As we can see below, R1 is advertising the summary route to R2.

R1#show ip bgp neighbor 10.0.12.2 advertised-routes | include (2(0|1|2|3))|Network
Network Next Hop Metric LocPrf Weight Path
*> 20.0.0.0/6 0.0.0.0 100 32768 2 i

But R2 is not installing it into its BGP table - because the update contains its own ASN in the path and therefore is discarded. This can be verified on R2 by enabling debugging and doing an inbound route refresh.

R2#debug ip bgp updates in
BGP updates debugging is on (inbound) for address family: IPv4 Unicast
R2#clear ip bgp * in
<output omitted>
*Apr 15 20:35:55.144: BGP(0): 10.0.12.1 rcv UPDATE w/ attr: nexthop 10.0.12.1, origin i, metric 0, aggregated by 1 1.1.1.1, originator 0.0.0.0, merged path 1 2, AS_PATH , community , extended community , SSA attribute
*Apr 15 20:35:55.144: BGPSSA ssacount is 0
*Apr 15 20:35:55.144: BGP(0): 10.0.12.1 rcv UPDATE about 20.0.0.0/6 -- DENIED due to: AS-PATH contains our own AS;
<output omitted>

R2 now rejects the aggregate update due to "AS-Path contains our own AS".

Sunday, April 12, 2015

BGP Communities

Communities in BGP is kind of like peer groups, but for routes. It allows for the grouping of routes that share same characteristics and that needs to be treated the same throughout the network. There are well-known BGP communities that may instruct a router treat the routes a certain way, but you can also make up your own community and do with it as you will.

These are the well-known communities and what they instruct a router to do:

none: strip any community that may currently be applied to a route (leaving it with no community set)
no-export: do not advertise the route to an eBGP peer
no-advertise: do not advertise the route to any peer
local-AS: keep the route contained within a confederation subAS
internet: advertise route to all peers

Going through the list the community none is actually not a community, but more a means of stripping any current setting off an incoming or outgoing route - the route is not marked with community of none.

According to the RFC (RFC 1997), there is only three well-known communites: no_export, no_advertise and no_export_subconfed. The latter is the community, known in Cisco-speak, as the local-as. The internet community, however, is not a "real" community - it is used much like the any keyword in ip access-lists. This is what is stated in RFC 1997 regarding the internet community:

By default, all destinations belong to the general Internet community.

So, we are indeed left with only three actual well-known communities.

Below is the topology used to showcase a few uses of BGP communities.

BGP Communities Network Topology

To allow the use of communities, a router must first set the send-community parameter for the specific neighbor, it wishes to advertise communities to. Then, a router somewhere will have to actually set the community value for specific routes.

In the following example, I will have R6 and R7 add the no-export community string to 10.1.6.0 /24 and 10.1.66.0 /24 networks and the no-advertise to the 10.1.7.0 /24 and 10.1.77.0 /24 networks. Then we see how far the routes goes through the BGP network.

This is the configuration on routers R6 and R7 (the configuration is shown on R6 and it is only slightly different from that on R7).

R6#configure terminal
R6(config)#ip prefix-list PFX_R6_PREFIXES seq 5 permit 10.1.6.0/24
R6(config)#ip prefix-list PFX_R6_PREFIXES seq 10 permit 10.1.66.0/24
R6(config)#ip prefix-list PFX_R7_PREFIXES seq 5 permit 10.1.7.0/24
R6(config)#ip prefix-list PFX_R7_PREFIXES seq 10 permit 10.1.77.0/24
R6(config)#!
R6(config)#route-map RMAP_COMMUNITIES permit 10
R6(config-route-map)#match ip address prefix-list PFX_R6_PREFIXES
R6(config-route-map)#set community no-export
R6(config-route-map)#route-map RMAP_COMMUNITIES permit 20
R6(config-route-map)#match ip address prefix-list PFX_R7_PREFIXES
R6(config-route-map)#set community no-advertise
R6(config-route-map)#route-map RMAP_COMMUNITIES permit 99
R6(config-route-map)#exit
R6(config)#!
R6(config)#router bgp 67
R6(config-router)#neighbor 10.0.16.1 send-community
R6(config-router)#redistribute connected route-map RMAP_COMMUNITIES
R6(config-router)#end
R6#clear ip bgp * out

The things that differ between R6 and R7, is that R6 uses the route-map to apply the communities when doing redistribution of the connected routes into BGP - this means it will apply to any future BGP peers with the send-community enabled - whereas R7 applies the route-map to the specific neighbor of R1 in the outbound direction. This is done with the command command neighbor 10.0.17.1 route-map RMAP_COMMUNITIES out under the BGP router process.

Lets verify on R1, that the routes received have the proper community set. If done properly, the routes matched with the prefix-list PFX_R6_PREFIXES should be no-export and the PFX_R7_PREFIXES should be no-advertise.

R1#show ip bgp community no-export
BGP table version is 11, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 10.1.6.0/24 10.0.17.7 0 0 67 ?
*> 10.1.66.0/24 10.0.17.7 0 0 67 ?
R1#show ip bgp community no-advertise
BGP table version is 11, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 10.1.7.0/24 10.0.17.7 0 0 67 ?
*> 10.1.77.0/24 10.0.17.7 0 0 67 ?

It seems like some routes from R6 might be missing here. Let have a look at what we receive from R6.

R1#show ip bgp neighbors 10.0.16.6 routes
Total number of prefixes 0

Nothing. Lets look at what is advertised on from R6.

R6#show ip bgp neighbors 10.0.16.1 advertised-routes
Total number of prefixes 0

Again, nothing. Now is the time to check the config for any typos or missing portions. Well, it turns out that everything is configured with no typos or missing portions - and the router is doing exactly as instructed.

R6#show ip bgp community no-advertise
BGP table version is 9, local router ID is 6.6.6.6
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 10.1.7.0/24 0.0.0.0 0 32768 ?
*> 10.1.77.0/24 0.0.0.0 0 32768 ?
R6#show ip bgp community no-export
BGP table version is 9, local router ID is 6.6.6.6
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 10.1.6.0/24 0.0.0.0 0 32768 ?
*> 10.1.66.0/24 0.0.0.0 0 32768 ?

The problem here is that the community is set, when the routes are entered into the BGP table and this makes BGP honor the community by not advertising them to R1 - as instructed. Lets just change the config on R6 to match that of R7.

R6#configure terminal
R6(config)#router bgp 67
R6(config-router)#no redistribute connected route-map RMAP_COMMUNITIES
R6(config-router)#redistribute connected route-map RMAP_BGP_REDIST_CONN
R6(config-router)#neighbor 10.0.16.1 route-map RMAP_COMMUNITIES out
R6(config-router)#end
R6#clear ip bgp * out

Now lets have look at R1 again.

R1#show ip bgp community no-advertise
BGP table version is 17, local router ID is 1.1.1.1
<output omitted>
Network Next Hop Metric LocPrf Weight Path
* 10.1.7.0/24 10.0.16.6 0 0 67 ?
*> 10.0.17.7 0 0 67 ?
* 10.1.77.0/24 10.0.16.6 0 0 67 ?
*> 10.0.17.7 0 0 67 ?
R1#show ip bgp community no-export
BGP table version is 17, local router ID is 1.1.1.1
<output omitted>
Network Next Hop Metric LocPrf Weight Path
* 10.1.6.0/24 10.0.16.6 0 0 67 ?
*> 10.0.17.7 0 0 67 ?
* 10.1.66.0/24 10.0.16.6 0 0 67 ?
*> 10.0.17.7 0 0 67 ?

Now we see two entries for each prefix - which we should as there are two routers (R6 and R7) that advertise the networks.

This issue would also have been encountered if the network statement had been used instead of the redistribution of connected routes. The example I tested in my lab had the following configuration done on R6 instead of the redistribution configuration from earlier.

R6(config)#route-map RMAP_COMMUNITY_NO_ADVERTISE
R6(config-route-map)#set community no-ad
R6(config-route-map)#set community no-advertise
R6(config-route-map)#route-map RMAP_COMMUNITY_NO_EXPORT
R6(config-route-map)#set community no-export
R6(config-route-map)#exit
R6(config)#router bgp 67
R6(config-router)#network 10.1.6.0 mask 255.255.255.0 route-map RMAP_COMMUNITY_NO_EXPORT
R6(config-router)#network 10.1.66.0 mask 255.255.255.0 route-map RMAP_COMMUNITY_NO_EXPORT
R6(config-router)#network 10.1.7.0 mask 255.255.255.0 route-map RMAP_COMMUNITY_NO_ADVERTISE
R6(config-router)#network 10.1.77.0 mask 255.255.255.0 route-map RMAP_COMMUNITY_NO_ADVERTISE

Again, the result here would be the same as the redistribution - the community would be set before the routes enter into R6s BGP table and R6 would do as instructed and not advertise the routes to R1.

Now on the configuration on R1. We want the networks from AS 67 to be treated with the community that that is set by R6 and R7, so we will have to send the community along to our iBGP peers R2 and R3.
R1#configure terminal
R1(config)#router bgp 123
R1(config-router)#neighbor PGROUP_AS_123 send-community
This will have to be done on R2 and R3 as well to ensure the community is sent between all peers in AS 123.

Now we should see all four subnets in R1s BGP table, two subnets in the BGP tables of routers R2 and R3 and finally we should not see the subnets in the BGP tables of R4 and R5 at all.

R1#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*>i 10.1.4.0/24 10.0.24.4 0 100 0 4 i
*>i 10.1.5.0/24 10.0.35.5 0 100 0 5 i
* 10.1.6.0/24 10.0.16.6 0 0 67 ?
*> 10.0.17.7 0 0 67 ?
* 10.1.7.0/24 10.0.16.6 0 0 67 ?
*> 10.0.17.7 0 0 67 ?
*>i 10.1.44.0/24 10.0.24.4 0 100 0 4 i
*>i 10.1.55.0/24 10.0.35.5 0 100 0 5 i
* 10.1.66.0/24 10.0.16.6 0 0 67 ?
*> 10.0.17.7 0 0 67 ?
* 10.1.77.0/24 10.0.16.6 0 0 67 ?
*> 10.0.17.7 0 0 67 ?

R2#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*> 10.1.4.0/24 10.0.24.4 0 0 4 i
*>i 10.1.5.0/24 10.0.35.5 0 100 0 5 i
*>i 10.1.6.0/24 10.0.17.7 0 100 0 67 ?
*> 10.1.44.0/24 10.0.24.4 0 0 4 i
*>i 10.1.55.0/24 10.0.35.5 0 100 0 5 i
*>i 10.1.66.0/24 10.0.17.7 0 100 0 67 ?

R3#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*>i 10.1.4.0/24 10.0.24.4 0 100 0 4 i
*> 10.1.5.0/24 10.0.35.5 0 0 5 i
*>i 10.1.6.0/24 10.0.17.7 0 100 0 67 ?
*>i 10.1.44.0/24 10.0.24.4 0 100 0 4 i
*> 10.1.55.0/24 10.0.35.5 0 0 5 i
*>i 10.1.66.0/24 10.0.17.7 0 100 0 67 ?

R4#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*> 10.1.4.0/24 0.0.0.0 0 32768 i
*> 10.1.5.0/24 10.0.24.2 0 123 5 i
*> 10.1.44.0/24 0.0.0.0 0 32768 i
*> 10.1.55.0/24 10.0.24.2 0 123 5 i

R5#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*> 10.1.4.0/24 10.0.35.3 0 123 4 i
*> 10.1.5.0/24 0.0.0.0 0 32768 i
*> 10.1.44.0/24 10.0.35.3 0 123 4 i
*> 10.1.55.0/24 0.0.0.0 0 32768 i

And, just as expected, the routes 10.1.6.0 /24 and 10.1.66.0 /24 are sent from R1 to R2 and R3, but the routes 10.1.7.0 /24 and 10.1.77.0 /24 are not advertised beyond R1. The routers R4 and R5 receive none of the routes originating from AS 67 from any of their peers.

Lets say we want to undo the no-export set by the administrators of AS 67. We can do this on R2 to remove the no-export community from routes received from R1 and that should allow R2 to advertise two subnets (10.1.6.0 /24 and 10.1.66.0 /24) to R4.

R2(config)#ip community-list expanded CML_NO_EXPORT permit no-export
R2(config)#route-map RMAP_CLEAR_NO_EXPORT
R2(config-route-map)#match community CML_NO_EXPORT
R2(config-route-map)#set community none
R2(config-route-map)#exit
R2(config)#router bgp 123
R2(config-router)#neighbor 1.1.1.1 route-map RMAP_CLEAR_NO_EXPORT in
R2(config-router)#end
R2#clear ip bgp 1.1.1.1 in

The configuration uses a route-map to match the community "no-export" and set it to none and this is applied inbound for the peer 1.1.1.1 (R1). The BGP table on R4 should tell us whether it works or not.

R4#show ip bgp | begin Network
Network Next Hop Metric LocPrf Weight Path
*> 10.1.4.0/24 0.0.0.0 0 32768 i
*> 10.1.5.0/24 10.0.24.2 0 123 5 i
*> 10.1.6.0/24 10.0.24.2 0 123 67 ? *> 10.1.44.0/24 0.0.0.0 0 32768 i
*> 10.1.55.0/24 10.0.24.2 0 123 5 i
*> 10.1.66.0/24 10.0.24.2 0 123 67 ?

The routes are in the BGP table because R2 strips the no-export community from those routes allowing R2 to advertise the routes to R4.

Community lists are much like access-lists as they have standard and extended (expanded), where the expanded will allow for the use of regular expressions. For my own sake, I think I will have to do a post on using regular expressions at some point in the near future.