not announcing IXP IPv6 peering lan prefixes in global BGP table possibly breaks PMTUD

25 Jul 2011

      Hello,

we had an IPv6 path MTU discovery issue last week and I would like to 
discuss possible solutions here.

The problem is a combination of not announcing IXP IPv6 peering prefixes 
in the global BGP table and activating loose uRPF at the border of a 
network. I made the following traceroutes and pings after deactivating 
loose uRPF at the border. Before this I did not see any packet from the 
LINX peering lan address 2001:7f8:4::d1c:1, because it is not announced to 
the global BGP table and therefore not routable. The traceroute ended 
after 2001:450:2001:800e::2 before.

chris@router> traceroute 2a01:e0c:1:1599::1 source 2001:x:x:x::2
traceroute6 to 2a01:e0c:1:1599::1 (2a01:e0c:1:1599::1) from 2001:x:x:x::2, 64 hops max, 12 byte packets
[...]
2  2001:450:2001:800e::2 (2001:450:2001:800e::2)  0.793 ms  0.611 ms  0.604 ms
3  2001:7f8:4::d1c:1 (2001:7f8:4::d1c:1)  25.692 ms  227.307 ms  18.160 ms
4  2001:1900:5:2::12e (2001:1900:5:2::12e)  26.565 ms  26.310 ms  26.248 ms
5  2a01:e00:2:9::1 (2a01:e00:2:9::1)  26.746 ms  28.774 ms  40.343 ms
[...]

An ICMP echo request packet with more than 1410 bytes (1450 byte incl. 
header) shows that there is a smaller MTU between two routers in the 
backbone of Level3:

PING www.free.fr(www.free.fr) 1403 data bytes
...
From 2001:7f8:4::d1c:1 icmp_seq=86 Packet too big: mtu=1450
1411 bytes from www.free.fr: icmp_seq=87 ttl=53 time=41.5 ms
[...]
2001:7f8:4::d1c:1 seems to be a router of Level3 at LINX. The next router 
2001:1900:5:2::12e also has an IP address from the Level3 IPv6 allocation. 
There seems to be an MTU of 1450 bytes between those two routers. The 
router at LINX sends out an ICMP "Packet too big" with the source address 
of the interface where he sees the route to the source address. This is 
the LINX peering lan, which is currently not announced in BGP. We use 
loose uRPF at the border to drop all packets from source addresses that 
are not globally routed. The ICMP "Packet too big" gets lost and path MTU 
discovery is broken. Communication with big packets is not possible.

Some IXPs decided not to announce their peering lan prefixes for some 
reasons, but in combination with loose uRPF this leads to problems like 
this one. I would like to discuss the best current practise and possible 
solutions here.

Possible solutions from my point of view:

1) Do not activate loose uRPF at the border of any network.

2) Any network where loose uRPF is configured at the border has to
    configure static routes for the IXP ranges of every RIR and
    redistribute them in the IGP so there is a valid route for
    loose uRPF checks.

3) IXPs announce their peering lan prefixes in the global BGP table
    and make loose uRPF work for the rest of the world. Members of
    the IXPs should possibly filter BGP announcements of the IXP
    peering lan prefixes from external peers when they do not use
    "next-hop-self" in iBGP within their network.

4) Remove tunnels, use native IPv6. There will always be links with
    an MTU lower than 1500 byte (access), so this is possibly not the
    best solution.

5) ?

Regards from Berlin,

Chris

Christian Seitz

Sander Steffann

Gert Doering

Eric Vyncke (evyncke)

Gert Doering

Christian Seitz

Christian Seitz

Gert Doering

Ivan Pepelnjak

Florian Weimer

Christian Seitz

Daniel Roesen

tags

participants (7)