Link aggregation for anchors? - ripe-atlas - mailman.ripe.net

newer
Update on the anchoring...

Link aggregation for anchors?

older
Trace and Ping measurements fail...

Tore Anderson

18 Jul 2017 18 Jul '17

7:48 a.m.

Hi, I was wondering if it is possible to connect our anchor in a fault tolerant fashion using 802.3ad link aggregation? This would allow us to perform serialised maintenance on the upstream switches without ever offlining the anchor. Tore

Show replies by date

Robert Kisteleki

19 Jul 19 Jul

2:35 p.m.

On 2017-07-18 7:48, Tore Anderson wrote:

Hi,

I was wondering if it is possible to connect our anchor in a fault tolerant fashion using 802.3ad link aggregation?

This would allow us to perform serialised maintenance on the upstream switches without ever offlining the anchor.

Tore

Hi Tore, Thanks for the suggestion; we'll look into this to see how easy or difficult it would be. In the meantime, it'd be useful to know if others are interested in this as well...? Just to be able to check the expected amount of work against the demand. Cheers, Robert

Gert Doering

2:49 p.m.

Hi, On Wed, Jul 19, 2017 at 02:35:14PM +0200, Robert Kisteleki wrote:

In the meantime, it'd be useful to know if others are interested in this as well...? Just to be able to check the expected amount of work against the demand.

*second* (active/passive linux bonding would work well for us, while LACP wouldn't due to conscious design decision to de-couple control planes of primary/secondary switches) Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279

Tore Anderson

20 Jul 20 Jul

7:44 a.m.

* Gert Doering <gert@space.net>

(active/passive linux bonding would work well for us, while LACP wouldn't due to conscious design decision to de-couple control planes of primary/secondary switches)

Active/passive fail-over à la Linux bonding would work for me too. The biggest disadvantage of that is that you waste half your available bandwidth, but that probably isn't a big deal for the Atlas Anchors. It is quite possible to create a setup that does 802.3ad if an LACP neighbour is detected, falling back on active/passive fail-over if not. That said, you do lose most of the error detection capabilities of LACP that way. Quite possibly not worth the engineering effort if it's not already implemented in whatever software you're using. I'd rather you spent that time implementing LLDP support, come to think of it. (That would be useful on the non-Anchor probes as well.) Tore

Gert Doering

7:52 a.m.

Hi, On Thu, Jul 20, 2017 at 07:44:46AM +0200, Tore Anderson wrote:

* Gert Doering <gert@space.net>

...
(active/passive linux bonding would work well for us, while LACP wouldn't due to conscious design decision to de-couple control planes of primary/secondary switches)

Active/passive fail-over à la Linux bonding would work for me too. The biggest disadvantage of that is that you waste half your available bandwidth, but that probably isn't a big deal for the Atlas Anchors.

Not really, given a GigE uplink and just a few mbits in use :-)

It is quite possible to create a setup that does 802.3ad if an LACP neighbour is detected, falling back on active/passive fail-over if not. That said, you do lose most of the error detection capabilities of LACP that way. Quite possibly not worth the engineering effort if it's not already implemented in whatever software you're using.

Linux bonding can do ARP probing on both member interfaces, which does the most important "real world" part of the error detection - "will this port work for me to reach the default gateway?" (so you can even failover on an uplink outage).

I'd rather you spent that time implementing LLDP support, come to think of it. (That would be useful on the non-Anchor probes as well.)

Minimalistic LLDP support would also be nice ("device, port"). Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279

Tore Anderson

8:24 a.m.

* Gert Doering <gert@space.net>

Linux bonding can do ARP probing on both member interfaces, which does the most important "real world" part of the error detection - "will this port work for me to reach the default gateway?" (so you can even failover on an uplink outage).

I've seen, but it seems a bit of a hack to me, one that I'd be reluctant to see implemented in all the Anchors. I'd rather go without those X pps of extra broadcast traffic on my network, to be honest. There's also the risk that some networks rate-limit ARP and/or broadcast traffic in general which would make the approach unreliable. Further, it is a layering violation. In particular, if there's an IPv4 outage and ARPs go unanswered on one or more interfaces, that shouldn't have any impact on IPv6 whatsoever. Active/passive (using link down events to trigger fail-over) is probably the way to go. KISS and all that. Tore

Philip Homburg

10:48 a.m.

On 2017/07/20 7:44 , Tore Anderson wrote:

I'd rather you spent that time implementing LLDP support, come to think of it. (That would be useful on the non-Anchor probes as well.)

I have a ticket open for LLDP on regular probes. Though no time frame is assigned when it should be done. I assume the same code would work for anchors, but I haven't looked into that.

2980

Age (days ago)

2982

Last active (days ago)

Download

6 comments

4 participants

tags

participants (4)

Gert Doering
Philip Homburg
Robert Kisteleki
Tore Anderson