Hello, I am trying to run a probe (#73) behind an IPv6 NAT. The probe gets an IP in a non-standard prefix 66:113::/64, which is then NATed to a single GUA IPv6. This used to work fine for a few months, but then in April-2018 problems started: the probe would connect to Atlas for 5-10 minutes, then disconnect for about 20 minutes, then connect again; this alternates over and over again. However -- if IPv6 RAs are enabled some time after the probe is powered-on (i.e. the controller connection already got established over IPv4), things seem to go on fine, i.e. the controller is connected over IPv4 for days, but various measurements are performed on IPv6 too. So it seems that only the controller connection is the problem. I fully expect that you didn't account for such a setup in the controller infrastructure or possibly various "local IP is valid" checks and whatnot, but why this used to work fine before? Has there been any change on your side in April that would break this kind of setup? -- With respect, Roman
On Tue, Jun 12, 2018 at 3:42 AM Roman Mamedov <rm@romanrm.net> wrote: I fully expect that you didn't account for such a setup in the controller
infrastructure or possibly various "local IP is valid" checks and whatnot, but why this used to work fine before? Has there been any change on your side in April that would break this kind of setup?
Many would argue that the setup is already broken in many ways and that this is just one bit of breakage that you happened to notice.
On 12/06/18 10:41, Lorenzo Colitti wrote:
On Tue, Jun 12, 2018 at 3:42 AM Roman Mamedov <rm@romanrm.net> wrote: I fully expect that you didn't account for such a setup in the controller
infrastructure or possibly various "local IP is valid" checks and whatnot, but why this used to work fine before? Has there been any change on your side in April that would break this kind of setup?
Many would argue that the setup is already broken in many ways and that this is just one bit of breakage that you happened to notice.
So what is your recommendation for IPv6 multi-homing without BGP? -- James Andrewartha Network & Projects Engineer Christ Church Grammar School Claremont, Western Australia Ph. (08) 9442 1757 Mob. 0424 160 877
On Tue, Jun 12, 2018 at 11:44 AM James Andrewartha <jandrewartha@ccgs.wa.edu.au> wrote:
Many would argue that the setup is already broken in many ways and that this is just one bit of breakage that you happened to notice.
So what is your recommendation for IPv6 multi-homing without BGP?
For the home, homenet protocols. For small enterprises, draft-ietf-v6ops-conditional-ras, which is about to become an RFC. For larger enterprises, draft-ietf-rtgwg-enterprise-pa-multihoming .
On 12/06/18 10:47, Lorenzo Colitti wrote:
On Tue, Jun 12, 2018 at 11:44 AM James Andrewartha <jandrewartha@ccgs.wa.edu.au> wrote:
Many would argue that the setup is already broken in many ways and that this is just one bit of breakage that you happened to notice.
So what is your recommendation for IPv6 multi-homing without BGP?
For the home, homenet protocols.
RFC7368?
For small enterprises, draft-ietf-v6ops-conditional-ras, which is about to become an RFC. For larger enterprises, draft-ietf-rtgwg-enterprise-pa-multihoming
And what's the implementation support for these protocols like? Hmm, let's read draft-ietf-rtgwg-enterprise-pa-multihoming:
How a host should make good decisions about source address selection in a multihomed site is not a solved problem. We do not attempt to solve this problem in this document.
Followed by a discussion on possible ways it might work if the routers can react to network changes by sending new RAs, which I love would to know if there are any implementations that can do this. -- James Andrewartha Network & Projects Engineer Christ Church Grammar School Claremont, Western Australia Ph. (08) 9442 1757 Mob. 0424 160 877
On Tue, Jun 12, 2018 at 12:13 PM James Andrewartha < jandrewartha@ccgs.wa.edu.au> wrote:
Followed by a discussion on possible ways it might work if the routers can react to network changes by sending new RAs, which I love would to know if there are any implementations that can do this.
The more customers ask for it, the sooner and more widely it will be deployed. The alternative is increase costs to application developers by requiring them to implement complex and brittle NAT traversal code, and impose on users the resulting burden of flakier connectivity and the battery impact due to NAT keepalives.
On 2018/06/11 20:42 , Roman Mamedov wrote:
I fully expect that you didn't account for such a setup in the controller infrastructure or possibly various "local IP is valid" checks and whatnot, but why this used to work fine before? Has there been any change on your side in April that would break this kind of setup?
I started a TCP traceroute on your probe toward it's controller, but it consistently fails after hop 2. I have no idea if that is related to your NAT box. https://atlas.ripe.net/measurements/14364300/ In general it is worth noting that Atlas has support for IPv4 NAT, such as keeping track of the public address of a probe. This support in not implemented for IPv6. So if you put a probe behind some kind of IPv6 NAT, the results may be confusing to other Atlas users. Philip
Hello, Sorry for the noise, it appears the issue was due to a local misconfiguration after all, the GUA IP used for NAT66 was getting removed and instantly readded back to its interface every time DHCPv6 client renewed its lease from the ISP, which was every 30 minutes. Not surprisingly that breaks all NAT66 mappings made from/to that IP. As also noted in my private E-Mail to Stephen, one issue highlighted here is the lack of notification about such issues with the probe. For instance I didn't notice that anything is wrong until almost a month later (and by then it was already difficult to pinpoint these issues to the configuration changes that I made earlier), as there is no E-Mail notification for probe connection flapping, only for 60+ minutes periods of disconnection. So perhaps it would be a good idea to consider adding more kinds of E-Mail notifications to Atlas in the future. If anyone is curious as to why NAT66 setup in the first place, my ISP only provides a /64, and that's already "spent" on my main primary VLAN. I don't want to subnet the /64 further and have to run DHCPv6 everywhere. For security reasons I want to place the probe into its own separate VLAN. So that one, as well as numerous other guest/VM/untrusted VLANs get their own 66:xxxx:/64s and are then NAT'ed into the router's own IPv6. Let me know if you have any better suggestions for these network conditions, aside from "Get a Different ISP" (only one ISP in my area provides IPv6) or "Demand More IPs" from the current ISP (very unlikely to be successful, they even allocate only /64s to business-class connections, and on residential those are also *dynamic*). -- With respect, Roman
Dear Roman,
As also noted in my private E-Mail to Stephen, one issue highlighted here is the lack of notification about such issues with the probe. For instance I didn't notice that anything is wrong until almost a month later (and by then it was already difficult to pinpoint these issues to the configuration changes that I made earlier), as there is no E-Mail notification for probe connection flapping, only for 60+ minutes periods of disconnection. So perhaps it would be a good idea to consider adding more kinds of E-Mail notifications to Atlas in the future.
You can set a pretty aggressive (down to 10 minutes) notification threshold on the probe status page. I do not recommend this in general, since it's going to be way too noisy, but it may come handy if you're testing something. Regards, Robert
participants (5)
-
James Andrewartha
-
Lorenzo Colitti
-
Philip Homburg
-
Robert Kisteleki
-
Roman Mamedov