Hi Matthieu, On Wed, Jul 12, 2017 at 09:44:15PM +0200, Matthieu Herrb wrote:
Here at Tetaneutral.net we're hosting a V1 Atlas probe since 2012 or so. A few weeks ago it started beeing marked as unconnected on the ripe atlas page : https://atlas.ripe.net/probes/523/
Since it's a v1 probe, I can only power cycle it to try to reconnect it, which I've done a number of times now, without success.
It's still answering to ping requests, both on its v4 and v6 addresses and I can see some traceroute-like trafic going out of it.
It's marked as "firewall problems suspected", but we don't do any firewalling on it. On IPv4 it's on a 1:1 NAT, all ports are open and forwarded in both directions, and on V6 it's routed without any filtering. Also we haven't changed anything on our infrastructure around the time it became disconnected.
All of this is similar to my past experience with v1 probe ID 114 including the reported FW version 4770 (according to the RIPE Atlas portal) at time of the outage.
I'm starting to suspect a firmware upgrade problem as the one mentionned by Alun Davis in https://www.ripe.net/ripe/mail/archives/ripe-atlas/2017-June/003329.html
Any suggestion before I send it back and ask for a new one ?
I've tried only a few (two?) power cycles to no avail. Finally, I've got myself to snoop and check the probe's network activity on reboot. Following the last power cycle the v1 probe 114 has: - acquired a public v4 IP address (DHCP) & v6 IP address (SLAAC), recursive name server IPs, etc.; - requested a translation of ntp.atlas.ripe.net getting back responses (CNAME ntp.ripe.net., A 193.0.0.229); - synced time from ntp.ripe.net/193.0.0.229.123 (NTPv3); - requested a translation of U19.M${probe-mac-address}.sos.atlas.ripe.net. (both AAAA an A records) getting back v4 & v6 address in response (2001:67c:2e8:11::c100:1337 & 193.0.19.55); - sent an ICMP echo request to the SOS host (193.0.19.55) getting back a reply; - requested a translation of ctr-ams07.atlas.ripe.net getting back an answer (v6 2001:67c:2e8:11::c100:1373) - talked to the controller 2001:67c:2e8:11::c100:1373 at tcp port 443 likely initiating a new FW dowload; - successfully rebooted into the newest FW 4790. Luckily, that was the end of outage for ID 114. Hope your 523 can get back online as well. I hope v1 probes will keep on running at least until the v4 version is available as a replacement. Kind regards, Martin