Hi all, I have a probe that's been down for several days now. I did contact RIPE about it, but support has been superficial. I also didn't yet find the level of technical detail I need in the information on the website and in this list's archives. The 8GB drive, when reformatted on a windows box with FAT32 as per support instructions, surprisingly only comes out with a 1GB partition. Inspecting the drive under linux, I find multiple partitions, with only the first containing an a fat32 filesystem: Platte /dev/sdc: 7927 MByte, 7927234560 Byte 244 Köpfe, 62 Sektoren/Spur, 1023 Zylinder, zusammen 15482880 Sektoren Einheiten = Sektoren von 1 × 512 = 512 Bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Festplattenidentifikation: 0xc3072e18 Gerät boot. Anfang Ende Blöcke Id System /dev/sdc1 2048 2099199 1048576 b W95 FAT32 /dev/sdc2 2099200 4196351 1048576 83 Linux /dev/sdc3 4196352 6293503 1048576 83 Linux On first glance, the others contain ext2 filesystems # fsck -N /dev/sdc? fsck von util-linux 2.20.1 [/sbin/fsck.vfat (1) -- /dev/sdc1] fsck.vfat /dev/sdc1 [/sbin/fsck.ext2 (2) -- /dev/sdc2] fsck.ext2 /dev/sdc2 [/sbin/fsck.ext2 (3) -- /dev/sdc3] fsck.ext2 /dev/sdc3 but on further inspections do not contain a valid magic number in their superblocks, so they are effectively unformatted. I would therefore suspect that they are not being used by the probe. Now the question: What is the correct layout supposed to look like (primary/secondary, size, ID, FS-type)? I would like to get my probe back up and running. Bonus question: What can I expect to find in point of files/data on the drive, that would tell me whether the probe was working correctly until I cut power and removed the drive? Thanks, Michael
Hi, On Wed, Mar 30, 2016 at 11:14:11AM +0200, Michael Ionescu wrote:
Now the question: What is the correct layout supposed to look like (primary/secondary, size, ID, FS-type)? I would like to get my probe back up and running.
dd if=/dev/zero of=/dev/<usbstick> bs=1m count=1 just zero-out the partition sector and the probe will re-initialize everything (boot up without flash stick, re-insert emptied flash stick, wait an hour or so). This REALLY needs to go into the FAQ that is sent with every "hey, your probe is down!" mail. Plus, "we can hear its SOS requests, and it needs a flash replacement". RIPE NCC folks, are you listening? This is a major annoyance. Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
Thanks Gert, that helped! The probe was back up after only a couple of minutes. Until the FAQ has been amended, the following might help others: To reset the USB Thumbdrive: Under Windows: https://tails.boum.org/doc/first_steps/reset/windows/index.en.html (stop after the "clean" step) Under Linux: dd if=/dev/zero of=/dev/<usbstick> bs=512 count=1 Plug the thumbdrive back into the Probe AFTER the Probe has rebooted. SOS History (on the probe's RIPE Atlas Webpage - Network tab) should show Entries with "NO USB" after reboot and then regular entries with no errors, once the thumbdrive has been replaced. Michael
Hi Michael,
but on further inspections do not contain a valid magic number in their superblocks, so they are effectively unformatted. I would therefore suspect that they are not being used by the probe.
Now the question: What is the correct layout supposed to look like (primary/secondary, size, ID, FS-type)? I would like to get my probe back up and running.
In general, the probe doesn't care what is on the USB stick. So wiping or formatting the stick is not needed. There is at the moment one rare exception. The filesystem can go corrupt to the point the fsck hangs. In that case, wiping the entire USB stick will help. To get the probe to reinitialise the USB stick, the correct procedure is to power on the probe without USB stick and then insert the USB stick after a few minutes (10 to be safe). The procedure suggested by Gert Doering will also work. But should not be necessary. The filesystems on the UDB stick are encrypted, that why you can't find any valid magic numbers.
Bonus question: What can I expect to find in point of files/data on the drive, that would tell me whether the probe was working correctly until I cut power and removed the drive?
You cannot find anything on the drive. Only the probe has the encryption keys. Currently the probes use an ext2 filesystem, because it is simple and seemed to work well enough. It worked well in artificially power cycling the probe for period of time. Of course, reality is worse. At some point I'll try to experiment with adding journaling to see if it makes a difference. Philip
Hi, On Wed, Mar 30, 2016 at 01:50:33PM +0200, Philip Homburg wrote:
Currently the probes use an ext2 filesystem, because it is simple and seemed to work well enough. It worked well in artificially power cycling the probe for period of time. Of course, reality is worse.
This might actually be the reason for the flash issues - from what I could gather, these USB stick controllers understand FAT well enough to do wear-leveling for it, but are not as smart as SSDs who handle "arbitrary block access patterns" properly - so, using "non-FAT" filesystems will make the USB stick (and SD-Cards, for that matter) wear out much faster. Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
Hi Philip, thanks for clearing this up. It would be really great to take some of the guesswork out of these cases, by adding one or two checks regarding the thumbdrive and file system status, transmitting negative results via descriptive SOS messages and/or system status tags. On my probe the USB problems led to a system tag "Firewall problem suspected". This is quite misleading. After your description I would suspect the FS issue impeded the probe so deeply that it wasn't able to call home at all anymore. To my understanding, probes should be fire-and-forget as much as possible, so I think any fsck should be run in a way that would not impede an otherwise functioning probe. Michael On 30.03.2016 13:50, Philip Homburg wrote:
Hi Michael, [...] In general, the probe doesn't care what is on the USB stick. So wiping or formatting the stick is not needed.
There is at the moment one rare exception. The filesystem can go corrupt to the point the fsck hangs. [...] Philip
Hi, I need to follow up on this once more... The same probe, which had now been working again for several weeks after it reformatted its drive, I have now found the drive to be read-only. After replacing the drive with a new one, the probe is once again fine. While troubleshooting I have found that the probe does not connect as long as there is no functioning drive present. This makes troubleshooting unnecessarily difficult. I would suggest adding a couple of (SOS?) messages that would run regardless of the presence of a drive, such as: - detected no USB drive on bootup - detected insertion of USB drive - detected removal of USB drive - detected read-only USB drive - Starting fsck on USB drive - Ended fsck on USB drive (possibly indicating success/failure) - Starting to reformat USB drive - Ended reformatting USB drive (possibly indicating success/failure) and possibliy something that would point towards a corrupt FS - if that is easily detectable, for instance by checking the integrity of a known file within the FS. Just my 2ct. Michael On 30.03.2016 13:50, Philip Homburg wrote:
Hi Michael, [...] In general, the probe doesn't care what is on the USB stick. So wiping or formatting the stick is not needed.
There is at the moment one rare exception. The filesystem can go corrupt to the point the fsck hangs. [...] Philip
participants (3)
-
Gert Doering
-
Michael Ionescu
-
Philip Homburg