Re: [atlas] One-off measurements not terminating

30 Dec 2019

      An update:   I was able to ‘delete' my stuck measurements via the API, so they’re stopped now and I’m back up and running for the moment.

I also added an API command to my code to ‘delete’ measurements as soon as the results have been picked up, which I hoped would make this fix sustainable, but so far that doesn’t seem to be doing anything.  Perhaps a longer delay is required between creating the measurement and sending the ‘delete’ command?

Thanks,
Steve
...
On Dec 28, 2019, at 3:20 PM, Steve Gibbard <scg@gibbard.org> wrote:
Hi Atlas folks,
I hope you’re having a good holiday season.  Sorry to interrupt it by complaining about issues.
On Christmas Eve my time (early Christmas morning your time) there was an Atlas issue where any attempt at reading measurements failed with an HTTP 500 status error.  That appears to have gotten fixed on Christmas (a really big thank you to whoever worked on that) but since then it appears that while most of the one-off measurements we’ve created have delivered results very quickly, none of the measurements created since 17:00 UTC on 2019-12-25 have stopped running.  As shown in the Atlas portal:
23722197	Traceroute	www.globaltraceroute.com (AS13335)	Test Traceroute	1	one-off	2019-12-25 22:24
Never		
23722089	Traceroute	archive.ubuntu.com (AS41231)	Test Traceroute	1	one-off	2019-12-25 19:16
Never		
23722088	Traceroute	sps.prima.com.ar (AS10318)	Test Traceroute	1	one-off	2019-12-25 19:14
Never		
23721915	Traceroute	www.globaltraceroute.com (AS13335)	Test Traceroute	1	one-off	2019-12-25 17:00
Never
And on for every measurement between then and now.
Previously, the typical one-off measurement was listed with start and stop times less than 10 minutes apart.
When a user has 100 measurements running concurrently, creation of new measurements fails, which is happening for me now.
If somebody could take a look at this, I’d really appreciate it.
Thanks,
Steve