Proposal: Remove support for non-public measurements [ONLY-PUBLIC]
Dear RIPE Atlas users, We recently published a RIPE Labs article containing a few proposals: https://labs.ripe.net/author/kistel/five-proposals-for-a-better-ripe-atlas/. We'd like to encourage you to express your comments about this proposal (if you'd like to share them) here. Regards, Robert Kisteleki For the RIPE Atlas team
Hello,
From the linked page:
A total of 173 users scheduled at least one, 81 users have at least two, one specific user scheduled 91.5% of all of these.
That is surprising. What do those numbers look like if you zoom out to the past 6/12/24 months? If you can count on one hand the number of users using >90% of the private measurements over a longer timeframe than two weeks, then I submit that the choice is clear. Cheers, Alex
On 15. 12. 22 6:57, Alexander Burke via ripe-atlas wrote:
Hello,
From the linked page:
A total of 173 users scheduled at least one, 81 users have at least two, one specific user scheduled 91.5% of all of these.
That is surprising. What do those numbers look like if you zoom out to the past 6/12/24 months?
If you can count on one hand the number of users using >90% of the private measurements over a longer timeframe than two weeks, then I submit that the choice is clear.
I concur. Ad:
Cons
Users who were specifically using RIPE Atlas because of this feature will stop using the service. Other users may reduce / change their use of the service and perhaps ultimately disengage completely. As a result we could lose some connected probes, as their hosts no longer see value in keeping them connected.
I suppose with such a small set of users of this feature, it should be possible to intersect set of users vs. set of probe owners and see how many probes would be affected in the worst case - if all of the users withdrew all of their probes? If it is not a _significant_ part of network, I would say ... "Nuke it from orbit." -- Petr Špaček
The “one specific user” is www.globaltraceroute.com, an Atlas front end I created a few years ago that does instant one-off traceroutes, pings, and DNS lookups. It has become a well-used operational troubleshooting tool. It operates as a free service, with operational costs slowly draining the bank account of my mostly defunct consulting firm, and Atlas credits coming from a couple probes I operate plus some other donors. What I think is worth considering here: Atlas, and the RIPE NCC, have two fairly separate constituencies: researchers and operators. In the research community, there’s an expectation that data be made available so that others can review it, judge the validity of research results, do their own work based on it, etc. Atlas, in its native form with long running repeatable measurements, gets a lot of research use. It is useful to have those datasets available to the public. The operations use case for Global Traceroute is different. It’s generally either “I set up a new CDN POP, and want to make sure the networks it’s supposed to serve are getting there are on a direct path,” or “some other source is telling me my performance is bad from ISP X, and I want to know why.” Instead of calling the other ISP’s NOC or trying to track down a random customer to help troubleshoot, they can do a one-off traceroute, see where the traffic is going, and hopefully figure out what to adjust or who to talk to to fix the situation. Making those operational troubleshooting results public may not be worthwhile. The results themselves, being one-offs, are not something it would be all that interesting to track over time. If anybody does want a one-off traceroute to a particular target, they can go get it themselves. It is pretty obvious who is doing the traceroutes. If there are a bunch of traceroutes to a certain CDN operators’ services, they almost always come from that CDN operators’ corporate network, so it does show who is concerned about network performance issues in certain regions at certain times. That info might be potentially interesting — color for a news story about an outage saying “as the outage unfolded, engineers from company X unleashed a series of measurements to see paths into their network from region Y.” Given the fear a lot of companies have about releasing internal information to the public, I worry that that would have a “chilling effect” on use of the service. So, I think think Global Traceroute and Atlas are together a useful operational tool. I think it’s made more useful by setting is_public to false in the query. I’d really like to be able to continue proxying non-public one-off measurements into Atlas. Thanks, Steve
On Dec 14, 2022, at 9:57 PM, Alexander Burke via ripe-atlas <ripe-atlas@ripe.net> wrote:
Hello,
From the linked page:
A total of 173 users scheduled at least one, 81 users have at least two, one specific user scheduled 91.5% of all of these.
That is surprising. What do those numbers look like if you zoom out to the past 6/12/24 months?
If you can count on one hand the number of users using >90% of the private measurements over a longer timeframe than two weeks, then I submit that the choice is clear.
Cheers, Alex
-- ripe-atlas mailing list ripe-atlas@ripe.net https://lists.ripe.net/mailman/listinfo/ripe-atlas
Hi, One idea could be that "small" tests can be non-public but when the setup passes some limits it has to be public. I understand your /business needs/ contra the public interest. The definition of a "small" test could be something like; - maximum 20 probes - runs for maximum 60 minutes - is repeated maximum 10 times (if you run every 300sec you have to limit the endtime to 50 minutes) With such limits you can troubleshoot things (non-publicly) but you can't build your monitoring system on top of that. Regards, // mem Den 2022-12-15 kl. 19:41, skrev Steve Gibbard:
The “one specific user” is www.globaltraceroute.com, an Atlas front end I created a few years ago that does instant one-off traceroutes, pings, and DNS lookups. It has become a well-used operational troubleshooting tool. It operates as a free service, with operational costs slowly draining the bank account of my mostly defunct consulting firm, and Atlas credits coming from a couple probes I operate plus some other donors.
What I think is worth considering here:
Atlas, and the RIPE NCC, have two fairly separate constituencies: researchers and operators.
In the research community, there’s an expectation that data be made available so that others can review it, judge the validity of research results, do their own work based on it, etc. Atlas, in its native form with long running repeatable measurements, gets a lot of research use. It is useful to have those datasets available to the public.
The operations use case for Global Traceroute is different. It’s generally either “I set up a new CDN POP, and want to make sure the networks it’s supposed to serve are getting there are on a direct path,” or “some other source is telling me my performance is bad from ISP X, and I want to know why.” Instead of calling the other ISP’s NOC or trying to track down a random customer to help troubleshoot, they can do a one-off traceroute, see where the traffic is going, and hopefully figure out what to adjust or who to talk to to fix the situation.
Making those operational troubleshooting results public may not be worthwhile. The results themselves, being one-offs, are not something it would be all that interesting to track over time. If anybody does want a one-off traceroute to a particular target, they can go get it themselves. It is pretty obvious who is doing the traceroutes. If there are a bunch of traceroutes to a certain CDN operators’ services, they almost always come from that CDN operators’ corporate network, so it does show who is concerned about network performance issues in certain regions at certain times. That info might be potentially interesting — color for a news story about an outage saying “as the outage unfolded, engineers from company X unleashed a series of measurements to see paths into their network from region Y.” Given the fear a lot of companies have about releasing internal information to the public, I worry that that would have a “chilling effect” on use of the service.
So, I think think Global Traceroute and Atlas are together a useful operational tool. I think it’s made more useful by setting is_public to false in the query. I’d really like to be able to continue proxying non-public one-off measurements into Atlas.
Thanks, Steve
On Dec 14, 2022, at 9:57 PM, Alexander Burke via ripe-atlas <ripe-atlas@ripe.net> wrote:
Hello,
From the linked page:
A total of 173 users scheduled at least one, 81 users have at least two, one specific user scheduled 91.5% of all of these.
That is surprising. What do those numbers look like if you zoom out to the past 6/12/24 months?
If you can count on one hand the number of users using >90% of the private measurements over a longer timeframe than two weeks, then I submit that the choice is clear.
Cheers, Alex
-- ripe-atlas mailing list ripe-atlas@ripe.net https://lists.ripe.net/mailman/listinfo/ripe-atlas
Hi,
The definition of a "small" test could be something like; - maximum 20 probes - runs for maximum 60 minutes - is repeated maximum 10 times (if you run every 300sec you have to limit the endtime to 50 minutes)
What would you propose the API to do if this rule is violated? Should it outright refuse the measurement, or should it silently turn on the public flag (silently, because even if we emit a warning, it is probably never seen by a human...)?
With such limits you can troubleshoot things (non-publicly) but you can't build your monitoring system on top of that.
I guess this hinges on what level of support (legal, technical or other) we're aiming for when it comes to others building services on top of RIPE Atlas. Regards, Robert
On 2022-12-16 at 12:56, Robert Kisteleki wrote:
Hi,
The definition of a "small" test could be something like; - maximum 20 probes - runs for maximum 60 minutes - is repeated maximum 10 times (if you run every 300sec you have to limit the endtime to 50 minutes)
What would you propose the API to do if this rule is violated? Should it outright refuse the measurement, or should it silently turn on the public flag (silently, because even if we emit a warning, it is probably never seen by a human...)?
I think the safest thing would be to refuse with an error message. Just override a setting with just a warning is probably not enough, as you wrote.
With such limits you can troubleshoot things (non-publicly) but you can't build your monitoring system on top of that.
I guess this hinges on what level of support (legal, technical or other) we're aiming for when it comes to others building services on top of RIPE Atlas.
I really love RIPE Atlas and all the possibilities that come with an open and mature platform. I think as much as possible should be open and reusable by others. However, I understand that some people in some cases have other wishes. Regards, // mem
On 15. 12. 22 19:41, Steve Gibbard wrote:
I worry that that would have a “chilling effect” on use of the service.
I hear your concerns, and have a proposal how to quantify this concern: - First, amend the page to say that all the data are public. (Possibly also switch the flag, but that can be a separate step.) - Second, observe what has changed in the usage pattern. - Third, evaluate. That way we don't need to stay in limbo over hypothetical situations but get real data. Side note about usefulness of one-off measurement history: I think it _might_ be is interesting for anyone doing study on any given outage, or even study about optimization practices over time. For example, the DNS community has service called DNSViz which does just one-off measurements, and yet, researchers come and write papers based on data from DNSViz. HTH. -- Petr Špaček
Hi, On Thu, Dec 15, 2022 at 10:41:42AM -0800, Steve Gibbard wrote:
Atlas, and the RIPE NCC, have two fairly separate constituencies: researchers and operators.
This. Operators (like me) are willing to host Atlas anchors and probes, and thus contribute to the system. I might be troubleshooting something in our network where I have no interest in making the results public. So I value the option to have non-public measurements. There's no "right to see all measurements" here - if someone wants to see something, they are free to run their own measurements with their own credits. What I do with my credits (which do not come for free) and who can see the results should be my decision. Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard, Michael Emmer Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
Hi, On Fri, Dec 16, 2022 at 07:13:25PM +0100, Gert Doering wrote:
There's no "right to see all measurements" here - if someone wants to see something, they are free to run their own measurements with their own credits. What I do with my credits (which do not come for free) and who can see the results should be my decision.
... I could see a compromise here, making non-public measurements require more credits, or so... Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard, Michael Emmer Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
Sorry Gert, but i strongly disagree.
I might be troubleshooting something in our network where I have no interest in making the results public. RIPE Atlas is primarily designed for INTERnet measurements, not Intranet. To see, how things look like from other networks. If you want to troubleshoot something inside your own network and need privacy, you can do it the classic way, by using your internal test-equipment or monitoring.
There's no "right to see all measurements" here I highly appreciate RIPE NCCs efforts for maximum transparency and open-data, and i hope this will never change. This is what Atlas was intended to be (as far as i can tell).
if someone wants to see something, they are free to run their own measurements with their own credits. In this point, i disagree as well. It shouldn't be necessary to run the same measurement multiple times. For what reason? It would be a waste of credits, time and ressources. Also, keep in mind that this redundant data has to be stored somewhere.
If one has a problem with open-data philosophy, he shouldn't participate in this project. _There's only one exception_ i could agree with: measurements can be private, if you only deploy your own probes/anchors for that measurement. But if you decide to use other peoples probes, the results should be public. BR, Simon On 16.12.22 19:13, Gert Doering wrote:
Hi,
On Thu, Dec 15, 2022 at 10:41:42AM -0800, Steve Gibbard wrote:
Atlas, and the RIPE NCC, have two fairly separate constituencies: researchers and operators. This.
Operators (like me) are willing to host Atlas anchors and probes, and thus contribute to the system.
I might be troubleshooting something in our network where I have no interest in making the results public. So I value the option to have non-public measurements.
There's no "right to see all measurements" here - if someone wants to see something, they are free to run their own measurements with their own credits. What I do with my credits (which do not come for free) and who can see the results should be my decision.
Gert Doering -- NetMaster
Hi, On Fri, Dec 16, 2022 at 08:07:14PM +0100, ripe.net@toppas.net wrote:
Sorry Gert, but i strongly disagree.
I might be troubleshooting something in our network where I have no interest in making the results public. RIPE Atlas is primarily designed for INTERnet measurements, not Intranet. To see, how things look like from other networks. If you want to troubleshoot something inside your own network and need privacy, you can do it the classic way, by using your internal test-equipment or monitoring.
I am an ISP. So I *must* see how things look like from the outside - but in case I've done stupid things to my own networks before, I might not want the world to see that.
There's no "right to see all measurements" here I highly appreciate RIPE NCCs efforts for maximum transparency and open-data, and i hope this will never change. This is what Atlas was intended to be (as far as i can tell).
All my probes do public and open measurements "for everyone", and they are happy to do measurements for you. And I'm happy to bear the costs for that (an Atlas Anchor costs real money).
if someone wants to see something, they are free to run their own measurements with their own credits. In this point, i disagree as well. It shouldn't be necessary to run the same measurement multiple times. For what reason? It would be a waste of credits, time and ressources. Also, keep in mind that this redundant data has to be stored somewhere.
You wouldn't run the same measurements that I want to do, if I have to do some troubleshooting.
If one has a problem with open-data philosophy, he shouldn't participate in this project.
_There's only one exception_ i could agree with: measurements can be private, if you only deploy your own probes/anchors for that measurement. But if you decide to use other peoples probes, the results should be public.
This is a valid opinion for a network researcher, I guess. For those that actually run the show here, we might see things in a different light... (as I'm not the only one explaining the desire for private measurements). Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard, Michael Emmer Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
i do not understand the big fuss here. so a teensie fraction of probes are not public. big deal. some weeks we seem to want to be maxed out regulators and keepers of the total moral truth. the net is complex and rich. this is a good thing. randy
On Fri, Dec 16, 2022 at 02:57:40PM -0800, Randy Bush <randy@psg.com> wrote a message of 12 lines which said:
i do not understand the big fuss here. so a teensie fraction of probes are not public. big deal.
Apparently, the problem is that the cost of private measurements is not zero: it makes the Atlas backend code much more complex, with added security issues (ensuring that the private measurementrs stay private). I'm sure we all agree that complexity is something to be reduced. Also, this is not just operators vs. researchers. We are operator (.fr name registry) and we never use private measurements.
RIPE Atlas is primarily designed for INTERnet measurements, not Intranet.
Emphasize on the "primarily", that doesn't directly mean private measurements shouldn't exist.
If one has a problem with open-data philosophy, he shouldn't participate in this project.
The existence of a token system already incentivizes contributing, without just making good public measurements that contribute back open data.
It shouldn't be necessary to run the same measurement multiple times. For what reason?
Because things change in the wild?
There's only one exception
Well... So in addition to lack of a good basis in principle, non-private measurements might also pose an unnecessary additional information leakage. In addition to that, many measurements are just absolutely useless for others. I am also certain there are people who only contribute (by hosting a probe) for that reason - running a few private measurements.
Hello, I support Gert here, network operators (LIRs) can have valid reasons to make some their measurements non-public. So I don't support removal of this feature. It's a bad idea... If some probe host has problem with that, why don't mark such probes as not-available for private measurements (this can be implemented easily)? And I think there will be only minority of probes marked like that. Majority of hosts will not care at all... And keep in mind that Atlas is funded by LIRs and their money. All the big-data infrastructure (and also making of hardware probes) costs real (and not small) money. Existence of rivate measurements might be one reason, why LIRs allow spending money for this useful project. Probe hosting is only small piece in expenses within this project... - Daniel On 12/16/22 19:13, Gert Doering wrote:
Hi,
On Thu, Dec 15, 2022 at 10:41:42AM -0800, Steve Gibbard wrote:
Atlas, and the RIPE NCC, have two fairly separate constituencies: researchers and operators.
This.
Operators (like me) are willing to host Atlas anchors and probes, and thus contribute to the system.
I might be troubleshooting something in our network where I have no interest in making the results public. So I value the option to have non-public measurements.
There's no "right to see all measurements" here - if someone wants to see something, they are free to run their own measurements with their own credits. What I do with my credits (which do not come for free) and who can see the results should be my decision.
Gert Doering -- NetMaster
I agree with Gert (and Daniel), in whatever way continuing private measurements can be retained. The RIPE-Atlas project and infrastructure is a significant benefit to network operators, as well as to the researchers (and the Internet as a whole). The points Gert mentions are valid and demonstrate a significant operator use case. In addition to what Robert mentioned in the article, with the increasing use of 'edge connectivity', CDN, and anycast, operators need to test against the IP-unicast foundation of services - something they do not necessarily want 'the Internet' to know about, and something they certainly don't want black-hats to target with DDoS. Being able to run measurements against 'hidden IP addresses' to validate connectivity (for an Internet-facing service) in this way, and being fairly confident that measurements aren't highlighted in public data can be important. Might it be possible to reassess the current method of filtering non-public measurements, to perhaps simplify? Thanks Ivan On 16/12/2022 21:39, Daniel Suchy via ripe-atlas wrote:
Hello, I support Gert here, network operators (LIRs) can have valid reasons to make some their measurements non-public. So I don't support removal of this feature. It's a bad idea...
If some probe host has problem with that, why don't mark such probes as not-available for private measurements (this can be implemented easily)? And I think there will be only minority of probes marked like that. Majority of hosts will not care at all...
And keep in mind that Atlas is funded by LIRs and their money. All the big-data infrastructure (and also making of hardware probes) costs real (and not small) money. Existence of rivate measurements might be one reason, why LIRs allow spending money for this useful project. Probe hosting is only small piece in expenses within this project...
- Daniel
On 12/16/22 19:13, Gert Doering wrote:
Hi,
On Thu, Dec 15, 2022 at 10:41:42AM -0800, Steve Gibbard wrote:
Atlas, and the RIPE NCC, have two fairly separate constituencies: researchers and operators.
This.
Operators (like me) are willing to host Atlas anchors and probes, and thus contribute to the system.
I might be troubleshooting something in our network where I have no interest in making the results public. So I value the option to have non-public measurements.
There's no "right to see all measurements" here - if someone wants to see something, they are free to run their own measurements with their own credits. What I do with my credits (which do not come for free) and who can see the results should be my decision.
Gert Doering -- NetMaster
-- Ivan Beveridge <ivan.beveridge@dreamtime.org>
i asked this quietly before, but let me be more direct. how much private data is there actually? what percentage of the stored data is private? randy
Hi, On Fri, Dec 30, 2022 at 08:42:18AM -0800, Randy Bush wrote:
i asked this quietly before, but let me be more direct.
how much private data is there actually? what percentage of the stored data is private?
Indeed, this is a good question. If it turns out that there is a lone private measurement from me, or "in the sub-percent range", I'm willing to let myself be convinced in the name of "make the measurement backend more simple = robust"... Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard, Michael Emmer Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
Hello, I seem to remember that a typical non-public measurement is a one-off, and as such the data volume collected and stored is likely to be relatiely low. I don't have the numbers readily available to me at this moment, but we can certainly dig up recent statistics about this. Cheers, Robert On Fri, Dec 30, 2022 at 5:42 PM Randy Bush <randy@psg.com> wrote:
i asked this quietly before, but let me be more direct.
how much private data is there actually? what percentage of the stored data is private?
randy
-- ripe-atlas mailing list ripe-atlas@ripe.net https://lists.ripe.net/mailman/listinfo/ripe-atlas
hi robert, and thanks
I seem to remember that a typical non-public measurement is a one-off, and as such the data volume collected and stored is likely to be relatiely low.
interesting
I don't have the numbers readily available to me at this moment, but we can certainly dig up recent statistics about this.
i did not intend to create a lot of work for you. but, as a measurement person, i am a bit fond of basing decisions on facts. randy
Hi, On 12/31/22 18:37, Randy Bush wrote:
hi robert,
and thanks
I seem to remember that a typical non-public measurement is a one-off, and as such the data volume collected and stored is likely to be relatiely low. interesting
There is an old (late 2016) labs post worth referring to for this discussion [1]. In 2020 there was a survey of the users of private measurements [2], which was intended to get feedback on their actual use-cases. I seem to recall a very small set of respondents from an already small set of surveyees. The results are probably lost in the ether. I feel there's an aspect of the broader question that is rarely touched upon in these discussions: would user behaviour actually change if non-public measurements went away? I'm never sure how many use-cases are truly secret. Cheers, S. [1] https://labs.ripe.net/author/kistel/non-public-measurements-in-ripe-atlas/ [2] https://www.ripe.net/participate/forms/apply/ripe-atlas-private-measurements...
I don't have the numbers readily available to me at this moment, but we can certainly dig up recent statistics about this. i did not intend to create a lot of work for you. but, as a measurement person, i am a bit fond of basing decisions on facts.
randy
On 31. 12. 22 14:30, Róbert Kisteleki wrote:
Hello,
I seem to remember that a typical non-public measurement is a one-off, and as such the data volume collected and stored is likely to be relatiely low. I don't have the numbers readily available to me at this moment, but we can certainly dig up recent statistics about this.
Stats would be interesting. Another thing to consider is that lots of one-off measurements might mean "not much data" and at the same time "inflating database indices a lot" (or not, depending on the schema). Petr Špaček
Cheers, Robert
On Fri, Dec 30, 2022 at 5:42 PM Randy Bush <randy@psg.com> wrote:
i asked this quietly before, but let me be more direct.
how much private data is there actually? what percentage of the stored data is private?
randy
I might be troubleshooting something in our network where I have no interest in making the results public. So I value the option to have non-public measurements.
There's no "right to see all measurements" here - if someone wants to see something, they are free to run their own measurements with their own credits. What I do with my credits (which do not come for free) and who can see the results should be my decision.
The problem with non-public measurements for the RIPE NCC (at least, when I was still working there, but I doubt the situation has changed), is that keeping track of whether a measurement result is public or not causes significant code complexity. So the question is really whether the relatively small amount of use that is made of non-public measurements is worth the cost to RIPE NCC members for the ongoing maintainance of this feature. Without non-public measurements, measurement results could be stored anywhere, meta-data of measurements could be stored aynwhere. With non-public measurrements, some measurement results have to stored such that they are non generally accessible. Of course API endpoints do need access to that data. At some point there is just too much code that needs to know about non-public measurements.
If you can count on one hand the number of users using >90% of the private measurements over a longer timeframe than two weeks, then I submit that the choice is clear.
This information is somewhat harder to extract, but we'll try and get back to you! Cheers, Robert
participants (15)
-
Alexander Burke
-
Daniel Suchy
-
Gert Doering
-
Ivan Beveridge
-
Magnus Sandberg
-
Petr Špaček
-
Philip Homburg
-
Randy Bush
-
ripe.net@toppas.net
-
Robert Kisteleki
-
Róbert Kisteleki
-
Stephane Bortzmeyer
-
Stephen Strowes
-
Steve Gibbard
-
Taavi Eomäe