Colleagues [Apologies for the length of this email...] The chairs would like to suggest creating a new NWI for the "geofeed:" attribute and suggest the following draft Problem statement and Solution definition. If there is agreement on this the RIPE NCC will do an impact assessment including legal review and summary of what the other RIRs are doing. There has already been quite a discussion in the WG on this issue with a lot of support. So hopefully we can reach a consensus quickly, at least on setting up the NWI and starting the impact assessment. Having read the latest draft IETF docs, there are some outstanding questions. Comments and changes are welcome... cheers denis co-chair DB-WG Problem statement Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment. The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database. Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. Solution definition Implement a new "geofeed:" attribute according to the IETF's definition. Although the IETF has not yet concluded discussions on this attribute we can still implement it in the RIPE Database RPSL data definition. The RIPE Database already has many local differences to the RPSL standard. As expressed in the Problem statement, users are already using the geofeed data by overloading the "remarks:" attribute. That is a dirty hack which should be avoided. An invalid formated url will be a syntax error. The RIPE NCC will perform a one time conversion of the existing data to convert "remarks: geofeed: url" to "geofeed: url". If an update then contains a "remarks: geofeed: url" attribute, the update will be successful and the response should include an appropriate Warning message. At some point in the (near) future, this could be changed to an update failure as a syntax error. An update containing a "geofeed:" and a "remarks: geofeed:" attribute or more than one "remarks: geofeed:" will be a syntax error. The resource holder should be able to create, modify, delete the "geofeed:" attribute in allocation objects. Questions: -Should the database software do any checks on the existence/reachability of the url as part of the update with an error if the check fails? -Should the RIPE NCC do any periodic repeat checks on the continued existence/reachability of the url? -Should the RIPE NCC do any periodic checks on the content structure of the csv file referenced by the url? -Should the Solution definition define how this will be adopted into RDAP or should we simply ask the RIPE NCC to define this in their impact assessment? -The RIPE Database contains hierarchical address space objects. Should it be acceptable for "geofeed:" attributes to exist at multiple levels within a specific hierarchy? -Suppose a geofeed csv file referenced by a /16 INETNUM object contains location data for the whole /16. Then a more specific /24 INETNUM object references another geofeed csv file that contains conflicting location data for this /24. Should this be a concern for the RIPE Database? -Should geofeed data be inherited? If you query for a /24 that does not contain a "geofeed:" attribute, but a less specific /16 does contain a "geofeed:" attribute, should this data be returned? In other words could it be used in a similar way to "abuse-c:"? -Thinking ahead to how people will actually deploy this data and what short cuts they could make. It is said that when reading a geofeed csv file, consumers of the data should ignore all data within that file not directly concerning the address space queried in the RIPE Database. Could you therefore create a single csv file with location data for all your address space and reference the same file in all your RIPE Database address objects? The address space owner could rely on the data consumer to pick out the correct piece of data for the relevant address space. The manager of the csv file then only has to work with one file. If this is possible and does happen (which the IETF doc 'Finding geofeeds' seems to suggest is possible for unsigned geofeed data), would it therefore make sense to apply "geofeed:" hierarchically as with "abuse-c:"? Allow a single, default "geofeed:" attribute in the ORGANISATION object to be applied to all that organisations address space, with the option of specific localised "geofeed:" attributes in address space objects. That could be a neater solution, and easier to setup, than applying thousands of references to the same geofeed file at a more specific level in the database. -Relating to the above 3 questions, should geofeed data only be considered applicable if returned by a specific geofeed locater application which takes into account the hierarchical nature of address space in the database? Otherwise do the standard database query mechanisms have to take into account this hierarchy and locate the most specific "geofeed:" attribute from the less specific objects? -Should/could the RIPE Database return the csv file as part of the query? If so should the file be cached (for how long?) to avoid too many downloads? -Should we only allow HTTPS urls? (Which the IETF doc 'Finding geofeeds' seems to suggest) -Should the RIPE NCC go ahead and implement this now, with our own set of RIPE rules? Or should we try to coordinate this and agree a set of common rules between all the RIRs before any deployment in the RIPE Database? -For the legal review, there are 2 statements in the IETF doc 'Finding geofeeds' which may be of concern: *[RFC8805] geofeed data may reveal the approximate location of an IP address, which might in turn reveal the approximate location of an individual user. Unfortunately, [RFC8805] provides no privacy guidance on avoiding or ameliorating possible damage due to this exposure of the user. In publishing pointers to geofeed files as described in this document the operator should be aware of this exposure in geofeed data and be cautious. All the privacy considerations of [RFC8805] Section 4 apply to this document. *It is significant that geofeed data may have finer granularity than the inetnum: which refers to them. It is clear that the RIPE NCC cannot prevent this data being referenced by objects in the RIPE Database. It is already being referenced from "remarks:" attributes. Perhaps the RIPE NCC should require (as part of their service agreement) that it's members obtain written consent from their customers to publish this location data, or at least inform the customers in writing that it will be published. Also, although RFC8805 says postcode is deprecated it is still provided for in the csv files. So anyone can still enter location data to this detail. -The IETF doc 'Finding geofeeds' suggests that geofeed information 'will be' available in bulk accessed whois data. In view of the privacy concerns above, is this likely? -The IETF doc 'Finding geofeeds' says "To minimize the load on RIR whois [RFC3912] services, use of the RIR's FTP [RFC0959] services SHOULD be the preferred access." Is the RIPE NCC expected to download all the geofeed files and make them available through their FTP service? -The IETF doc 'Finding geofeeds' states that consumers of the geofeed data MUST NOT access this data in real time via the RPSL servers 'too frequently' or at 'magic times like midnight'. Some users will do whatever they want to do if they are able to do, regardless of any statements to the contrary. Should the RIPE NCC enforce such access rules by some means? References The IETF doc 'Finding geofeeds': https://datatracker.ietf.org/doc/draft-ietf-opsawg-finding-geofeeds/?include... geofeed file format: https://www.rfc-editor.org/rfc/rfc8805.html
Hi Denis, I have CCed Randy Bush as I thought he might be able to clarify what was meant by the following:
To minimize the load on RIR whois [RFC3912] services, use of the RIR's FTP [RFC0959] services SHOULD be the preferred access. This also provides bulk access instead of fetching with a tweezers.
I think one of the most important things in general here is seeing what is within the scope of the db-wg to decide and what should probably be defined by the IETF spec. And also to try to get some kind of implementation out as quickly as possible while still doing it properly. Because as you mention, the remarks format appears to be used quite a bit and I imagine it probably grows in use at a decent rate. (I have no data to back this up though, it is just a guess) I have responded to some of the questions below.
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
Solution definition
Implement a new "geofeed:" attribute according to the IETF's definition. Although the IETF has not yet concluded discussions on this attribute we can still implement it in the RIPE Database RPSL data definition. The RIPE Database already has many local differences to the RPSL standard. As expressed in the Problem statement, users are already using the geofeed data by overloading the "remarks:" attribute. That is a dirty hack which should be avoided.
An invalid formated url will be a syntax error.
The RIPE NCC will perform a one time conversion of the existing data to convert "remarks: geofeed: url" to "geofeed: url".
I think this might need more careful consideration as all software consuming this data might not support this instantly as it is still a draft. Converting while still keeping both is totally fine with me. (as in adding the geofeed: url based on the remarks: geofeed: url)
If an update then contains a "remarks: geofeed: url" attribute, the update will be successful and the response should include an appropriate Warning message. At some point in the (near) future, this could be changed to an update failure as a syntax error.
An update containing a "geofeed:" and a "remarks: geofeed:" attribute or more than one "remarks: geofeed:" will be a syntax error.
I don't agree with this, I don't think there should be any checking on syntax of remarks. If it is just a warning, that is fine, but I am not too positive on the idea of having syntax errors on free form data like remarks. Only one geofeed attribute should be allowed, but other than warnings, I am not too positive on the syntax checking of remarks.
The resource holder should be able to create, modify, delete the "geofeed:" attribute in allocation objects.
I feel this might be a bit vague, and just clarifying that it is inet(6)num objects (and potentially organisation objects) that can have this attribute. And that it is set by the maintainer of the object, provided that it fulfills the syntax requirement and any reachability requirements, just like domain objects. This is just to clarify that it would just be a database thing and wouldn't be done via the LIR portal or similar. And also that if the mnt-by was delegated, that maintainer would have the auth to change it.
Questions:
-Should the database software do any checks on the existence/reachability of the url as part of the update with an error if the check fails?
I would say yes as this is not a new concept to the DB as I believe this is already done with domain objects.
-Should the RIPE NCC do any periodic repeat checks on the continued existence/reachability of the url?
I would say that checking once a month or so could be fine, as long as it just results in a just a nudge email. Like don't enforce it, but nudge people if it is down.
-Should the RIPE NCC do any periodic checks on the content structure of the csv file referenced by the url?
I don't have a strong opinion either way here but I feel like that is not really something the NCC is responsible for checking. But if the NCC should check then my comments about the repeat reachability checks above apply here too.
-Should the Solution definition define how this will be adopted into RDAP or should we simply ask the RIPE NCC to define this in their impact assessment?
I don't mind either way.
-The RIPE Database contains hierarchical address space objects. Should it be acceptable for "geofeed:" attributes to exist at multiple levels within a specific hierarchy?
In my opinion, yes. I have not thought too deeply about this yet but currently I can't think of a good reason not to.
-Suppose a geofeed csv file referenced by a /16 INETNUM object contains location data for the whole /16. Then a more specific /24 INETNUM object references another geofeed csv file that contains conflicting location data for this /24. Should this be a concern for the RIPE Database?
The most specific geofeed should be returned in my opinion.
-Should geofeed data be inherited? If you query for a /24 that does not contain a "geofeed:" attribute, but a less specific /16 does contain a "geofeed:" attribute, should this data be returned? In other words could it be used in a similar way to "abuse-c:"?
I think it should be handled like abuse-c. I have not thought too much about what complications this might have yet though but currently I can't see any issues with it.
-Thinking ahead to how people will actually deploy this data and what short cuts they could make. It is said that when reading a geofeed csv file, consumers of the data should ignore all data within that file not directly concerning the address space queried in the RIPE Database. Could you therefore create a single csv file with location data for all your address space and reference the same file in all your RIPE Database address objects? The address space owner could rely on the data consumer to pick out the correct piece of data for the relevant address space. The manager of the csv file then only has to work with one file. If this is possible and does happen (which the IETF doc 'Finding geofeeds' seems to suggest is possible for unsigned geofeed data), would it therefore make sense to apply "geofeed:" hierarchically as with "abuse-c:"? Allow a single, default "geofeed:" attribute in the ORGANISATION object to be applied to all that organisations address space, with the option of specific localised "geofeed:" attributes in address space objects. That could be a neater solution, and easier to setup, than applying thousands of references to the same geofeed file at a more specific level in the database.
I am well aware that this kind of stuff happens in practice even with IP space between different RIRs (like ARIN and RIPE NCC space in the same CSV). I feel like the NCC shouldn't really be concerned about the data in the CSV file but rather just about publishing the URL to it.
-Relating to the above 3 questions, should geofeed data only be considered applicable if returned by a specific geofeed locater application which takes into account the hierarchical nature of address space in the database? Otherwise do the standard database query mechanisms have to take into account this hierarchy and locate the most specific "geofeed:" attribute from the less specific objects?
I don't quite understand the question here? Could you please clarify?
-Should/could the RIPE Database return the csv file as part of the query? If so should the file be cached (for how long?) to avoid too many downloads?
I don't see a reason for this, especially as I could imagine these lists being huge in some cases. It feels like it might be opening the NCC up to unnecessary liability considering the privacy concerns. (clarification: I don't think the database should return the CSV)
-Should we only allow HTTPS urls? (Which the IETF doc 'Finding geofeeds' seems to suggest)
The current draft seems pretty clear on this to me, and it also makes sense to me, so I would say yes. And if not, then it should be restricted to only http and https (aka not ftp or any other protocols).
-Should the RIPE NCC go ahead and implement this now, with our own set of RIPE rules? Or should we try to coordinate this and agree a set of common rules between all the RIRs before any deployment in the RIPE Database?
I would say that the RIPE NCC should implement this quickly as the remarks method already has quite a bit of use and it will probably keep growing quickly. Especially as this seems sort of basic in terms of the frontend, like how the URLs are validated could change in the future without the WHOIS format changing. With regards to the potential list in the RIR FTP, that should maybe be coordinated with different RIRs or specified in the IETF spec.
-For the legal review, there are 2 statements in the IETF doc 'Finding geofeeds' which may be of concern: *[RFC8805] geofeed data may reveal the approximate location of an IP address, which might in turn reveal the approximate location of an individual user. Unfortunately, [RFC8805] provides no privacy guidance on avoiding or ameliorating possible damage due to this exposure of the user. In publishing pointers to geofeed files as described in this document the operator should be aware of this exposure in geofeed data and be cautious. All the privacy considerations of [RFC8805] Section 4 apply to this document. *It is significant that geofeed data may have finer granularity than the inetnum: which refers to them.
It is clear that the RIPE NCC cannot prevent this data being referenced by objects in the RIPE Database. It is already being referenced from "remarks:" attributes. Perhaps the RIPE NCC should require (as part of their service agreement) that it's members obtain written consent from their customers to publish this location data, or at least inform the customers in writing that it will be published.
Also, although RFC8805 says postcode is deprecated it is still provided for in the csv files. So anyone can still enter location data to this detail.
I am not a lawyer by any means, but I don't see this necessarily being an issue as long as the NCC just links to URLs provided by resource holders. And with regards to it being part of the service agreement I feel like that would be very complicated when you consider PI resources etc. I think trying to get written consent from resource holder's customers should absolutely be avoided if possible. I don't see why anyone would actually be doing this kind of stuff and seems like it would probably be rare if it happens without the customers consent. (as in putting it down to a very specific place)
-The IETF doc 'Finding geofeeds' suggests that geofeed information 'will be' available in bulk accessed whois data. In view of the privacy concerns above, is this likely?
Can you clarify where this is mentioned? Is it part of the quote below?
-The IETF doc 'Finding geofeeds' says "To minimize the load on RIR whois [RFC3912] services, use of the RIR's FTP [RFC0959] services SHOULD be the preferred access." Is the RIPE NCC expected to download all the geofeed files and make them available through their FTP service?
I don't quite interpret it like that, I rather interpret it as the RIPE NCC (and other RIRs) publishing a list of all prefixes and their geofeed URL. I imagine it like the delegated file, but just prefixes and geofeed URLs. But I will say it seems a bit unclear, maybe Randy Bush or one of the other authors could comment on the intention here.
-The IETF doc 'Finding geofeeds' states that consumers of the geofeed data MUST NOT access this data in real time via the RPSL servers 'too frequently' or at 'magic times like midnight'. Some users will do whatever they want to do if they are able to do, regardless of any statements to the contrary. Should the RIPE NCC enforce such access rules by some means?
For access via WHOIS, I would say no as that would probably be way too complicated. If a list like I suggested above was to be implemented, then I guess it could be implemented to make sure people didn't pull it down every 5 minutes as it would probably be a pretty large file.
References
The IETF doc 'Finding geofeeds': https://datatracker.ietf.org/doc/draft-ietf-opsawg-finding-geofeeds/?include...
geofeed file format: https://www.rfc-editor.org/rfc/rfc8805.html
-Cynthia
I have CCed Randy Bush as I thought he might be able to clarify what was meant by the following:
To minimize the load on RIR whois [RFC3912] services, use of the RIR's FTP [RFC0959] services SHOULD be the preferred access. This also provides bulk access instead of fetching with a tweezers.
I think one of the most important things in general here is seeing what is within the scope of the db-wg to decide and what should probably be defined by the IETF spec. And also to try to get some kind of implementation out as quickly as possible while still doing it properly. Because as you mention, the remarks format appears to be used quite a bit and I imagine it probably grows in use at a decent rate. (I have no data to back this up though, it is just a guess)
massimo is more aware of the details of current deployment and how that affects the rirs. i suspect the load of the remarks: versions would be the same as that of geofeed:. the suggestion to use bulk fetch as opposed to object-by-object seems simple and the benefits should be pretty obvious i would think. see msssimo's existing tooling (cited in the internet-draft) for implementation.>> -The IETF doc 'Finding geofeeds' says "To minimize the load on RIR
whois [RFC3912] services, use of the RIR's FTP [RFC0959] services SHOULD be the preferred access." Is the RIPE NCC expected to download all the geofeed files and make them available through their FTP service?
I don't quite interpret it like that, I rather interpret it as the RIPE NCC (and other RIRs) publishing a list of all prefixes and their geofeed URL. I imagine it like the delegated file, but just prefixes and geofeed URLs.
But I will say it seems a bit unclear, maybe Randy Bush or one of the other authors could comment on the intention here.
again, see the actual implementation referenced in the internet-draft
For access via WHOIS, I would say no as that would probably be way too complicated.
yep
The IETF doc 'Finding geofeeds': https://datatracker.ietf.org/doc/draft-ietf-opsawg-finding-geofeeds/
randy --- randy@psg.com `gpg --locate-external-keys --auto-key-locate wkd randy@psg.com` signatures are back, thanks to dmarc header butchery
Hi all, On 06/04/2021 19:23, Randy Bush wrote:
I have CCed Randy Bush as I thought he might be able to clarify what was meant by the following:
To minimize the load on RIR whois [RFC3912] services, use of the RIR's FTP [RFC0959] services SHOULD be the preferred access. This also provides bulk access instead of fetching with a tweezers.
I think one of the most important things in general here is seeing what is within the scope of the db-wg to decide and what should probably be defined by the IETF spec. And also to try to get some kind of implementation out as quickly as possible while still doing it properly. Because as you mention, the remarks format appears to be used quite a bit and I imagine it probably grows in use at a decent rate. (I have no data to back this up though, it is just a guess)
massimo is more aware of the details of current deployment and how that affects the rirs. i suspect the load of the remarks: versions would be the same as that of geofeed:. the suggestion to use bulk fetch as opposed to object-by-object seems simple and the benefits should be pretty obvious i would think. see msssimo's existing tooling (cited in the internet-draft) for implementation.>> -The IETF doc 'Finding geofeeds' says "To minimize the load on RIR
At the moment there are 1350 prefixes with geofeeds coming from remarks/comments, mostly RIPE and ARIN. This number more than doubled in the last ~4 months. If a service is interested in importing all the geofeeds, the generic daily whois dumps produced by the rirs are perfect---including the public anonymized ones, since we just need the remarks. The implementation [1] uses such dumps. To the best of my knowledge, the geo providers supporting our draft are also using such dumps (apparently some were already familiar with them). Ciao, Massimo [1] https://github.com/massimocandela/geofeed-finder
Hi Massimo Your data does not match the data I got from the RIPE NCC... On Wed, 7 Apr 2021 at 01:37, Massimo Candela via db-wg <db-wg@ripe.net> wrote:
Hi all,
At the moment there are 1350 prefixes with geofeeds coming from remarks/comments, mostly RIPE and ARIN. This number more than doubled in the last ~4 months.
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. cheers denis co-chair DB-WG
If a service is interested in importing all the geofeeds, the generic daily whois dumps produced by the rirs are perfect---including the public anonymized ones, since we just need the remarks. The implementation [1] uses such dumps. To the best of my knowledge, the geo providers supporting our draft are also using such dumps (apparently some were already familiar with them).
Ciao, Massimo
Hi Denis, On 07/04/2021 02:02, denis walker wrote:
Your data does not match the data I got from the RIPE NCC...
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2]. Is the data in the FTP wrong? Am I doing something wrong? Ciao, Massimo [1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
HI Massimo I just checked the numbers Ed gave me and I misread the message. These are the numbers of objects with a "geoloc:" attribute not geofeed :( cheers denis co-chair DB-WG On Wed, 7 Apr 2021 at 02:56, Massimo Candela <massimo@us.ntt.net> wrote:
Hi Denis,
On 07/04/2021 02:02, denis walker wrote:
Your data does not match the data I got from the RIPE NCC...
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2].
Is the data in the FTP wrong? Am I doing something wrong?
Ciao, Massimo
[1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
Colleagues The chairs agree that there is a consensus to set up an NWI to create the "geoloc:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geoloc:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute. cheers denis co-chair DB-WG Problem statement Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment. The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database. Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc:" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute. On Wed, 7 Apr 2021 at 04:29, denis walker <ripedenis@gmail.com> wrote:
HI Massimo
I just checked the numbers Ed gave me and I misread the message. These are the numbers of objects with a "geoloc:" attribute not geofeed :(
cheers denis co-chair DB-WG
On Wed, 7 Apr 2021 at 02:56, Massimo Candela <massimo@us.ntt.net> wrote:
Hi Denis,
On 07/04/2021 02:02, denis walker wrote:
Your data does not match the data I got from the RIPE NCC...
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2].
Is the data in the FTP wrong? Am I doing something wrong?
Ciao, Massimo
[1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
Colleagues ** corrected version getting the attribute names right ** The chairs agree that there is a consensus to set up an NWI to create the "geofeed:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geofeed:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute. cheers denis co-chair DB-WG Problem statement Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment. The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database. Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute. On Mon, 12 Apr 2021 at 17:56, denis walker <ripedenis@gmail.com> wrote:
Colleagues
The chairs agree that there is a consensus to set up an NWI to create the "geoloc:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geoloc:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc:" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Wed, 7 Apr 2021 at 04:29, denis walker <ripedenis@gmail.com> wrote:
HI Massimo
I just checked the numbers Ed gave me and I misread the message. These are the numbers of objects with a "geoloc:" attribute not geofeed :(
cheers denis co-chair DB-WG
On Wed, 7 Apr 2021 at 02:56, Massimo Candela <massimo@us.ntt.net> wrote:
Hi Denis,
On 07/04/2021 02:02, denis walker wrote:
Your data does not match the data I got from the RIPE NCC...
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2].
Is the data in the FTP wrong? Am I doing something wrong?
Ciao, Massimo
[1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
Hi Denis, I've added NWI-13 to the Numbered Work Items page, with a link to the Problem statement below: https://www.ripe.net/manage-ips-and-asns/db/numbered-work-items I'll get to work on an impact analysis. Regards Ed Shryane RIPE NCC
On 12 Apr 2021, at 17:59, denis walker via db-wg <db-wg@ripe.net> wrote:
Colleagues
** corrected version getting the attribute names right **
The chairs agree that there is a consensus to set up an NWI to create the "geofeed:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geofeed:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Mon, 12 Apr 2021 at 17:56, denis walker <ripedenis@gmail.com> wrote:
Colleagues
The chairs agree that there is a consensus to set up an NWI to create the "geoloc:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geoloc:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc:" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Wed, 7 Apr 2021 at 04:29, denis walker <ripedenis@gmail.com> wrote:
HI Massimo
I just checked the numbers Ed gave me and I misread the message. These are the numbers of objects with a "geoloc:" attribute not geofeed :(
cheers denis co-chair DB-WG
On Wed, 7 Apr 2021 at 02:56, Massimo Candela <massimo@us.ntt.net> wrote:
Hi Denis,
On 07/04/2021 02:02, denis walker wrote:
Your data does not match the data I got from the RIPE NCC...
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2].
Is the data in the FTP wrong? Am I doing something wrong?
Ciao, Massimo
[1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
Hello Denis, Colleagues, Following is the impact analysis for the implementation of the "geofeed:" attribute in the RIPE database, based on the problem statement below and the draft RFC: https://tools.ietf.org/html/draft-ymbk-opsawg-finding-geofeeds I will ask our Legal team to conduct a full impact analysis of the implementation plan. Please reply with corrections or suggestions. Regards Ed Shryane RIPE NCC Impact Analysis for Implementing the "geofeed:" Attribute ============================================================ "geoloc:" Attribute ---------------------- Implementing the "geofeed:" attribute does not affect the "geoloc:" attribute. No decision has been taken on the future of the "geoloc:" attribute, a review can be done at a later date. "remarks:" Attribute ----------------------- Existing "remarks:" attributes in INETNUM or INET6NUM object types containing a "geofeed: url" value will not be automatically converted to a "geofeed:" attribute. The implementation will validate that an INETNUM or INET6NUM object may contain at most a single geofeed reference, either a "remarks:" attribute *or* a "geofeed:" attribute. More than one will result in an error on update. Any "remarks:" attributes in other object types will not be validated for geofeed references. "geofeed:" Attribute ----------------------- The "geofeed:" attribute will be added to the INETNUM and INET6NUM object types. It will be an optional, singly occurring attribute. The attribute value must consist only of a well-formed URL. Any other content in the attribute value will result in a syntax error. "geofeed:" URL ----------------- The URL in the "geofeed:" attribute will be validated that it is well-formed (i.e. syntactically correct). The URL must use the ASCII character set only (in the RIPE database, non-Latin-1 characters will be substituted with a '?' character). Non-ASCII characters in the URL host name must be converted to ASCII using Punycode in advance (before updating the RIPE database). Non-ASCII characters in the URL path must be converted using Percent-encoding in advance. Only the HTTPS protocol is supported in the URL, otherwise an error is returned. The reachability of the URL will not be checked. The content of the URL will not be validated. Database dump and Split files ---------------------------------- The "geofeed:" attribute will be included in the nightly database dump and split files. NRTM -------- The "geofeed:" attribute will be included in INETNUM and INET6NUM objects in the NRTM stream. Whois Queries ----------------- The "geofeed:" attribute will appear by default in (filtered) INETNUM and INET6NUM objects in Whois query responses, no additional query flag will be needed. RDAP ------------- The "geofeed:" attribute will not appear in RDAP responses. A separate RDAP profile will be needed to extend the response format to include geofeed. This can be implemented at a later date. Documentation --------------- The RIPE database documentation will be updated, including the inet(6)num object templates and attribute description (with a reference to the IETF draft document). Other RIRs ------------- There is currently no coordinated plan to implement "geofeed:" across regions. Other RIRs may implement "geofeed:" at a later date. Legal Review --------------- An initial review by the RIPE NCC Legal team found that geofeed data may qualify as personal data, and before introducing the "geofeed:" attribute a full impact analysis of its implementation would have to be conducted by the RIPE NCC. -----
On 12 Apr 2021, at 17:59, denis walker via db-wg <db-wg@ripe.net> wrote:
Colleagues
** corrected version getting the attribute names right **
The chairs agree that there is a consensus to set up an NWI to create the "geofeed:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geofeed:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Mon, 12 Apr 2021 at 17:56, denis walker <ripedenis@gmail.com> wrote:
Colleagues
The chairs agree that there is a consensus to set up an NWI to create the "geoloc:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geoloc:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc:" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Wed, 7 Apr 2021 at 04:29, denis walker <ripedenis@gmail.com> wrote:
HI Massimo
I just checked the numbers Ed gave me and I misread the message. These are the numbers of objects with a "geoloc:" attribute not geofeed :(
cheers denis co-chair DB-WG
On Wed, 7 Apr 2021 at 02:56, Massimo Candela <massimo@us.ntt.net> wrote:
Hi Denis,
On 07/04/2021 02:02, denis walker wrote:
Your data does not match the data I got from the RIPE NCC...
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2].
Is the data in the FTP wrong? Am I doing something wrong?
Ciao, Massimo
[1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
Hi Edward, Perfect! Thanks On 04/05/2021 22:35, Edward Shryane via db-wg wrote:
Hello Denis, Colleagues,
Following is the impact analysis for the implementation of the "geofeed:" attribute in the RIPE database, based on the problem statement below and the draft RFC: https://tools.ietf.org/html/draft-ymbk-opsawg-finding-geofeeds
I will ask our Legal team to conduct a full impact analysis of the implementation plan.
Please reply with corrections or suggestions.
Regards Ed Shryane RIPE NCC
Impact Analysis for Implementing the "geofeed:" Attribute ============================================================
"geoloc:" Attribute ---------------------- Implementing the "geofeed:" attribute does not affect the "geoloc:" attribute. No decision has been taken on the future of the "geoloc:" attribute, a review can be done at a later date.
"remarks:" Attribute ----------------------- Existing "remarks:" attributes in INETNUM or INET6NUM object types containing a "geofeed: url" value will not be automatically converted to a "geofeed:" attribute.
The implementation will validate that an INETNUM or INET6NUM object may contain at most a single geofeed reference, either a "remarks:" attribute *or* a "geofeed:" attribute. More than one will result in an error on update.
Any "remarks:" attributes in other object types will not be validated for geofeed references.
"geofeed:" Attribute ----------------------- The "geofeed:" attribute will be added to the INETNUM and INET6NUM object types. It will be an optional, singly occurring attribute.
The attribute value must consist only of a well-formed URL. Any other content in the attribute value will result in a syntax error.
"geofeed:" URL ----------------- The URL in the "geofeed:" attribute will be validated that it is well-formed (i.e. syntactically correct).
The URL must use the ASCII character set only (in the RIPE database, non-Latin-1 characters will be substituted with a '?' character).
Non-ASCII characters in the URL host name must be converted to ASCII using Punycode in advance (before updating the RIPE database).
Non-ASCII characters in the URL path must be converted using Percent-encoding in advance.
Only the HTTPS protocol is supported in the URL, otherwise an error is returned.
The reachability of the URL will not be checked. The content of the URL will not be validated.
Database dump and Split files ---------------------------------- The "geofeed:" attribute will be included in the nightly database dump and split files.
NRTM -------- The "geofeed:" attribute will be included in INETNUM and INET6NUM objects in the NRTM stream.
Whois Queries ----------------- The "geofeed:" attribute will appear by default in (filtered) INETNUM and INET6NUM objects in Whois query responses, no additional query flag will be needed.
RDAP ------------- The "geofeed:" attribute will not appear in RDAP responses. A separate RDAP profile will be needed to extend the response format to include geofeed. This can be implemented at a later date.
Documentation --------------- The RIPE database documentation will be updated, including the inet(6)num object templates and attribute description (with a reference to the IETF draft document).
Other RIRs ------------- There is currently no coordinated plan to implement "geofeed:" across regions. Other RIRs may implement "geofeed:" at a later date.
Legal Review --------------- An initial review by the RIPE NCC Legal team found that geofeed data may qualify as personal data, and before introducing the "geofeed:" attribute a full impact analysis of its implementation would have to be conducted by the RIPE NCC.
-----
On 12 Apr 2021, at 17:59, denis walker via db-wg <db-wg@ripe.net> wrote:
Colleagues
** corrected version getting the attribute names right **
The chairs agree that there is a consensus to set up an NWI to create the "geofeed:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geofeed:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Mon, 12 Apr 2021 at 17:56, denis walker <ripedenis@gmail.com> wrote:
Colleagues
The chairs agree that there is a consensus to set up an NWI to create the "geoloc:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geoloc:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc:" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Wed, 7 Apr 2021 at 04:29, denis walker <ripedenis@gmail.com> wrote:
HI Massimo
I just checked the numbers Ed gave me and I misread the message. These are the numbers of objects with a "geoloc:" attribute not geofeed :(
cheers denis co-chair DB-WG
On Wed, 7 Apr 2021 at 02:56, Massimo Candela <massimo@us.ntt.net> wrote:
Hi Denis,
On 07/04/2021 02:02, denis walker wrote:
Your data does not match the data I got from the RIPE NCC...
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2].
Is the data in the FTP wrong? Am I doing something wrong?
Ciao, Massimo
[1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
Hi Ed, This looks good to me :) -Cynthia On Tue, May 4, 2021 at 10:36 PM Edward Shryane via db-wg <db-wg@ripe.net> wrote:
Hello Denis, Colleagues,
Following is the impact analysis for the implementation of the "geofeed:" attribute in the RIPE database, based on the problem statement below and the draft RFC: https://tools.ietf.org/html/draft-ymbk-opsawg-finding-geofeeds
I will ask our Legal team to conduct a full impact analysis of the implementation plan.
Please reply with corrections or suggestions.
Regards Ed Shryane RIPE NCC
Impact Analysis for Implementing the "geofeed:" Attribute ============================================================
"geoloc:" Attribute ---------------------- Implementing the "geofeed:" attribute does not affect the "geoloc:" attribute. No decision has been taken on the future of the "geoloc:" attribute, a review can be done at a later date.
"remarks:" Attribute ----------------------- Existing "remarks:" attributes in INETNUM or INET6NUM object types containing a "geofeed: url" value will not be automatically converted to a "geofeed:" attribute.
The implementation will validate that an INETNUM or INET6NUM object may contain at most a single geofeed reference, either a "remarks:" attribute *or* a "geofeed:" attribute. More than one will result in an error on update.
Any "remarks:" attributes in other object types will not be validated for geofeed references.
"geofeed:" Attribute ----------------------- The "geofeed:" attribute will be added to the INETNUM and INET6NUM object types. It will be an optional, singly occurring attribute.
The attribute value must consist only of a well-formed URL. Any other content in the attribute value will result in a syntax error.
"geofeed:" URL ----------------- The URL in the "geofeed:" attribute will be validated that it is well-formed (i.e. syntactically correct).
The URL must use the ASCII character set only (in the RIPE database, non-Latin-1 characters will be substituted with a '?' character).
Non-ASCII characters in the URL host name must be converted to ASCII using Punycode in advance (before updating the RIPE database).
Non-ASCII characters in the URL path must be converted using Percent-encoding in advance.
Only the HTTPS protocol is supported in the URL, otherwise an error is returned.
The reachability of the URL will not be checked. The content of the URL will not be validated.
Database dump and Split files ---------------------------------- The "geofeed:" attribute will be included in the nightly database dump and split files.
NRTM -------- The "geofeed:" attribute will be included in INETNUM and INET6NUM objects in the NRTM stream.
Whois Queries ----------------- The "geofeed:" attribute will appear by default in (filtered) INETNUM and INET6NUM objects in Whois query responses, no additional query flag will be needed.
RDAP ------------- The "geofeed:" attribute will not appear in RDAP responses. A separate RDAP profile will be needed to extend the response format to include geofeed. This can be implemented at a later date.
Documentation --------------- The RIPE database documentation will be updated, including the inet(6)num object templates and attribute description (with a reference to the IETF draft document).
Other RIRs ------------- There is currently no coordinated plan to implement "geofeed:" across regions. Other RIRs may implement "geofeed:" at a later date.
Legal Review --------------- An initial review by the RIPE NCC Legal team found that geofeed data may qualify as personal data, and before introducing the "geofeed:" attribute a full impact analysis of its implementation would have to be conducted by the RIPE NCC.
-----
On 12 Apr 2021, at 17:59, denis walker via db-wg <db-wg@ripe.net> wrote:
Colleagues
** corrected version getting the attribute names right **
The chairs agree that there is a consensus to set up an NWI to create the "geofeed:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geofeed:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Mon, 12 Apr 2021 at 17:56, denis walker <ripedenis@gmail.com> wrote:
Colleagues
The chairs agree that there is a consensus to set up an NWI to create the "geoloc:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geoloc:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc:" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Wed, 7 Apr 2021 at 04:29, denis walker <ripedenis@gmail.com> wrote:
HI Massimo
I just checked the numbers Ed gave me and I misread the message. These are the numbers of objects with a "geoloc:" attribute not geofeed :(
cheers denis co-chair DB-WG
On Wed, 7 Apr 2021 at 02:56, Massimo Candela <massimo@us.ntt.net> wrote:
Hi Denis,
On 07/04/2021 02:02, denis walker wrote:
Your data does not match the data I got from the RIPE NCC...
From the RIPE NCC:
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "remarks: geofeed: url" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2].
Is the data in the FTP wrong? Am I doing something wrong?
Ciao, Massimo
[1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
On 05/05/2021 21:11, Cynthia Revström via db-wg wrote: I'd like to ask for a clarification to section 4 specifically: Both the address ranges of the signing certificate and of the inetnum: MUST cover all prefixes in the geofeed file; and the address range of the signing certificate must cover that of the inetnum:. An address range A 'covers' address range B if the range of B is identical to or a subset of A. 'Address range' is used here because inetnum: objects and RPKI certificates need not align on CIDR prefix boundaries, while those of geofeed lines must. What if you have a /16 as recorded by inetnum: as well as an RPKI certificate for that /16 but within the /16 there is a /24 that has been assigned to some other ASN? Can you publish a geofeed file for the /16? What if there is no inetnum: listed for that /24 yet in the global BGP tables there is an announcement of that /24 from a different ASN - would you still accept the geofeed announcement for the /16 based on inetnum: and RPKI cert? Regards, Hank
Hi Ed,
This looks good to me :)
-Cynthia
On Tue, May 4, 2021 at 10:36 PM Edward Shryane via db-wg <db-wg@ripe.net> wrote:
Hello Denis, Colleagues,
Following is the impact analysis for the implementation of the "geofeed:" attribute in the RIPE database, based on the problem statement below and the draft RFC: https://tools.ietf.org/html/draft-ymbk-opsawg-finding-geofeeds
I will ask our Legal team to conduct a full impact analysis of the implementation plan.
Please reply with corrections or suggestions.
Regards Ed Shryane RIPE NCC
Impact Analysis for Implementing the "geofeed:" Attribute ============================================================
"geoloc:" Attribute ---------------------- Implementing the "geofeed:" attribute does not affect the "geoloc:" attribute. No decision has been taken on the future of the "geoloc:" attribute, a review can be done at a later date.
"remarks:" Attribute ----------------------- Existing "remarks:" attributes in INETNUM or INET6NUM object types containing a "geofeed: url" value will not be automatically converted to a "geofeed:" attribute.
The implementation will validate that an INETNUM or INET6NUM object may contain at most a single geofeed reference, either a "remarks:" attribute *or* a "geofeed:" attribute. More than one will result in an error on update.
Any "remarks:" attributes in other object types will not be validated for geofeed references.
"geofeed:" Attribute ----------------------- The "geofeed:" attribute will be added to the INETNUM and INET6NUM object types. It will be an optional, singly occurring attribute.
The attribute value must consist only of a well-formed URL. Any other content in the attribute value will result in a syntax error.
"geofeed:" URL ----------------- The URL in the "geofeed:" attribute will be validated that it is well-formed (i.e. syntactically correct).
The URL must use the ASCII character set only (in the RIPE database, non-Latin-1 characters will be substituted with a '?' character).
Non-ASCII characters in the URL host name must be converted to ASCII using Punycode in advance (before updating the RIPE database).
Non-ASCII characters in the URL path must be converted using Percent-encoding in advance.
Only the HTTPS protocol is supported in the URL, otherwise an error is returned.
The reachability of the URL will not be checked. The content of the URL will not be validated.
Database dump and Split files ---------------------------------- The "geofeed:" attribute will be included in the nightly database dump and split files.
NRTM -------- The "geofeed:" attribute will be included in INETNUM and INET6NUM objects in the NRTM stream.
Whois Queries ----------------- The "geofeed:" attribute will appear by default in (filtered) INETNUM and INET6NUM objects in Whois query responses, no additional query flag will be needed.
RDAP ------------- The "geofeed:" attribute will not appear in RDAP responses. A separate RDAP profile will be needed to extend the response format to include geofeed. This can be implemented at a later date.
Documentation --------------- The RIPE database documentation will be updated, including the inet(6)num object templates and attribute description (with a reference to the IETF draft document).
Other RIRs ------------- There is currently no coordinated plan to implement "geofeed:" across regions. Other RIRs may implement "geofeed:" at a later date.
Legal Review --------------- An initial review by the RIPE NCC Legal team found that geofeed data may qualify as personal data, and before introducing the "geofeed:" attribute a full impact analysis of its implementation would have to be conducted by the RIPE NCC.
-----
On 12 Apr 2021, at 17:59, denis walker via db-wg <db-wg@ripe.net> wrote:
Colleagues
** corrected version getting the attribute names right **
The chairs agree that there is a consensus to set up an NWI to create the "geofeed:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geofeed:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Mon, 12 Apr 2021 at 17:56, denis walker <ripedenis@gmail.com> wrote:
Colleagues
The chairs agree that there is a consensus to set up an NWI to create the "geoloc:" attribute in the RIPE Database. We therefore ask the RIPE NCC to set up "NWI-13 Create a "geoloc:" attribute in the RIPE Database" Using the 'Problem statement' below. After the RIPE NCC completes it's impact analysis we can finalise the 'Solution definition'. The RIPE NCC can address any of the questions raised in this discussion that they feel are relevant to the basic creation of this attribute.
cheers denis co-chair DB-WG
Problem statement
Associating an approximate physical location with an IP address has proven to be a challenge to solve within the current constraints of the RIPE Database. Over the years the community has chosen to consider addresses in the RIPE Database to relate to entities in the assignment process itself, not the subsequent actual use of IP addresses after assignment.
The working group is asked to consider whether the RIPE Database can be used as a springboard for parties wishing to correlate geographical information with IP addresses by allowing structured references in the RIPE Database towards information outside the RIPE Database which potentially helps answer Geo IP Location queries
The IETF is currently discussing an update to RPSL to add a new attribute "geofeed: url". The url will reference a csv file containing location data. Some users have already started to make use of this feature via the "remarks: geofeed: url". It is never a good idea to try to overload structured data into the free format "remarks:" attribute. This has been done in the past, for example with abuse contact details before we introduced the "abuse-c:" attribute. There is no way to regulate what database users put into "remarks:" attributes. So even if the new "geofeed:" attribute is not agreed, the url data will still be included in the RIPE Database.
Currently there are 24,408 INETNUM and 516,354 INET6NUM objects containing a "geoloc:" attribute in the database. These have 7,731 distinct values in the INETNUMs and 1,045 distinct values in the INET6NUMs. There are about 150 objects in the RIPE Database with a "remarks: geoloc url" attribute.
On Wed, 7 Apr 2021 at 04:29, denis walker <ripedenis@gmail.com> wrote:
HI Massimo
I just checked the numbers Ed gave me and I misread the message. These are the numbers of objects with a "geoloc:" attribute not geofeed :(
cheers denis co-chair DB-WG
On Wed, 7 Apr 2021 at 02:56, Massimo Candela <massimo@us.ntt.net> wrote:
Hi Denis,
On 07/04/2021 02:02, denis walker wrote: > Your data does not match the data I got from the RIPE NCC... > > From the RIPE NCC: > > Currently there are 24,408 INETNUM and 516,354 INET6NUM objects > containing a "remarks: geofeed: url" attribute in the database. These > have 7,731 distinct values in the INETNUMs and 1,045 distinct values > in the INET6NUMs.
I cannot reproduce what you did. Even if I just "grep -i geofeed" in ripe.db.inetnum.gz from the ripe ncc ftp [1], I obtain only 132 items. And 39 in ripe.db.inet6num.gz. The same if I use the complete dump [2].
Is the data in the FTP wrong? Am I doing something wrong?
Ciao, Massimo
[1] https://ftp.ripe.net/ripe/dbase/split/ [2] https://ftp.ripe.net/ripe/dbase/ripe.db.gz
Hi Hank, On 06/05/2021 07:18, Hank Nussbacher via db-wg wrote:
What if you have a /16 as recorded by inetnum: as well as an RPKI certificate for that /16 but within the /16 there is a /24 that has been assigned to some other ASN? Can you publish a geofeed file for the /16?
What if there is no inetnum: listed for that /24 yet in the global BGP tables there is an announcement of that /24 from a different ASN - would you still accept the geofeed announcement for the /16 based on inetnum: and RPKI cert?
ASNs and BGP announcements do not come into play. If a /16 inetnum has a geofeed link, the pointed file can specify entries covering the /16. If a /24 inetnum with geofeed link exists, this takes priority for that /24 portion. RPKI signature (optional) can be used -after- the inetnum hierarchy is resolved to verify ownership of the prefix. Ciao, Massimo
Hi Massimo Does this mean geofeed values are only meaningful when used as a collection and not individually? Take this example with a /16 and a more specific /24 both having a geofeed attribute. If I query for an address within the /16 but outside of the more specific /24 I will get the INETNUM object for the /16 with it's geofeed attribute. But the data contained in the referenced file may not be correct for the more specific /24 range which has it's own referenced file. So for anyone using this data do you have to query for all objects containing a geofeed attribute and download all the referenced files to be sure of having correct information? Having downloaded all the data you would then have to correlate the data to check for overlapping values, discarding the less specific data for the /24 in this example. Is this how it works or have I missed something? cheers denis co-chair DB-WG On Thu, 6 May 2021 at 20:27, Massimo Candela via db-wg <db-wg@ripe.net> wrote:
Hi Hank,
On 06/05/2021 07:18, Hank Nussbacher via db-wg wrote:
What if you have a /16 as recorded by inetnum: as well as an RPKI certificate for that /16 but within the /16 there is a /24 that has been assigned to some other ASN? Can you publish a geofeed file for the /16?
What if there is no inetnum: listed for that /24 yet in the global BGP tables there is an announcement of that /24 from a different ASN - would you still accept the geofeed announcement for the /16 based on inetnum: and RPKI cert?
ASNs and BGP announcements do not come into play. If a /16 inetnum has a geofeed link, the pointed file can specify entries covering the /16. If a /24 inetnum with geofeed link exists, this takes priority for that /24 portion. RPKI signature (optional) can be used -after- the inetnum hierarchy is resolved to verify ownership of the prefix.
Ciao, Massimo
Hi Denis, On 06/05/2021 23:17, denis walker wrote:
Does this mean geofeed values are only meaningful when used as a collection and not individually? Take this example with a /16 and a more specific /24 both having a geofeed attribute. If I query for an address within the /16 but outside of the more specific /24 I will get the INETNUM object for the /16 with it's geofeed attribute. But the data contained in the referenced file may not be correct for the more specific /24 range which has it's own referenced file.
The fact you queried a specific IP doesn't allow to conclude anything on the entire prefix, the same goes for an inetnum and its more specifics. This is also the case if you provide geofeed files to a geolocation provider directly without passing for the whois. In that case they will have to validate the boundaries. In the address space there is no way to assign a property to an entire prefix without checking for more specifics.
So for anyone using this data do you have to query for all objects containing a geofeed attribute and download all the referenced files to be sure of having correct information? Having downloaded all the data you would then have to correlate the data to check for overlapping values, discarding the less specific data for the /24 in this example. Is this how it works or have I missed something?
The process is quite simple. If you are interested in a specific IP, you can just query for the IP, retrieve the geofeed and read the location. If you want to get everything: (1) parse the bulk whois data and get all the inetnums having a geofeed; (2) download all the geofeed files and remove all the prefixes not contained in the parent inetnum; (3) accept all geofeed entries which are coming from the most specific parent inetnum. The draft provides more details. Example of point 3: parent-inetnum: 1.2.0.0/16 geofeed-entry: 1.2.3.5,IT,IT-RM,Rome, parent-inetnum: 1.2.3.0/24 geofeed-entry: 1.2.3.5,IT,IT-MI,Milan, 1.2.3.5 is in Milan due to the parent inetnum. Ciao, Massimo
On 06/05/2021 21:27, Massimo Candela via db-wg wrote:
Hi Hank,
On 06/05/2021 07:18, Hank Nussbacher via db-wg wrote:
What if you have a /16 as recorded by inetnum: as well as an RPKI certificate for that /16 but within the /16 there is a /24 that has been assigned to some other ASN? Can you publish a geofeed file for the /16?
What if there is no inetnum: listed for that /24 yet in the global BGP tables there is an announcement of that /24 from a different ASN - would you still accept the geofeed announcement for the /16 based on inetnum: and RPKI cert?
ASNs and BGP announcements do not come into play.
That is what I too would have thought. But Google, which wrote RFC8805 sees it differently. I have submitted to Google via their ISP Portal the follow geo-feed file: http://noc.ilan.net.il/GGC/iucc-geo-feed-for-google.csv Google accepted 15 of 16 prefixes from AS378 but *rejected* 128.139.0.0/16 since there is a BGP announcement from AS8551 for 128.139.194.0/24. Google's solution of "split the ranges in the feed to not include the subranges announced by other ASNs" seems to interpret RFC8805 differently than previous RFCs. From RFC8805 section 3.2: A consumer should only trust geolocation information for IP addresses or prefixes for which the publisher has been verified as administratively authoritative. All other geolocation feed entries should be ignored and logged for further administrative review. AS8551 cannot be administrative authoritative for 128.139.194.0/24 simply because they find such a prefix in the global BGP table. Common whois checks (whois.ripe.net) determines who is authoritative for 128.139.0.0/16 as well as for 128.139.194.0/24. It would appear based on Google's interpretation and implementation of RFC8805 that they nullify the RIR registry mechanism and base administrative authority based solely on what appears in the BGP table. I believe this is incorrect and I have emailed the Google authors of RFC8805 and await a response from them. I am wondering whether others who will implement RFC8805 will also use BGP routing tables as authoritative for geo-location checks. Regards, Hank
If a /16 inetnum has a geofeed link, the pointed file can specify entries covering the /16. If a /24 inetnum with geofeed link exists, this takes priority for that /24 portion. RPKI signature (optional) can be used -after- the inetnum hierarchy is resolved to verify ownership of the prefix.
Ciao, Massimo
Hi Hank, On 07/05/2021 07:17, Hank Nussbacher via db-wg wrote:
From RFC8805 section 3.2:
A consumer should only trust geolocation information for IP addresses or prefixes for which the publisher has been verified as administratively authoritative.
I am wondering whether others who will implement RFC8805 will also use BGP routing tables as authoritative for geo-location checks.
You started a good topic. The main goal of RFC8805 is to describe a file format, the validation of resource ownership is up to the consumer. At least that's my interpretation. At the moment, this is done in various ways (including some "original" ones). However, what we are proposing builds on top of RFC8805 exactly with the goal to provide a defined and easy way to correctly retrieve such data. Please, see https://tools.ietf.org/html/draft-ietf-opsawg-finding-geofeeds-06 Open implementations, such as [1], will further ease this process. Ciao, Massimo [1] https://github.com/massimocandela/geofeed-finder
I am using https://apps.db.ripe.net/db-web-ui/fulltextsearch to do a full text search to return just the inetnum record but if I get 800 hits I get only 10 entries per screen. How can I see all 800 hits at once? Thanks for any clue provided. Regards, Hank
Hi Hank, The webpage is restricted to a maximum of 10 pages with 10 matches on each page, but if you don't mind reading XML or JSON you can make the call directly: XML: https://apps.db.ripe.net/db-web-ui/api/rest/fulltextsearch/select?facet=true&hl=true&q=(TEST)&start=0&rows=800 JSON: https://apps.db.ripe.net/db-web-ui/api/rest/fulltextsearch/select.json?facet=true&hl=true&q=(TEST)&start=0&rows=100 In this example I'm searching for "TEST". Add a "rows" query parameter to limit matches, and ".json" to the end of the path to specify a JSON response. The upcoming Whois 1.101 release exposes this API directly via https://rest.db.ripe.net/ and we will document it fully. Also we plan to improve the query page in the coming months, we will look at integrating full text search and allowing more results. Regards Ed Shryane RIPE NCC
On 17 Jun 2021, at 18:45, Hank Nussbacher via db-wg <db-wg@ripe.net> wrote:
I am using https://apps.db.ripe.net/db-web-ui/fulltextsearch to do a full text search to return just the inetnum record but if I get 800 hits I get only 10 entries per screen.
How can I see all 800 hits at once?
Thanks for any clue provided.
Regards, Hank
Hi Hank, I spoke too soon, Whois still restricts the maximum results to 100. I'll investigate how to retrieve more results and get back to you. Regards Ed
On 17 Jun 2021, at 23:38, Edward Shryane via db-wg <db-wg@ripe.net> wrote:
Hi Hank,
The webpage is restricted to a maximum of 10 pages with 10 matches on each page, but if you don't mind reading XML or JSON you can make the call directly:
XML: https://apps.db.ripe.net/db-web-ui/api/rest/fulltextsearch/select?facet=true&hl=true&q=(TEST)&start=0&rows=800 JSON: https://apps.db.ripe.net/db-web-ui/api/rest/fulltextsearch/select.json?facet=true&hl=true&q=(TEST)&start=0&rows=100
In this example I'm searching for "TEST". Add a "rows" query parameter to limit matches, and ".json" to the end of the path to specify a JSON response.
The upcoming Whois 1.101 release exposes this API directly via https://rest.db.ripe.net/ and we will document it fully.
Also we plan to improve the query page in the coming months, we will look at integrating full text search and allowing more results.
Regards Ed Shryane RIPE NCC
On 17 Jun 2021, at 18:45, Hank Nussbacher via db-wg <db-wg@ripe.net> wrote:
I am using https://apps.db.ripe.net/db-web-ui/fulltextsearch to do a full text search to return just the inetnum record but if I get 800 hits I get only 10 entries per screen.
How can I see all 800 hits at once?
Thanks for any clue provided.
Regards, Hank
Hi Hank,
On 17 Jun 2021, at 23:46, Edward Shryane via db-wg <db-wg@ripe.net> wrote:
Hi Hank,
I spoke too soon, Whois still restricts the maximum results to 100. I'll investigate how to retrieve more results and get back to you.
Regards Ed
Apologies for my delay in replying, I wanted to be sure, but there is currently a hard limit of 100 results in the backend API. We introduced this limit because the full text search engine is much more resource intensive than the database, and it was possible to crash Whois by querying for too many objects. We set a relatively low limit as we found that most users only requested the first page of results. The full text search API is currently only used by the DB web application, and we haven't opened it yet for general use. When we improve the query page in the upcoming months, we will also review the backend API, open it for general use and increase the query limit. Regards Ed Shryane RIPE NCC
On 18/06/2021 00:38, Edward Shryane wrote: This helps as a start. I had played with standard search: http://rest.db.ripe.net/search?type-filter=inetnum&query-string=TEST but as per the documentation in github: https://github.com/RIPE-NCC/whois/wiki/WHOIS-REST-API-search with: type-filter Optional. If specified the results will be filtered by object-type, multiple type-filters can be specified. doesn't seem to work since I am getting object-types other than inetnum (such as role or person). Unless I am doing something wrong. Can you clue me in to the proper syntax? Will type-filter=inetnum work on fulltextsearch as well? Incidentally, the first and second examples on that RIPE github page: http://rest.db.ripe.net/search?inverse-attribute=org&type-filter=inetnum&source=ripe&query-string=ORG-NCC1-RIPE results in: ERROR:101: no entries found No entries found in source %s. Seems like source=ripe is causing issues. Thanks, Hank
Hi Hank,
The webpage is restricted to a maximum of 10 pages with 10 matches on each page, but if you don't mind reading XML or JSON you can make the call directly:
XML: https://apps.db.ripe.net/db-web-ui/api/rest/fulltextsearch/select?facet=true&hl=true&q=(TEST)&start=0&rows=800 JSON: https://apps.db.ripe.net/db-web-ui/api/rest/fulltextsearch/select.json?facet=true&hl=true&q=(TEST)&start=0&rows=100
In this example I'm searching for "TEST". Add a "rows" query parameter to limit matches, and ".json" to the end of the path to specify a JSON response.
The upcoming Whois 1.101 release exposes this API directly via https://rest.db.ripe.net/ and we will document it fully.
Also we plan to improve the query page in the coming months, we will look at integrating full text search and allowing more results.
Regards Ed Shryane RIPE NCC
On 17 Jun 2021, at 18:45, Hank Nussbacher via db-wg <db-wg@ripe.net> wrote:
I am using https://apps.db.ripe.net/db-web-ui/fulltextsearch to do a full text search to return just the inetnum record but if I get 800 hits I get only 10 entries per screen.
How can I see all 800 hits at once?
Thanks for any clue provided.
Regards, Hank
Hi Hank,
On 18 Jun 2021, at 08:32, Hank Nussbacher <hank@interall.co.il> wrote:
On 18/06/2021 08:36, Hank Nussbacher wrote:
As I delve deeper I am more confused. I went to: https://www.ripe.net/manage-ips-and-asns/db/support/querying-the-ripe-databa... <https://www.ripe.net/manage-ips-and-asns/db/support/querying-the-ripe-database#4--query-multiple-databases-with-the-global-resource-service> and wanted to search not just RIPE but all RIRs.
This section of the documentation is not clear. There are two ways to search all databases: (1) The "--resource" flag, you don't need to specify all database names, but the search key must be a resource (i.e., an IP prefix/range or an AS number). (2) The "--sources" flag, you do need to specify each database name, and the search key can be a text string (e.g. an organisation name).
So I started with a simple search like: https://apps.db.ripe.net/db-web-ui/query?bflag=true&dflag=false&rflag=true&searchtext=marlink&source=RIPE&types=organisation <https://apps.db.ripe.net/db-web-ui/query?bflag=true&dflag=false&rflag=true&searchtext=marlink&source=RIPE&types=organisation>
1 hit. Very nice. Easy peasy.
The query page searches the RIPE database only by default (the "--sources RIPE" query flag is always specified), and doesn't use the "--resource" flag.
So I now try all RIRs: https://apps.db.ripe.net/db-web-ui/query?bflag=true&dflag=false&rflag=true&searchtext=marlink&source=GRS&types=organisation <https://apps.db.ripe.net/db-web-ui/query?bflag=true&dflag=false&rflag=true&searchtext=marlink&source=GRS&types=organisation> No hits. Shouldn't I have had at least 1 hit from RIPE for this search?
Choosing the "Search resource objects in all available databases" radio button on the query page, forces the "--resource" flag, and not the "--sources" flag. So you can't search all databases for a text string, only for a resource, because the "--resources" flag is used for all databases, and not the ""--sources" flag. As a workaround, you can include the "--sources" flag yourself in the input field, e.g. https://apps.db.ripe.net/db-web-ui/query?searchtext=marlink&types=organisation&rflag=true&source=RIPE&source=APNIC-GRS&source=AFRINIC-GRS&source=ARIN-GRS&source=LACNIC-GRS&bflag=true To be clear, we should improve this behaviour to allow searching all databases for text, not only for a resource. Regards Ed Shryane RIPE NCC
Regards, Hank
Hi Hank, You can also use the "-a" or "--all-sources" flag instead of enumerating all possible sources with multiple "--sources" flags. Regards Ed
On 21 Jun 2021, at 10:38, Edward Shryane via db-wg <db-wg@ripe.net> wrote:
... As a workaround, you can include the "--sources" flag yourself in the input field, e.g.
https://apps.db.ripe.net/db-web-ui/query?searchtext=marlink&types=organisation&rflag=true&source=RIPE&source=APNIC-GRS&source=AFRINIC-GRS&source=ARIN-GRS&source=LACNIC-GRS&bflag=true <https://apps.db.ripe.net/db-web-ui/query?searchtext=marlink&types=organisation&rflag=true&source=RIPE&source=APNIC-GRS&source=AFRINIC-GRS&source=ARIN-GRS&source=LACNIC-GRS&bflag=true>
To be clear, we should improve this behaviour to allow searching all databases for text, not only for a resource.
Regards Ed Shryane RIPE NCC
Regards, Hank
Hi Hank,
On 18 Jun 2021, at 07:36, Hank Nussbacher <hank@interall.co.il> wrote:
On 18/06/2021 00:38, Edward Shryane wrote:
This helps as a start. I had played with standard search: http://rest.db.ripe.net/search?type-filter=inetnum&query-string=TEST but as per the documentation in github: https://github.com/RIPE-NCC/whois/wiki/WHOIS-REST-API-search with: type-filter Optional. If specified the results will be filtered by object-type, multiple type-filters can be specified.
doesn't seem to work since I am getting object-types other than inetnum (such as role or person). Unless I am doing something wrong. Can you clue me in to the proper syntax? Will type-filter=inetnum work on fulltextsearch as well?
Related objects are returned by default (this is also the behaviour on port 43: "-T inetnum TEST" will return related person/role objects). You need to add the "-r" flag to switch *off* referenced object lookup: "-T inetnum -r TEST" Or using the REST API, add the "flags=r" query parameter: http://rest.db.ripe.net/search?type-filter=inetnum&query-string=TEST&flags=r
Incidentally, the first and second examples on that RIPE github page: http://rest.db.ripe.net/search?inverse-attribute=org&type-filter=inetnum&source=ripe&query-string=ORG-NCC1-RIPE results in: ERROR:101: no entries found No entries found in source %s. Seems like source=ripe is causing issues.
The example uses the RIPE NCC organisation with org-type: RIR, that isn't referenced from any inetnum resources, which isn't very useful! A better example is RIPE NCC with org-type: LIR (ORG-RIEN1-RIPE) that *does* return inetnums: http://rest.db.ripe.net/search?inverse-attribute=org&type-filter=inetnum&source=ripe&query-string=ORG-RIEN1-RIPE I'll correct the examples, thanks for pointing it out. Regards Ed Shryane RIPE NCC
We are working on a PoC in regards to DR with AWS. We are doing BYOIP and were asked to create an ROA record which I can easily understand. But AWS also requests an X.509 certificate as per: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-byoip.html which needs to be added to a new "descr:" tag They state in their page: "When you provision an address range for use with AWS, you are confirming that you control the address range and are authorizing Amazon to advertise it. We also verify that you control the address range through a signed authorization message. This message is signed with the self-signed X.509 key pair that you used when updating the RDAP record with the X.509 certificate. AWS requires a cryptographically signed authorization message that it presents to the RIR. The RIR authenticates the signature against the certificate that you added to RDAP, and checks the authorization details against the ROA." Why isn't creating an ROA proof enough that I control the address range? Why 2 forms of authentication needed (ROA & X.509)? What will happen to the pollution of the descr tag if others like Azure and GCP decide on something similar? Should the community form a standard rather than let the descr field become polluted? Regards, Hank
Dear Hank, On Tue, Apr 12, 2022 at 06:23:45PM +0300, Hank Nussbacher via db-wg wrote:
Why isn't creating an ROA proof enough that I control the address range?
The extra step is needed because the ROA could also have been created by another entity attempting to BYOIP some space into a provider! Imagine there being two separate accounts (unrelated to each other), each instructs the cloud provider "10.0.0.0/24 is my prefix, can you originate it on my behalf, I create a ROA". The cloud provider has to figure out which of the two accounts is truthful, and which is attempting to trick the cloud provider into announcing that space and routing it to the fraudulent account holder's virtual resources.
Why 2 forms of authentication needed (ROA & X.509)? What will happen to the pollution of the descr tag if others like Azure and GCP decide on something similar?
I'm not concerned about pollution, but indeed it doesn't look super pretty.
Should the community form a standard rather than let the descr field become polluted?
I have good news! Work already is in progress to help with challenges like the above one! :-) A concept known as "RSC" (Resource Signed Checklists) might be useful to restructure cloud onboarding procedures for BYOIP. 1) request the IP holder to create a ROA. This way the cloud provider knows they are authorized to originate space. 2) tell the IP holder a secret random string 3) request the IP holder to create a RSC (this probably would happen via the RIR RPKI dashboard) in which the IP space and the random string are bound to each other. 4) the RSC is uploaded/emailed to the cloud provider, who can verify the signature. This way the cloud provider knows that whoever tried to initatie the BOYIP process, also has access to a keypair capable of making attestations. RSC objects are *not* distributed through the global RPKI repository system, this is neat because this way the RSC concept does not impose a burden. The Internet-Draft is heading towards IETF "Working Group Last Call" https://datatracker.ietf.org/doc/html/draft-ietf-sidrops-rpki-rsc Kind regards, Job
Thanks for the extensive note Denis, thanks Cynthia for being first-responder. I wanted to jump in on a specific subthread. On Tue, Apr 06, 2021 at 06:38:29PM +0200, Cynthia Revström via db-wg wrote:
Questions:
-Should the database software do any checks on the existence/reachability of the url as part of the update with an error if the check fails?
I would say yes as this is not a new concept to the DB as I believe this is already done with domain objects.
I disagree on this one point, what is the RIPE DB supposed to do when it discovers one state or another? Should the URIs be probed from many vantage points to compare? Once you try to monitor if something is up or down it can quickly become complicated. The content the 'geofeed:' attribute value references to something outside the RIPE DB, this means the RIPE DB software should not be crawling it. All RIPE NCC's DB software needs to check is whether the string's syntax conforms to the HTTPS URI scheme.
-Should the RIPE NCC do any periodic repeat checks on the continued existence/reachability of the url?
I would say that checking once a month or so could be fine, as long as it just results in a just a nudge email. Like don't enforce it, but nudge people if it is down.
It seems an unnecessary burden for RIPE NCC's business to check whether a given website is up or down. What is such nudging supposed to accomplish? It might end up being busy work if done by an individual RIR.
-Should the RIPE NCC do any periodic checks on the content structure of the csv file referenced by the url?
I don't have a strong opinion either way here but I feel like that is not really something the NCC is responsible for checking. But if the NCC should check then my comments about the repeat reachability checks above apply here too.
The RIPE NCC should not check random URIs, they are not the GeoIP police ;-) Kind regards, Job
Hi Job, I just want to clarify my stance here, with regards to the verification of the URL, my opinion was that it might be helpful to prevent typos etc. and unless the geofeed attribute is updated, it doesn't need to be validated imo. This is also not a very "strong opinion", it was more my initial thoughts on the thing but I don't really care that much if there are reasons to do it another way. -Cynthia On Tue, Apr 6, 2021 at 7:51 PM Job Snijders <job@sobornost.net> wrote:
Thanks for the extensive note Denis, thanks Cynthia for being first-responder. I wanted to jump in on a specific subthread.
On Tue, Apr 06, 2021 at 06:38:29PM +0200, Cynthia Revström via db-wg wrote:
Questions:
-Should the database software do any checks on the existence/reachability of the url as part of the update with an error if the check fails?
I would say yes as this is not a new concept to the DB as I believe this is already done with domain objects.
I disagree on this one point, what is the RIPE DB supposed to do when it discovers one state or another? Should the URIs be probed from many vantage points to compare? Once you try to monitor if something is up or down it can quickly become complicated.
The content the 'geofeed:' attribute value references to something outside the RIPE DB, this means the RIPE DB software should not be crawling it.
All RIPE NCC's DB software needs to check is whether the string's syntax conforms to the HTTPS URI scheme.
-Should the RIPE NCC do any periodic repeat checks on the continued existence/reachability of the url?
I would say that checking once a month or so could be fine, as long as it just results in a just a nudge email. Like don't enforce it, but nudge people if it is down.
It seems an unnecessary burden for RIPE NCC's business to check whether a given website is up or down. What is such nudging supposed to accomplish? It might end up being busy work if done by an individual RIR.
-Should the RIPE NCC do any periodic checks on the content structure of the csv file referenced by the url?
I don't have a strong opinion either way here but I feel like that is not really something the NCC is responsible for checking. But if the NCC should check then my comments about the repeat reachability checks above apply here too.
The RIPE NCC should not check random URIs, they are not the GeoIP police ;-)
Kind regards,
Job
Hi guys I've changed the subject as it goes a bit off topic and becomes more general and reaches out beyond just the DB-WG. I've been going to say this for a while but never got round to it until now. Apologies for saying it in response to your email Job but it's not directed at you. There are two phrases that frustrate me every time I see them used: "The RIPE NCC is not the 'xyz' police" "It's not the job of the RIPE NCC to do 'abc'" These are just dramatised ways of saying no to something. But the drama doesn't really add anything. No one is expecting the RIPE NCC to investigate any crimes or arrest anyone. They are not the 'geoip police', the 'internet police', the 'abuse police'. So what are they? I think everyone would agree that what the RIPE NCC does today is not the same as they did when they first started in business. So the job that they do has changed. Their role or mandate has grown, expanded, contracted, moved sideways, diversified, etc. Every time they started to do something different or new, it could have been said (and maybe was said) that it was not their job to do that. But they are doing it now anyway. So I would rather turn these infamous statements round and be positive instead of negative. Let's stop saying what it's not their job to do and ask if it is, or should/could it be, their job to do something helpful or beneficial. The internet technical infrastructure is like a whole ecosystem now. Lots of different elements all working together and managed or controlled by large numbers of organisations. If anyone wants to have a good life in this cyber world, all parts of this ecosystem need to be operating well. Many of these elements have no checks or monitoring. They run on trust. Trust is hard to build and easy to lose. Once people lose trust in one element they start to call it a swamp, say it's inaccurate, useless, needs to be replaced. These comments have often been made about the RIPE Database as a whole, often by people partly responsible for it's content. It's also been said about parts of the content like abuse contacts. It could end up being said about geofeed data. One of the reasons people use to justify these infamous statements is the cost or complexity of doing something. They think to do checks needs FTEs sitting behind desks doing laborious tasks. That costs money for the members. They forget this is the 21st century. We have learned now how to use computers to do these tasks for us. Abuse contact checking is a good example. Every proposal to do anything in this area is repeatedly hit with these infamous statements and more. Perhaps because the technical checks now being done are done the wrong way. If an email address fails the checks it triggers manual intervention requiring an FTE to schedule an ARC with the resource holder and follow up discussions. This should be fully automated. If a monthly check fails, software should send an email to the registered contact for the resource holder. If n monthly checks fail the ORGANISATION object in the RIPE Database should be tagged as having an invalid abuse contact. That information should be available for anyone to see. Public disclosure can be the penalty for failing to handle abuse. People can then make informed decisions. How does this affect geofeed? The same principles apply here. What we have now is a handful of companies providing geolocation data. I am sure they put a lot of effort into ensuring their data is accurate. This geofeed attribute will delegate this information process out to thousands of organisations. Some of these will put a lot of effort into ensuring their data is valid and accurate. Some may put less effort in, especially over time. If a proportion of this data starts to degrade over time, is shown to be inaccurate or syntactically invalid, trust in the whole system dies. If checks and tests can be done to validate the data in any way it may help to keep it up to date and accurate. If each RIR maintains a list of geofeed urls in a file on the FTP site, each RIR can check availability of those urls each month for all the RIRs lists. I don't know if checks from 5 locations is enough. Maybe a third party system can be used for the 'is it up' check? Any repeated failures can be notified to the resource holders' contact. If each RIR downloads the files for their region they can check the syntax, check for conflicting data in multiple files within a hierarchy, etc. Any failures can be reported to the contact. All of this can be automated. If any repeated errors are not fixed the geofeed data in the RIPE Database can again be tagged as invalid or suspect. When anyone accesses this data it comes with a red flag. It is up to them if they will trust any of that data file. For both abuse contacts and geofeed, a system can be set up for (trusted) users to report problems. Maybe abuse contacts that are valid but never resolve any reported issues. Or geofeed data that is known to be inaccurate. By adding appropriate tags to the meta data in the RIPE Database which can be publicly viewed this becomes a reputational system. Overall it would improve the quality of data available in or through the RIPE Database, which improves the value of the services. There may be other elements in the database that could benefit from this type of tagging and reporting. I see the RIPE NCC as being in a good position to do these type of checks and tests. It would not be the RIPE Database software doing the checks, but an additional RIPE NCC service. Minimal costs with fully automated checks can give added benefits. I think it is their job to do this for the good of the internet. cheers denis co-chair DB-WG On Tue, 6 Apr 2021 at 19:50, Job Snijders <job@sobornost.net> wrote:
Thanks for the extensive note Denis, thanks Cynthia for being first-responder. I wanted to jump in on a specific subthread.
On Tue, Apr 06, 2021 at 06:38:29PM +0200, Cynthia Revström via db-wg wrote:
Questions:
-Should the database software do any checks on the existence/reachability of the url as part of the update with an error if the check fails?
I would say yes as this is not a new concept to the DB as I believe this is already done with domain objects.
I disagree on this one point, what is the RIPE DB supposed to do when it discovers one state or another? Should the URIs be probed from many vantage points to compare? Once you try to monitor if something is up or down it can quickly become complicated.
The content the 'geofeed:' attribute value references to something outside the RIPE DB, this means the RIPE DB software should not be crawling it.
All RIPE NCC's DB software needs to check is whether the string's syntax conforms to the HTTPS URI scheme.
-Should the RIPE NCC do any periodic repeat checks on the continued existence/reachability of the url?
I would say that checking once a month or so could be fine, as long as it just results in a just a nudge email. Like don't enforce it, but nudge people if it is down.
It seems an unnecessary burden for RIPE NCC's business to check whether a given website is up or down. What is such nudging supposed to accomplish? It might end up being busy work if done by an individual RIR.
-Should the RIPE NCC do any periodic checks on the content structure of the csv file referenced by the url?
I don't have a strong opinion either way here but I feel like that is not really something the NCC is responsible for checking. But if the NCC should check then my comments about the repeat reachability checks above apply here too.
The RIPE NCC should not check random URIs, they are not the GeoIP police ;-)
Kind regards,
Job
Hi Denis, Apologies if this email comes off as rude or harsh, that isn't my intention, but I am not quite sure how else to phrase it. So when I say things like this, it's not because of cost or anything like that. I say it because I don't think validating the CSV is something that would be a benefit. Abuse contacts are validated and required (except for some legacy resources iirc) because they are important in order to report abuse. Having a geofeed service is not a requirement, and additionally I would think that the data consumer would almost always be software dedicated to this. If so, then that software can easily validate the CSV data itself. I see abuse contacts and geofeed as very different things considering those 2 things. -Cynthia On Wed, Apr 7, 2021, 00:45 denis walker <ripedenis@gmail.com> wrote:
Hi guys
I've changed the subject as it goes a bit off topic and becomes more general and reaches out beyond just the DB-WG. I've been going to say this for a while but never got round to it until now. Apologies for saying it in response to your email Job but it's not directed at you.
There are two phrases that frustrate me every time I see them used: "The RIPE NCC is not the 'xyz' police" "It's not the job of the RIPE NCC to do 'abc'"
These are just dramatised ways of saying no to something. But the drama doesn't really add anything. No one is expecting the RIPE NCC to investigate any crimes or arrest anyone. They are not the 'geoip police', the 'internet police', the 'abuse police'. So what are they? I think everyone would agree that what the RIPE NCC does today is not the same as they did when they first started in business. So the job that they do has changed. Their role or mandate has grown, expanded, contracted, moved sideways, diversified, etc. Every time they started to do something different or new, it could have been said (and maybe was said) that it was not their job to do that. But they are doing it now anyway. So I would rather turn these infamous statements round and be positive instead of negative. Let's stop saying what it's not their job to do and ask if it is, or should/could it be, their job to do something helpful or beneficial.
The internet technical infrastructure is like a whole ecosystem now. Lots of different elements all working together and managed or controlled by large numbers of organisations. If anyone wants to have a good life in this cyber world, all parts of this ecosystem need to be operating well. Many of these elements have no checks or monitoring. They run on trust. Trust is hard to build and easy to lose. Once people lose trust in one element they start to call it a swamp, say it's inaccurate, useless, needs to be replaced. These comments have often been made about the RIPE Database as a whole, often by people partly responsible for it's content. It's also been said about parts of the content like abuse contacts. It could end up being said about geofeed data.
One of the reasons people use to justify these infamous statements is the cost or complexity of doing something. They think to do checks needs FTEs sitting behind desks doing laborious tasks. That costs money for the members. They forget this is the 21st century. We have learned now how to use computers to do these tasks for us. Abuse contact checking is a good example. Every proposal to do anything in this area is repeatedly hit with these infamous statements and more. Perhaps because the technical checks now being done are done the wrong way. If an email address fails the checks it triggers manual intervention requiring an FTE to schedule an ARC with the resource holder and follow up discussions. This should be fully automated. If a monthly check fails, software should send an email to the registered contact for the resource holder. If n monthly checks fail the ORGANISATION object in the RIPE Database should be tagged as having an invalid abuse contact. That information should be available for anyone to see. Public disclosure can be the penalty for failing to handle abuse. People can then make informed decisions.
How does this affect geofeed? The same principles apply here. What we have now is a handful of companies providing geolocation data. I am sure they put a lot of effort into ensuring their data is accurate. This geofeed attribute will delegate this information process out to thousands of organisations. Some of these will put a lot of effort into ensuring their data is valid and accurate. Some may put less effort in, especially over time. If a proportion of this data starts to degrade over time, is shown to be inaccurate or syntactically invalid, trust in the whole system dies. If checks and tests can be done to validate the data in any way it may help to keep it up to date and accurate. If each RIR maintains a list of geofeed urls in a file on the FTP site, each RIR can check availability of those urls each month for all the RIRs lists. I don't know if checks from 5 locations is enough. Maybe a third party system can be used for the 'is it up' check? Any repeated failures can be notified to the resource holders' contact. If each RIR downloads the files for their region they can check the syntax, check for conflicting data in multiple files within a hierarchy, etc. Any failures can be reported to the contact. All of this can be automated. If any repeated errors are not fixed the geofeed data in the RIPE Database can again be tagged as invalid or suspect. When anyone accesses this data it comes with a red flag. It is up to them if they will trust any of that data file.
For both abuse contacts and geofeed, a system can be set up for (trusted) users to report problems. Maybe abuse contacts that are valid but never resolve any reported issues. Or geofeed data that is known to be inaccurate. By adding appropriate tags to the meta data in the RIPE Database which can be publicly viewed this becomes a reputational system. Overall it would improve the quality of data available in or through the RIPE Database, which improves the value of the services. There may be other elements in the database that could benefit from this type of tagging and reporting.
I see the RIPE NCC as being in a good position to do these type of checks and tests. It would not be the RIPE Database software doing the checks, but an additional RIPE NCC service. Minimal costs with fully automated checks can give added benefits. I think it is their job to do this for the good of the internet.
cheers denis co-chair DB-WG
On Tue, 6 Apr 2021 at 19:50, Job Snijders <job@sobornost.net> wrote:
Thanks for the extensive note Denis, thanks Cynthia for being first-responder. I wanted to jump in on a specific subthread.
On Tue, Apr 06, 2021 at 06:38:29PM +0200, Cynthia Revström via db-wg
wrote:
Questions:
-Should the database software do any checks on the existence/reachability of the url as part of the update with an error if the check fails?
I would say yes as this is not a new concept to the DB as I believe this is already done with domain objects.
I disagree on this one point, what is the RIPE DB supposed to do when it discovers one state or another? Should the URIs be probed from many vantage points to compare? Once you try to monitor if something is up or down it can quickly become complicated.
The content the 'geofeed:' attribute value references to something outside the RIPE DB, this means the RIPE DB software should not be crawling it.
All RIPE NCC's DB software needs to check is whether the string's syntax conforms to the HTTPS URI scheme.
-Should the RIPE NCC do any periodic repeat checks on the continued existence/reachability of the url?
I would say that checking once a month or so could be fine, as long as it just results in a just a nudge email. Like don't enforce it, but nudge people if it is down.
It seems an unnecessary burden for RIPE NCC's business to check whether a given website is up or down. What is such nudging supposed to accomplish? It might end up being busy work if done by an individual RIR.
-Should the RIPE NCC do any periodic checks on the content structure of the csv file referenced by the url?
I don't have a strong opinion either way here but I feel like that is not really something the NCC is responsible for checking. But if the NCC should check then my comments about the repeat reachability checks above apply here too.
The RIPE NCC should not check random URIs, they are not the GeoIP police ;-)
Kind regards,
Job
HI Cynthia I don't take criticism personally. I am known for my wild ideas...occasionally I come up with a good one :) But let's take another look at this geofeed. Any service offered by or through the RIPE Database should be a high quality and reliable service. It doesn't matter if it is an essential service or not. The reputation of the RIPE Database itself rests on the quality of the services derived from it. You say most consumers of this geofeed data will be software capable of validating the csv file. What will this software do when it finds invalid data? Just ignore it? Will this software know who to report data errors to? Will it have any means to follow up on reported errors? Will anyone notice if the quality of data deteriorates over time? What will be done with data that is considered to be inaccurate (by mistake or deliberate intention)? Will it be reported? Will there be any follow up...or just discarded? Services like geofeed are good ideas. But if the data quality or accessibility deteriorates over time it becomes useless to misleading. That is why I believe centralised validating, testing and reporting are helpful. I think the RIRs are well positioned for doing these tasks and should do more of them. Abuse contacts and geofeed are different things....but they are both secondary services facilitated by the RIPE Database and should be trusted. cheers denis co-chair DB-WG On Wed, 7 Apr 2021 at 08:20, Cynthia Revström <me@cynthia.re> wrote:
Hi Denis,
Apologies if this email comes off as rude or harsh, that isn't my intention, but I am not quite sure how else to phrase it.
So when I say things like this, it's not because of cost or anything like that. I say it because I don't think validating the CSV is something that would be a benefit.
Abuse contacts are validated and required (except for some legacy resources iirc) because they are important in order to report abuse.
Having a geofeed service is not a requirement, and additionally I would think that the data consumer would almost always be software dedicated to this. If so, then that software can easily validate the CSV data itself.
I see abuse contacts and geofeed as very different things considering those 2 things.
-Cynthia
On Wed, Apr 7, 2021, 00:45 denis walker <ripedenis@gmail.com> wrote:
Hi guys
I've changed the subject as it goes a bit off topic and becomes more general and reaches out beyond just the DB-WG. I've been going to say this for a while but never got round to it until now. Apologies for saying it in response to your email Job but it's not directed at you.
There are two phrases that frustrate me every time I see them used: "The RIPE NCC is not the 'xyz' police" "It's not the job of the RIPE NCC to do 'abc'"
These are just dramatised ways of saying no to something. But the drama doesn't really add anything. No one is expecting the RIPE NCC to investigate any crimes or arrest anyone. They are not the 'geoip police', the 'internet police', the 'abuse police'. So what are they? I think everyone would agree that what the RIPE NCC does today is not the same as they did when they first started in business. So the job that they do has changed. Their role or mandate has grown, expanded, contracted, moved sideways, diversified, etc. Every time they started to do something different or new, it could have been said (and maybe was said) that it was not their job to do that. But they are doing it now anyway. So I would rather turn these infamous statements round and be positive instead of negative. Let's stop saying what it's not their job to do and ask if it is, or should/could it be, their job to do something helpful or beneficial.
The internet technical infrastructure is like a whole ecosystem now. Lots of different elements all working together and managed or controlled by large numbers of organisations. If anyone wants to have a good life in this cyber world, all parts of this ecosystem need to be operating well. Many of these elements have no checks or monitoring. They run on trust. Trust is hard to build and easy to lose. Once people lose trust in one element they start to call it a swamp, say it's inaccurate, useless, needs to be replaced. These comments have often been made about the RIPE Database as a whole, often by people partly responsible for it's content. It's also been said about parts of the content like abuse contacts. It could end up being said about geofeed data.
One of the reasons people use to justify these infamous statements is the cost or complexity of doing something. They think to do checks needs FTEs sitting behind desks doing laborious tasks. That costs money for the members. They forget this is the 21st century. We have learned now how to use computers to do these tasks for us. Abuse contact checking is a good example. Every proposal to do anything in this area is repeatedly hit with these infamous statements and more. Perhaps because the technical checks now being done are done the wrong way. If an email address fails the checks it triggers manual intervention requiring an FTE to schedule an ARC with the resource holder and follow up discussions. This should be fully automated. If a monthly check fails, software should send an email to the registered contact for the resource holder. If n monthly checks fail the ORGANISATION object in the RIPE Database should be tagged as having an invalid abuse contact. That information should be available for anyone to see. Public disclosure can be the penalty for failing to handle abuse. People can then make informed decisions.
How does this affect geofeed? The same principles apply here. What we have now is a handful of companies providing geolocation data. I am sure they put a lot of effort into ensuring their data is accurate. This geofeed attribute will delegate this information process out to thousands of organisations. Some of these will put a lot of effort into ensuring their data is valid and accurate. Some may put less effort in, especially over time. If a proportion of this data starts to degrade over time, is shown to be inaccurate or syntactically invalid, trust in the whole system dies. If checks and tests can be done to validate the data in any way it may help to keep it up to date and accurate. If each RIR maintains a list of geofeed urls in a file on the FTP site, each RIR can check availability of those urls each month for all the RIRs lists. I don't know if checks from 5 locations is enough. Maybe a third party system can be used for the 'is it up' check? Any repeated failures can be notified to the resource holders' contact. If each RIR downloads the files for their region they can check the syntax, check for conflicting data in multiple files within a hierarchy, etc. Any failures can be reported to the contact. All of this can be automated. If any repeated errors are not fixed the geofeed data in the RIPE Database can again be tagged as invalid or suspect. When anyone accesses this data it comes with a red flag. It is up to them if they will trust any of that data file.
For both abuse contacts and geofeed, a system can be set up for (trusted) users to report problems. Maybe abuse contacts that are valid but never resolve any reported issues. Or geofeed data that is known to be inaccurate. By adding appropriate tags to the meta data in the RIPE Database which can be publicly viewed this becomes a reputational system. Overall it would improve the quality of data available in or through the RIPE Database, which improves the value of the services. There may be other elements in the database that could benefit from this type of tagging and reporting.
I see the RIPE NCC as being in a good position to do these type of checks and tests. It would not be the RIPE Database software doing the checks, but an additional RIPE NCC service. Minimal costs with fully automated checks can give added benefits. I think it is their job to do this for the good of the internet.
cheers denis co-chair DB-WG
On Tue, 6 Apr 2021 at 19:50, Job Snijders <job@sobornost.net> wrote:
Thanks for the extensive note Denis, thanks Cynthia for being first-responder. I wanted to jump in on a specific subthread.
On Tue, Apr 06, 2021 at 06:38:29PM +0200, Cynthia Revström via db-wg wrote:
Questions:
-Should the database software do any checks on the existence/reachability of the url as part of the update with an error if the check fails?
I would say yes as this is not a new concept to the DB as I believe this is already done with domain objects.
I disagree on this one point, what is the RIPE DB supposed to do when it discovers one state or another? Should the URIs be probed from many vantage points to compare? Once you try to monitor if something is up or down it can quickly become complicated.
The content the 'geofeed:' attribute value references to something outside the RIPE DB, this means the RIPE DB software should not be crawling it.
All RIPE NCC's DB software needs to check is whether the string's syntax conforms to the HTTPS URI scheme.
-Should the RIPE NCC do any periodic repeat checks on the continued existence/reachability of the url?
I would say that checking once a month or so could be fine, as long as it just results in a just a nudge email. Like don't enforce it, but nudge people if it is down.
It seems an unnecessary burden for RIPE NCC's business to check whether a given website is up or down. What is such nudging supposed to accomplish? It might end up being busy work if done by an individual RIR.
-Should the RIPE NCC do any periodic checks on the content structure of the csv file referenced by the url?
I don't have a strong opinion either way here but I feel like that is not really something the NCC is responsible for checking. But if the NCC should check then my comments about the repeat reachability checks above apply here too.
The RIPE NCC should not check random URIs, they are not the GeoIP police ;-)
Kind regards,
Job
Hi Denis, This message is in response to several in the discussion . In brief: I have seen network operators distraught because their network was misclassified as being in the wrong geography for the services their customers needed to access and they had no way to fix that situation. I feel that publishing geofeed data in the RIPE Database would be a good thing to do as it helps network operators share data in a structured way and should reduce the overall amount of pain from misclassified networks. I personally would like to see an agreement on your draft problem statement and some feedback from the RIPE NCC before focusing on some of the more detailed questions you raised. I also agree with you that accurate and reliable data is important. But... On Wed, Apr 7, 2021 at 7:19 AM denis walker via db-wg <db-wg@ripe.net> wrote: [...]
You say most consumers of this geofeed data will be software capable of validating the csv file. What will this software do when it finds invalid data? Just ignore it? Will this software know who to report data errors to? Will it have any means to follow up on reported errors?
I would have thought that anyone implementing a parser for this data would also be able to query the database for a tech-c and report validation failures. Based on my previous interactions with the network operators who have suffered misclassification, I am confident that there is a strong incentive for networks to publish well formatted accurate data and to fix any errors quickly. That said, there are many possible ways to reduce the risk of badly formatted data. For instance, the RIPE NCC could offer a tool to create the relevant files to be published through the LIR Portal or as a standalone tool. This is why I'd like to see feedback from the RIPE NCC ahead of an implementation discussion.
Services like geofeed are good ideas. But if the data quality or accessibility deteriorates over time it becomes useless to misleading. That is why I believe centralised validating, testing and reporting are helpful. I think the RIRs are well positioned for doing these tasks and should do more of them.
I agree with you that defining what data means and keeping it accurate is important. But in the case of geo data, could the RIPE NCC validate the content as well as the data structures? I'd have thought that the publishers and the users of the data would be in the best position to do that. Am I wrong? Kind regards, Leo
Hi, I just wanted to clarify my stance on validation a bit more. I am totally against trying to validate the data itself, that is not what the NCC is supposed to do. Validating the format of the CSV might be okay but honestly anything beyond validating that it is not a 404 not found is a bit too much in my opinion. I also agree with Leo's points with regards to fixing the data, I believe that the data publishers have a pretty strong incentive to have the data be accurate. And as Leo also mentions, the tech-c and/or admin-c contacts are also published so finding a reporting mechanism for issues would not be very difficult. And with regards to misformatted data, yeah I would probably just ignore that entry if I was writing a parser and log the error and report it to an engineer who can then forward it to the admin contact if they determine it to be a real issue. In order to not infinitely delay this, I feel like while it shouldn't be rushed, I am not sure how realistic this issue would be and how much harm it would cause to anyone. Also, changing how much validation is done could be changed in the future if it is shown to be an actual real world problem. -Cynthia On Wed, Apr 7, 2021 at 10:58 PM Leo Vegoda <leo@vegoda.org> wrote:
Hi Denis,
This message is in response to several in the discussion .
In brief: I have seen network operators distraught because their network was misclassified as being in the wrong geography for the services their customers needed to access and they had no way to fix that situation. I feel that publishing geofeed data in the RIPE Database would be a good thing to do as it helps network operators share data in a structured way and should reduce the overall amount of pain from misclassified networks.
I personally would like to see an agreement on your draft problem statement and some feedback from the RIPE NCC before focusing on some of the more detailed questions you raised.
I also agree with you that accurate and reliable data is important. But...
On Wed, Apr 7, 2021 at 7:19 AM denis walker via db-wg <db-wg@ripe.net> wrote:
[...]
You say most consumers of this geofeed data will be software capable of validating the csv file. What will this software do when it finds invalid data? Just ignore it? Will this software know who to report data errors to? Will it have any means to follow up on reported errors?
I would have thought that anyone implementing a parser for this data would also be able to query the database for a tech-c and report validation failures. Based on my previous interactions with the network operators who have suffered misclassification, I am confident that there is a strong incentive for networks to publish well formatted accurate data and to fix any errors quickly.
That said, there are many possible ways to reduce the risk of badly formatted data. For instance, the RIPE NCC could offer a tool to create the relevant files to be published through the LIR Portal or as a standalone tool. This is why I'd like to see feedback from the RIPE NCC ahead of an implementation discussion.
Services like geofeed are good ideas. But if the data quality or accessibility deteriorates over time it becomes useless to misleading. That is why I believe centralised validating, testing and reporting are helpful. I think the RIRs are well positioned for doing these tasks and should do more of them.
I agree with you that defining what data means and keeping it accurate is important. But in the case of geo data, could the RIPE NCC validate the content as well as the data structures? I'd have thought that the publishers and the users of the data would be in the best position to do that. Am I wrong?
Kind regards,
Leo
The Geofeed: field is a URL. It points to a resource. The semantic content of the resource should not be checked, what matters is that the URL is not a 404 at the time of publication. if you want to check it isn't a 404 after that, its like Lame checks: good to do, not strictly essential in the role of Whois/RPSL. if you want to check the semantic intent of the .csv geo data, thats not db-wb. This work is important. it substantially improves the STEERAGE to find the delegates assertions about geo for their INR. This is sufficiently high value in itself its worth doing. Checking the integrity of what they say goes beyond the role of a steerage/directory function. (my opinion) cheers -G On Thu, Apr 8, 2021 at 8:19 AM Cynthia Revström via db-wg <db-wg@ripe.net> wrote:
Hi,
I just wanted to clarify my stance on validation a bit more.
I am totally against trying to validate the data itself, that is not what the NCC is supposed to do. Validating the format of the CSV might be okay but honestly anything beyond validating that it is not a 404 not found is a bit too much in my opinion.
I also agree with Leo's points with regards to fixing the data, I believe that the data publishers have a pretty strong incentive to have the data be accurate. And as Leo also mentions, the tech-c and/or admin-c contacts are also published so finding a reporting mechanism for issues would not be very difficult.
And with regards to misformatted data, yeah I would probably just ignore that entry if I was writing a parser and log the error and report it to an engineer who can then forward it to the admin contact if they determine it to be a real issue.
In order to not infinitely delay this, I feel like while it shouldn't be rushed, I am not sure how realistic this issue would be and how much harm it would cause to anyone.
Also, changing how much validation is done could be changed in the future if it is shown to be an actual real world problem.
-Cynthia
On Wed, Apr 7, 2021 at 10:58 PM Leo Vegoda <leo@vegoda.org> wrote:
Hi Denis,
This message is in response to several in the discussion .
In brief: I have seen network operators distraught because their network was misclassified as being in the wrong geography for the services their customers needed to access and they had no way to fix that situation. I feel that publishing geofeed data in the RIPE Database would be a good thing to do as it helps network operators share data in a structured way and should reduce the overall amount of pain from misclassified networks.
I personally would like to see an agreement on your draft problem statement and some feedback from the RIPE NCC before focusing on some of the more detailed questions you raised.
I also agree with you that accurate and reliable data is important. But...
On Wed, Apr 7, 2021 at 7:19 AM denis walker via db-wg <db-wg@ripe.net> wrote:
[...]
You say most consumers of this geofeed data will be software capable of validating the csv file. What will this software do when it finds invalid data? Just ignore it? Will this software know who to report data errors to? Will it have any means to follow up on reported errors?
I would have thought that anyone implementing a parser for this data would also be able to query the database for a tech-c and report validation failures. Based on my previous interactions with the network operators who have suffered misclassification, I am confident that there is a strong incentive for networks to publish well formatted accurate data and to fix any errors quickly.
That said, there are many possible ways to reduce the risk of badly formatted data. For instance, the RIPE NCC could offer a tool to create the relevant files to be published through the LIR Portal or as a standalone tool. This is why I'd like to see feedback from the RIPE NCC ahead of an implementation discussion.
Services like geofeed are good ideas. But if the data quality or accessibility deteriorates over time it becomes useless to misleading. That is why I believe centralised validating, testing and reporting are helpful. I think the RIRs are well positioned for doing these tasks and should do more of them.
I agree with you that defining what data means and keeping it accurate is important. But in the case of geo data, could the RIPE NCC validate the content as well as the data structures? I'd have thought that the publishers and the users of the data would be in the best position to do that. Am I wrong?
Kind regards,
Leo
Hi Denis, I have so far not seen anyone (other than you) suggest doing anything more than checking that the URL is valid and doesn't 404. The people who have so far commented on this are: me, Job, George Michaelson, Leo Vegoda. Could we consider creating an NWI with a reduced scope? If I start by phrasing a question like this: does any object to validating the value to a valid HTTPS URL that returns a 200 status code upon creation? If yes, is it too much validation or too little? -Cynthia On Thu, Apr 8, 2021 at 12:29 AM George Michaelson <ggm@algebras.org> wrote:
The Geofeed: field is a URL.
It points to a resource.
The semantic content of the resource should not be checked, what matters is that the URL is not a 404 at the time of publication.
if you want to check it isn't a 404 after that, its like Lame checks: good to do, not strictly essential in the role of Whois/RPSL.
if you want to check the semantic intent of the .csv geo data, thats not db-wb.
This work is important. it substantially improves the STEERAGE to find the delegates assertions about geo for their INR. This is sufficiently high value in itself its worth doing. Checking the integrity of what they say goes beyond the role of a steerage/directory function.
(my opinion)
cheers
-G
On Thu, Apr 8, 2021 at 8:19 AM Cynthia Revström via db-wg <db-wg@ripe.net> wrote:
Hi,
I just wanted to clarify my stance on validation a bit more.
I am totally against trying to validate the data itself, that is not what the NCC is supposed to do. Validating the format of the CSV might be okay but honestly anything beyond validating that it is not a 404 not found is a bit too much in my opinion.
I also agree with Leo's points with regards to fixing the data, I believe that the data publishers have a pretty strong incentive to have the data be accurate. And as Leo also mentions, the tech-c and/or admin-c contacts are also published so finding a reporting mechanism for issues would not be very difficult.
And with regards to misformatted data, yeah I would probably just ignore that entry if I was writing a parser and log the error and report it to an engineer who can then forward it to the admin contact if they determine it to be a real issue.
In order to not infinitely delay this, I feel like while it shouldn't be rushed, I am not sure how realistic this issue would be and how much harm it would cause to anyone.
Also, changing how much validation is done could be changed in the future if it is shown to be an actual real world problem.
-Cynthia
On Wed, Apr 7, 2021 at 10:58 PM Leo Vegoda <leo@vegoda.org> wrote:
Hi Denis,
This message is in response to several in the discussion .
In brief: I have seen network operators distraught because their network was misclassified as being in the wrong geography for the services their customers needed to access and they had no way to fix that situation. I feel that publishing geofeed data in the RIPE Database would be a good thing to do as it helps network operators share data in a structured way and should reduce the overall amount of pain from misclassified networks.
I personally would like to see an agreement on your draft problem statement and some feedback from the RIPE NCC before focusing on some of the more detailed questions you raised.
I also agree with you that accurate and reliable data is important. But...
On Wed, Apr 7, 2021 at 7:19 AM denis walker via db-wg <db-wg@ripe.net> wrote:
[...]
You say most consumers of this geofeed data will be software capable of validating the csv file. What will this software do when it finds invalid data? Just ignore it? Will this software know who to report data errors to? Will it have any means to follow up on reported errors?
I would have thought that anyone implementing a parser for this data would also be able to query the database for a tech-c and report validation failures. Based on my previous interactions with the network operators who have suffered misclassification, I am confident that there is a strong incentive for networks to publish well formatted accurate data and to fix any errors quickly.
That said, there are many possible ways to reduce the risk of badly formatted data. For instance, the RIPE NCC could offer a tool to create the relevant files to be published through the LIR Portal or as a standalone tool. This is why I'd like to see feedback from the RIPE NCC ahead of an implementation discussion.
Services like geofeed are good ideas. But if the data quality or accessibility deteriorates over time it becomes useless to misleading. That is why I believe centralised validating, testing and reporting are helpful. I think the RIRs are well positioned for doing these tasks and should do more of them.
I agree with you that defining what data means and keeping it accurate is important. But in the case of geo data, could the RIPE NCC validate the content as well as the data structures? I'd have thought that the publishers and the users of the data would be in the best position to do that. Am I wrong?
Kind regards,
Leo
I'd say rather than a 2xx, Allowing for 30x redirection, HTTP->HTTPS uplift and other things. And, gzip compression. So, basically, completion of a data exchange. Probably in the spirit of what you meant. As long as thats what "200" means, I'd be fine! cheers -G On Thu, Apr 8, 2021 at 8:42 AM Cynthia Revström via db-wg <db-wg@ripe.net> wrote:
Hi Denis,
I have so far not seen anyone (other than you) suggest doing anything more than checking that the URL is valid and doesn't 404. The people who have so far commented on this are: me, Job, George Michaelson, Leo Vegoda.
Could we consider creating an NWI with a reduced scope?
If I start by phrasing a question like this: does any object to validating the value to a valid HTTPS URL that returns a 200 status code upon creation?
If yes, is it too much validation or too little?
-Cynthia
On Thu, Apr 8, 2021 at 12:29 AM George Michaelson <ggm@algebras.org> wrote:
The Geofeed: field is a URL.
It points to a resource.
The semantic content of the resource should not be checked, what matters is that the URL is not a 404 at the time of publication.
if you want to check it isn't a 404 after that, its like Lame checks: good to do, not strictly essential in the role of Whois/RPSL.
if you want to check the semantic intent of the .csv geo data, thats not db-wb.
This work is important. it substantially improves the STEERAGE to find the delegates assertions about geo for their INR. This is sufficiently high value in itself its worth doing. Checking the integrity of what they say goes beyond the role of a steerage/directory function.
(my opinion)
cheers
-G
On Thu, Apr 8, 2021 at 8:19 AM Cynthia Revström via db-wg <db-wg@ripe.net> wrote:
Hi,
I just wanted to clarify my stance on validation a bit more.
I am totally against trying to validate the data itself, that is not what the NCC is supposed to do. Validating the format of the CSV might be okay but honestly anything beyond validating that it is not a 404 not found is a bit too much in my opinion.
I also agree with Leo's points with regards to fixing the data, I believe that the data publishers have a pretty strong incentive to have the data be accurate. And as Leo also mentions, the tech-c and/or admin-c contacts are also published so finding a reporting mechanism for issues would not be very difficult.
And with regards to misformatted data, yeah I would probably just ignore that entry if I was writing a parser and log the error and report it to an engineer who can then forward it to the admin contact if they determine it to be a real issue.
In order to not infinitely delay this, I feel like while it shouldn't be rushed, I am not sure how realistic this issue would be and how much harm it would cause to anyone.
Also, changing how much validation is done could be changed in the future if it is shown to be an actual real world problem.
-Cynthia
On Wed, Apr 7, 2021 at 10:58 PM Leo Vegoda <leo@vegoda.org> wrote:
Hi Denis,
This message is in response to several in the discussion .
In brief: I have seen network operators distraught because their network was misclassified as being in the wrong geography for the services their customers needed to access and they had no way to fix that situation. I feel that publishing geofeed data in the RIPE Database would be a good thing to do as it helps network operators share data in a structured way and should reduce the overall amount of pain from misclassified networks.
I personally would like to see an agreement on your draft problem statement and some feedback from the RIPE NCC before focusing on some of the more detailed questions you raised.
I also agree with you that accurate and reliable data is important. But...
On Wed, Apr 7, 2021 at 7:19 AM denis walker via db-wg <db-wg@ripe.net> wrote:
[...]
You say most consumers of this geofeed data will be software capable of validating the csv file. What will this software do when it finds invalid data? Just ignore it? Will this software know who to report data errors to? Will it have any means to follow up on reported errors?
I would have thought that anyone implementing a parser for this data would also be able to query the database for a tech-c and report validation failures. Based on my previous interactions with the network operators who have suffered misclassification, I am confident that there is a strong incentive for networks to publish well formatted accurate data and to fix any errors quickly.
That said, there are many possible ways to reduce the risk of badly formatted data. For instance, the RIPE NCC could offer a tool to create the relevant files to be published through the LIR Portal or as a standalone tool. This is why I'd like to see feedback from the RIPE NCC ahead of an implementation discussion.
Services like geofeed are good ideas. But if the data quality or accessibility deteriorates over time it becomes useless to misleading. That is why I believe centralised validating, testing and reporting are helpful. I think the RIRs are well positioned for doing these tasks and should do more of them.
I agree with you that defining what data means and keeping it accurate is important. But in the case of geo data, could the RIPE NCC validate the content as well as the data structures? I'd have thought that the publishers and the users of the data would be in the best position to do that. Am I wrong?
Kind regards,
Leo
Yeah that's a good point, I guess "non-error status code" rather than "200 status code". -Cynthia On Thu, Apr 8, 2021 at 12:47 AM George Michaelson <ggm@algebras.org> wrote:
I'd say rather than a 2xx, Allowing for 30x redirection, HTTP->HTTPS uplift and other things. And, gzip compression. So, basically, completion of a data exchange.
Probably in the spirit of what you meant. As long as thats what "200" means, I'd be fine!
cheers
-G
On Thu, Apr 8, 2021 at 8:42 AM Cynthia Revström via db-wg <db-wg@ripe.net> wrote:
Hi Denis,
I have so far not seen anyone (other than you) suggest doing anything more than checking that the URL is valid and doesn't 404. The people who have so far commented on this are: me, Job, George Michaelson, Leo Vegoda.
Could we consider creating an NWI with a reduced scope?
If I start by phrasing a question like this: does any object to validating the value to a valid HTTPS URL that returns a 200 status code upon creation?
If yes, is it too much validation or too little?
-Cynthia
On Thu, Apr 8, 2021 at 12:29 AM George Michaelson <ggm@algebras.org> wrote:
The Geofeed: field is a URL.
It points to a resource.
The semantic content of the resource should not be checked, what matters is that the URL is not a 404 at the time of publication.
if you want to check it isn't a 404 after that, its like Lame checks: good to do, not strictly essential in the role of Whois/RPSL.
if you want to check the semantic intent of the .csv geo data, thats not db-wb.
This work is important. it substantially improves the STEERAGE to find the delegates assertions about geo for their INR. This is sufficiently high value in itself its worth doing. Checking the integrity of what they say goes beyond the role of a steerage/directory function.
(my opinion)
cheers
-G
On Thu, Apr 8, 2021 at 8:19 AM Cynthia Revström via db-wg <db-wg@ripe.net> wrote:
Hi,
I just wanted to clarify my stance on validation a bit more.
I am totally against trying to validate the data itself, that is not what the NCC is supposed to do. Validating the format of the CSV might be okay but honestly anything beyond validating that it is not a 404 not found is a bit too much in my opinion.
I also agree with Leo's points with regards to fixing the data, I believe that the data publishers have a pretty strong incentive to have the data be accurate. And as Leo also mentions, the tech-c and/or admin-c contacts are also published so finding a reporting mechanism for issues would not be very difficult.
And with regards to misformatted data, yeah I would probably just ignore that entry if I was writing a parser and log the error and report it to an engineer who can then forward it to the admin contact if they determine it to be a real issue.
In order to not infinitely delay this, I feel like while it shouldn't be rushed, I am not sure how realistic this issue would be and how much harm it would cause to anyone.
Also, changing how much validation is done could be changed in the future if it is shown to be an actual real world problem.
-Cynthia
On Wed, Apr 7, 2021 at 10:58 PM Leo Vegoda <leo@vegoda.org> wrote:
Hi Denis,
This message is in response to several in the discussion .
In brief: I have seen network operators distraught because their network was misclassified as being in the wrong geography for the services their customers needed to access and they had no way to fix that situation. I feel that publishing geofeed data in the RIPE Database would be a good thing to do as it helps network operators share data in a structured way and should reduce the overall amount of pain from misclassified networks.
I personally would like to see an agreement on your draft problem statement and some feedback from the RIPE NCC before focusing on some of the more detailed questions you raised.
I also agree with you that accurate and reliable data is important. But...
On Wed, Apr 7, 2021 at 7:19 AM denis walker via db-wg <db-wg@ripe.net> wrote:
[...]
You say most consumers of this geofeed data will be software capable of validating the csv file. What will this software do when it finds invalid data? Just ignore it? Will this software know who to report data errors to? Will it have any means to follow up on reported errors?
I would have thought that anyone implementing a parser for this data would also be able to query the database for a tech-c and report validation failures. Based on my previous interactions with the network operators who have suffered misclassification, I am confident that there is a strong incentive for networks to publish well formatted accurate data and to fix any errors quickly.
That said, there are many possible ways to reduce the risk of badly formatted data. For instance, the RIPE NCC could offer a tool to create the relevant files to be published through the LIR Portal or as a standalone tool. This is why I'd like to see feedback from the RIPE NCC ahead of an implementation discussion.
Services like geofeed are good ideas. But if the data quality or accessibility deteriorates over time it becomes useless to misleading. That is why I believe centralised validating, testing and reporting are helpful. I think the RIRs are well positioned for doing these tasks and should do more of them.
I agree with you that defining what data means and keeping it accurate is important. But in the case of geo data, could the RIPE NCC validate the content as well as the data structures? I'd have thought that the publishers and the users of the data would be in the best position to do that. Am I wrong?
Kind regards,
Leo
Could we consider creating an NWI with a reduced scope?
as an exercise, how minimal can we get? randy
Hi Randy,
On 8 Apr 2021, at 13:54, Randy Bush via db-wg <db-wg@ripe.net> wrote:
Could we consider creating an NWI with a reduced scope?
as an exercise, how minimal can we get?
randy
Given the draft RFC: https://datatracker.ietf.org/doc/draft-ietf-opsawg-finding-geofeeds/?include... I suggest the following minimal Solution Definition for an NWI: - Implement support for an optional, single "geofeed:" attribute in inetnum and inet6num object types. - Validate there is a maximum of either one "remarks: Geofeed" or "geofeed:" attribute per object. - Validate the "geofeed:" URL is well-formed and specifies the HTTPS protocol. - Include the "geofeed:" attribute in database dumps and split files. And inversely, what we could leave out (to simplify the implementation): - Do not support non-ASCII values in URL domain names or path (these must be converted beforehand) - Do not migrate (re-write) "remarks: Geofeed" values as "geofeed:" attributes - Do not validate that the URL is reachable (available) and do not validate the content Is this enough to satisfy the draft requirements? Is it enough to be useful for the DB-WG? Regards Ed Shryane RIPE NCC
Hi Ed, This seems like a good implementation to me. However, I don't think it's a good idea to limit the values on the "remarks" attribute in this way, as this could cause unwanted side effects with for ex. messages left on objects for other network operators. Also:
"Do not support non-ASCII values in URL domain names or path (these must be converted beforehand)"
Do you by this mean not supporting non-ASCII entirely? Or to have for ex. the web-interface convert IDNs to punycode, and have this listed on the object? If the latter, and remarks can remain free-form, I'd say let's implement. Cheers, Jori Vanneste FOD11-RIPE \-------- Original Message -------- On Apr 8, 2021, 2:27 PM, Edward Shryane via db-wg < db-wg@ripe.net> wrote:
Hi Randy,
On 8 Apr 2021, at 13:54, Randy Bush via db-wg <db-wg@ripe.net> wrote:
Could we consider creating an NWI with a reduced scope?
as an exercise, how minimal can we get?
randy
Given the draft RFC: [https://datatracker.ietf.org/doc/draft-ietf-opsawg-finding-geofeeds/?include\_text=1][https_datatracker.ietf.org_doc_draft-ietf-opsawg-finding-geofeeds_include_text_1]
I suggest the following minimal Solution Definition for an NWI:
\- Implement support for an optional, single "geofeed:" attribute in inetnum and inet6num object types. \- Validate there is a maximum of either one "remarks: Geofeed" or "geofeed:" attribute per object. \- Validate the "geofeed:" URL is well-formed and specifies the HTTPS protocol. \- Include the "geofeed:" attribute in database dumps and split files.
And inversely, what we could leave out (to simplify the implementation):
\- Do not support non-ASCII values in URL domain names or path (these must be converted beforehand) \- Do not migrate (re-write) "remarks: Geofeed" values as "geofeed:" attributes \- Do not validate that the URL is reachable (available) and do not validate the content
Is this enough to satisfy the draft requirements? Is it enough to be useful for the DB-WG?
Regards Ed Shryane RIPE NCC
[https_datatracker.ietf.org_doc_draft-ietf-opsawg-finding-geofeeds_include_text_1]: https://datatracker.ietf.org/doc/draft-ietf-opsawg-finding-geofeeds/?include...
Hi Jori,
On 8 Apr 2021, at 14:42, Tyrasuki <tyrasuki@pm.me> wrote:
Hi Ed,
This seems like a good implementation to me.
However, I don't think it's a good idea to limit the values on the "remarks" attribute in this way, as this could cause unwanted side effects with for ex. messages left on objects for other network operators.
Given the draft states: " Any particular inetnum: object MAY have, at most, one geofeed reference, whether a remarks: or a proper geofeed: attribute when one is defined." Do we enforce this by validating that there is only one "remarks: Geofeed" value (or "geofeed:") in the object?
Also:
"Do not support non-ASCII values in URL domain names or path (these must be converted beforehand)"
Do you by this mean not supporting non-ASCII entirely? Or to have for ex. the web-interface convert IDNs to punycode, and have this listed on the object?
The RIPE database uses the Latin-1 character set, so IDN domain names or non-ASCII values in the URL path will be substituted with a '?' character, by default. We could support non-ASCII values by automatically converting them (like we do with non-ASCII domains in email addresses).
If the latter, and remarks can remain free-form, I'd say let's implement.
Cheers, Jori Vanneste FOD11-RIPE
Regards Ed
Hi Ed, On 4/8/2021 3:36 PM, Edward Shryane via db-wg wrote:
Hi Jori,
On 8 Apr 2021, at 14:42, Tyrasuki <tyrasuki@pm.me> wrote:
Hi Ed,
This seems like a good implementation to me.
However, I don't think it's a good idea to limit the values on the "remarks" attribute in this way, as this could cause unwanted side effects with for ex. messages left on objects for other network operators. Given the draft states:
" Any particular inetnum: object MAY have, at most, one geofeed reference, whether a remarks: or a proper geofeed: attribute when one is defined."
Do we enforce this by validating that there is only one "remarks: Geofeed" value (or "geofeed:") in the object? My apologies, I think I missed this section in the draft, thanks for clarifying the reason to me.
Also:
"Do not support non-ASCII values in URL domain names or path (these must be converted beforehand)" Do you by this mean not supporting non-ASCII entirely? Or to have for ex. the web-interface convert IDNs to punycode, and have this listed on the object?
The RIPE database uses the Latin-1 character set, so IDN domain names or non-ASCII values in the URL path will be substituted with a '?' character, by default.
We could support non-ASCII values by automatically converting them (like we do with non-ASCII domains in email addresses).
That sounds like a good approach to me, thanks for clarifying. :) I think this is indeed a good starting ground for a minimal NWI, and would like to see where this goes.
If the latter, and remarks can remain free-form, I'd say let's implement.
Cheers, Jori Vanneste FOD11-RIPE
Regards Ed
Best regards, Jori Vanneste FOD11-RIPE
Hello Jori, Edward and All. I apologise for resurrecting this very old thread. We are using the files in the ripe DB for creating our own geo-location DB. It's straightforward to get country level geo-ip classification. We are now looking into a city level geo-ip information and I have just come across this old thread about "geofeed" It would be great to know whether this geofeed is already being implemented in the ripe FTP. Thank you very much. With kind regards. Arcadius, On Thu, 8 Apr 2021 at 15:18, Jori Vanneste via db-wg <db-wg@ripe.net> wrote:
Hi Ed,
On 4/8/2021 3:36 PM, Edward Shryane via db-wg wrote:
Hi Jori,
On 8 Apr 2021, at 14:42, Tyrasuki <tyrasuki@pm.me> wrote:
Hi Ed,
This seems like a good implementation to me.
However, I don't think it's a good idea to limit the values on the "remarks" attribute in this way, as this could cause unwanted side effects with for ex. messages left on objects for other network operators. Given the draft states:
" Any particular inetnum: object MAY have, at most, one geofeed reference, whether a remarks: or a proper geofeed: attribute when one is defined."
Do we enforce this by validating that there is only one "remarks: Geofeed" value (or "geofeed:") in the object? My apologies, I think I missed this section in the draft, thanks for clarifying the reason to me.
Also:
"Do not support non-ASCII values in URL domain names or path (these must be converted beforehand)" Do you by this mean not supporting non-ASCII entirely? Or to have for ex. the web-interface convert IDNs to punycode, and have this listed on the object?
The RIPE database uses the Latin-1 character set, so IDN domain names or non-ASCII values in the URL path will be substituted with a '?' character, by default.
We could support non-ASCII values by automatically converting them (like we do with non-ASCII domains in email addresses).
That sounds like a good approach to me, thanks for clarifying. :)
I think this is indeed a good starting ground for a minimal NWI, and would like to see where this goes.
If the latter, and remarks can remain free-form, I'd say let's implement.
Cheers, Jori Vanneste FOD11-RIPE
Regards Ed
Best regards, Jori Vanneste FOD11-RIPE
-- Arcadius Ahouansou Menelic Ltd | Applied Knowledge Is Power Office : +441444702101 Mobile: +447908761999 Menelic Ltd: menelic.com SmartLobby: SmartLobby.co <https://smartlobby.co/> Hosted Apache Solr Services: solrfarm.com ---
Hi Arcadius If you download the inetnum split file from the RIPE ftp site you will see there are already some "geofeed:" attributes in there as well as some still using the earlier "remarks: geofeed" option. denis$ zgrep "^geofeed:" ~/Desktop/ripe.db.inetnum.gz | wc -l 289 denis$ zgrep "^remarks:\s*geofeed" ~/Desktop/ripe.db.inetnum.gz | wc -l 13 Although from the 289 there are only 161 unique organisations. There are still a lot of people using the (old) "geoloc:" attribute. denis$ zgrep "^geoloc:" ~/Desktop/ripe.db.inetnum.gz | wc -l 35453 You mentioned getting country level geo-ip, are you using the "country:" attribute? If so which one? The ones in the INET(6)NUM objects are undefined to anyone other than the resource holder. The only one that is defined is the "country:" attribute in the referenced ORGANISATION object for allocations and PI assignments. This is the country the resource holder is legally based in. cheers denis co-chair DB-WG On Tue, 12 Apr 2022 at 19:09, Arcadius Ahouansou via db-wg <db-wg@ripe.net> wrote:
Hello Jori, Edward and All.
I apologise for resurrecting this very old thread.
We are using the files in the ripe DB for creating our own geo-location DB. It's straightforward to get country level geo-ip classification. We are now looking into a city level geo-ip information and I have just come across this old thread about "geofeed"
It would be great to know whether this geofeed is already being implemented in the ripe FTP.
Thank you very much.
With kind regards.
Arcadius,
On Thu, 8 Apr 2021 at 15:18, Jori Vanneste via db-wg <db-wg@ripe.net> wrote:
Hi Ed,
On 4/8/2021 3:36 PM, Edward Shryane via db-wg wrote:
Hi Jori,
On 8 Apr 2021, at 14:42, Tyrasuki <tyrasuki@pm.me> wrote:
Hi Ed,
This seems like a good implementation to me.
However, I don't think it's a good idea to limit the values on the "remarks" attribute in this way, as this could cause unwanted side effects with for ex. messages left on objects for other network operators. Given the draft states:
" Any particular inetnum: object MAY have, at most, one geofeed reference, whether a remarks: or a proper geofeed: attribute when one is defined."
Do we enforce this by validating that there is only one "remarks: Geofeed" value (or "geofeed:") in the object? My apologies, I think I missed this section in the draft, thanks for clarifying the reason to me.
Also:
"Do not support non-ASCII values in URL domain names or path (these must be converted beforehand)" Do you by this mean not supporting non-ASCII entirely? Or to have for ex. the web-interface convert IDNs to punycode, and have this listed on the object?
The RIPE database uses the Latin-1 character set, so IDN domain names or non-ASCII values in the URL path will be substituted with a '?' character, by default.
We could support non-ASCII values by automatically converting them (like we do with non-ASCII domains in email addresses).
That sounds like a good approach to me, thanks for clarifying. :)
I think this is indeed a good starting ground for a minimal NWI, and would like to see where this goes.
If the latter, and remarks can remain free-form, I'd say let's implement.
Cheers, Jori Vanneste FOD11-RIPE
Regards Ed
Best regards, Jori Vanneste FOD11-RIPE
-- Arcadius Ahouansou Menelic Ltd | Applied Knowledge Is Power Office : +441444702101 Mobile: +447908761999 Menelic Ltd: menelic.com SmartLobby: SmartLobby.co Hosted Apache Solr Services: solrfarm.com
--- --
To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://lists.ripe.net/mailman/listinfo/db-wg
Hello Denis and All. Thank you very much for your reply. For instance with an entry like shown below, we will be using NL as the country. As you rightly said, that is for the organisation. So, we are working on moving to something more accurate on the city level. Thank you very much. With best regards Arcadius % Tags relating to '156.150.0.0 - 156.150.255.255' % RIPE-REGISTRY-RESOURCE inetnum: 192.68.230.0 - 192.68.230.255 netname: Atos-MEV *country: NL* org: ORG-OB2-RIPE admin-c: DUMY-RIPE tech-c: DUMY-RIPE status: LEGACY notify: Global-IPcoordinator@atos.net mnt-by: RIPE-NCC-LEGACY-MNT mnt-routes: MNT-VALUESOLUTIONS mnt-by: GIPC-ORIGIN-MNT created: 1970-01-01T00:00:00Z last-modified: 2017-08-24T08:32:00Z source: RIPE remarks: **************************** remarks: * THIS OBJECT IS MODIFIED remarks: * Please note that all data that is generally regarded as personal remarks: * data has been removed from this object. remarks: * To view the original object, please query the RIPE Database at: remarks: * http://www.ripe.net/whois remarks: **************************** On Tue, 12 Apr 2022 at 20:12, denis walker <ripedenis@gmail.com> wrote:
Hi Arcadius
If you download the inetnum split file from the RIPE ftp site you will see there are already some "geofeed:" attributes in there as well as some still using the earlier "remarks: geofeed" option.
denis$ zgrep "^geofeed:" ~/Desktop/ripe.db.inetnum.gz | wc -l 289 denis$ zgrep "^remarks:\s*geofeed" ~/Desktop/ripe.db.inetnum.gz | wc -l 13
Although from the 289 there are only 161 unique organisations.
There are still a lot of people using the (old) "geoloc:" attribute. denis$ zgrep "^geoloc:" ~/Desktop/ripe.db.inetnum.gz | wc -l 35453
You mentioned getting country level geo-ip, are you using the "country:" attribute? If so which one? The ones in the INET(6)NUM objects are undefined to anyone other than the resource holder. The only one that is defined is the "country:" attribute in the referenced ORGANISATION object for allocations and PI assignments. This is the country the resource holder is legally based in.
cheers denis co-chair DB-WG
On Tue, 12 Apr 2022 at 19:09, Arcadius Ahouansou via db-wg <db-wg@ripe.net> wrote:
Hello Jori, Edward and All.
I apologise for resurrecting this very old thread.
We are using the files in the ripe DB for creating our own geo-location
It's straightforward to get country level geo-ip classification. We are now looking into a city level geo-ip information and I have just come across this old thread about "geofeed"
It would be great to know whether this geofeed is already being implemented in the ripe FTP.
Thank you very much.
With kind regards.
Arcadius,
On Thu, 8 Apr 2021 at 15:18, Jori Vanneste via db-wg <db-wg@ripe.net> wrote:
Hi Ed,
On 4/8/2021 3:36 PM, Edward Shryane via db-wg wrote:
Hi Jori,
On 8 Apr 2021, at 14:42, Tyrasuki <tyrasuki@pm.me> wrote:
Hi Ed,
This seems like a good implementation to me.
However, I don't think it's a good idea to limit the values on the
"remarks" attribute in this way, as this could cause unwanted side effects with for ex. messages left on objects for other network operators.
Given the draft states:
" Any particular inetnum: object MAY have, at most, one geofeed reference, whether a remarks: or a proper geofeed: attribute when one is defined."
Do we enforce this by validating that there is only one "remarks: Geofeed" value (or "geofeed:") in the object? My apologies, I think I missed this section in the draft, thanks for clarifying the reason to me.
Also:
"Do not support non-ASCII values in URL domain names or path (these must be converted beforehand)" Do you by this mean not supporting non-ASCII entirely? Or to have for ex. the web-interface convert IDNs to punycode, and have this listed on
DB. the object?
The RIPE database uses the Latin-1 character set, so IDN domain names or non-ASCII values in the URL path will be substituted with a '?' character, by default.
We could support non-ASCII values by automatically converting them (like we do with non-ASCII domains in email addresses).
That sounds like a good approach to me, thanks for clarifying. :)
I think this is indeed a good starting ground for a minimal NWI, and would like to see where this goes.
If the latter, and remarks can remain free-form, I'd say let's implement.
Cheers, Jori Vanneste FOD11-RIPE
Regards Ed
Best regards, Jori Vanneste FOD11-RIPE
-- Arcadius Ahouansou Menelic Ltd | Applied Knowledge Is Power Office : +441444702101 Mobile: +447908761999 Menelic Ltd: menelic.com SmartLobby: SmartLobby.co Hosted Apache Solr Services: solrfarm.com
--- --
To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://lists.ripe.net/mailman/listinfo/db-wg
-- Arcadius Ahouansou Menelic Ltd | Applied Knowledge Is Power Office : +441444702101 Mobile: +447908761999 Menelic Ltd: menelic.com SmartLobby: SmartLobby.co <https://smartlobby.co/> Hosted Apache Solr Services: solrfarm.com ---
On Thu, Apr 08, 2021 at 02:27:13PM +0200, Edward Shryane via db-wg wrote:
On 8 Apr 2021, at 13:54, Randy Bush via db-wg <db-wg@ripe.net> wrote:
Could we consider creating an NWI with a reduced scope?
as an exercise, how minimal can we get?
Given the draft RFC: https://datatracker.ietf.org/doc/draft-ietf-opsawg-finding-geofeeds/?include...
I suggest the following minimal Solution Definition for an NWI:
- Implement support for an optional, single "geofeed:" attribute in inetnum and inet6num object types. - Validate there is a maximum of either one "remarks: Geofeed" or "geofeed:" attribute per object. - Validate the "geofeed:" URL is well-formed and specifies the HTTPS protocol. - Include the "geofeed:" attribute in database dumps and split files.
And inversely, what we could leave out (to simplify the implementation):
- Do not support non-ASCII values in URL domain names or path (these must be converted beforehand) - Do not migrate (re-write) "remarks: Geofeed" values as "geofeed:" attributes - Do not validate that the URL is reachable (available) and do not validate the content
Is this enough to satisfy the draft requirements? Is it enough to be useful for the DB-WG?
I think the above is a great start to get a 'minimal viable geofeed' going! The first priority should be to move things over from 'remarks:' to a dedicated attribute. Kind regards, Job
Guys I don't see the issue of what, if anything, should be validated as a show stopper for introducing the "geofeed:" attribute. This is my idea of utilising the RIRs to improve the value of services with increased validation. That's why I changed the subject line and started it as a different thread. We can come back to this later. Apologies for taking you down a side road, but at least I got some initial feelings from you on this more general issue. cheers denis co-chair DB-WG On Thu, 8 Apr 2021 at 00:28, George Michaelson via db-wg <db-wg@ripe.net> wrote:
The Geofeed: field is a URL.
It points to a resource.
The semantic content of the resource should not be checked, what matters is that the URL is not a 404 at the time of publication.
if you want to check it isn't a 404 after that, its like Lame checks: good to do, not strictly essential in the role of Whois/RPSL.
if you want to check the semantic intent of the .csv geo data, thats not db-wb.
This work is important. it substantially improves the STEERAGE to find the delegates assertions about geo for their INR. This is sufficiently high value in itself its worth doing. Checking the integrity of what they say goes beyond the role of a steerage/directory function.
(my opinion)
cheers
-G
On Thu, Apr 8, 2021 at 8:19 AM Cynthia Revström via db-wg <db-wg@ripe.net> wrote:
Hi,
I just wanted to clarify my stance on validation a bit more.
I am totally against trying to validate the data itself, that is not what the NCC is supposed to do. Validating the format of the CSV might be okay but honestly anything beyond validating that it is not a 404 not found is a bit too much in my opinion.
I also agree with Leo's points with regards to fixing the data, I believe that the data publishers have a pretty strong incentive to have the data be accurate. And as Leo also mentions, the tech-c and/or admin-c contacts are also published so finding a reporting mechanism for issues would not be very difficult.
And with regards to misformatted data, yeah I would probably just ignore that entry if I was writing a parser and log the error and report it to an engineer who can then forward it to the admin contact if they determine it to be a real issue.
In order to not infinitely delay this, I feel like while it shouldn't be rushed, I am not sure how realistic this issue would be and how much harm it would cause to anyone.
Also, changing how much validation is done could be changed in the future if it is shown to be an actual real world problem.
-Cynthia
On Wed, Apr 7, 2021 at 10:58 PM Leo Vegoda <leo@vegoda.org> wrote:
Hi Denis,
This message is in response to several in the discussion .
In brief: I have seen network operators distraught because their network was misclassified as being in the wrong geography for the services their customers needed to access and they had no way to fix that situation. I feel that publishing geofeed data in the RIPE Database would be a good thing to do as it helps network operators share data in a structured way and should reduce the overall amount of pain from misclassified networks.
I personally would like to see an agreement on your draft problem statement and some feedback from the RIPE NCC before focusing on some of the more detailed questions you raised.
I also agree with you that accurate and reliable data is important. But...
On Wed, Apr 7, 2021 at 7:19 AM denis walker via db-wg <db-wg@ripe.net> wrote:
[...]
You say most consumers of this geofeed data will be software capable of validating the csv file. What will this software do when it finds invalid data? Just ignore it? Will this software know who to report data errors to? Will it have any means to follow up on reported errors?
I would have thought that anyone implementing a parser for this data would also be able to query the database for a tech-c and report validation failures. Based on my previous interactions with the network operators who have suffered misclassification, I am confident that there is a strong incentive for networks to publish well formatted accurate data and to fix any errors quickly.
That said, there are many possible ways to reduce the risk of badly formatted data. For instance, the RIPE NCC could offer a tool to create the relevant files to be published through the LIR Portal or as a standalone tool. This is why I'd like to see feedback from the RIPE NCC ahead of an implementation discussion.
Services like geofeed are good ideas. But if the data quality or accessibility deteriorates over time it becomes useless to misleading. That is why I believe centralised validating, testing and reporting are helpful. I think the RIRs are well positioned for doing these tasks and should do more of them.
I agree with you that defining what data means and keeping it accurate is important. But in the case of geo data, could the RIPE NCC validate the content as well as the data structures? I'd have thought that the publishers and the users of the data would be in the best position to do that. Am I wrong?
Kind regards,
Leo
Hi Denis, On Thu, Apr 08, 2021 at 12:55:32AM +0200, denis walker via db-wg wrote:
I don't see the issue of what, if anything, should be validated as a show stopper for introducing the "geofeed:" attribute. This is my idea of utilising the RIRs to improve the value of services with increased validation. That's why I changed the subject line and started it as a different thread. We can come back to this later. Apologies for taking you down a side road, but at least I got some initial feelings from you on this more general issue.
I recognize value in your suggestions on RIR-driven validation, and definitely agree on the desired outcomes, but I'm not convinced its the RIRs that should take on (as permanent task) to do outreach about 'broken geofeeds'. While keeping in mind that 'geofeed:' is a new utility for this industry and we (as collective of producers & consumers) have yet to see how things will work out exactly. One thing stood out to me in what Denis wrote: "This geofeed attribute will delegate this information process out to thousands of organisations." (src: https://www.ripe.net/ripe/mail/archives/db-wg/2021-April/006893.html) While indeed globally thousands of organizations are now being guided and enabled towards populating geofeeds, this in itself is not an indicator that a decentralized approach will 'overtake' the current market for "GeoIP information". It doesn't strike me as unfeasible that the likes of MaxMind, IPinfo.info, or any other GeoIP aggregators will take on the role of 'patrolling' the feeds and providing tools/notifications to those who appear to publish broken information. The (commercial) 'GeoIP market' is far more advanced than a mere IP Address <> Geographical location mapping. Another layer of advancement that exists: customers of GeoIP information oftentimes will correlate the purchased GeoIP information with their own internal records on fraud and other activities of interest, and also acquire GeoIP from multiple (possibly even non-database) sources. Some companies measure latency to help approximate geography. To me it seems unlikely the 'geofeed' mechanism will 'wipe out' the existing market, but rather geofeeds might be a (significant!) enhancement for existing practises. I do think that 'geofeed:' in some ways democratizes the market in the sense that an industry standard publication mechanism and easier access to this type of access means that more people can cheaply acquire GeoIP information which means that existing GeoIP providers will have to step up their game .... which is a positive! :) I think we should only ask the RIRs 'to do something' when it has become clear the industry itself is unable to organize it themselves. RIPE Atlas probably is a great example of someting only an RIR could've pulled off. Kind regards, Job ps. An example where data quality urgently needed to increase were RPKI ROAs in the 2019-2020 time frame. To help the BGP Default-Free Zone get rid of 'RPKI Invalid' BGP announcements, it was not the RIRs that 'did the cleanup', it was the efforts of individuals such as 'nusenu_', Anurag Bhatia, Massimo Candela's BGPAlerter, and hundreds of network operators actually deploying RPKI ROV that resulted in significant industry-wide cleanup. RIPE NCC indeed does have a 'wrong-ROA' alerting mechanism but it only applied to RIPE managed space, and (imho) is too crude of an alerting mechanism to be useful in most corporate contexts. pps. Another example of global cleanup led by an individuals is Jared Mauch's "Open Resolver Project" for which he was awarded the M3AAWG J.D. Falk Award.
participants (12)
-
Arcadius Ahouansou
-
Cynthia Revström
-
denis walker
-
Edward Shryane
-
George Michaelson
-
Hank Nussbacher
-
Job Snijders
-
Jori Vanneste
-
Leo Vegoda
-
Massimo Candela
-
Randy Bush
-
Tyrasuki