/24 prefix "hijackability" metric (defining "better than avg AS")
Hi, I'm currently estimating how "vulnerable" certain IP addresses are to BGP hijacking. To do that, I put them into different categories (multiple can apply): a) RPKI validity state is "NotFound" (no ROA) and IP located in a prefix shorter than /24 (IPv4) or /48 (IPv6) b) Valid ROA but weak maxlength c) Valid ROA with proper maxlength d) is announced in a /24 prefix (IPv4) or /48 (IPv6) e) = (c) + (d) In addition to the distinction of prefix length (/24 vs. </24) I'd like to subcategorize /24 prefixes into - /24 prefix located in "well" connected AS (attacker's BGP visibility is presumed lower than the authentic AS visibility) - /24 prefix located in "poorly" connected AS (better for the attacker) The question is: What is the threshold and metric to tell these two apart? I'm having 3 approaches in mind and wanted to hear if you have any preferences, opinions or other approaches: Approach 1: ----------- If avg AS PATH length as provided by [1] is <2 in more than 50% of given locations and DE-CIX and AMS-IX is among them, then consider it a "well connected AS" Approach 2: ----------- Use CAIDA's AS rank data and define the top 50% ASes as "well" connected Approach 3: ----------- define "well connected" as avg AS PATH as seen in [1] is shorter than the global avg. AS PATH length (defined in [2]) Also: If there are already well established metrics for "well connected" AS I'd be happy to hear about them. Currently I'm leaning towards approach 1 as it is probably the strictest and most conservative approach. I also might compare the results of all 3 approaches. thanks! nusenu [1] https://stat.ripe.net/docs/data_api#AsPathLength [2] http://thyme.rand.apnic.net/current/data-summary (the mean value would actually be more interesting than the avg) Because it is hard to collect ROV data and the list on https://rov.rpki.net is still short I do not try to include a ROV metric (yet).
On Tue, Aug 14, 2018 at 07:58:00PM +0000, nusenu wrote:
I'm currently estimating how "vulnerable" certain IP addresses are to BGP hijacking.
To do that, I put them into different categories (multiple can apply):
a) RPKI validity state is "NotFound" (no ROA) and IP located in a prefix shorter than /24 (IPv4) or /48 (IPv6) b) Valid ROA but weak maxlength c) Valid ROA with proper maxlength d) is announced in a /24 prefix (IPv4) or /48 (IPv6) e) = (c) + (d)
Interesting approach! This is the first time I've seen someone phrase it this formally, but you are correct I think.
In addition to the distinction of prefix length (/24 vs. </24) I'd like to subcategorize /24 prefixes into - /24 prefix located in "well" connected AS (attacker's BGP visibility is presumed lower than the authentic AS visibility) - /24 prefix located in "poorly" connected AS (better for the attacker)
The question is: What is the threshold and metric to tell these two apart?
I'm having 3 approaches in mind and wanted to hear if you have any preferences, opinions or other approaches:
Approach 1: ----------- If avg AS PATH length as provided by [1] is <2 in more than 50% of given locations and DE-CIX and AMS-IX is among them, then consider it a "well connected AS"
I'd try to avoid building in a bias towards institutions such as AMS-IX or DE-CIX.
Approach 2: ----------- Use CAIDA's AS rank data and define the top 50% ASes as "well" connected
Perhaps create multiple buckets, in order of 'weakness': - single-homed stub network - dual-homed stub network - bottom 50% ASNs on CAIDA's AS rank - top 50% ASNs on CAIDA's AS rank - ASNs in the top 100 on CAIDA's AS rank
Approach 3: ----------- define "well connected" as avg AS PATH as seen in [1] is shorter than the global avg. AS PATH length (defined in [2])
Also: If there are already well established metrics for "well connected" AS I'd be happy to hear about them.
Currently I'm leaning towards approach 1 as it is probably the strictest and most conservative approach. I also might compare the results of all 3 approaches.
Perhaps you can try both approach 1, approach 2 and approach 2bis and compare the results. Since we know we are *not* seeing tons of BGP paths (the collectors only receive what they receive, which is a subset of all BGP data), the trick is to find a proxy metric that through which you can attempt to model reality.
Because it is hard to collect ROV data and the list on https://rov.rpki.net is still short I do not try to include a ROV metric (yet).
I'm sad to have to report that the listing on rov.rpki.net is presented in a misleading way, and useless for your purpose. I'd ignore any data from rov.rpki.net entirely for now. Kind regards, Job
On Sep 22, 2018, at 9:57 AM, Job Snijders <job@instituut.net> wrote:
On Tue, Aug 14, 2018 at 07:58:00PM +0000, nusenu wrote:
I'm currently estimating how "vulnerable" certain IP addresses are to BGP hijacking.
To do that, I put them into different categories (multiple can apply):
a) RPKI validity state is "NotFound" (no ROA) and IP located in a prefix shorter than /24 (IPv4) or /48 (IPv6) b) Valid ROA but weak maxlength c) Valid ROA with proper maxlength
Are “weak” and “proper” defined in terms of presence or absence in the global routing update database?
d) is announced in a /24 prefix (IPv4) or /48 (IPv6) e) = (c) + (d)
Interesting approach! This is the first time I've seen someone phrase it this formally, but you are correct I think.
You say ‘estimating how “vulnerable”’, so this is an ordering, right? (a) is most vulnerable? I’m wondering how this vulnerability order applies to IRR route objects as well. Could you see expanding this vulnerability estimate to include (and what would be the order of vulnerability) (1) No route object found in any IRR (2) Route object found in an IRR hosted by an RIR (3) Route object found, but self-declares as “proxy registration” (4) Route object found, multiplied by age (5) Multiple route objects found, with differing prefix length, in a single IRR database (6) Multiple route objects found, with differing prefix lengths, in different IRR databases
In addition to the distinction of prefix length (/24 vs. </24) I'd like to subcategorize /24 prefixes into - /24 prefix located in "well" connected AS (attacker's BGP visibility is presumed lower than the authentic AS visibility) - /24 prefix located in "poorly" connected AS (better for the attacker)
I recall presentations at RIPE of prefix mis-originations directed at certain desirable targets at large IXs. So not sure how much faith to put into connectedness or AS-rank. —Sandy
The question is: What is the threshold and metric to tell these two apart?
I'm having 3 approaches in mind and wanted to hear if you have any preferences, opinions or other approaches:
Approach 1: ----------- If avg AS PATH length as provided by [1] is <2 in more than 50% of given locations and DE-CIX and AMS-IX is among them, then consider it a "well connected AS"
I'd try to avoid building in a bias towards institutions such as AMS-IX or DE-CIX.
Approach 2: ----------- Use CAIDA's AS rank data and define the top 50% ASes as "well" connected
Perhaps create multiple buckets, in order of 'weakness':
- single-homed stub network - dual-homed stub network - bottom 50% ASNs on CAIDA's AS rank - top 50% ASNs on CAIDA's AS rank - ASNs in the top 100 on CAIDA's AS rank
Approach 3: ----------- define "well connected" as avg AS PATH as seen in [1] is shorter than the global avg. AS PATH length (defined in [2])
Also: If there are already well established metrics for "well connected" AS I'd be happy to hear about them.
Currently I'm leaning towards approach 1 as it is probably the strictest and most conservative approach. I also might compare the results of all 3 approaches.
Perhaps you can try both approach 1, approach 2 and approach 2bis and compare the results.
Since we know we are *not* seeing tons of BGP paths (the collectors only receive what they receive, which is a subset of all BGP data), the trick is to find a proxy metric that through which you can attempt to model reality.
Because it is hard to collect ROV data and the list on https://rov.rpki.net is still short I do not try to include a ROV metric (yet).
I'm sad to have to report that the listing on rov.rpki.net is presented in a misleading way, and useless for your purpose. I'd ignore any data from rov.rpki.net entirely for now.
Kind regards,
Job
Sandra Murphy wrote:
On Tue, Aug 14, 2018 at 07:58:00PM +0000, nusenu wrote:
I'm currently estimating how "vulnerable" certain IP addresses are to BGP hijacking.
To do that, I put them into different categories (multiple can apply):
a) RPKI validity state is "NotFound" (no ROA) and IP located in a prefix shorter than /24 (IPv4) or /48 (IPv6) b) Valid ROA but weak maxlength c) Valid ROA with proper maxlength
Are “weak” and “proper” defined in terms of presence or absence in the global routing update database?
I probably should have used the same wording as the related Internet-Draft uses: weak: a "loose ROA" proper: a "minimal ROA" as described in: https://datatracker.ietf.org/doc/draft-ietf-sidrops-rpkimaxlen
You say ‘estimating how “vulnerable”’, so this is an ordering, right? (a) is most vulnerable?
correct, my assumption is that (a) is most vulnerable.
I’m wondering how this vulnerability order applies to IRR route objects as well.
I also looked at IRR coverage [1] but I only considered RIPE's IRR because most prefixes I analyzed were from the RIPE region and RIPE has the best data quality/authorization checks. [1] Figure 6: https://medium.com/@nusenu/how-vulnerable-is-the-tor-network-to-bgp-hijackin... kind regards, nusenu -- https://twitter.com/nusenu_ https://mastodon.social/@nusenu
Job Snijders wrote:
On Tue, Aug 14, 2018 at 07:58:00PM +0000, nusenu wrote:
I'm currently estimating how "vulnerable" certain IP addresses are to BGP hijacking.
To do that, I put them into different categories (multiple can apply):
a) RPKI validity state is "NotFound" (no ROA) and IP located in a prefix shorter than /24 (IPv4) or /48 (IPv6) b) Valid ROA but weak maxlength c) Valid ROA with proper maxlength d) is announced in a /24 prefix (IPv4) or /48 (IPv6) e) = (c) + (d)
Interesting approach! This is the first time I've seen someone phrase it this formally, but you are correct I think.
thanks for the feedback, I'm glad it made some sense. context: I wrote that email while putting together this post: https://medium.com/@nusenu/how-vulnerable-is-the-tor-network-to-bgp-hijackin... (specifically the "what properties do we consider?" section) In the end I went ahead with "Approach 2" and used the following definition: 'we consider all ASes with an AS rank <= 10000 to be “better connected than the attacking AS”' which split the /24 prefixes I looked at in about half (10 vs. 9 as seen in Figure 3). kind regards, nusenu -- https://twitter.com/nusenu_ https://mastodon.social/@nusenu
participants (3)
-
Job Snijders
-
nusenu
-
Sandra Murphy