Re: [address-policy-wg] IPv6 PI resource question!
Hi, I'm new to the list but would like to comment on this subject. I'm also trying to receive IPv6 PI for a webhosting customer of us which seems to be a big problem. I do not understand why everyone claims that 'the routing table growth' is the reason not to allow too much IPv6 PI. Those currently having IPv4 PI would of course require IPv6 PI because they still want to be indepentent. If they need to be a LIR to get IPv6 PA the problem is not solved for the routing-table issue. A prefix is still entered into the same global routing-table. I would suggest changing the IPv6 PI policy to make it more clear for hosting/isp company's to get IPv6 PI by enforcing that the equipment on which the IPv6 addresses are installed in rooms accessible by that company. Multi-homing is a good reason also. I'm not sure if we should delete that reason. This makes the company more aware that they are really independend. This suggestion solves two things: - large broadband ISP's with CPE in office-building, homes etc do not fit into this policy. The equipment is behind other company's doors. This enforces these company's to assign block to their customers from PA space as it was meant to be. - small ISP's/hosting company's with a few racks installed at one ore more DC's (which we talk about in this postings) do fit into this policy and will be independend to their internet suppliers. Of course, they should request another PI (or get assigned-PA from a LIR) for their customers which buy dedicated racks from them and receiving their own key to the lock. This is exactly what we would like to see it. What do other people on the list think about this? regards, Igor Ybema
On 16/02/2011 19:08, Igor Ybema wrote:
I do not understand why everyone claims that 'the routing table growth' is the reason not to allow too much IPv6 PI.
This is not really relevant to address-policy-wg, but it probably does need some explanation, because there are probably people on the mailing list who may not understand why some people get so upset about the issue. When a packet passes through a router, the router needs to decide where to send that packet. The way it works is that the router examines the destination IP address of the packet and performs a search through the entire routing table to see what the next hop address of that packet should be. The component on a router which does this lookup is called a "lookup engine", and depending on how big the router is, it will be implemented in one of a couple of different ways. On a low-end router (e.g. Juniper J series or Cisco 7200/7300), this is done on the main CPU. As this is a generic purpose CPU, this means that a router of this form will only be able to handle forwarding a certain number of packets per second before the CPU gets to busy to handle any more. A router of this form is generally referred to as a "software router". On a higher-end router (e.g. Juniper MX / M / T series, or Cisco GSR/CRS/7600), they use dedicated hardware lookup-engines, and in a chassis + blade system, these lookup engines will often be located on the line cards themselves. The advantage of this is that you can achieve _much_ higher throughput on the router. The disadvantage is that dedicated hardware of this form tends to be very expensive. A router of this form is generally referred to as a "hardware router". One of the more common hardware components for performing IP address lookups is called TCAM - ternary content addressable memory. It performs a similar function to an associative array in a language like PHP, except it's implemented in hardware. It's ridiculously fast and ridiculously expensive, and because it's so expensive, router manufacturers tend not to put large quantities into their lookup engines. So, a router vendor like Cisco might create a router with a lookup engine which had enough TCAM for 256000 ipv4 addresses (e.g. C7600/SUP7203B), or 500,000 entries (e.g. ASR1001), or they might make a line card with its own dedicated lookup engine which could handle 1,000,000 ipv4 addresses (e.g. ASR9000, brocade XMR). While you get very good performance from these lookup engines, you are also constrained by the fact that if the number of prefixes on your router exceeds the number of TCAM slots, then your router will either drop the packets or else do the lookup on the route-processor CPU. I.e. the moment you hit 1,000,001 prefixes, your €200,000 C7600 with 20 x 10G ports will turn into a software router with the performance of a C7200VXR. This is generally considered to be a Bad Thing. If there are too many routes on the internet, this will cause the capacity of these routers to be exceeded, and they will need to be upgraded with a device with more lookup capacity. Replacing one or two routers like this for a small service provider is expensive, but if you have to perform a forklift upgrade on a continental or global infrastructure, you may be talking about hundreds of millions of € / $ / £ worth of investment. Getting back to IPv6 PI assignments, the reason that people are so upset about them is that an IPv6 prefix can take up to 4 times the amount of TCAM than an ipv4 prefix. I.e. it fills up your router's routing slots 4 times faster. So, it will cause serious problems in future years if there is lots of IPv6 PI assigned to lots of end-users. Nick
Nick, Although your explanation of why large routing table are unwanted in the current discussion with regards to PI space I fail to see the point. Our customers want independence for a number of reasons - with IPv4 they could get PI space and have that independence. With regards to IPv6 the rules are currently different and they cannot get PI space. As suggested on this list many a time a work around is to become a LIR and get your own PA space. In the end these customers *will* generate new routes. The artificial hurdle in place now is in effect only slowing down the IPv6 adaptation in the case of our customers, it will not stop them from achieving their goals of being independent. Jasper -----Original Message----- From: address-policy-wg-admin@ripe.net [mailto:address-policy-wg-admin@ripe.net] On Behalf Of Nick Hilliard Sent: Wednesday, February 16, 2011 11:11 PM To: address-policy-wg@ripe.net Subject: Re: [address-policy-wg] IPv6 PI resource question! On 16/02/2011 19:08, Igor Ybema wrote:
I do not understand why everyone claims that 'the routing table growth' is the reason not to allow too much IPv6 PI.
This is not really relevant to address-policy-wg, but it probably does need some explanation, because there are probably people on the mailing list who may not understand why some people get so upset about the issue. When a packet passes through a router, the router needs to decide where to send that packet. The way it works is that the router examines the destination IP address of the packet and performs a search through the entire routing table to see what the next hop address of that packet should be. The component on a router which does this lookup is called a "lookup engine", and depending on how big the router is, it will be implemented in one of a couple of different ways. On a low-end router (e.g. Juniper J series or Cisco 7200/7300), this is done on the main CPU. As this is a generic purpose CPU, this means that a router of this form will only be able to handle forwarding a certain number of packets per second before the CPU gets to busy to handle any more. A router of this form is generally referred to as a "software router". On a higher-end router (e.g. Juniper MX / M / T series, or Cisco GSR/CRS/7600), they use dedicated hardware lookup-engines, and in a chassis + blade system, these lookup engines will often be located on the line cards themselves. The advantage of this is that you can achieve _much_ higher throughput on the router. The disadvantage is that dedicated hardware of this form tends to be very expensive. A router of this form is generally referred to as a "hardware router". One of the more common hardware components for performing IP address lookups is called TCAM - ternary content addressable memory. It performs a similar function to an associative array in a language like PHP, except it's implemented in hardware. It's ridiculously fast and ridiculously expensive, and because it's so expensive, router manufacturers tend not to put large quantities into their lookup engines. So, a router vendor like Cisco might create a router with a lookup engine which had enough TCAM for 256000 ipv4 addresses (e.g. C7600/SUP7203B), or 500,000 entries (e.g. ASR1001), or they might make a line card with its own dedicated lookup engine which could handle 1,000,000 ipv4 addresses (e.g. ASR9000, brocade XMR). While you get very good performance from these lookup engines, you are also constrained by the fact that if the number of prefixes on your router exceeds the number of TCAM slots, then your router will either drop the packets or else do the lookup on the route-processor CPU. I.e. the moment you hit 1,000,001 prefixes, your €200,000 C7600 with 20 x 10G ports will turn into a software router with the performance of a C7200VXR. This is generally considered to be a Bad Thing. If there are too many routes on the internet, this will cause the capacity of these routers to be exceeded, and they will need to be upgraded with a device with more lookup capacity. Replacing one or two routers like this for a small service provider is expensive, but if you have to perform a forklift upgrade on a continental or global infrastructure, you may be talking about hundreds of millions of € / $ / £ worth of investment. Getting back to IPv6 PI assignments, the reason that people are so upset about them is that an IPv6 prefix can take up to 4 times the amount of TCAM than an ipv4 prefix. I.e. it fills up your router's routing slots 4 times faster. So, it will cause serious problems in future years if there is lots of IPv6 PI assigned to lots of end-users. Nick Op dit e-mailbericht is een disclaimer van toepassing, welke te vinden is op http://www.espritxb.nl/disclaimer
* Nick Hilliard:
One of the more common hardware components for performing IP address lookups is called TCAM - ternary content addressable memory. It performs a similar function to an associative array in a language like PHP, except it's implemented in hardware. It's ridiculously fast and ridiculously expensive, and because it's so expensive, router manufacturers tend not to put large quantities into their lookup engines.
Very nice explanation, thanks! Cost is probably not that much of an issue. I suspect that the bits-per-Watt number is still extremely poor for current TCAMs (data sheets do not quote them), which means that you can only use a limited number of them in parallel. Especially for IPv6, alternative implementations are probably quite competitive, such as hardware-assisted tries or hybrid schemes relying mostly on DRAM. You can also take a few shortcuts if you encode the current IPv6 addressing architecture in silicon. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On 16 Feb 2011, at 19:08, Igor Ybema wrote:
I'm also trying to receive IPv6 PI for a webhosting customer of us which seems to be a big problem.
Well perhaps you could explain to us why you or your customer feel they need IPv6 PI space for webhosting. I'm struggling to see what the justification might be for any sort of PI space for that.
I do not understand why everyone claims that 'the routing table growth' is the reason not to allow too much IPv6 PI.
There's no reason to repeat the same mistakes with IPv6 allocation policy as we did with IPv4. Doling out lots of non-aggregatable IPv6 PI allocations will just give us a re-run of today's IPv4 routing table scalability horrors. Except with much longer prefixes => burning even more router memory and CPU. IPv6 gives us the potential to waste more space on PI vanity projects. That doesn't mean we should do so just because we can. Just think of the number of edge devices that are likely to get IPv6 addresses and have suppliers/vendors managing those edge address assignments: consumer electronics gadgets, RFID tags, phones, utility meters, lightbulbs, sensor networks, etc, etc. If these suppliers/vendors head down the PI path, life will get unpleasant. Unless you are a Cisco, Juniper or semiconductor shareholder. We now know a great deal more about how address policies affect route deaggregation than we did back in the good old days. Since IPv6 allocation policy is essentially starting from a blank sheet of paper, it shouldn't be encumbered with all the cruft and bad ideas that IPv4 policies picked up along the path to their eventual oblivion. This does of course mean that we can (and probably will) make new mistakes with IPv6 address allocation. It should not mean repeating the earlier ones. Please also bear in mind that there will be even more IPv4 deaggregation as the run-out starts to bite. So let's not make the router problems even worse by facilitating an explosion in the IPv6 routing table as that other problem gathers momentum.
Those currently having IPv4 PI would of course require IPv6 PI because they still want to be indepentent. If they need to be a LIR to get IPv6 PA the problem is not solved for the routing-table issue. A prefix is still entered into the same global routing-table.
This is true but misses the point Igor. Consider an LIR which doles out a /80 (say) to each of its customers who buys its web hosting services. These prefixes can be aggregated behind a single /64 (or whatever) => 1 routing table entry. If each of those customers got its own IPv6 PI space (why?), that's N routing table entries, all with longer prefixes, etc, etc. I'm unconvinced that "wanting to be independent" (of what?) for someone who's only needing some web hosting is a justification for PI space.
On 17/02/2011 09:54, Jim Reid wrote:
Well perhaps you could explain to us why you or your customer feel they need IPv6 PI space for webhosting. I'm struggling to see what the justification might be for any sort of PI space for that.
For exactly the same reason that you want your own IP address space for customer DSL access, or generic ISP service, or whatever. Even for very small businesses, it's still a bad business idea to shackle yourself to a single service provider. What if that relationship goes bad? Your business can go down the drain. Nick
Hi Jim,
I'm unconvinced that "wanting to be independent" (of what?) for someone who's only needing some web hosting is a justification for PI space.
You made a nice speech about routing table size, de-aggregation and not understanding why someone would want to be independent. I don't see it like that. The way the policy is currently, doesn't solve your points. Let's take the following cases. A customer wants to be independent. They ask for PI but doesn't want to multi-home. They want to be independent of their providers IP addresses, they basically want their own. Let's take as an example they are a webhosting company with some shared webservers for shared webhosting and VPS's on their own infrastructure OR for instance a city with x number of desktops and possibility to be aggregated with other cities into a single larger city (this is happening a lot in The Netherlands currently, cities merging together.) Those cities want to be able to have unique IP's without having to change all IP's with each change of the government when they decide to aggregate multiple smaller cities into a larger city. As the policy is currently, both will not get IPv6. They could get IPv4 PI and that's it. This same webhoster OR city, applies for a LIR membership. They get a /32 V6, they get a /21 IPv4, no questions asked. This points out exactly my statement, this policy flaw currently isn't about de-aggregation, routing table size or something else. It's a money issue. If you pay a LIR membership, you get all what you want. If you want a 'cheaper than a LIR membership cost' PI prefix, you need to change your infrastructure setup in order to comply if you don't want to buy your way into the community. As suggested yesterday as well, increase the cost for a /48 IPv6 PI object from 50 Euro to a 200 or 400 euro maintenance cost per year to avoid pet projects at home behind a DSL line and get rid of the multi-homing requirement. I've actually seen multiple cities get IP's denied by their current provider, got fed up with it and apply for a LIR membership and decided to do it themselves. Regards, Erik Bais
Hello,
As suggested yesterday as well, increase the cost for a /48 IPv6 PI object from 50 Euro to a 200 or 400 euro maintenance cost per year to avoid pet projects at home behind a DSL line and get rid of the multi-homing requirement.
I support this too. It shouldn't be necessary to apply for a LIR membership just to be independent from your upstream provider. This text should also be added to the IPv6 policy, as in IPv4: "IP addresses used solely for the connection of an End User to a service provider (e.g. point-to-point links) are considered part of the service provider's infrastructure." One of our end-user's application for PI IPv6 was rejected because the IPRA considered the IP addresses of their shared hosting web servers as assignments to other end-users. Most of the websites on the internet are hosted on shared hosting providers. These companies is running hundreds of websites on the same IP address, and it's impossible to use IPv6 for these providers if their end-users must apply for their own IPv6 assignment. It also doesn't make sense if a colocation customer with 1 server has to get their own assignment because the hosting provider wants to be independent. To make an easier transition to IPv6, we must allow hosting providers to use PI IPv6 for their hosting services. Most of them have PI IPv4 today, and don't want a PA allocation for their IPv6 needs either. -- Best regards, Vegar Løvås Rent a Rack AS
* Jim Reid:
There's no reason to repeat the same mistakes with IPv6 allocation policy as we did with IPv4. Doling out lots of non-aggregatable IPv6 PI allocations will just give us a re-run of today's IPv4 routing table scalability horrors.
With IPv4, you can aggregate routes before installing them into the FIB (because only few of the routing decisions are semantically meaningful), at least if the CPU which computes the FIB is a bit more beefy than what is built into your smartphone. If your vendor still isn't doing that, it might make sense to consider switching vendors, or at least use it as a negotiating tool for getting better deals on upgrades. So IPv4 isn't that bad, even technically, if you think about it. But it turns out that the current IPv6 allocation practice prevents running with aggregated FIBs---there's a hole after each PA allocation. The hole is visible because you're expected to generated ICMP unreachables for packets target there, so you can't lump two PA prefixes together, even if they share the same next hop. This is yet another case of premature optimization gone wrong. IPv6 history is full of well-meaning optimization attempts, but curiously few actually are an improvement over IPv4, and some are downright harmful (like the header layout "for optimized forwarding"). -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
So IPv4 isn't that bad, even technically, if you think about it. But it turns out that the current IPv6 allocation practice prevents running with aggregated FIBs---there's a hole after each PA allocation. The hole is visible because you're expected to generated ICMP unreachables for packets target there, so you can't lump two PA prefixes together, even if they share the same next hop. This is yet another case of premature optimization gone wrong.
This would mean that one ISP de-aggregating their /32 won't cause many problems. Those could be auto-aggregated in the FIB. - Sander
On 03/03/2011 09:30, Florian Weimer wrote:
With IPv4, you can aggregate routes before installing them into the FIB (because only few of the routing decisions are semantically meaningful), at least if the CPU which computes the FIB is a bit more beefy than what is built into your smartphone. If your vendor still isn't doing that, it might make sense to consider switching vendors, or at least use it as a negotiating tool for getting better deals on upgrades.
You can certainly do this with lookup engines. Problem is that if you do it, you're breaking the deterministic rib->fib relationship model and replacing it with a nondeterministic system, which will probably work very well in almost all situations, but which has corner cases which fail catastrophically once you run out of lookup engine bits. Looking at it another way, automatic route aggregation creates overcommit between the RIB and the FIB. If that overcommit charge is exercised, things will break horribly. Nick
* Nick Hilliard:
You can certainly do this with lookup engines. Problem is that if you do it, you're breaking the deterministic rib->fib relationship model and replacing it with a nondeterministic system, which will probably work very well in almost all situations, but which has corner cases which fail catastrophically once you run out of lookup engine bits.
I'm not convinced. The TCAM data structure have no relationship with the RIB anymore (and the compilation process has not always been bug-free), either. Aggregation is just a table mangling that could be performed in a transparent fashion. It's an interesting question if you can implement efficient deletes with aggregation. But then, you could fake that with a brute-force approach.
Looking at it another way, automatic route aggregation creates overcommit between the RIB and the FIB. If that overcommit charge is exercised, things will break horribly.
I think the point is to push it so far away that you won't hit it. There are always some limits. The important question is whether you encounter them during (reasonable) life-time of the device. In any case, I would expect that current TCAM-based implementations could be exhausted before their nominal capacity is reached with a carefully crafted list of routes. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
participants (8)
-
Erik Bais
-
Florian Weimer
-
Igor Ybema
-
Jasper Jans
-
Jim Reid
-
Nick Hilliard
-
Sander Steffann
-
Vegar Løvås