On 6-dec-2005, at 20:10, Oliver Bartels wrote:
I can't imagine what such a layer would look like...
Clustering all PI-prefixes originating at the same AS to form a single "super-prefix" makes policy processing a lot easier, because it need to be done just once for the whole block.
I'm not sure I understand the "superprefix" but obviously a lot of work that now happens per-prefix in BGP should happen per-AS. But that's mostly moot in IPv6 as we should never reach the numbers of prefixes per AS that we see in IPv4.
With as few as 256MByte of DDRAM plus a 64K TCAM chip it is possible to handle max. 8 Million Forwarding entries at full 128 bit resolution
I guess that means you throw everything in the TCAM first to get from 8M to about 125 entries and then look those up in a tree or hash table? Obviously it's possible to build architectures that allow fast forwarding with big tables. However, this doesn't come free: it takes more iron to do this, and also more power. TCAMs suck down the juice like a depressed alcoholic. This is bad for your design (both the box itself and the datacenter), your wallet and the environment.
I personally just received a patent on such router hardware concept.
So big routing tables are good business for you, then?
Sure. But trying to aggregate on network topology is never going to work for two very simple reasons:
1. It changes all the time
The same is true for geographical aggregation.
I guess I you live in California or another place that is plagued by frequent earthquakes...
Geographical aggregation would require free transit, otherwise it is not compatible with the ISP's business models.
The point is to keep the aggregation inside the ISP network. Tier-1 ISPs would still have a full routing table, but rather than have a copy in each router, it's distributed over the network. So there is no free transit requirement.
country boundaries.
Such boundaries are artifical, the EU tries to avoid them.
The idea behind aggregation is that you can move up or down. If country borders get in the way, drill down a bit and start looking at provinces or cities. In our design there are potentially 64k distinct areas (mostly cities) so if you want, you can have a route for each of those in your routing table and never run into country borders.
2. You can't express a topology with loops in it in an addressing hierarchy
Avoiding loops is the job of the routing protocol, not the topology.
??? Are we talking about spanning tree now? Loops in the topology are good. You can't remove them. Routing is also dynamic, BTW.
Distance is actually very important. It's very hard to do decent high speed file transfer on out of the box OSes and applications with high delay. Also, it often makes sense to backhaul traffic over SOME distance, but that doesn't mean it also makes sense to backhaul it over even larger distances. I.e., even if a link to New York is cheap, you don't want to go over Palo Alto.
If I would be located in Seattle, Palo Alto might be an alternative way point as well as Chicago or even Dallas.
Of course. But we're in Europe. If you're in Seattle you'll see a lot of your traffic to other people in Seattle flow through Palo Alto. That's normal, because it's not economical to peer with everyone everywhere. It's not so cool when intra-Seattle traffic starts to flow through Miami.
What you are trying is:
Map a two-and-a-half-dimensional world on a one-dimensional address range. This won't work by Math. Dimensions can only be replaced by dimensions.
Ah, but we're not mathematicians but engineers. In software, you have one dimensional memory. Still, you can have multidimensional arrays.
Asked a database programmer how difficult it is to implement a geographical 10km around some place search on a database and ask them about the algorithms in use.
Easy: select everything that's in a 20x20 km grid around the center point and then do the real distance calculation on everything in that grid. Obviously you'll select stuff that's at x+8 y+8 = ~12km from the center but that's only true for a relatively small part of the intermediate results.
What they try is interleaving the West-East (X) and North-South (Y) coordinates bitwise in the search key and handle overruns by exceptions,
That sounds like Tony Hain's geographical addressing. The variant Michel Py and myself came up with is based on administrative borders such as countries so you already have on dimension: the alphabet. (Ok, not entirely how it works, but still.)
However this requires a _significant_ exception handling effort, nothing someone would like to implement in a fast forwarding engine for packet routing.
Geography is long gone by the time we're forwarding packets.
Today, IPv4 routing works but it has come close to the edge of the cliff twice (early 1990s just before CIDR routing tables were too large and late 1990s flapping cost too much CPU) and it's still pushing towards that edge, which we can't see clearly but know is out there somewhere.
It works. Period.
Hm, if you only descern "works" / "doesn't work" it's hard to say anything about the routing system... Some quantitative and qualitative analysis can be helpful.
And it will continue to work, because of the economic pressure. Engineers have found a solution, thus: Don't worry.
Guess what. I'm an engineer. I'm working on this stuff. And I'm saying: when de facto unlimited PI is allowed, it may not mean the end of the internet, but it's certainly reason to worry. Of course things will continue to work. However, they'll be less reliable and more expensive.
So because you can't prove that you're right I should just believe you without proof?
Yes, because the theory of computer sience gives you the prrof that there are theorems in this world which can't be proven.
There are also many theorems that turn out to be false. Proof is a pretty good method to avoid those. If we can't have proof we'll need to have less reliable methods to avoid them. Just accept anything is not the solution.
The scenario that de facto unlimited PI in IPv6 will make routing tables so large that it becomes problematic in some way or another is entirely reasonable, on the other hand.
The current experience let us make a reasonable and responsible assumption that a IPv6 routing table would take not much more growth than the current IPv4 table, whereas current technology permits tables of 10 to 100 times that size.
Today, people sometimes deaggregate a /16. That's bad: 255 unnecessary routes. What if they do the same thing with a /32 in IPv6? 65535 unnecessary routes. That will probably kill most existing IPv6 routers today. 10 times is 1.75M routes, 100 = 17.5. The former is probably doable for IPv4 on some extremely high end boxes but I'm not sure how those would handle real issues such as flapping, lots of full feeds etc. I don't believe the latter exists or will exist in the forseeable future, at least not in a way that anyone can afford to actually use. Even those 1.75M boxes will be very expensive and only affordable by the largest networks. Don't forget you and I all pay for their hardware, directly or indirectly. Iljitsch -- I've written another book! http://www.runningipv6.net/