Happy Eyeballs bias

newer
New on RIPE Labs: The Trouble with...

Sander Steffann

26 Oct 2016 26 Oct '16

1:54 p.m.

Hi, Very good presentation on Happy Eyeballs at the WG session today. It might indeed be true that lowering the HE "head start" for IPv6 from 300ms to 150ms might cause a slight increase in performance. However, as Geoff Huston pointed out, HE is not about getting the best performance, it's meant to move traffic to IPv6 (so IPv4 traffic will go down, and we can know when we can turn off IPv4) without breaking the user experience when IPv6 is broken. It was (and still is) an important argument for website owners to be comfortable enough to actually deploy IPv6 and not be too worried about client experience breaking. But it is also hiding IPv6 brokenness, removing an incentive for website owners to actually fixing IPv6 errors and causing IPv6 to be taken less seriously... So, how about we go the other way. We want IPv6 to be taken more seriously. What about if we change the algorithm the other way over time: give IPv6 more and more of a head start. That way IPv6 stability and performance become more important over time, without causing brokenness. Something like: HE head start = 300 + (months after 2017-01-01) * 30 That would provide some incentive to make sure that IPv6 is properly deployed and managed. Cheers, Sander

Attachments:

smime.p7s (application/pkcs7-signature — 2.0 KB)

Show replies by date

Philip Homburg

26 Oct 26 Oct

2:27 p.m.

In your letter dated Wed, 26 Oct 2016 15:54:52 +0200 you wrote:

...

So, how about we go the other way. We want IPv6 to be taken more seriously. What about if we change the algorithm the other way over time: give IPv6 more and more of a head start. That way IPv6 stability and performance become more important over time, without causing brokenness. Something like:

HE head start = 300 + (months after 2017-01-01) * 30

That would provide some incentive to make sure that IPv6 is properly deployed and managed.

Looking at this from an operating system perspective... As an experiment I implemented a fully dynamic version of happy eyeball in my toy-os. It keeps long term statistics about the performance of v4 and v6 and will give v6 a small head start to add a small positive preference to v6. But if IPv4 is really much better than IPv6, it will not bother with IPv6 at all. I don't any reason why any system code would implement what you suggest. Basically, in a situation where IPv6 is broken, your suggestion would make the user experience worse and worse. For the user, there would be a simple way out of this mess, just disable IPv6 and performance is back to normal. My suggestion: try to get IPv6 to be 80% or more (at least make sure that IPv6 from content providers is almost universal) and then for eyeball networks to stop investing in IPv4. When IPv6 support is the default, people will notice that some sites have bad performance and that may be because their IPv6 support is just not there.

JORDI PALET MARTINEZ

4:56 p.m.

But now we have most of the content providers, CDNs, social networks, etc., already supporting IPv6. This is probably more than 50% of the traffic. If we “deprecate” now Happy Eyeballs, it will take probably 6 months/average, I guess even one year, to get it dropped from browsers, OSs, etc., which means that meanwhile IPv6 support keeps growing. When people start experiencing problems, it is either a bad IPv6 deployment at the content provider or the ISP, and they need to FIX IT, not ignore it because nobody notice it. Saludos, Jordi -----Mensaje original----- De: ipv6-wg <ipv6-wg-bounces@ripe.net> en nombre de Philip Homburg <pch-ripeml@u-1.phicoh.com> Responder a: <pch-ripeml@u-1.phicoh.com> Fecha: miércoles, 26 de octubre de 2016, 16:27 Para: "ipv6-wg@ripe.net IPv6" <ipv6-wg@ripe.net> CC: Sander Steffann <sander@steffann.nl> Asunto: Re: [ipv6-wg] Happy Eyeballs bias In your letter dated Wed, 26 Oct 2016 15:54:52 +0200 you wrote: > So, how about we go the other way. We want IPv6 to be taken more > seriously. What about if we change the algorithm the other way over > time: give IPv6 more and more of a head start. That way IPv6 > stability and performance become more important over time, without > causing brokenness. Something like: > > HE head start = 300 + (months after 2017-01-01) * 30 > > That would provide some incentive to make sure that IPv6 is properly > deployed and managed. Looking at this from an operating system perspective... As an experiment I implemented a fully dynamic version of happy eyeball in my toy-os. It keeps long term statistics about the performance of v4 and v6 and will give v6 a small head start to add a small positive preference to v6. But if IPv4 is really much better than IPv6, it will not bother with IPv6 at all. I don't any reason why any system code would implement what you suggest. Basically, in a situation where IPv6 is broken, your suggestion would make the user experience worse and worse. For the user, there would be a simple way out of this mess, just disable IPv6 and performance is back to normal. My suggestion: try to get IPv6 to be 80% or more (at least make sure that IPv6 from content providers is almost universal) and then for eyeball networks to stop investing in IPv4. When IPv6 support is the default, people will notice that some sites have bad performance and that may be because their IPv6 support is just not there. ********************************************** IPv4 is over Are you ready for the new Internet ? http://www.consulintel.es The IPv6 Company This electronic message contains information which may be privileged or confidential. The information is intended to be for the use of the individual(s) named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information, including attached files, is prohibited.

Philip Homburg

6:19 p.m.

...

But now we have most of the content providers, CDNs, social networks, etc., already supporting IPv6. This is probably more than 50% of the traffic.

If we deprecate now Happy Eyeballs, it will take probably 6 months/average, I guess even one year, to get it dropped from browsers, OSs, etc., which means that meanwhile IPv6 support keeps growing.

When people start experiencing problems, it is either a bad IPv6 deployment at the content provider or the ISP, and they need to FIX IT, not ignore it because nobody notice it.

I doubt that any software vendor will make that change. You cannot go from something that sort of deals with failure, to something that breaks. Of course, users will notice right after an upgrade of the software, so they know who to blame. Early on with RIPE Atlas we had an issue with IPv6 PMTU blackholes. And I had the same attitude. Don't do anything on our end, just tell probe hosts to fix their connections. That didn't work out at all, so now we just clamp the mss on our end. Next step, if you would do this, users will quicky find out that disabling IPv6 will solve the issue. There is already too much 'when in doubt, disable IPv6, it is useless anyway' going on. Finally, it is nice to think ahead, but maybe thinking about getting rid of IPv4 is a bit premature? A couple of months ago, my home network had no IPv4. Time to see what my normal setup would look like on IPv6-only. Facebook worked. So at least I could tell my friends that I was still alive :-) My home network is a bit more complex than that of the average consumer, so as I already expected, Google was completely dead in the water. And of course, most the other websites that I frequently visit were unreachable. So maybe we should wait until going IPv6-only is actually realistic. And until that time think about minizing IPv4 traffic and not so much about breaking stuff when IPv6 doesn't work as expected. Any ISP can use flow data to figure out which content providers generate a lot of IPv4 traffic and talk to them about improving their IPv6 support. Any sensible happy-eyeballs implementation will then prefer IPv6 and IPv4 traffic will drop.

Mikael Abrahamsson

3:04 p.m.

On Wed, 26 Oct 2016, Sander Steffann wrote:

...

HE head start = 300 + (months after 2017-01-01) * 30

I don't believe in this. Trying to deploy something by severely degrading the customer experience in the fail case is worse than just failing it completely. It would be better to keep it at 300 ms (still significant penalty), but instead recommend the OS vendor to install some kind of heuristic to flag for the user somehow that their IPv6 connectivity is degraded, and offer to fault find it... or let's invent some kind of telemetry where these kinds of breakages can be reported to the OS vendor so they can contact the ISP and alert them to the breakage? Also, we still have the problem with PMTU blackhole detection and mitigation. Why isn't this turned on more? -- Mikael Abrahamsson email: swmike@swm.pp.se

Sander Steffann

3:19 p.m.

Hi,

...

Op 26 okt. 2016, om 17:04 heeft Mikael Abrahamsson <swmike@swm.pp.se> het volgende geschreven:

On Wed, 26 Oct 2016, Sander Steffann wrote:

...
HE head start = 300 + (months after 2017-01-01) * 30

I don't believe in this. Trying to deploy something by severely degrading the customer experience in the fail case is worse than just failing it completely.

Fair enough

...

It would be better to keep it at 300 ms (still significant penalty), but instead recommend the OS vendor to install some kind of heuristic to flag for the user somehow that their IPv6 connectivity is degraded, and offer to fault find it... or let's invent some kind of telemetry where these kinds of breakages can be reported to the OS vendor so they can contact the ISP and alert them to the breakage?

I would be very interested in telemetry, but also to the website owner. I see too many unspecified, localhost, 6to4 and ipv4-mapped addresses in DNS without the website owner every even noticing their setup is broken. Which is why I suggested a more gradual approach than letting it fail hard, but maybe that is what we need at some point.

...

Also, we still have the problem with PMTU blackhole detection and mitigation. Why isn't this turned on more?

Another good question Cheers, Sander

Philip Homburg

3:27 p.m.

In your letter dated Wed, 26 Oct 2016 17:04:18 +0200 (CEST) you wrote:

...

but instead recommend the OS vendor to install some kind of heuristic to flag for the user somehow that their IPv6 connectivity is degraded, and offer to fault find it... or let's invent some kind of telemetry where these kinds of breakages can be reported to the OS vendor so they can contact the ISP and alert them to the breakage?

I wonder, if a host has a global IPv6 address that is not derived from any kind of transition technology or tunnel, and setting up a TCP connection is either slow or fails, then what percentage is due to an issue close to the host and what percentage close to the target. I.e., if IPv6 is broken is there any reason to believe it is often enough due to ISP provided services that is would be worth reporting it in a roudabout way.

...

Also, we still have the problem with PMTU blackhole detection and mitigation. Why isn't this turned on more?

I really don't understand this. I guess 'everybody' doing major operating systems still lives in an IPv4-only world with MSS clamping.

Bajpai, Vaibhav

3:38 p.m.

...

On 26 Oct 2016, at 17:27, Philip Homburg <pch-ripeml@u-1.phicoh.com> wrote:

I wonder, if a host has a global IPv6 address that is not derived from any kind of transition technology or tunnel, and setting up a TCP connection is either slow or fails, then what percentage is due to an issue close to the host and what percentage close to the target.

This is a very interesting question, but it requires profiling a large number of destinations and at the same time profiling large number of sources and collecting this dataset over a longitudinal period of time. -- Vaibhav =================================== Vaibhav Bajpai www.vaibhavbajpai.com Postdoctoral Researcher Jacobs University Bremen, Germany ===================================

Jen Linkova

3:43 p.m.

On Wed, Oct 26, 2016 at 5:27 PM, Philip Homburg <pch-ripeml@u-1.phicoh.com> wrote:

...

I wonder, if a host has a global IPv6 address that is not derived from any kind of transition technology or tunnel, and setting up a TCP connection is either slow or fails, then what percentage is due to an issue close to the host and what percentage close to the target.

I.e., if IPv6 is broken is there any reason to believe it is often enough due to ISP provided services that is would be worth reporting it in a roudabout way.

I'd say that it is much more likely to be broken close to clients, at least for Alexa web sites.. -- SY, Jen Linkova aka Furry

Bajpai, Vaibhav

5:06 p.m.

...

On 26 Oct 2016, at 17:43, Jen Linkova <furry13@gmail.com> wrote:

On Wed, Oct 26, 2016 at 5:27 PM, Philip Homburg <pch-ripeml@u-1.phicoh.com> wrote:

...
I wonder, if a host has a global IPv6 address that is not derived from any kind of transition technology or tunnel, and setting up a TCP connection is either slow or fails, then what percentage is due to an issue close to the host and what percentage close to the target.

I.e., if IPv6 is broken is there any reason to believe it is often enough due to ISP provided services that is would be worth reporting it in a roudabout way.

I'd say that it is much more likely to be broken close to clients, at least for Alexa web sites..

We cannot generalise this without empirical data. There is also v6 brokenness (or slowness) in ALEXA websites also that not hosted by large CDNs. -- Vaibhav =================================== Vaibhav Bajpai www.vaibhavbajpai.com Postdoctoral Researcher Jacobs University Bremen, Germany ===================================

Torbjörn Eklöv

27 Oct 27 Oct

6:47 a.m.

...

26 okt. 2016 kl. 19:06 skrev Bajpai, Vaibhav <v.bajpai@jacobs-university.de>:

...
On 26 Oct 2016, at 17:43, Jen Linkova <furry13@gmail.com> wrote:

On Wed, Oct 26, 2016 at 5:27 PM, Philip Homburg <pch-ripeml@u-1.phicoh.com> wrote:

...
I wonder, if a host has a global IPv6 address that is not derived from any kind of transition technology or tunnel, and setting up a TCP connection is either slow or fails, then what percentage is due to an issue close to the host and what percentage close to the target.

I.e., if IPv6 is broken is there any reason to believe it is often enough due to ISP provided services that is would be worth reporting it in a roudabout way.

I'd say that it is much more likely to be broken close to clients, at least for Alexa web sites..

We cannot generalise this without empirical data.

There is also v6 brokenness (or slowness) in ALEXA websites also that not hosted by large CDNs.

Every category of webbservers have problem with IPv6 Look at https://www.myndighetermedipv6.se/ -> "Authorities with AAAA in its www..”, more than 10% does not work over IPv6 today. Another common IPv6 problem is the the firewall/load balancer where http work and https does not. With IPv4 it does https://ipv6alizer.se/?address=https://www.gavle.se https://ipv6alizer.se/?address=http://www.gavle.se ( https://ipv6alizer.se ’s main mission is the PTB-problem but it also tell us when it don’t work at all. ) Why does the owners don’t see this problems? The users don’t have IPv6 on the inside and they are the best monitors. /Tobbe

...

-- Vaibhav

=================================== Vaibhav Bajpai www.vaibhavbajpai.com

Postdoctoral Researcher Jacobs University Bremen, Germany ===================================

Torbjörn Eklöv | Interlan Gefle AB Norra Kungsgatan 5, 803 20 Gävle Växel: 026-18 50 00 | Direkt: 070-683 51 75 http://www.dnssecandipv6.se "Ever since I can remember I always wanted to use IPv6. To me that was better than being president of the United States. To use IPv6 was to own the world."

Jen Linkova

26 Oct 26 Oct

3:39 p.m.

On Wed, Oct 26, 2016 at 3:54 PM, Sander Steffann <sander@steffann.nl> wrote:

...

So, how about we go the other way. We want IPv6 to be taken more seriously. What about if we change the algorithm the other way over time: give IPv6 more and more of a head start. That way IPv6 stability and performance become more important over time, without causing brokenness. Something like:

HE head start = 300 + (months after 2017-01-01) * 30

That would provide some incentive to make sure that IPv6 is properly deployed and managed.

Well, if you keep increasing the timeout you'll eventually make the failover time worse than it would have been w/o HE. Basically the timeout should be long enough to keep Ipv6 preferred in most of 'non-broken' cases but short enough so if IPv6 is broken, users do not notice the failover. -- SY, Jen Linkova aka Furry

3222

Age (days ago)

3223

Last active (days ago)

List overview

Download

11 comments

7 participants

participants (7)

Bajpai, Vaibhav
Jen Linkova
JORDI PALET MARTINEZ
Mikael Abrahamsson
Philip Homburg
Sander Steffann
Torbjörn Eklöv

Happy Eyeballs bias

tags

participants (7)