New on RIPE Labs: Further Virtualisation Testing for RIPE Atlas Anchor

Dear colleagues, Please find a new article on RIPE Labs: "Further Virtualisation Testing for the RIPE Atlas Anchor" https://labs.ripe.net/Members/romeo_zwart/further-virtualisation-testing-for... This report is a follow up to an earlier article, in which we detailed our investigations into the use of virtualisation technology as a suitable means to provide other services via the RIPE Atlas anchor hardware, beyond its primary anchor function. Kind regards, Mirjam Kuehne RIPE NCC

hi mirjam and atlas crew, as you know, i am interested in this type of calibration work for wider application than atlas anchors. so please excuse my poking at it. precious little information about o network configs (natted/bridged/...) o other loads on network, processors, ... o what hvisors were tested in the tables, and no differentiation between them? i am also confused between testing of openvz, kvm (on which kernel(s)?), vmware, ... randy

Hi Randy, On 13/09/13 20:10 , Randy Bush wrote:
hi mirjam and atlas crew,
as you know, i am interested in this type of calibration work for wider application than atlas anchors. so please excuse my poking at it.
Thanks for your questions and good to see that you have an interest. I'll try to clear up the confusion. Previously (last year), work was done with three virtualisation technologies, OpenVZ, VMware, KVM. A summary report on that work was published earlier this year [1], with details described here [2]. Based on that first phase we concluded that OpenVZ was the most promising of the investigated alternatives. We followed up with some more work, this time only looking at OpenVZ. The last summary article on Labs (announced by Mirjam this Friday) is referring to the follow up work done with OpenVZ, so no VMware or KVM there. With regard to your other questions:
precious little information about o network configs (natted/bridged/...)
o other loads on network, processors, ... In the last series of tests (OpenVZ only), the test systems were only loaded with ping tests as described. However, the switch that connected
All tests were done with a bridged setup. VMware tests in the first phase of the project used the VMware VMXnet3 driver, KVM tests used virtio. OpenVZ tests were done with 'plain' linux virtual interfaces. This was described in the phase I report mentioned above (see [2], p. 29). the test systems was also connecting other systems. In the earlier test, many combinations of cpu/disk/io/network loads were tested. See [2] for an extensive description and discussion of that.
o what hvisors were tested in the tables, and no differentiation between them?
The last report was only about OpenVZ, so indeed no differentiation. The earlier work investigated OpenVZ, VMware and KVM - lots of differentiation there. :) Details, again, in [2].
i am also confused between testing of openvz, kvm (on which kernel(s)?), vmware, ...
Please see my note above. Hope this answers your questions. Let me know if you have more. Best regards, Romeo [1]https://labs.ripe.net/Members/romeo_zwart/ripe-atlas-anchor-to-ripe-ncc-serv... [2] https://labs.ripe.net/Members/romeo_zwart/LuigiCorselloAtlasanchorvirtualisa...

https://labs.ripe.net/Members/romeo_zwart/LuigiCorselloAtlasanchorvirtualisa...
nice masters level work, which i presume it was. of course, the devil is in the details, and he seemed to be dealing with a lot of devils. for me, interested in timing, the following was the most disappointing: The "accurate timekeeping" requirement, for the way it was defined, has no unconditional winner. Carefully configured NTP with stable peers is the minimal condition for stable timekeeping, however a ceiling of "1ms" set during the research may be too strict to be respected at all times by a virtualization technology and maybe even by physical PC hardware. while he seemed to have explored the standard kvm hack of the guest booting with no-kvmclock while running ntpd in the guest (for which the net of a thousand lies makes a very clear case), Section 4.3.3. KVM does not make clear that this configuration is what was evaluated. in general, i worry that he discovered ntp during the project and may not have configured the DUTs as well as they could have been. the researcher was brutally honest about details, limits, what was unexplored, ... you gotta love The role and importance of the Network Time Protocol to guarantee an accurate timekeeping has been progressively "discovered" during the analysis phase of this research. when ntp's precision is not really all that great, hence the gps clocks for ttm. i would greatly love to see this researcher explore ttm time precision and accuracy before ttm is retired, so we can compare to previous analyses. i was worried by In fact, the first TTM node for time monitoring (tt97.ripe.net) resided in a different VLAN than the POC and had an average delay of about 1100μsec! That was certainly not good enough to measure sub millisecond time accuracy and was replaced by tt999.ripe.net installed in the same VLAN as the POC. ntp is designed to be pretty insensitive to rtt. so this either scares the hell outta me about ttm, is a misunderstanding, or miscommunication. i would love to see more analysis of this. side note: it would have been nice if he had run openvz and kvm on something less paleolithic than centos. but my compliments to the researcher and to the ncc for funding him. useful work. randy

Hi Randy, On 13/09/16 01:59 , Randy Bush wrote:
https://labs.ripe.net/Members/romeo_zwart/LuigiCorselloAtlasanchorvirtualisa...
nice masters level work, which i presume it was. of course, the devil is in the details, and he seemed to be dealing with a lot of devils.
Thanks for taking the time to review. I will forward your comments, I am sure the researcher will be pleased with your constructive criticism. [...]
i was worried by
In fact, the first TTM node for time monitoring (tt97.ripe.net) resided in a different VLAN than the POC and had an average delay of about 1100μsec! That was certainly not good enough to measure sub millisecond time accuracy and was replaced by tt999.ripe.net installed in the same VLAN as the POC.
ntp is designed to be pretty insensitive to rtt. so this either scares the hell outta me about ttm, is a misunderstanding, or miscommunication. i would love to see more analysis of this.
This is a misunderstanding and probably only reflects a very early view of the researcher on the topic. I don't believe it was intended to comment on the quality of the TTM mechanism. :) WRT you other question about more work along these lines on TTM. I think it's unlikely we will follow this up further. The researcher has moved on the greener fields in the mean time and a lot of work on TTM has been done in the early 2000's and before, see e.g. [1]. Although that particular reference is not specifically looking TTM accuracy.
side note: it would have been nice if he had run openvz and kvm on something less paleolithic than centos.
but my compliments to the researcher and to the ncc for funding him. useful work.
Thanks for that comment. I will pass it on. :)
randy
[1] http://www-nas.et.tudelft.nl/people/Piet/papers/PAM2002.pdf
participants (3)
-
Mirjam Kuehne
-
Randy Bush
-
Romeo Zwart