Notes from the TTM workshop @RIPE63
Dear colleagues, please find attached the notes from the brainstorm-meeting during RIPE63 in Vienna, where we had a room full of people bring up their experiences and wishes for the future of the Test Traffic Measurement (TTM) service. My apologies for the delay. I am attaching the full, raw notes, for the historical record, and a basis for further discussion. However, you may prefer the summarized list (below). Please add any comments and ideas. We are working on the evolution plan and road-map, and all your feedback is appreciated to guide us. And finally, a poetic illustration of one of the concepts suggested at the meeting. Regards, Vesna Manojlovic Feature requests and updates to TTM (hardware and software) * Improve interface so that non-technical people can get reports (two modes). Make it easier to see the data in general * Changes to the API to do your own presentations, reports, and integration into monitoring systems; more raw data * Useful alarms which are configurable * Multiple ethernet, insight into layer 3, TTM boxes talking to each other * Antenai sharing * Use taildaemon - share tailport amongst TTM users * TCP port randomisation * User-defined measurements * Expanded information: where is the bottle neck, performance problems between two points * Adjustment of sampling, higher sampling rate * Open-source the software New uses for TTM boxes, data * Combine Atlas and TTM into an active measurement services system. TTM becomes an anchor and target - a sort of powerful Atlas probe * Put data as quickly as possible in a central database and make it accessible, correlate with RIS -- Vesna Manojlovic BECHA@ripe.net Senior Community Builder +31205354444 for Measurements Tools RIPE NCC http://ripe.net Notes from TTM workshop, 1.1.2011. Vienna, RIPE63 Chair: Daniel Karrenberg, Chief Scientist, RIPE NCC Daniel: TTM is getting a bit old, and there are other active measurement structures: DNSMON, Atlas. TTM has been a bit neglected; we want to re-establish contact with hosts, and see how TTM fits into other activities. We already did a mini-survey and published results: https://labs.ripe.net/Members/dfk/ripe-ttm-user-survey-results Let's do the around-the-table introduction, to see how people are using it, take your view on how TTM should evolve. Sean, RIPE NCC: I maintain the system; I can answer questions about current operation. Jari Miettinen, FUNET: hosting box #34. Main purpose is time-source (seconded by George form APNIC). Data should be made easier to use; better reports for non-techies. Two paths--one for technical people and one for others. Antionio Moreiras, NIC.br: hosts 5 boxes. Using TTM to see how Brazilian internet connect is with the rest of the world. Wants TTM continued; we likes it. Evolution: easy way to see the data (have done some things themselves, such as world map); realtime data very useful. Uses one-way delay measurements which are not available with other tools like Atlas. Feature request: use of taildaemon - share tailport amongst ttm users. Own software does not scale, cannot be used for more probes. (During RIPE63, Antonio gave presentations on their TTM use: http://ripe63.ripe.net/presentations/170-ttm-real-time-nic-br.pdf http://ripe63.ripe.net/presentations/45-ipv6-ipv4-latencies-ripe.pdf Yoshinobu Matsuzaki, IIJ: We have 2 boxes, in Japan and USA. Also used as tiemsource. Doing DNS anycast measurements, and using one-way delay measurements. Martin Pels, Ams-IX: They had 4 boxes; Used to use them to measure one-way delays on switching platform. Platform evolved, infastructure too complex for existing TTM. Now using it as a time source for other measurements boxes. Useful addition would be: tcp source port randomisation (way to better define the measurements from fixed source ports?) & would like more insight into layer 3 specifics (seconded by someone). Sebastian Castro, .NZ (#182): Interested in Raw Data to build own system on that. Boxes around the world useful to test visibility, monitor DNS anycast. Addition feature request: ability to run small DNS tests on other boxes, and collect the data. (User Defined Measurements) Nigel Titley, Easynet: used requirement for pretty graphs as the original argument. It took 2 years to install GPS :( By the time it was running, management directive had gone away. Using as a timesource. Hoping we can get something out of TTM + Atlas to get pretty graphs to send to customers. Thomas Schmidt, DFN: Room for improvement in presentation of data. What is the vision of the project? I would like to get information about where is the bottleneck, performance problems between A&B, pinpointing info, possibly reroute traffic accordingly. Would like to see more correlation of the data: traceroutes & one-way delay measurements. (DFK translated this into "automatic troubleshooting system) Alarming system would be nice... George Michaelson, APNIC: We have one box in APNIC infrastructure, and sponsored 10 boxes in AP region: We are using it as a timesource, but it was a huge value added in AP region - people were hugely supportive of this, because there is no other infrastructure to provide accurate time. Also used for "up or down" tests, that's useful (also for PR). Christian Panig, ACOnet: no regular use, not integrated into operational cockpit. happy to provide it as a service for others to use the data. Wilfried Woeber, ACOnet: have had it many years, doesn't require much care & feeding, it's just there (close to the core). Looking at path between endpoints/various TTM boxes (in Asia in particular) and examine from both ends was most recent useful use. Interesting thing is to have a one-way look onto infrastructure (not necessarily one-way delay measurement). To be able to pick out strategic boxes and see what they pass on about our network. Presentation: I think the approach Atlas takes is much more modern and user-friendly, but if you want to dig into a particular case, TTM gives good info, but you have to pull it together yourself. TTM is a good thing to have. Geoff Huston, APNIC: I find time-source valuable usage. We are also involved with Atlas, think there's some benefit in folding them together. Like to see development that makes them targets for atlas nodes (many people agreed) Dave Wilson, HEANET: does get used from time to time, but it is very occassional. It has been invaluable for some specific troubleshooting, rest of the time it isn't used. We (at GEANT) are inventing some of the same wheels RIPE NCC is. Nothing actually gets deployed and used unless it is part of a regular system--therefore it must have an API for someone to want to do development. TTM - I know how to interpret it, but visualisation shows what data is there, not info we need. Third, aware that RIPE projects are competing with the projects we are doing - and everyone is losing. We should work together. Brian Nisbet, HEANET: It is very old, visualization is old, it was used operationally originally, but not now, because people don't know the interface and aren't interested. Can't persuade people to use very useful data. And we use it as time source :) Fredrik Pettai, NORDUNET: used as a timesource. future use: things mentioned by Thomas and Dave & Brian: we don't use it because the interface is old, alarms are not that useful so turned off. Don't know how to make alarms more useful. Most TTM boxes are centrally placed with good connectivity, so there are no useful alarms produced. Wolfgang Tremmel, DECIX: We have 12 boxes, and we are *not* using any as a timesource! Use as measurement and proof for SLA with customers. Measure delay, packetloss, between customer and switches. Room for improvement: a lot. Not guy running it, guy is not here until Thursday. Would like hardware improvement--as many NICs (Ethenet interfaces) per box as possible. Antenai sharing - running a cable to the roof requries tons of permissions so sharing between 2-3 boxes in the same building would be a great improvement. Interfaces API: would li ke to see API based on modern technique like SOAP. Adjustment of sampling, like higher sampling rate. Boxes in same city, want many more packets; adjustment of IP source and destination wanted (user control measurements (packet parameters)). Any plan to open source the software? Stephan Wahl, ECIX: API is very important, second Geoff - onion model with TTM and Atlas. Small atlas probes could help fill in white spots in the TTM map, adding granularity. TTM shows well-connected nodes, interested in access layer. Timesource - there are other options which are cheaper than TTM Sascha Pollok, IPHH: we mainly use it as a reference for good quality traffic. Put where customer service is connected so we can graph things and show it to customers to show it isn't their fault. Independent source. Fancy graphs, API to generate graphs wanted - not require customers to go to RIPE for this. Web interface looks rather old-fashioned. Co-workers don't like to use the web interface. Not quite sure if TTM is already providing this: alarms... (whatever it was is very rudimentary, says Daniel). Atlas integration = cool. Additional traffic to Atlas hosts would be quite small, depending on # of TTM boxes. Today we can send customers to DNSMON graphs. Christian Kaufmann: Here as MAT-wg co-chair. Randy Bush, IIJ: Atlas seem to be running behind NATs, when people talk about having them play together, with my operator's hat on I think it could be useful to tell customers to get atlas probes which will allow them to see how their link to us is doing--how your reachability, performance is. That would make Atlas probes useful in a customer relationship. Vaclav Novak, CESNA : use it as time source; Luisa Villa, LACNIC: here just to bring feedback back to LACNIC. Serge Krashakov, Chernogolovka (tt143): not using it intensively. Not quite sure we should continue using it. Main problem ... internal connectivity cannot be tested. ... and lack of Russian TTM nodes . Matthias Kluth (TTM #101 - HeliNET): found it useful to have an independent view on the network. Integration with Atlas. Daniel Karrenberg summarized: TTM is mostly used as timesource. Data presentation is deficient: dated, time lag too big, presentation needs to be adaptable to specific users, correlate the data. Call for an API in order to do your own presentation and integrate it into cockpits & monitoring - and produce "pretty graphs". Demand for useful alarms that are configurable. One-way-delay measurements also appreciated. General positive attitude toward integration with other active measurements products in particular Atlas. Target idea is good. Couple of requirements from IXP crowd - divided: someone else is using a different commercial product, some would like to see TTM have multiple ethernet, insight into layer 3, TTM boxes talking to each other, etc Christian Panigl: Might be interesting to offer TTM beacon and let people who are interested in measuring their so called broadband measure against regional or national exchange service. (Daniel: put something in a place so well connected that it can't be the bottleneck) Daniel: the last mile won't be the problem of the future. Wilfried: Atlas is much more modern and compatible with today's expectation, in that it includes upper-layer servcies, like trend graphs. TTM was built on purpose to just do lower layer, but these days we would rather have something which gives us a broader picture. (e.g., what is the quality over time from this vantage point when the infrastructure might change.) Daniel: telling personal current vision on how to evolve this. Interested in onion model. Not Atlas and TTM, just active measurent service. Morphed TTM into anchor/target sites that can do two things (powerful Atlas probes for active easurements, user controlability, also use Geoff's suggestion to make them targets). Atlas probe can now do everything except the one-way delay that TTM can do. Might be room for third size in between. I would like to do that but I fear we will lose focus if we do it immediately. Personally I'm not so convinced that RIPE NCC should expand a lot of energy in making it very very suitable to measure metropolitan area stuff. I need to know that's in the interest of the community. I'd like to be convinced. But quite frankly, there are commercial solutions for that. There is a conflict here. Put data with as little lag as possible into central database to make it accessible. Needs to be distributed to remove bottleneck/point of failure. Put data together, correlate with RIS, ... Also expand energy to get data opened up as uch as possible, not just through API, making it possible to combine with other initiatives to bring things together, benefit of more raw data, enable many people working on presentations. Wolfgang: with all those requirements there is one thing that makes up most of the complexity which is also the differentiator between this and Atlas: one-way measurements. Some people using it, some find it not so useful. What priority does this have to you? Would a round-trip be as useful? There was disagreement with this proposal--one way measurements are the core. Dave: want to correlate with RIS, just don't want to fall into trap of "here's what we have, how can we use it?" Daniel: data gathering and presentation has been split deliberately within RIPE NCC. Idea is to have two groups, one who creates database, and one who try to get something out of it. (Wolfgang's and Robert's teams) Antionio Moreiras: concerned that people are using TTM as primary timekeeping. Daniel: thank you for being here, any details get in touch with Ann and Vesna. Their role is to stand up for users. You can talk to developers as well, but Ann and Vesna are user advocates, make use of that. ÒThe Future of TTMÓ workshop, RIPE63, Vienna Tuesday, 1.11.11., 1PM, Schonberg room (lunch meeting in a separate room) List of participants: Number, name Organisation TTM box Wilfried Woeber ACOnet #73 Christian Panigl ACOnet #73 Stefan Wahl Peering.eu n/a Nigel Titley Easynet / NCC EB#154 George Michaelson APNIC #157 Jari Miettinen FUNET #34 Thomas Schmid DFN #77 Christian Kaufmann MAT-wg chair n/a Wolfgang Tremmel De-CIX Many Brian Nisbet HEAnet #35 Matthias Kluth HeLi NET #101 Sebastian Castro .nz registry #182 Martin Pels AMS-IX #121 Serge Krashakov Chernogolovka #153 Sascha Pollok IPHH #162 Fredrik Pettai NORDUnet #8 Vaclav Novak CESNET #133 Antonio Moreiras NIC.br #159 Yoshinobu Matsuzaki IIJ #103,#129 Geoff Huston APNIC n/a Luisa Villa LACNIC n/a Dave Wilson HEANET #35 Randy Bush IIJ #129 Vaclav Novak CESNET #133 RIPE NCC staff: Ann Barcomb Daniel Karrenberg Erik Romijn Sean McAvoy Vesna Manojlovic Wolfgang Nagele
participants (1)
-
Vesna Manojlovic