Pre-PDP discussion: "All published documents and PDPs are maintained with git"

Dear all, this is the fourth suggestion: All RIPE documents are made available by means of a git repository, both via anonymous pull and via a web interface. The read-only access is open to everyone, RIPE member or not. Within RIPE, this git repository is the canonical place for all published documents. All published documents are copied from it. All PDPs will be maintained as a separate branch. Every new version of a PDP is an update within that branch. The branch will either be merged back into master or left unmerged but intact, depending on if they are accepted or not. Old documents and PDPs, if still on record, will be put into separate branches that reflect correct history. The master branch is then merged into old history, creating one single history. This will happen one year after this policy comes into effect at the latest. If the proposal "All PDP emails, documents and websites should come with unified diff" is accepted, all diffs will be generated from git. Rationale: git is here to stay, it's highly efficient, allows off-line work, and has generally won the fight for version control systems in the foreseeable future. Not relying on technical tools which were created with ever-chaning text files in mind is inefficient at best. I offer to help maintain said git repository pro bono for at least 12 months or until the history has been merged, whichever is later. Richard PS: Think of those five emails as a patchset ;)

Hi Richard,
All RIPE documents are made available by means of a git repository, both via anonymous pull and via a web interface. The read-only access is open to everyone, RIPE member or not. Within RIPE, this git repository is the canonical place for all published documents. All published documents are copied from it.
I don't know how useful this is, as RIPE documents don't change once they are published. If a revision of a RIPE document is published it gets a new number. So I don't think a version control system will be useful here. Cheers, Sander

On Fri, Mar 15, 2013 at 8:56 PM, Sander Steffann <sander@steffann.nl> wrote:
I don't know how useful this is, as RIPE documents don't change once they are published. If a revision of a RIPE document is published it gets a new number. So I don't think a version control system will be useful here.
While that is true, this will still allow anyone to see how PDPs progressed and resulted in new documents. Also, I don't think the power of a single, canonical resource that you can easily sync to your local machines and which you _know_ is up to date should be underestimated. We shouldn't discard using a toold just because we won't need all features it provides. Thanks for your quick feedback, Richard

While that is true, this will still allow anyone to see how PDPs progressed and resulted in new documents. Also, I don't think the power of a single, canonical resource that you can easily sync to your local machines and which you _know_ is up to date should be underestimated.
Whilst understanding the limits and overheads of rsync, I'd have thought there is enough experience of the software within the NCC (think RPKI) to be able to set up an rsync server for the documents fairly trivially. Cheers, Rob

On Fri, Mar 15, 2013 at 9:07 PM, Rob Evans <rhe@nosc.ja.net> wrote:
Whilst understanding the limits and overheads of rsync, I'd have thought there is enough experience of the software within the NCC (think RPKI) to be able to set up an rsync server for the documents fairly trivially.
There are several different technologies which provide some or all of the features git offers. Yet, the package as a whole is unmatched, imo. I am well aware that this is the weakest of all proposals as it specifies a specific piece of software that should be used. I still think it would be a Very Good Thing, but if this one fails and the others come through, I will be happy. -- Richard

Hi,
I don't know how useful this is, as RIPE documents don't change once they are published. If a revision of a RIPE document is published it gets a new number. So I don't think a version control system will be useful here.
While that is true, this will still allow anyone to see how PDPs progressed and resulted in new documents.
Now, if *that* history is maintained then I see the value. Seeing that a RIPE document is based on an older document, seeing which proposal changed it (from the commit message) and seeing the differences between them might be useful for some. It might even be useful to maintain this for policy proposals to keep track of the proposed changes. The final version of the policy proposal could then be copied to the ripe-documents section. Hmmm. It might be much more than we need, and rsync might be enough for what we need, but I can also see some benefits now. Thanks, Sander

Hi,
Hmmm. It might be much more than we need, and rsync might be enough for what we need, but I can also see some benefits now.
Just to be clear: I don't think your ideas should go through the PDP. This is procedural stuff for the NCC, not a policy. But the NCC Services WG seems the right place to discuss these new services and/or procedures. I'll leave the details of how to proceed to the NCCS chairs :-) Cheers, Sander

On Fri, Mar 15, 2013 at 9:16 PM, Sander Steffann <sander@steffann.nl> wrote:
Just to be clear: I don't think your ideas should go through the PDP. This is procedural stuff for the NCC, not a policy. But the NCC Services WG seems the right place to discuss these new services and/or procedures. I'll leave the details of how to proceed to the NCCS chairs :-)
To be honest, I don't care in the least if they do not result in PDPs. I want to see change within RIPE; I don't care about the name tag that's been attached to the means of change. Richard

On Fri, Mar 15, 2013 at 9:12 PM, Sander Steffann <sander@steffann.nl> wrote:
Now, if *that* history is maintained then I see the value.
Of course it would be.
Seeing that a RIPE document is based on an older document, seeing which proposal changed it (from the commit message) and seeing the differences between them might be useful for some.
And visualizing those differences is laughably easy with git.
It might even be useful to maintain this for policy proposals to keep track of the proposed changes. The final version of the policy proposal could then be copied to the ripe-documents section.
That is the basic goal of this proposal. All changes during a PDP process are simply one patch for every update. Once the PDP changing document 123 gets approved and results in 124, the final commit in the branch is a simple `git mv 123 124` and is then merged into master.
Hmmm. It might be much more than we need, and rsync might be enough for what we need, but I can also see some benefits now.
I have noticed this particular proposal growing on people over time several times over the few months I bounced it around on IRC. Richard

* Sander Steffann
All RIPE documents are made available by means of a git repository, both via anonymous pull and via a web interface. The read-only access is open to everyone, RIPE member or not. Within RIPE, this git repository is the canonical place for all published documents. All published documents are copied from it.
I don't know how useful this is, as RIPE documents don't change once they are published. If a revision of a RIPE document is published it gets a new number. So I don't think a version control system will be useful here.
I can see one particular advantage of using a version control system - the "git blame" command, which will show when a particular sentence or paragraph was added or last modified. There's several times I've read RIPE documents and wondered along the lines of: «Where did this come from? What was the rationale behind it?» I rarely go through the trouble to find out. That said, while I agree that Git is a very good tool, I'm reluctant to micro-manage the NCC by saying «you MUST use Git» (or any other particular tool for that matter). The current system can be improved a whole lot without requiring a version control system, I like to point to the IETF's tools page, which shows all the revisions of a draft leading up to its publication as RFC, and where you can generate both in-line and side-by-side diffs on the fly. One example: http://tools.ietf.org/html/draft-ietf-v6ops-ipv6-cpe-router-09 In any case, for maximum use of any form of version control system, you'd have to consider that even though e.g. ripe-582 is a new version of ripe-577 which in turn is a new version of ripe-553 and so on, the history of ripe-582 alone isn't nearly as interesting as the history of ripe-582 plus all preceding versions combined. So either the file name needs to stay the same between versions (unlike today). or you'd have to import the entire history of the previous version into whatever you're starting the PDP with for the proposed new version. I don't know if Git allows for this. -- Tore Anderson

Hi,
That said, while I agree that Git is a very good tool, I'm reluctant to micro-manage the NCC by saying «you MUST use Git» (or any other particular tool for that matter).
I agree. If we want something then we should specify *what* we want and give the NCC the freedom to choose an implementation that fits both the needs of the community and of the NCC itself. I personally would like: - a word-by-word diff (not line-by-line) - being able to see the diff between a RIPE document and any RIPE document version that preceded it - being able to see the history of a document with or without the policy proposal versions in between - for every RIPE document change I would like to see what caused it (policy proposal, cosmetic surgery, RIPE NCC, etc) - being able to see the history even for policy proposals that haven't (yet) become RIPE documents - a pony (optional) And we have to remember that not all RIPE documents are policy documents. Working Groups and the RIPE NCC also publish other types of RIPE documents. I suggest we see what we want, and then ask the NCC to give feedback on how they see it, if it is doable etc. Sander

On Fri, Mar 15, 2013 at 10:35 PM, Sander Steffann <sander@steffann.nl> wrote:
I agree. If we want something then we should specify *what* we want and give the NCC the freedom to choose an implementation that fits both the needs of the community and of the NCC itself.
I would not mind changing the proposal that way; I am confident git would win, anyway. Using anything else today would be weird at best.
I personally would like: - a word-by-word diff (not line-by-line)
See my initial email. word-by-word can be ambiguous. PDPs must be explicit and without doubt of the underlying data. Offering word-by-word as an optional extra to line-by-line diffs would be trivial, though. And you could even do so on your own client, have it color the diff the way you want, etc etc.
- being able to see the diff between a RIPE document and any RIPE document version that preceded it
Trivial on CLI, may not be doable on web without minor updates to gitweb/gitolite. You will need both filenames.
- being able to see the history of a document with or without the policy proposal versions in between
Trivial on CLI, may not be doable on web without minor updates to gitweb/gitolite. You will need specific commits or, better, tags. Every finished PDP can simply be tagged.
- for every RIPE document change I would like to see what caused it (policy proposal, cosmetic surgery, RIPE NCC, etc)
That's what the commit message or the tag message is for. Plus, _every single copy_ of the document set (i.e. git repo) would carry all that info.
- being able to see the history even for policy proposals that haven't (yet) become RIPE documents
Trivial. `git pull; gitk --all`
- a pony (optional)
Sadly, you missed the time window in which every Ikea offered horse kotbullar.
And we have to remember that not all RIPE documents are policy documents. Working Groups and the RIPE NCC also publish other types of RIPE documents.
True. Maybe focus on policy documents for now?
I suggest we see what we want, and then ask the NCC to give feedback on how they see it, if it is doable etc.
Should we do that in this early phase or later? I would prefer early. Richard

Hi Richard,
Trivial on CLI, may not be doable on web without minor updates to gitweb/gitolite. You will need both filenames.
Please stop thinking in solutions, and start thinking in requirements. Oh, and add two other requirements: - No CLI, knowing of filenames or such things must be necessary - A way must be provided for users to clone the RIPE document repository and perform the same analysis as the RIPE NCC offers on their own system Cheers, - Sander

On Fri, Mar 15, 2013 at 10:54 PM, Sander Steffann <sander@steffann.nl> wrote:
Please stop thinking in solutions, and start thinking in requirements.
I was answering your specific questions, but I see your point.
- No CLI, knowing of filenames or such things must be necessary
CLI offers this, you are not forced to use CLI to arrive at this. If you know you want to compare ripe-XXX and ripe-YYY, you have the file names. How could you compare two files without knowing their file names?
- A way must be provided for users to clone the RIPE document repository and perform the same analysis as the RIPE NCC offers on their own system
That is one of the pivotal points of my proposal. Please note the anonymous pull I explicitly mention in the proposal. When I say "CLI", I mean "CLI on your own computer on a fully offline clone of all data". -- Richard

Hi,
- No CLI, knowing of filenames or such things must be necessary
CLI offers this, you are not forced to use CLI to arrive at this.
Which CLI offers this? Remember: we are not talking about specific implementations of a system like git, we are defining requirements :-)
If you know you want to compare ripe-XXX and ripe-YYY, you have the file names. How could you compare two files without knowing their file names?
For example with links on the webpage of ripe-XXX. The web page of ripe-554 contains a link 'Updates: ripe-501'. Adding a link 'Compare to ripe-501' would be very useful.
- A way must be provided for users to clone the RIPE document repository and perform the same analysis as the RIPE NCC offers on their own system
That is one of the pivotal points of my proposal. Please note the anonymous pull I explicitly mention in the proposal. When I say "CLI", I mean "CLI on your own computer on a fully offline clone of all data".
I understand, but we need to specify that without specifying the tool to do it with. That's how you build requirements ;-) Cheers, Sander

On Fri, Mar 15, 2013 at 11:04 PM, Sander Steffann <sander@steffann.nl> wrote:
Which CLI offers this? Remember: we are not talking about specific implementations of a system like git, we are defining requirements :-)
I was offering solutions within the context of what you asked for, mainly to show that git can do everything you wanted. In requirement-mode, my answer is "yes, I agree" to all your points.
For example with links on the webpage of ripe-XXX. The web page of ripe-554 contains a link 'Updates: ripe-501'. Adding a link 'Compare to ripe-501' would be very useful.
Agreed.
I understand, but we need to specify that without specifying the tool to do it with. That's how you build requirements ;-)
Again, I was still in answer mode. My reference to anonymous pull is equivalent to the requirement "everyone on earth should have easy access to the raw data" I will add some more: * Easy, quick and reliable way to get all updates * Easy, quick and reliable way to verify local data is OK * Full access to history while off line Richard

On Fri, Mar 15, 2013 at 10:24 PM, Tore Anderson <tore@fud.no> wrote:
I can see one particular advantage of using a version control system - the "git blame" command, which will show when a particular sentence or paragraph was added or last modified. There's several times I've read RIPE documents and wondered along the lines of: «Where did this come from? What was the rationale behind it?» I rarely go through the trouble to find out.
For example: Re-allocated blocks will be signed to establish the current allocation owner. ;)
That said, while I agree that Git is a very good tool, I'm reluctant to micro-manage the NCC by saying «you MUST use Git» (or any other particular tool for that matter).
I am painfully aware of that. The community is free to shoot this down, but if it's accepted, it would be possible to change this. _Especially_ once there's a yearly list of services through which RIPE NCC can give feedback on if they want to get rid of it and replace it with VCS 3.0 in a decade.
The current system can be improved a whole lot without requiring a version control system, I like to point to the IETF's tools page, which shows all the revisions of a draft leading up to its publication as RFC, and where you can generate both in-line and side-by-side diffs on the fly. One example:
http://tools.ietf.org/html/draft-ietf-v6ops-ipv6-cpe-router-09
Submitting Internet Drafts is a different valley of pain, but the end user interface is somewhat decent. Still, RIPE NCC would need to either ask for that code of code a similar system themselves. Gitweb and gitolite are free and available today. Plus, IETF lacks any offline capabilities.
In any case, for maximum use of any form of version control system, you'd have to consider that even though e.g. ripe-582 is a new version of ripe-577 which in turn is a new version of ripe-553 and so on, the history of ripe-582 alone isn't nearly as interesting as the history of ripe-582 plus all preceding versions combined.
Internally git tracks changesets, not files. Thus, there is no `git cp` which allows you to trace the forking of a file directly. That being said, git log supports --find-copies which does the same on the fly. Also, just like RFCs, documents carry inline information about what other forms they update.
So either the file name needs to stay the same between versions (unlike today). or you'd have to import the entire history of the previous version into whatever you're starting the PDP with for the proposed new version. I don't know if Git allows for this.
If you have an existing repository, importing the old data into the past of the repo will change history, resulting in issues for everyone who cloned from that repo. There are two ways to mitigate this: 1) Recreate full history before starting to really use the main repo. Optionally have a scratch repo for new PDPs which can be rebased on top of the historic repo once it's done. This leaves you with a sparkling clean history, but is a lot of work up front and means people need to actively migrate away from the temporary repo. 2) Use main repo without full history from day one. Once history has been recreated, merge current master into the history and make that the new master. You would have one clear spot in your history where you would always be able to see that two repos were merged. But the actual impact is neglible and work up front minimal. Richard

On 15 Mar 2013, at 21:42, Richard Hartmann <richih.mailinglist@gmail.com> wrote:
That said, while I agree that Git is a very good tool, I'm reluctant to micro-manage the NCC by saying «you MUST use Git» (or any other particular tool for that matter).
I am painfully aware of that. The community is free to shoot this down, but if it's accepted, it would be possible to change this.
Boiling the ocean is also possible. In theory anyway... That doesn't make it a good idea or a sensible use of resources. Policy development is already sclerotic. Now imagine just how difficult it will be for the community to reach consensus on a policy to replace last year's shiny version control fad with next year's model. The potential for rat-holing and shed painting will be off-the-scale scary. Let's not forget backwards compatibility issues and having to keep the old platform(s) running because people can't or won't change the stuff they already use and rely on. Locking in the PDP to particular tools or document formats is also stunningly unwise. It's the sort of thing the ITU does. A policy stating "the NCC must use git" is no different from one which states "the NCC must only use vi" or "the NCC must only write code in Java". I'll repeat what I previously said. First identify the problem that needs solving. Then come up with agreed requirements. Once that's done, trust the NCC to choose the right tools and platforms in consultation with the community to deliver the desired outcome(s). Personally, I think all that's needed is to have proposal versions published in a neutral format -- probably plain text -- which allows for the files to be imported into whatever version control tools and document management systems each individual chooses for themself and presumably fits their needs. Whatever the IETF is doing for RFCs and I-Ds might well be good enough.

I agree with what Jim says. - Job On Mar 15, 2013, at 11:32 PM, Jim Reid <jim@rfc1035.com> wrote:
On 15 Mar 2013, at 21:42, Richard Hartmann <richih.mailinglist@gmail.com> wrote:
That said, while I agree that Git is a very good tool, I'm reluctant to micro-manage the NCC by saying «you MUST use Git» (or any other particular tool for that matter).
I am painfully aware of that. The community is free to shoot this down, but if it's accepted, it would be possible to change this.
Boiling the ocean is also possible. In theory anyway... That doesn't make it a good idea or a sensible use of resources.
Policy development is already sclerotic. Now imagine just how difficult it will be for the community to reach consensus on a policy to replace last year's shiny version control fad with next year's model. The potential for rat-holing and shed painting will be off-the-scale scary. Let's not forget backwards compatibility issues and having to keep the old platform(s) running because people can't or won't change the stuff they already use and rely on.
Locking in the PDP to particular tools or document formats is also stunningly unwise. It's the sort of thing the ITU does. A policy stating "the NCC must use git" is no different from one which states "the NCC must only use vi" or "the NCC must only write code in Java".
I'll repeat what I previously said. First identify the problem that needs solving. Then come up with agreed requirements. Once that's done, trust the NCC to choose the right tools and platforms in consultation with the community to deliver the desired outcome(s).
Personally, I think all that's needed is to have proposal versions published in a neutral format -- probably plain text -- which allows for the files to be imported into whatever version control tools and document management systems each individual chooses for themself and presumably fits their needs. Whatever the IETF is doing for RFCs and I-Ds might well be good enough.

On Fri, Mar 15, 2013 at 8:13 PM, Richard Hartmann <richih.mailinglist@gmail.com> wrote:
All PDPs will be maintained as a separate branch. Every new version of a PDP is an update within that branch. The branch will either be merged back into master or left unmerged but intact, depending on if they are accepted or not.
To make this point clear: I am not saying that a proposer should be forced to know git. If RIPE NCC receives the proposal/update as simple text file, it should import that data into git (retaining author information) and commit the result. Or nudge the proposer into the right direction if need be. Again, I will help with the git side of things for at least 12 months if this proposal is accepted. Richard

On 15 Mar 2013, at 21:23, Richard Hartmann <richih.mailinglist@gmail.com> wrote:
If RIPE NCC receives the proposal/update as simple text file, it should import that data into git (retaining author information) and commit the result
NO! A million times no. First, git may well be the flavour-of-the-month as a version control repository/system today. It won't be in the future. What happens then? Next, this could force EVERYONE to use some git client. The NCC's supposed to be neutral. It should not get into the game of requiring the community to use specific tools or software. Just because some people have a git-shaped hammer doesn't mean every problem in the area of version control and document management has to be made to look like a git-shaped nail. You also appear to be defining outcomes before the requirements are agreed. Or even clear. Let's first understand what problem needs solved and then decide what's the best way to solve them.

On Fri, Mar 15, 2013 at 10:46 PM, Jim Reid <jim@rfc1035.com> wrote:
First, git may well be the flavour-of-the-month as a version control repository/system today. It won't be in the future. What happens then?
What always happens: We migrate. I strongly disagree with your assessment that it's "of the month", though. Most if not all major FLOSS projects (which allow GPL software) use git or are migrating to it. Even Microsoft is starting to support it. git will not go away for a long time. And if it does, it will have provided real benefit to everyone within RIPE.
Next, this could force EVERYONE to use some git client. The NCC's supposed to be neutral. It should not get into the game of requiring the community to use specific tools or software.
No, it would not. The whois data is maintained within a SQL DB. I don't need SQL to access whois information. Just because data is stored in a specific canonical place does not mean it's the only place. Point in case: Do you know what RIPE NCC's canonical policy document storage is? Do you care as long as you get to access the files via www, ftp, etc?
Just because some people have a git-shaped hammer doesn't mean every problem in the area of version control and document management has to be made to look like a git-shaped nail.
While that statement is true in and as of itself, I fail to see how it relates to the discussion at hand.
You also appear to be defining outcomes before the requirements are agreed. Or even clear. Let's first understand what problem needs solved and then decide what's the best way to solve them.
That's what we are doing. To arrive at that, we need something to base our discussion on. If I hadn't proposed the above, we would not be having this conversation. Richard

On 15 Mar 2013, at 21:55, Richard Hartmann <richih.mailinglist@gmail.com> wrote:
You also appear to be defining outcomes before the requirements are agreed. Or even clear. Let's first understand what problem needs solved and then decide what's the best way to solve them.
That's what we are doing. To arrive at that, we need something to base our discussion on. If I hadn't proposed the above, we would not be having this conversation.
Your emails on this topic are NOT doing that at all. They're doing the very opposite. You've already decided that git is the solution to everything. As yet it's not clear what problem needs fixing. [Or if there's a consensus that this problem does indeed need fixing.] I suggest you focus on that.

On Fri, Mar 15, 2013 at 11:13 PM, Jim Reid <jim@rfc1035.com> wrote:
Your emails on this topic are NOT doing that at all. They're doing the very opposite. You've already decided that git is the solution to everything. As yet it's not clear what problem needs fixing. [Or if there's a consensus that this problem does indeed need fixing.] I suggest you focus on that.
I stated initially that I was aware that this is the weakest of my proposals. I also agreed to switch over to defining requirements. Do I still expect git to be the solution? Yes. Do I want to dive into a meta-discussion about points I already conceded during a fast-paced discussion? No. Richard

On 15 Mar 2013, at 21:55, Richard Hartmann <richih.mailinglist@gmail.com> wrote:
On Fri, Mar 15, 2013 at 10:46 PM, Jim Reid <jim@rfc1035.com> wrote:
First, git may well be the flavour-of-the-month as a version control repository/system today. It won't be in the future. What happens then?
What always happens: We migrate. I strongly disagree with your assessment that it's "of the month", though.
I exaggerated (a little) for effect. So shoot me. I'm an Old Fart and have lived through far too much of new version control software snake oil and hype. IMO in a few years git will be as dead as SCCS, RCS, CVS, and subversion are. YMMV.
Most if not all major FLOSS projects (which allow GPL software) use git or are migrating to it. Even Microsoft is starting to support it. git will not go away for a long time. And if it does, it will have provided real benefit to everyone within RIPE.
By this argument, we'd do Track Changes in Word because that's what "everybody does".
Point in case: Do you know what RIPE NCC's canonical policy document storage is? Do you care as long as you get to access the files via www, ftp, etc?
Just because some people have a git-shaped hammer doesn't mean every problem in the area of version control and document management has to be made to look like a git-shaped nail.
While that statement is true in and as of itself, I fail to see how it relates to the discussion at hand.
Because you strongly suggested git was the way to fix something before it's clear what needs fixing or what the requirements are.
You also appear to be defining outcomes before the requirements are agreed. Or even clear. Let's first understand what problem needs solved and then decide what's the best way to solve them.
That's what we are doing. To arrive at that, we need something to base our discussion on. If I hadn't proposed the above, we would not be having this conversation.
So let's stop this meta-discussion and get back to what should be getting discussed: a clear problem statement and identification of requirements.

Dear all, this is the updated version of the fourth proposal: All policy documents by RIPE NCC will be published in one canonical place which is freely accessible to the public. All other copies will be copied from this one canonical place. All PDPs and their versions should be archived, not matter if they are successful or not. A clear way to discern PDPs if were accepted eventually or not should be provided. An existing local copy should be easy to bring up to date with the current canonical version and to verify against local defects. All policy documents, their predecessors, and the PDPs leading to their existence will be archived and can easily be diffed on the RIPE NCC website. Said web system should allow successive policy documents to be diffed without the PDPs in between, as well. All lines of all policy proposals and PDPs should be trace-able to their first occurrence. Rationale and links to relevant discussions should be published along with the policy documents and PDPs. Historic versions should be imported in this system within one year after this proposal becomes policy. If the proposal "All PDP emails, documents and websites should come with unified diff" is accepted, all diffs will be generated from this system. A pony for Sander Again, if this proposal ends up using git, I am willing to help maintain this system for at least a year. Richard

participants (6)
-
Jim Reid
-
Job Snijders
-
Richard Hartmann
-
Rob Evans
-
Sander Steffann
-
Tore Anderson