Urgent release of RIPE Database software
Dear colleagues, We were recently made aware of some issues with the documentation related to the REST API. To fix these issues requires a new release of the Database software as some parts of the documentation are generated from the active code. The documentation is vital for anyone wanting to use the service. We therefore believe it is necessary to make a new release containing this fix only and deploy it directly to production later today. In this particular case we don’t believe it is necessary to put the release in test beforehand. The only fix is for the documentation and this is preventing users form using the service, so it's quite urgent. This fix should have no impact on any other part of the RIPE Database service. If you have any questions then please let us know. Kind regards, Johan Åhlén Assistant Manager Database RIPE NCC
Dear Johan, thanks a lot for the new much more transparent mode of software version management for the RIPE data base service. This is a huge improvement. Nevertheless I think that we need continued discussion in the community to improve mutual understanding of considerations and impact. Your message today brings up few consideratiosn from me (a bit more tuned to the general pro cess than the specific case).
We were recently made aware of some issues with the documentation related to the REST API. To fix these issues requires a new release of the Database software as some parts of the documentation are generated from the active code. The documentation is vital for anyone wanting to use the service. We therefore believe it is necessary to make a new release containing this fix only and deploy it directly to production later today. The problem description does NOT read like there IS ACTUAL SERVICE IMPACT at the moment; looking at the RIPE NCC service status page seems to confirm since it reports "no known problems" and no ser vice announcements either.
Doing a software version change with less than a day advance notice certainly can be appropriate as an "emergency update" to deal with some service impact. How much harm is done if fix to the documentation is done a few days later while a note is posted indicating that certain errors will be fixed in the near future... ?
In this particular case we don=92t believe it is necessary to put the release in test beforehand. The only fix is for the documentation and this is preventing users form using the service, so it's quite urgent. This fix should have no impact on any other part of the RIPE Database service. I am tempted to believe your assessment; I dont have access to the changes and the resources to do my own full assessment. As I did experience 3 service impacting bugs within 12 months with UNannounced version changes I conclude that I rather see than believe. I certainly do see a need to watch more carefully how software behaves after ANY change and a non zero probability of some surprise (that could result in service impact for me).
Note: knowing the schedule of upcoming changes some time in advance can be more important than actually having the ability to run tests (like due to IETF timing I could not run tests against the test server with 1.67.4 - but we did schedule preparation of critical production runs in a way that protected us against potential surprises due to the software change.)
If you have any questions then please let us know.
Kind regards,
Johan =C5hl=E9n Assistant Manager Database RIPE NCC= Best regards, Ruediger
Ruediger Volk Deutsche Telekom AG -- Internet Backbone Engineering E-Mail: rv@NIC.DTAG.DE
Dear Ruediger, Thank you for the input. As you are all aware the release procedure is new and does require some tuning, so in that aspect any suggestions on how we can improve this are welcome. In this particular situation our assessment is that the broken links in the documentation indeed has a service impact. For some users this could be critical as they may not be able to use the REST API service. At least one case was reported to us where the development towards the API has come to a halt because of this. Perhaps in this case we can hold the deployment of the fix until the working group advices us on how to proceed. Kind regards, Johan Åhlén Assistant Database Manager RIPE NCC On 15 Aug 2013, at 14:52, "Ruediger Volk, Deutsche Telekom Technik - FMED-41.." <rv@NIC.DTAG.DE> wrote:
Dear Johan,
thanks a lot for the new much more transparent mode of software version management for the RIPE data base service. This is a huge improvement.
Nevertheless I think that we need continued discussion in the community to improve mutual understanding of considerations and impact.
Your message today brings up few consideratiosn from me (a bit more tuned to the general pro cess than the specific case).
We were recently made aware of some issues with the documentation related to the REST API. To fix these issues requires a new release of the Database software as some parts of the documentation are generated from the active code. The documentation is vital for anyone wanting to use the service. We therefore believe it is necessary to make a new release containing this fix only and deploy it directly to production later today. The problem description does NOT read like there IS ACTUAL SERVICE IMPACT at the moment; looking at the RIPE NCC service status page seems to confirm since it reports "no known problems" and no ser vice announcements either.
Doing a software version change with less than a day advance notice certainly can be appropriate as an "emergency update" to deal with some service impact. How much harm is done if fix to the documentation is done a few days later while a note is posted indicating that certain errors will be fixed in the near future... ?
In this particular case we don=92t believe it is necessary to put the release in test beforehand. The only fix is for the documentation and this is preventing users form using the service, so it's quite urgent. This fix should have no impact on any other part of the RIPE Database service. I am tempted to believe your assessment; I dont have access to the changes and the resources to do my own full assessment. As I did experience 3 service impacting bugs within 12 months with UNannounced version changes I conclude that I rather see than believe. I certainly do see a need to watch more carefully how software behaves after ANY change and a non zero probability of some surprise (that could result in service impact for me).
Note: knowing the schedule of upcoming changes some time in advance can be more important than actually having the ability to run tests (like due to IETF timing I could not run tests against the test server with 1.67.4 - but we did schedule preparation of critical production runs in a way that protected us against potential surprises due to the software change.)
If you have any questions then please let us know.
Kind regards,
Johan =C5hl=E9n Assistant Manager Database RIPE NCC= Best regards, Ruediger
Ruediger Volk
Deutsche Telekom AG -- Internet Backbone Engineering
E-Mail: rv@NIC.DTAG.DE
Hi, Any reason not to deploy on test environmen, wait 24 hours, proceed to roll out on production environment? Kind regards, Job On Aug 15, 2013, at 3:27 PM, Johan Åhlén <jahlen@ripe.net> wrote:
Dear Ruediger,
Thank you for the input. As you are all aware the release procedure is new and does require some tuning, so in that aspect any suggestions on how we can improve this are welcome.
In this particular situation our assessment is that the broken links in the documentation indeed has a service impact. For some users this could be critical as they may not be able to use the REST API service. At least one case was reported to us where the development towards the API has come to a halt because of this.
Perhaps in this case we can hold the deployment of the fix until the working group advices us on how to proceed.
Kind regards,
Johan Åhlén Assistant Database Manager RIPE NCC
On 15 Aug 2013, at 14:52, "Ruediger Volk, Deutsche Telekom Technik - FMED-41.." <rv@NIC.DTAG.DE> wrote:
Dear Johan,
thanks a lot for the new much more transparent mode of software version management for the RIPE data base service. This is a huge improvement.
Nevertheless I think that we need continued discussion in the community to improve mutual understanding of considerations and impact.
Your message today brings up few consideratiosn from me (a bit more tuned to the general pro cess than the specific case).
We were recently made aware of some issues with the documentation related to the REST API. To fix these issues requires a new release of the Database software as some parts of the documentation are generated from the active code. The documentation is vital for anyone wanting to use the service. We therefore believe it is necessary to make a new release containing this fix only and deploy it directly to production later today. The problem description does NOT read like there IS ACTUAL SERVICE IMPACT at the moment; looking at the RIPE NCC service status page seems to confirm since it reports "no known problems" and no ser vice announcements either.
Doing a software version change with less than a day advance notice certainly can be appropriate as an "emergency update" to deal with some service impact. How much harm is done if fix to the documentation is done a few days later while a note is posted indicating that certain errors will be fixed in the near future... ?
In this particular case we don=92t believe it is necessary to put the release in test beforehand. The only fix is for the documentation and this is preventing users form using the service, so it's quite urgent. This fix should have no impact on any other part of the RIPE Database service. I am tempted to believe your assessment; I dont have access to the changes and the resources to do my own full assessment. As I did experience 3 service impacting bugs within 12 months with UNannounced version changes I conclude that I rather see than believe. I certainly do see a need to watch more carefully how software behaves after ANY change and a non zero probability of some surprise (that could result in service impact for me).
Note: knowing the schedule of upcoming changes some time in advance can be more important than actually having the ability to run tests (like due to IETF timing I could not run tests against the test server with 1.67.4 - but we did schedule preparation of critical production runs in a way that protected us against potential surprises due to the software change.)
If you have any questions then please let us know.
Kind regards,
Johan =C5hl=E9n Assistant Manager Database RIPE NCC= Best regards, Ruediger
Ruediger Volk
Deutsche Telekom AG -- Internet Backbone Engineering
E-Mail: rv@NIC.DTAG.DE
-- AS5580 - Atrato IP Networks
That seems like a reasonable approach to me. Tim Garrison Software Engineer III *SoftLayer, an IBM Company* 315 Capitol Street Suite 205, Houston, TX 77002 281.714.4213 direct | 713.540.4325 mobile | 281.714.4657 fax | tgarrison@softlayer.com On 08/15/2013 08:32 AM, Job Snijders wrote:
Hi,
Any reason not to deploy on test environmen, wait 24 hours, proceed to roll out on production environment?
Kind regards,
Job
On Aug 15, 2013, at 3:27 PM, Johan Åhlén <jahlen@ripe.net> wrote:
Dear Ruediger,
Thank you for the input. As you are all aware the release procedure is new and does require some tuning, so in that aspect any suggestions on how we can improve this are welcome.
In this particular situation our assessment is that the broken links in the documentation indeed has a service impact. For some users this could be critical as they may not be able to use the REST API service. At least one case was reported to us where the development towards the API has come to a halt because of this.
Perhaps in this case we can hold the deployment of the fix until the working group advices us on how to proceed.
Kind regards,
Johan Åhlén Assistant Database Manager RIPE NCC
On 15 Aug 2013, at 14:52, "Ruediger Volk, Deutsche Telekom Technik - FMED-41.." <rv@NIC.DTAG.DE> wrote:
Dear Johan,
thanks a lot for the new much more transparent mode of software version management for the RIPE data base service. This is a huge improvement.
Nevertheless I think that we need continued discussion in the community to improve mutual understanding of considerations and impact.
Your message today brings up few consideratiosn from me (a bit more tuned to the general pro cess than the specific case).
We were recently made aware of some issues with the documentation related to the REST API. To fix these issues requires a new release of the Database software as some parts of the documentation are generated from the active code. The documentation is vital for anyone wanting to use the service. We therefore believe it is necessary to make a new release containing this fix only and deploy it directly to production later today. The problem description does NOT read like there IS ACTUAL SERVICE IMPACT at the moment; looking at the RIPE NCC service status page seems to confirm since it reports "no known problems" and no ser vice announcements either.
Doing a software version change with less than a day advance notice certainly can be appropriate as an "emergency update" to deal with some service impact. How much harm is done if fix to the documentation is done a few days later while a note is posted indicating that certain errors will be fixed in the near future... ?
In this particular case we don=92t believe it is necessary to put the release in test beforehand. The only fix is for the documentation and this is preventing users form using the service, so it's quite urgent. This fix should have no impact on any other part of the RIPE Database service. I am tempted to believe your assessment; I dont have access to the changes and the resources to do my own full assessment. As I did experience 3 service impacting bugs within 12 months with UNannounced version changes I conclude that I rather see than believe. I certainly do see a need to watch more carefully how software behaves after ANY change and a non zero probability of some surprise (that could result in service impact for me).
Note: knowing the schedule of upcoming changes some time in advance can be more important than actually having the ability to run tests (like due to IETF timing I could not run tests against the test server with 1.67.4 - but we did schedule preparation of critical production runs in a way that protected us against potential surprises due to the software change.)
If you have any questions then please let us know.
Kind regards,
Johan =C5hl=E9n Assistant Manager Database RIPE NCC= Best regards, Ruediger
Ruediger Volk
Deutsche Telekom AG -- Internet Backbone Engineering
E-Mail: rv@NIC.DTAG.DE
Dear all, The release containing the fix is now available in the TEST Database and there's a service announcement on the web page. If there are no objections to this we will proceed with 24h in the TEST Database and put this release in production on the RIPE Database tomorrow at 16:00. Please go ahead and test your applications against the version in the TEST Database. The URL for the REST API in TEST is: http(s)://rest-test.db.ripe.net Kind regards, Johan Åhlén Assistant Manager Database RIPE NCC On 15 Aug 2013, at 15:37, Tim Garrison <tgarrison@softlayer.com> wrote:
That seems like a reasonable approach to me.
Tim Garrison Software Engineer III
SoftLayer, an IBM Company 315 Capitol Street Suite 205, Houston, TX 77002 281.714.4213 direct | 713.540.4325 mobile | 281.714.4657 fax | tgarrison@softlayer.com
On 08/15/2013 08:32 AM, Job Snijders wrote:
Hi,
Any reason not to deploy on test environmen, wait 24 hours, proceed to roll out on production environment?
Kind regards,
Job
On Aug 15, 2013, at 3:27 PM, Johan Åhlén <jahlen@ripe.net> wrote:
Dear Ruediger,
Thank you for the input. As you are all aware the release procedure is new and does require some tuning, so in that aspect any suggestions on how we can improve this are welcome.
In this particular situation our assessment is that the broken links in the documentation indeed has a service impact. For some users this could be critical as they may not be able to use the REST API service. At least one case was reported to us where the development towards the API has come to a halt because of this.
Perhaps in this case we can hold the deployment of the fix until the working group advices us on how to proceed.
Kind regards,
Johan Åhlén Assistant Database Manager RIPE NCC
On 15 Aug 2013, at 14:52, "Ruediger Volk, Deutsche Telekom Technik - FMED-41.." <rv@NIC.DTAG.DE> wrote:
Dear Johan,
thanks a lot for the new much more transparent mode of software version management for the RIPE data base service. This is a huge improvement.
Nevertheless I think that we need continued discussion in the community to improve mutual understanding of considerations and impact.
Your message today brings up few consideratiosn from me (a bit more tuned to the general pro cess than the specific case).
We were recently made aware of some issues with the documentation related to the REST API. To fix these issues requires a new release of the Database software as some parts of the documentation are generated from the active code. The documentation is vital for anyone wanting to use the service. We therefore believe it is necessary to make a new release containing this fix only and deploy it directly to production later today. The problem description does NOT read like there IS ACTUAL SERVICE IMPACT at the moment; looking at the RIPE NCC service status page seems to confirm since it reports "no known problems" and no ser vice announcements either.
Doing a software version change with less than a day advance notice certainly can be appropriate as an "emergency update" to deal with some service impact. How much harm is done if fix to the documentation is done a few days later while a note is posted indicating that certain errors will be fixed in the near future... ?
In this particular case we don=92t believe it is necessary to put the release in test beforehand. The only fix is for the documentation and this is preventing users form using the service, so it's quite urgent. This fix should have no impact on any other part of the RIPE Database service. I am tempted to believe your assessment; I dont have access to the changes and the resources to do my own full assessment. As I did experience 3 service impacting bugs within 12 months with UNannounced version changes I conclude that I rather see than believe. I certainly do see a need to watch more carefully how software behaves after ANY change and a non zero probability of some surprise (that could result in service impact for me).
Note: knowing the schedule of upcoming changes some time in advance can be more important than actually having the ability to run tests (like due to IETF timing I could not run tests against the test server with 1.67.4 - but we did schedule preparation of critical production runs in a way that protected us against potential surprises due to the software change.)
If you have any questions then please let us know.
Kind regards,
Johan =C5hl=E9n Assistant Manager Database RIPE NCC= Best regards, Ruediger
Ruediger Volk
Deutsche Telekom AG -- Internet Backbone Engineering
E-Mail: rv@NIC.DTAG.DE
I would like to report that there appear to be some broken links in the documentation still. For example, on https://rest-test.db.ripe.net/api-doc/path__create.html there is a link to the "whois-resources" element, which points to https://rest-test.db.ripe.net/api-doc/el_ns0_whois-resources.html . This link produces a 404. Tim Garrison Software Engineer III *SoftLayer, an IBM Company* 315 Capitol Street Suite 205, Houston, TX 77002 281.714.4213 direct | 713.540.4325 mobile | 281.714.4657 fax | tgarrison@softlayer.com On 08/15/2013 09:10 AM, Johan Åhlén wrote:
Dear all,
The release containing the fix is now available in the TEST Database and there's a service announcement on the web page.
If there are no objections to this we will proceed with 24h in the TEST Database and put this release in production on the RIPE Database tomorrow at 16:00. Please go ahead and test your applications against the version in the TEST Database.
The URL for the REST API in TEST is:
http(s)://rest-test.db.ripe.net <http://rest-test.db.ripe.net>
Kind regards,
Johan Åhlén Assistant Manager Database RIPE NCC
On 15 Aug 2013, at 15:37, Tim Garrison <tgarrison@softlayer.com <mailto:tgarrison@softlayer.com>> wrote:
That seems like a reasonable approach to me.
Tim Garrison Software Engineer III
*SoftLayer, an IBM Company* 315 Capitol Street Suite 205, Houston, TX 77002 281.714.4213 direct | 713.540.4325 mobile | 281.714.4657 fax | tgarrison@softlayer.com
On 08/15/2013 08:32 AM, Job Snijders wrote:
Hi,
Any reason not to deploy on test environmen, wait 24 hours, proceed to roll out on production environment?
Kind regards,
Job
On Aug 15, 2013, at 3:27 PM, Johan Åhlén<jahlen@ripe.net> wrote:
Dear Ruediger,
Thank you for the input. As you are all aware the release procedure is new and does require some tuning, so in that aspect any suggestions on how we can improve this are welcome.
In this particular situation our assessment is that the broken links in the documentation indeed has a service impact. For some users this could be critical as they may not be able to use the REST API service. At least one case was reported to us where the development towards the API has come to a halt because of this.
Perhaps in this case we can hold the deployment of the fix until the working group advices us on how to proceed.
Kind regards,
Johan Åhlén Assistant Database Manager RIPE NCC
On 15 Aug 2013, at 14:52, "Ruediger Volk, Deutsche Telekom Technik - FMED-41.."<rv@NIC.DTAG.DE> wrote:
Dear Johan,
thanks a lot for the new much more transparent mode of software version management for the RIPE data base service. This is a huge improvement.
Nevertheless I think that we need continued discussion in the community to improve mutual understanding of considerations and impact.
Your message today brings up few consideratiosn from me (a bit more tuned to the general pro cess than the specific case).
We were recently made aware of some issues with the documentation related to the REST API. To fix these issues requires a new release of the Database software as some parts of the documentation are generated from the active code. The documentation is vital for anyone wanting to use the service. We therefore believe it is necessary to make a new release containing this fix only and deploy it directly to production later today. The problem description does NOT read like there IS ACTUAL SERVICE IMPACT at the moment; looking at the RIPE NCC service status page seems to confirm since it reports "no known problems" and no ser vice announcements either.
Doing a software version change with less than a day advance notice certainly can be appropriate as an "emergency update" to deal with some service impact. How much harm is done if fix to the documentation is done a few days later while a note is posted indicating that certain errors will be fixed in the near future... ?
In this particular case we don=92t believe it is necessary to put the release in test beforehand. The only fix is for the documentation and this is preventing users form using the service, so it's quite urgent. This fix should have no impact on any other part of the RIPE Database service. I am tempted to believe your assessment; I dont have access to the changes and the resources to do my own full assessment. As I did experience 3 service impacting bugs within 12 months with UNannounced version changes I conclude that I rather see than believe. I certainly do see a need to watch more carefully how software behaves after ANY change and a non zero probability of some surprise (that could result in service impact for me).
Note: knowing the schedule of upcoming changes some time in advance can be more important than actually having the ability to run tests (like due to IETF timing I could not run tests against the test server with 1.67.4 - but we did schedule preparation of critical production runs in a way that protected us against potential surprises due to the software change.)
If you have any questions then please let us know.
Kind regards,
Johan =C5hl=E9n Assistant Manager Database RIPE NCC= Best regards, Ruediger
Ruediger Volk
Deutsche Telekom AG -- Internet Backbone Engineering
E-Mail:rv@NIC.DTAG.DE
Dear Tim, Instead of quickly fixing this issue and creating a new release today we've decided to halt the deployment of any new release for now. This is very embarrassing and we're very sorry for this. We obviously put too much trust in the framework that generates the documentation and we don't have the proper mechanisms in place to validate the correctness of the documentation generated. We will give this our fullest attention the coming days and expect to have a working solution sometime next week. In the meantime I hope the existing documentation in TEST Database contains the information you need to proceed with your project. If you have any questions about how the API works then please contact us directly and we'll gladly help you out. Kind regards, Johan Åhlén Assistant Manager Database RIPE NCC On 15 Aug 2013, at 17:20, Tim Garrison <tgarrison@softlayer.com> wrote:
I would like to report that there appear to be some broken links in the documentation still. For example, on https://rest-test.db.ripe.net/api-doc/path__create.html there is a link to the "whois-resources" element, which points to https://rest-test.db.ripe.net/api-doc/el_ns0_whois-resources.html . This link produces a 404.
Tim Garrison Software Engineer III
SoftLayer, an IBM Company 315 Capitol Street Suite 205, Houston, TX 77002 281.714.4213 direct | 713.540.4325 mobile | 281.714.4657 fax | tgarrison@softlayer.com
On 08/15/2013 09:10 AM, Johan Åhlén wrote:
Dear all,
The release containing the fix is now available in the TEST Database and there's a service announcement on the web page.
If there are no objections to this we will proceed with 24h in the TEST Database and put this release in production on the RIPE Database tomorrow at 16:00. Please go ahead and test your applications against the version in the TEST Database.
The URL for the REST API in TEST is:
http(s)://rest-test.db.ripe.net
Kind regards,
Johan Åhlén Assistant Manager Database RIPE NCC
On 15 Aug 2013, at 15:37, Tim Garrison <tgarrison@softlayer.com> wrote:
That seems like a reasonable approach to me.
Tim Garrison Software Engineer III
SoftLayer, an IBM Company 315 Capitol Street Suite 205, Houston, TX 77002 281.714.4213 direct | 713.540.4325 mobile | 281.714.4657 fax | tgarrison@softlayer.com
On 08/15/2013 08:32 AM, Job Snijders wrote:
Hi,
Any reason not to deploy on test environmen, wait 24 hours, proceed to roll out on production environment?
Kind regards,
Job
On Aug 15, 2013, at 3:27 PM, Johan Åhlén <jahlen@ripe.net> wrote:
Dear Ruediger,
Thank you for the input. As you are all aware the release procedure is new and does require some tuning, so in that aspect any suggestions on how we can improve this are welcome.
In this particular situation our assessment is that the broken links in the documentation indeed has a service impact. For some users this could be critical as they may not be able to use the REST API service. At least one case was reported to us where the development towards the API has come to a halt because of this.
Perhaps in this case we can hold the deployment of the fix until the working group advices us on how to proceed.
Kind regards,
Johan Åhlén Assistant Database Manager RIPE NCC
On 15 Aug 2013, at 14:52, "Ruediger Volk, Deutsche Telekom Technik - FMED-41.." <rv@NIC.DTAG.DE> wrote:
Dear Johan,
thanks a lot for the new much more transparent mode of software version management for the RIPE data base service. This is a huge improvement.
Nevertheless I think that we need continued discussion in the community to improve mutual understanding of considerations and impact.
Your message today brings up few consideratiosn from me (a bit more tuned to the general pro cess than the specific case).
> We were recently made aware of some issues with the documentation > related to the REST API. To fix these issues requires a new release of > the Database software as some parts of the documentation are generated > from the active code. The documentation is vital for anyone wanting to > use the service. We therefore believe it is necessary to make a new > release containing this fix only and deploy it directly to production > later today. The problem description does NOT read like there IS ACTUAL SERVICE IMPACT at the moment; looking at the RIPE NCC service status page seems to confirm since it reports "no known problems" and no ser vice announcements either.
Doing a software version change with less than a day advance notice certainly can be appropriate as an "emergency update" to deal with some service impact. How much harm is done if fix to the documentation is done a few days later while a note is posted indicating that certain errors will be fixed in the near future... ?
> In this particular case we don=92t believe it is necessary to put the > release in test beforehand. The only fix is for the documentation and > this is preventing users form using the service, so it's quite urgent. > This fix should have no impact on any other part of the RIPE Database > service. I am tempted to believe your assessment; I dont have access to the changes and the resources to do my own full assessment. As I did experience 3 service impacting bugs within 12 months with UNannounced version changes I conclude that I rather see than believe. I certainly do see a need to watch more carefully how software behaves after ANY change and a non zero probability of some surprise (that could result in service impact for me).
Note: knowing the schedule of upcoming changes some time in advance can be more important than actually having the ability to run tests (like due to IETF timing I could not run tests against the test server with 1.67.4 - but we did schedule preparation of critical production runs in a way that protected us against potential surprises due to the software change.)
> If you have any questions then please let us know. > > Kind regards, > > Johan =C5hl=E9n > Assistant Manager Database > RIPE NCC= Best regards, Ruediger
Ruediger Volk
Deutsche Telekom AG -- Internet Backbone Engineering
E-Mail: rv@NIC.DTAG.DE
Hi Johan! first of all, thanks for the openness and announcements on the list(s). As my question is post-factum, I am not including the lists(s), but I'd still like to collect some feedback from those "involved": Do I understand the stated facts correctly: that the software (the API) was working as intended, but the (auto-)generated documentation was in error? If this is the case, than I (personally, not wearing my WG co-chair hat) would strongly lean towards Rüdiger's position. In that case I would rather issue an alert regarding the documentation error and a correction. The approach as proposed to still go through the TEST cycle was the correct one, imho. Btw, I do not easily agree to label the problem as "service impacting, if the software is correct, but the documentation is in error. If it were the other was 'round, then yes, the label would be correct in my opinion. Johan Åhlén wrote:
Dear Tim,
Instead of quickly fixing this issue and creating a new release today we've decided to halt the deployment of any new release for now.
This is very embarrassing and we're very sorry for this. We obviously put too much trust in the framework that generates the documentation and we don't have the proper mechanisms in place to validate the correctness of the documentation generated. We will give this our fullest attention the coming days and expect to have a working solution sometime next week.
In the meantime I hope the existing documentation in TEST Database contains the information you need to proceed with your project. If you have any questions about how the API works then please contact us directly and we'll gladly help you out.
Kind regards,
Johan Åhlén Assistant Manager Database RIPE NCC
Regards, Wilfried.
Dear Wilfried, In retrospect, we may have overestimated the severity of the situation. Based on Tim's message to the list, we made the initial assessment that this was impacting his operations but also that it was impacting anyone else wanting to use the new REST API. Without the proper documentation in place it would be very difficult to migrate to the new API. At the point of the first announcement, we did not have a workaround available so we considered it necessary to update the Production Database as soon as possible. The documentation was part of the binary release so any changes to it required a new release in production. We agreed with the approach suggested on the list to put the software in the TEST Database before going to production. The test period was set to 24 hours, as a result of further feedback from the list. This proved beneficial, as we immediately found new bugs. At that point, we decided to halt the release process and conduct further investigation. The severity and urgency of this issue is now lower than initially estimated, partly because we now have a workaround in place: use the TEST Database documentation. This was the first time we applied the new release procedure for this type of incident, and it has provided us with valuable feedback as to how to proceed in the future, should a similar situation arise. I hope this clarifies the steps that were taken - and why. Kind regards, Johan Åhlén Assistant Manager Database RIPE NCC On 21 Aug 2013, at 14:14, Wilfried Woeber <Woeber@CC.UniVie.ac.at> wrote:
Hi Johan!
first of all, thanks for the openness and announcements on the list(s).
As my question is post-factum, I am not including the lists(s), but I'd still like to collect some feedback from those "involved":
Do I understand the stated facts correctly: that the software (the API) was working as intended, but the (auto-)generated documentation was in error?
If this is the case, than I (personally, not wearing my WG co-chair hat) would strongly lean towards Rüdiger's position. In that case I would rather issue an alert regarding the documentation error and a correction.
The approach as proposed to still go through the TEST cycle was the correct one, imho.
Btw, I do not easily agree to label the problem as "service impacting, if the software is correct, but the documentation is in error. If it were the other was 'round, then yes, the label would be correct in my opinion.
Johan Åhlén wrote:
Dear Tim,
Instead of quickly fixing this issue and creating a new release today we've decided to halt the deployment of any new release for now.
This is very embarrassing and we're very sorry for this. We obviously put too much trust in the framework that generates the documentation and we don't have the proper mechanisms in place to validate the correctness of the documentation generated. We will give this our fullest attention the coming days and expect to have a working solution sometime next week.
In the meantime I hope the existing documentation in TEST Database contains the information you need to proceed with your project. If you have any questions about how the API works then please contact us directly and we'll gladly help you out.
Kind regards,
Johan Åhlén Assistant Manager Database RIPE NCC
Regards, Wilfried.
Hi,
Any reason not to deploy on test environmen, wait 24 hours, proceed to roll out on production environment? would seem to me: deploying the fix to the test environment would already solve the service impact (that Johan sees) ... if we can point potential readers of the documentation there... So sidestepping the test environment actually delays fixing the specific
Hi Job, Johan, I'm happy to see careful attention, improvements, and tuning for the new process. That needs to continue and I'd expect to have some discussion in Athens... problem - while increasing risks for the production use... My take: no need and justification for rushing change on the production system. Kind regards, Ruediger Ruediger Volk Deutsche Telekom AG -- Internet Backbone Engineering E-Mail: rv@NIC.DTAG.DE
participants (6)
-
Job Snijders
-
Johan Åhlén
-
Ruediger Volk, Deutsche Telekom Technik - FMED-41..
-
Ruediger Volk, Deutsche Telekom Technik - FMED-41..
-
Tim Garrison
-
Wilfried Woeber