(temorary) solution for the database index problems ?!?!
Dear all, The last days I did some further research with Benoit Grange and Willi Huber (they both run SUN OS, so we could use each others binaries) on the recent indexing problems. This morning I found out that the problems disappear when you don't use the '-p' prime option in netdbm. This option adds dummy objects to the database to make the lookups faster. So for the temporary solution: Do *NOT* use the '-p' option in netdbm !!! I am not sure yet if there is a bug in the code or that certain 'dbm' versions cause the problems. I will investigate further ... Kind regards, David Kessens RIPE NCC Database software maintainer --------
RIPE Database Manager <ripe-dbm@ripe.net> writes * * Dear all, * * The last days I did some further research with Benoit Grange and Willi * Huber (they both run SUN OS, so we could use each others binaries) on the * recent indexing problems. This morning I found out that the problems * disappear when you don't use the '-p' prime option in netdbm. This option * adds dummy objects to the database to make the lookups faster. * * So for the temporary solution: * * Do *NOT* use the '-p' option in netdbm !!! * * I am not sure yet if there is a bug in the code or that certain 'dbm' * versions cause the problems. I will investigate further ... Why didn't anyone ask me? It is definately an DBM bug, or rather the way DBM grows its datafiles. It basically is rather stupid when it needs more space and will take an increasing chunk of disk space for the index. Not using the -p option will only be a temporary measure, only because you make the database smaller (no prime entries) but you do make it flatter. Try and make or run a perl with sdbm or gdbm. These two have better size control than the standard dbm (esp gdbm) but are slower when indexing. -Marten
Hi Marten, Marten Terpstra <marten@BayNetworks.com> writes: * * Why didn't anyone ask me? It is definately an DBM bug, or rather the The first report came from Tony Bates (You probably still remember him ;-)) and most of the discussion was done on the 'db-wg' mailing list (true this subject belongs more on rr-impl). We more or less (wrongly?) assumed that you were on that mailing list and that if Tony had this problem it was something new ... * way DBM grows its datafiles. It basically is rather stupid when it needs * more space and will take an increasing chunk of disk space for the * index. Not using the -p option will only be a temporary measure, only * because you make the database smaller (no prime entries) but you do * make it flatter. Try and make or run a perl with sdbm or gdbm. These * two have better size control than the standard dbm (esp gdbm) but are * slower when indexing. This was also my conclusion, but we would like to know for sure that it is the dbm package since this problem appeared within *one* week with several people using completely different sizes of filesystems... I don't like to maintain a timebomb that will go off some other day ... I better be sure about this. I hear that most people recommend the Berkeley db package so I am experimenting with that one ... (Check the perl5 man pages 'man AnyDBM_File' and you know why ;-)). Kind regards, David Kessens RIPE NCC Database software maintainer --------
* The first report came from Tony Bates (You probably still remember him * ;-)) and most of the discussion was done on the 'db-wg' mailing list * (true this subject belongs more on rr-impl). We more or less (wrongly?) * assumed that you were on that mailing list and that if Tony had this * problem it was something new ... Well, I seem to have dropped off most mailing lists.... Please put me back on db-wg... * * way DBM grows its datafiles. It basically is rather stupid when it needs * * more space and will take an increasing chunk of disk space for the * * index. Not using the -p option will only be a temporary measure, only * * because you make the database smaller (no prime entries) but you do * * make it flatter. Try and make or run a perl with sdbm or gdbm. These * * two have better size control than the standard dbm (esp gdbm) but are * * slower when indexing. * * This was also my conclusion, but we would like to know for sure that it * is the dbm package since this problem appeared within *one* week with * several people using completely different sizes of filesystems... I don't * like to maintain a timebomb that will go off some other day ... I better * be sure about this. If they all mirror the RIPE DB, they will all see the problem at about the same time. Trust me, it is dbm, Tony and I tried indexing with a lot of other versions of dbm when we wrote the code and saw exactly the same behaviour for the default dbm package (try indexing the Internic database without the -p option.....) * I hear that most people recommend the Berkeley db package so I am * experimenting with that one ... (Check the perl5 man pages 'man * AnyDBM_File' and you know why ;-)). Yup. Perhaps it is time to move to perl 5, that will take a bit of rewriting though.... -Marten
In message <9509281346.AA21955@class.engeast>, Marten Terpstra writes:
Yup. Perhaps it is time to move to perl 5, that will take a bit of rewriting though....
-Marten
Don't do that. There are still bugs in perl5 and this is too important. Just relink perl4. BSD DB has ndbm emulation, so just build the library and put it in the perl Makefile. Curtis
Dear all,
The last days I did some further research with Benoit Grange and Willi Huber (they both run SUN OS, so we could use each others binaries) on the recent indexing problems. This morning I found out that the problems disappear when you don't use the '-p' prime option in netdbm. This option adds dummy objects to the database to make the lookups faster.
So for the temporary solution:
Do *NOT* use the '-p' option in netdbm !!!
I am not sure yet if there is a bug in the code or that certain 'dbm' versions cause the problems. I will investigate further ...
Kind regards,
David Kessens RIPE NCC Database software maintainer --------
Without '-p' option the script has completed without 'file system full'. So we have again some time to go. Willi Huber SWITCH
In message <9509290728.AA05326@ncc.ripe.net>, Willi Huber writes:
Dear all,
The last days I did some further research with Benoit Grange and Willi Huber (they both run SUN OS, so we could use each others binaries) on the recent indexing problems. This morning I found out that the problems disappear when you don't use the '-p' prime option in netdbm. This option adds dummy objects to the database to make the lookups faster.
So for the temporary solution:
Do *NOT* use the '-p' option in netdbm !!!
I am not sure yet if there is a bug in the code or that certain 'dbm' versions cause the problems. I will investigate further ...
Kind regards,
David Kessens RIPE NCC Database software maintainer --------
Without '-p' option the script has completed without 'file system full'. So we have again some time to go.
Willi Huber SWITCH
We get much smaller files. At David Kessen's request I tried this with -p to see if it made any difference. It doesn't. It appears the differnce is simply BSD DB vs DBM. I normally use -cfMV with the code I run. I cleaned the directory and the added "p" yielding -pcfMV. The results were: [curtis@figaro.ans.net 5] ls -ltr test total 51856 -rw-r----- 1 curtis staff 22044885 Sep 29 00:27 ripe.db.save -rw-r----- 1 curtis staff 16384 Sep 29 18:08 ripe.db.save.db -rw-r----- 1 curtis staff 4448256 Sep 29 18:08 ripe.db.save.cl.db [curtis@figaro.ans.net 6] du -s test 51858 test We are running modified code (some of the flags are not in the RIPE distributed code, but all our code recently went to David). Anyone can get the same effect relinkink BSD DB and looking for the few places in the .pl files where .pag or .dir is found. Curtis
Dear all, Curtis Villamizar <curtis@ans.net> writes: * * We get much smaller files. At David Kessen's request I tried this * with -p to see if it made any difference. It doesn't. It appears the * differnce is simply BSD DB vs DBM. * * I normally use -cfMV with the code I run. I cleaned the directory and * the added "p" yielding -pcfMV. The results were: * * [curtis@figaro.ans.net 5] ls -ltr test * total 51856 * -rw-r----- 1 curtis staff 22044885 Sep 29 00:27 ripe.db.save * -rw-r----- 1 curtis staff 16384 Sep 29 18:08 ripe.db.save.db * -rw-r----- 1 curtis staff 4448256 Sep 29 18:08 ripe.db.save.cl.db * [curtis@figaro.ans.net 6] du -s test * 51858 test * * We are running modified code (some of the flags are not in the RIPE * distributed code, but all our code recently went to David). Anyone * can get the same effect relinkink BSD DB and looking for the few * places in the .pl files where .pag or .dir is found. I have received the code from Curtis. Since it looks like that BSD DB is a fast and more reliable database package I certainly want to support it. Support for BSD DB will be in the next distribution. Kind regards, David Kessens RIPE NCC Database software maintainer --------
participants (4)
-
Curtis Villamizar
-
Marten Terpstra
-
RIPE Database Manager
-
Willi Huber