Dear all, lately I get messages from the RIPE database when querying it with whois that there are too many hits, e.g: jade$ whois -h whois.ripe.net "Stefan Anders" The index search returned too many hits. Please refine your search key. I had a look at the database and there are exactly three entries which contain the string "Stefan Anders": jade$ grep "Stefan Anders" ripe.db.pn *pn: Stefan Andersson *pn: Stefan Anders *pn: K A Stefan Andersson Does not seem to be a lot, does it? Is this an error of the database software or is it an indexing problem (there have been other problems reported on the indexing software in the last weeks...)? Regards Joachim _______________________________________________________________________________ Dr. Joachim Schmitz schmitz@rus.uni-stuttgart.de DFN Network Operation Center Rechenzentrum der Universitaet Stuttgart ++ 711 685 5576 voice Allmandring 30 ++ 711 678 7626 FAX D-70550 Stuttgart FRG (Germany) _______________________________________________________________________________
Dear Joachim,
Joachim Schmitz writes :
lately I get messages from the RIPE database when querying it with whois that there are too many hits, e.g:
jade$ whois -h whois.ripe.net "Stefan Anders" The index search returned too many hits. Please refine your search key.
As I explained you before (when you asked why the database software couldn't find "Frank Simon") the database software treats the arguments as a list of separate searchkeys, not as one key. Then it is easy to see that 'Anders' and 'Stefan' both give too many hits in the database: $ grep \*pn: ripe.db.pn | grep -iw Stefan | wc -l 149 $ grep \*pn: ripe.db.pn | grep -iw Anders | wc -l 254
I had a look at the database and there are exactly three entries which contain the string "Stefan Anders":
jade$ grep "Stefan Anders" ripe.db.pn *pn: Stefan Andersson *pn: Stefan Anders *pn: K A Stefan Andersson
Does not seem to be a lot, does it? Is this an error of the database software or is it an indexing problem (there have been other problems reported on the indexing software in the last weeks...)?
So, no it's not an error or indexing problem. It is an design issue. Changing this behavior can be done but might cause other unwanted effects for people (or tools) that make use of this feature that look for more objects at the same time. Kind regards, David Kessens RIPE NCC database software maintainer
The problem really is that Stefan is a frequently used first name and Anders is too, maybe not in Germany but elsewhere (up north). :-) Of course the real problem (as opposed to "the problem really") is the indexing algorithm which uses the DBM library for speed. In the current DBM implementation there is a maximum data size which limits the number of objects a key can reference. This does not pose a problem normally as there are only a small number of people with both first and surnames which are frequently found. There are exceptions, especially where first names are used as surnames. Please use handles. We plan to use a different DBM implementation without this limit as soon as possible and after thorough compatibility testing. Daniel
participants (3)
-
Daniel Karrenberg
-
David.Kessens@ripe.net
-
Schmitz@rus.uni-stuttgart.de