Deal with "non-breaking-space"-characters in updates

7 Jul 2015

      Summary:
Currently the whois database accepts “non-breaking-space”-characters in updates. We propose that whois replaces these “non-breaking-space”-characters with regular spaces before storing the object.

Context:
Currently the whois database accepts updates with "non-breaking-space”-characters as part of attribute values. 
This behaviour is fairly acceptable, since the "non-breaking-space”-character is part of the “latin1”-character-set which is supported by whois.
The whois database treats these "non-breaking-spaces”-characters as regular spaces and considers the object to be syntactically correct.
However, the object is stored exactly as it was received from the client: including non-breaking-spaces

Problem:
We think most of these "non-breaking-spaces” where intended to be regular spaces, but ended up mangled due to copy and paste.
So, when such an object is being queried for, the original object (including non-breaking-spaces) is being returned.
This is inconvenient. since most clients cannot handle this “exceptional” character. It many clients it will end up as something like a “?’-character.

Alternative solutions:
-1- Let whois software accept the update but convert "non-breaking-spaces” into regular spaces before storing
-2- Consider requests containing  "non-breaking-space”-characters as syntactically incorrect, and do not accept updates containing them. 
-3- Keep current solution: You get what you asked for.

Some additional statistics that can be used to understand the size of the problem:
Approximately 3.000 objects (out of 8.000.000) contain "non-breaking-space”-characters. Mostly in “remarks"-, “description"- and “address"-attributes?

Marc Grol
member of ripe’s whois database team
mgrol@ripe.net <mailto:mgrol@ripe.net>
+31648928856

Marc Grol

Nick Hilliard

Wilfried Woeber

Ruediger Volk, Deutsche Telekom Technik - FMED-41..

Nick Hilliard

Wilfried Woeber

tags

participants (5)