Hi Nick

"would it be possible to work towards a day in the future when 
the ripedb could formally support utf8 across multiple different field 
types, not just email addresses?"

The technical aspect of utilising utf-8 in the RIPE Database has been looked at several times by the RIPE NCC, but it never moves forward because of the social/political/registry/legal issues which cannot be resolved at the technical level:

-Should we allow all/some/none of the data content to be in non Latin characters?
-Is there some data content that must be in Latin characters (for the correct functioning of the registry?), some that can be duplicated in Latin and local characters, some that can be in any format chosen by the resource holder?
-If some data is duplicated how will the duplicate sets of data be verified as a consistent set (does it need to be the same data)?
-Who needs to be able to interpret which parts of the data content and for what purpose?
-Who can make these decisions (this is the bit that has been missing ever since the subject of utf-8 was first raised many years ago)?

Probably puny code can be implemented for IDNs in a matter of weeks, full implementation of utf-8 may take years.

cheers
denis

co-chair DB-WG



On Tuesday, 7 July 2020, 23:54:13 CEST, Nick Hilliard via db-wg <db-wg@ripe.net> wrote:


Gert Doering via db-wg wrote on 07/07/2020 16:21:

> On Tue, Jul 07, 2020 at 05:03:01PM +0200, Benedikt Neuffer wrote:
>> I think at the moment it would be easier to update the clients to
>> convert punycode in whatever encoding you prefer.
>
> Cementing this abomination even more...


No-one likes punycode.

Do you have a better suggestion?  Would raw utf8 work here?  How much
would it break?  If there were broader support in the ripedb codebase
for utf8, would it be possible to work towards a day in the future when
the ripedb could formally support utf8 across multiple different field
types, not just email addresses?

Nick