Report #9: Unicode support

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Report #9: Unicode support

Dmitry Selyutin
Hello everyone!

Here are the last news about the Unicode support project[0].
You can always check my repository[1].

During these days I had hardware problems (my HDD peacefully died), so
development didn't progress so much as before. However, I've
eliminated these problems, so I tried to fix bugs and reorganize the
code as much as possible. Now everything shall compile.

I decided to use __attribute__((constructor)) and
__attribute__((destructor)), since I don't know if there exist a
better way to open a file once in the startup and closing it when all
routines close. I've found one or two occurrences of this construction
in FreeBSD code; AFAICT it is rather common in clang and gcc, so I
decided to use it. Hopefully it will also allow us to use root
collation database in the embedded systems (if any such system really
needs collation algorithm).

As you may know we need a tool that can convert collation text files
obtained from unicode.org to new collation database (colldb) format.
There is a version of this tool written in Python
(share/examples/colldb/colldb.py). IIRC we can't use Python when we
have a base system though, so it seems that we need to written such
tool using C language. I was thinking of lex/yacc combo; I've never
tried it, but I think it shouldn't be too hard to write a tool using
it. I'd like to know your opinions about this task.
I've already written a man page (bin/colldb/colldb.1). The only thing
which seems dubious is that I decided to use the same name as for the
library itself (well, it seems I have a lack of imagination). So we
have both colldb.1 and colldb.3 man pages.

The other thing I'd really like to do is to really force network byte
order in collation database format (I'm sure I've seen a way to do it
in Berkley databases). It's a pity that I have no platform with
big-endian (or even PDP!) byte order. Any help here is highly
appreciated (as well as your thoughts about lex/yacc, i.e. thoughts
whether it fits well to my task).

Since Google Summer of Code period has passed, I'd like to thank both
my mentors, Pedro and David, who gave me a helping hand during this
project, and especially Konrad Jankowski, who found time to answer my
questions and help me too. Though GSoC is closed, I'd like to stay
with FreeBSD project. First of all, I want to finish and bring to mind
this project: I don't think it's really finished, especially its
testing part, though it seems that new collation algorithm can already
be used. Then I'd like to work in other parts of my project,
especially in internationalization parts. I'd also like to improve my
own library, qc, to provide a rich API for *BSD and POSIX systems,
since I acutely feel the lack of such API. If it is possible to stay
with project, I'd be very happy to do it. :-)

P.S. Does anyone knows how to get diff between only for my branch
(i.e. for my part of repository)? svn diff -r $FIRST:$LAST seems to
give everything what all FreeBSD's GSoC have done, so I need some
other command. Thanks for your help!

[0] https://wiki.freebsd.org/SummerOfCode2014/Unicode
[1] https://socsvn.freebsd.org/socsvn/soc2014/ghostmansd

--
With best regards,
Dmitry Selyutin
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Report #9: Unicode support

Baptiste Daroussin-2
On Wed, Aug 27, 2014 at 01:08:58AM +0400, Dmitry Selyutin wrote:

> Hello everyone!
>
> Here are the last news about the Unicode support project[0].
> You can always check my repository[1].
>
> During these days I had hardware problems (my HDD peacefully died), so
> development didn't progress so much as before. However, I've
> eliminated these problems, so I tried to fix bugs and reorganize the
> code as much as possible. Now everything shall compile.
>
> I decided to use __attribute__((constructor)) and
> __attribute__((destructor)), since I don't know if there exist a
> better way to open a file once in the startup and closing it when all
> routines close. I've found one or two occurrences of this construction
> in FreeBSD code; AFAICT it is rather common in clang and gcc, so I
> decided to use it. Hopefully it will also allow us to use root
> collation database in the embedded systems (if any such system really
> needs collation algorithm).
>
> As you may know we need a tool that can convert collation text files
> obtained from unicode.org to new collation database (colldb) format.
> There is a version of this tool written in Python
> (share/examples/colldb/colldb.py). IIRC we can't use Python when we
> have a base system though, so it seems that we need to written such
> tool using C language. I was thinking of lex/yacc combo; I've never
> tried it, but I think it shouldn't be too hard to write a tool using
> it. I'd like to know your opinions about this task.
> I've already written a man page (bin/colldb/colldb.1). The only thing
> which seems dubious is that I decided to use the same name as for the
> library itself (well, it seems I have a lack of imagination). So we
> have both colldb.1 and colldb.3 man pages.
>
> The other thing I'd really like to do is to really force network byte
> order in collation database format (I'm sure I've seen a way to do it
> in Berkley databases). It's a pity that I have no platform with
> big-endian (or even PDP!) byte order. Any help here is highly
> appreciated (as well as your thoughts about lex/yacc, i.e. thoughts
> whether it fits well to my task).
>
> Since Google Summer of Code period has passed, I'd like to thank both
> my mentors, Pedro and David, who gave me a helping hand during this
> project, and especially Konrad Jankowski, who found time to answer my
> questions and help me too. Though GSoC is closed, I'd like to stay
> with FreeBSD project. First of all, I want to finish and bring to mind
> this project: I don't think it's really finished, especially its
> testing part, though it seems that new collation algorithm can already
> be used. Then I'd like to work in other parts of my project,
> especially in internationalization parts. I'd also like to improve my
> own library, qc, to provide a rich API for *BSD and POSIX systems,
> since I acutely feel the lack of such API. If it is possible to stay
> with project, I'd be very happy to do it. :-)
>
> P.S. Does anyone knows how to get diff between only for my branch
> (i.e. for my part of repository)? svn diff -r $FIRST:$LAST seems to
> give everything what all FreeBSD's GSoC have done, so I need some
> other command. Thanks for your help!
>
> [0] https://wiki.freebsd.org/SummerOfCode2014/Unicode
> [1] https://socsvn.freebsd.org/socsvn/soc2014/ghostmansd
>
First thank you very much for your work on this subject this is highly needed.

Concerning the db format have you thought about using the new netbsd constant
database format?

It has simple API way easier to use, the db format is endian safe and final file
is smaller than equivalent in bdb format.

Lots of areas of FreeBSD could benefit from using this cdb format as well imho.

regards,
Bapt

attachment0 (188 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Report #9: Unicode support

Pedro Giffuni-4
Hi Baptiste;

On 08/26/14 17:16, Baptiste Daroussin wrote:

> On Wed, Aug 27, 2014 at 01:08:58AM +0400, Dmitry Selyutin wrote:
>> Hello everyone!
>>
>> Here are the last news about the Unicode support project[0].
>> You can always check my repository[1].
>>
>> During these days I had hardware problems (my HDD peacefully died), so
>> development didn't progress so much as before. However, I've
>> eliminated these problems, so I tried to fix bugs and reorganize the
>> code as much as possible. Now everything shall compile.
>>
>> I decided to use __attribute__((constructor)) and
>> __attribute__((destructor)), since I don't know if there exist a
>> better way to open a file once in the startup and closing it when all
>> routines close. I've found one or two occurrences of this construction
>> in FreeBSD code; AFAICT it is rather common in clang and gcc, so I
>> decided to use it. Hopefully it will also allow us to use root
>> collation database in the embedded systems (if any such system really
>> needs collation algorithm).
>>
>> As you may know we need a tool that can convert collation text files
>> obtained from unicode.org to new collation database (colldb) format.
>> There is a version of this tool written in Python
>> (share/examples/colldb/colldb.py). IIRC we can't use Python when we
>> have a base system though, so it seems that we need to written such
>> tool using C language. I was thinking of lex/yacc combo; I've never
>> tried it, but I think it shouldn't be too hard to write a tool using
>> it. I'd like to know your opinions about this task.
>> I've already written a man page (bin/colldb/colldb.1). The only thing
>> which seems dubious is that I decided to use the same name as for the
>> library itself (well, it seems I have a lack of imagination). So we
>> have both colldb.1 and colldb.3 man pages.
>>
>> The other thing I'd really like to do is to really force network byte
>> order in collation database format (I'm sure I've seen a way to do it
>> in Berkley databases). It's a pity that I have no platform with
>> big-endian (or even PDP!) byte order. Any help here is highly
>> appreciated (as well as your thoughts about lex/yacc, i.e. thoughts
>> whether it fits well to my task).
>>
>> Since Google Summer of Code period has passed, I'd like to thank both
>> my mentors, Pedro and David, who gave me a helping hand during this
>> project, and especially Konrad Jankowski, who found time to answer my
>> questions and help me too. Though GSoC is closed, I'd like to stay
>> with FreeBSD project. First of all, I want to finish and bring to mind
>> this project: I don't think it's really finished, especially its
>> testing part, though it seems that new collation algorithm can already
>> be used. Then I'd like to work in other parts of my project,
>> especially in internationalization parts. I'd also like to improve my
>> own library, qc, to provide a rich API for *BSD and POSIX systems,
>> since I acutely feel the lack of such API. If it is possible to stay
>> with project, I'd be very happy to do it. :-)
>>
>> P.S. Does anyone knows how to get diff between only for my branch
>> (i.e. for my part of repository)? svn diff -r $FIRST:$LAST seems to
>> give everything what all FreeBSD's GSoC have done, so I need some
>> other command. Thanks for your help!
>>
>> [0] https://wiki.freebsd.org/SummerOfCode2014/Unicode
>> [1] https://socsvn.freebsd.org/socsvn/soc2014/ghostmansd
>>
> First thank you very much for your work on this subject this is highly needed.
>
> Concerning the db format have you thought about using the new netbsd constant
> database format?
>
> It has simple API way easier to use, the db format is endian safe and final file
> is smaller than equivalent in bdb format.
>
> Lots of areas of FreeBSD could benefit from using this cdb format as well imho.

While here, let me congratulate Dmitry. The Unicode Collation Algorithm is
not something easy/fun to work with.

Indeed both David and Konrad suggested it (or tinycdb). The reason for
going bdb was that we had time constraints and bdb is already in libc.

FWIW, Nexenta kindly re-licensed localedef [1] and their collation support
in Illumos which basically implements their own very efficient format. We
ended up re-using the tools that libc already has to better focus on the
collation part.

Changing it to use the NetBSD's cdb support[1] shouldn't be difficult.

As Dmitry noted there are still details to work out and we have to run tests
and get the code reviewed but all in all I am very satisfied with the
advance
in this GSoC.

Best regards,

Pedro.

[1] https://github.com/Nexenta/illumos-nexenta/tree/republish-localedef
[2] http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/cdb/

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Report #9: Unicode support

Dmitry Selyutin
Hi, Pedro, Baptiste,

first of all thanks for your congratulations and kind words! The
project was really harder that anything I've ever met in my life, but
at the same time it was the most interesting one. :-) And still
remains! ;-)

> That is not really uncommon :)
Well, so I can leave it as it is. :-)

> The project does have access to sparc64 machines so if you have some
> self-contained test we can run it for you or we can test it as a routine libc
> test after committing.
Hopefully I can finish it today or in the next two days.

> You never answered my question concerning the fallback options.
Really? I thought that I answered. :-D Well, I'll try to explain
again. DUCET seems to be a bit obsolete collation table, which can be
more or less successfully used with real languages. However, in real
world it is completely unusable, so ICU and other use CLDR collation
table, which supports more levels. I started with DUCET since there
was much more information about it, but then I found that it doesn't
fit well, so I switched to CLDR. We have DUCET table somewhere in our
revisions though; as a fallback option, it still may be useful, so I
can restore it if you want.

> Changing it to use the NetBSD's cdb support[1] shouldn't be difficult.
Well, I think I'll do it right after exams. bdb AFAIK is deprecated
from Linux (though it can be used as bdb46 or something similar). I
don't know reasons why they did such thing; it would be great if we
could use a tool which can be used on different platforms without
modifications and tons of conditional define's and undef's.

> It has simple API way easier to use, the db format is endian safe and final file
> is smaller than equivalent in bdb format.
It sounds great!

> I do want to encourage you to go to EuroBSDCon 2014 in Sofia. The
> FreeBSD Foundation will be allocating funds for students that want to go.
> I won’t be there (I am a bit far away) but David and other developers will
> likely be.
Well, that depends on whether I pass my exams for the postgraduate
course or not. I'd really like to listen to more experienced
developers and may be even talk to other people about work which I did
to better understand the community's opinions.

2014-08-27 3:17 GMT+04:00 Pedro Giffuni <[hidden email]>:

> Hi Baptiste;
>
>
> On 08/26/14 17:16, Baptiste Daroussin wrote:
>>
>> On Wed, Aug 27, 2014 at 01:08:58AM +0400, Dmitry Selyutin wrote:
>>>
>>> Hello everyone!
>>>
>>> Here are the last news about the Unicode support project[0].
>>> You can always check my repository[1].
>>>
>>> During these days I had hardware problems (my HDD peacefully died), so
>>> development didn't progress so much as before. However, I've
>>> eliminated these problems, so I tried to fix bugs and reorganize the
>>> code as much as possible. Now everything shall compile.
>>>
>>> I decided to use __attribute__((constructor)) and
>>> __attribute__((destructor)), since I don't know if there exist a
>>> better way to open a file once in the startup and closing it when all
>>> routines close. I've found one or two occurrences of this construction
>>> in FreeBSD code; AFAICT it is rather common in clang and gcc, so I
>>> decided to use it. Hopefully it will also allow us to use root
>>> collation database in the embedded systems (if any such system really
>>> needs collation algorithm).
>>>
>>> As you may know we need a tool that can convert collation text files
>>> obtained from unicode.org to new collation database (colldb) format.
>>> There is a version of this tool written in Python
>>> (share/examples/colldb/colldb.py). IIRC we can't use Python when we
>>> have a base system though, so it seems that we need to written such
>>> tool using C language. I was thinking of lex/yacc combo; I've never
>>> tried it, but I think it shouldn't be too hard to write a tool using
>>> it. I'd like to know your opinions about this task.
>>> I've already written a man page (bin/colldb/colldb.1). The only thing
>>> which seems dubious is that I decided to use the same name as for the
>>> library itself (well, it seems I have a lack of imagination). So we
>>> have both colldb.1 and colldb.3 man pages.
>>>
>>> The other thing I'd really like to do is to really force network byte
>>> order in collation database format (I'm sure I've seen a way to do it
>>> in Berkley databases). It's a pity that I have no platform with
>>> big-endian (or even PDP!) byte order. Any help here is highly
>>> appreciated (as well as your thoughts about lex/yacc, i.e. thoughts
>>> whether it fits well to my task).
>>>
>>> Since Google Summer of Code period has passed, I'd like to thank both
>>> my mentors, Pedro and David, who gave me a helping hand during this
>>> project, and especially Konrad Jankowski, who found time to answer my
>>> questions and help me too. Though GSoC is closed, I'd like to stay
>>> with FreeBSD project. First of all, I want to finish and bring to mind
>>> this project: I don't think it's really finished, especially its
>>> testing part, though it seems that new collation algorithm can already
>>> be used. Then I'd like to work in other parts of my project,
>>> especially in internationalization parts. I'd also like to improve my
>>> own library, qc, to provide a rich API for *BSD and POSIX systems,
>>> since I acutely feel the lack of such API. If it is possible to stay
>>> with project, I'd be very happy to do it. :-)
>>>
>>> P.S. Does anyone knows how to get diff between only for my branch
>>> (i.e. for my part of repository)? svn diff -r $FIRST:$LAST seems to
>>> give everything what all FreeBSD's GSoC have done, so I need some
>>> other command. Thanks for your help!
>>>
>>> [0] https://wiki.freebsd.org/SummerOfCode2014/Unicode
>>> [1] https://socsvn.freebsd.org/socsvn/soc2014/ghostmansd
>>>
>> First thank you very much for your work on this subject this is highly
>> needed.
>>
>> Concerning the db format have you thought about using the new netbsd
>> constant
>> database format?
>>
>> It has simple API way easier to use, the db format is endian safe and
>> final file
>> is smaller than equivalent in bdb format.
>>
>> Lots of areas of FreeBSD could benefit from using this cdb format as well
>> imho.
>
>
> While here, let me congratulate Dmitry. The Unicode Collation Algorithm is
> not something easy/fun to work with.
>
> Indeed both David and Konrad suggested it (or tinycdb). The reason for
> going bdb was that we had time constraints and bdb is already in libc.
>
> FWIW, Nexenta kindly re-licensed localedef [1] and their collation support
> in Illumos which basically implements their own very efficient format. We
> ended up re-using the tools that libc already has to better focus on the
> collation part.
>
> Changing it to use the NetBSD's cdb support[1] shouldn't be difficult.
>
> As Dmitry noted there are still details to work out and we have to run tests
> and get the code reviewed but all in all I am very satisfied with the
> advance
> in this GSoC.
>
> Best regards,
>
> Pedro.
>
> [1] https://github.com/Nexenta/illumos-nexenta/tree/republish-localedef
> [2] http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/cdb/
>



--
With best regards,
Dmitry Selyutin
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Report #9: Unicode support

Dmitry Selyutin
I've just seen EuroBSDCon's calendar page and it seems that it is
impossible to join it (i.e. I missed the application deadline).[0]
Well, may be next year? :-)

2014-08-27 14:48 GMT+04:00 Dmitry Selyutin <[hidden email]>:

> Hi, Pedro, Baptiste,
>
> first of all thanks for your congratulations and kind words! The
> project was really harder that anything I've ever met in my life, but
> at the same time it was the most interesting one. :-) And still
> remains! ;-)
>
>> That is not really uncommon :)
> Well, so I can leave it as it is. :-)
>
>> The project does have access to sparc64 machines so if you have some
>> self-contained test we can run it for you or we can test it as a routine libc
>> test after committing.
> Hopefully I can finish it today or in the next two days.
>
>> You never answered my question concerning the fallback options.
> Really? I thought that I answered. :-D Well, I'll try to explain
> again. DUCET seems to be a bit obsolete collation table, which can be
> more or less successfully used with real languages. However, in real
> world it is completely unusable, so ICU and other use CLDR collation
> table, which supports more levels. I started with DUCET since there
> was much more information about it, but then I found that it doesn't
> fit well, so I switched to CLDR. We have DUCET table somewhere in our
> revisions though; as a fallback option, it still may be useful, so I
> can restore it if you want.
>
>> Changing it to use the NetBSD's cdb support[1] shouldn't be difficult.
> Well, I think I'll do it right after exams. bdb AFAIK is deprecated
> from Linux (though it can be used as bdb46 or something similar). I
> don't know reasons why they did such thing; it would be great if we
> could use a tool which can be used on different platforms without
> modifications and tons of conditional define's and undef's.
>
>> It has simple API way easier to use, the db format is endian safe and final file
>> is smaller than equivalent in bdb format.
> It sounds great!
>
>> I do want to encourage you to go to EuroBSDCon 2014 in Sofia. The
>> FreeBSD Foundation will be allocating funds for students that want to go.
>> I won’t be there (I am a bit far away) but David and other developers will
>> likely be.
> Well, that depends on whether I pass my exams for the postgraduate
> course or not. I'd really like to listen to more experienced
> developers and may be even talk to other people about work which I did
> to better understand the community's opinions.
>
> 2014-08-27 3:17 GMT+04:00 Pedro Giffuni <[hidden email]>:
>> Hi Baptiste;
>>
>>
>> On 08/26/14 17:16, Baptiste Daroussin wrote:
>>>
>>> On Wed, Aug 27, 2014 at 01:08:58AM +0400, Dmitry Selyutin wrote:
>>>>
>>>> Hello everyone!
>>>>
>>>> Here are the last news about the Unicode support project[0].
>>>> You can always check my repository[1].
>>>>
>>>> During these days I had hardware problems (my HDD peacefully died), so
>>>> development didn't progress so much as before. However, I've
>>>> eliminated these problems, so I tried to fix bugs and reorganize the
>>>> code as much as possible. Now everything shall compile.
>>>>
>>>> I decided to use __attribute__((constructor)) and
>>>> __attribute__((destructor)), since I don't know if there exist a
>>>> better way to open a file once in the startup and closing it when all
>>>> routines close. I've found one or two occurrences of this construction
>>>> in FreeBSD code; AFAICT it is rather common in clang and gcc, so I
>>>> decided to use it. Hopefully it will also allow us to use root
>>>> collation database in the embedded systems (if any such system really
>>>> needs collation algorithm).
>>>>
>>>> As you may know we need a tool that can convert collation text files
>>>> obtained from unicode.org to new collation database (colldb) format.
>>>> There is a version of this tool written in Python
>>>> (share/examples/colldb/colldb.py). IIRC we can't use Python when we
>>>> have a base system though, so it seems that we need to written such
>>>> tool using C language. I was thinking of lex/yacc combo; I've never
>>>> tried it, but I think it shouldn't be too hard to write a tool using
>>>> it. I'd like to know your opinions about this task.
>>>> I've already written a man page (bin/colldb/colldb.1). The only thing
>>>> which seems dubious is that I decided to use the same name as for the
>>>> library itself (well, it seems I have a lack of imagination). So we
>>>> have both colldb.1 and colldb.3 man pages.
>>>>
>>>> The other thing I'd really like to do is to really force network byte
>>>> order in collation database format (I'm sure I've seen a way to do it
>>>> in Berkley databases). It's a pity that I have no platform with
>>>> big-endian (or even PDP!) byte order. Any help here is highly
>>>> appreciated (as well as your thoughts about lex/yacc, i.e. thoughts
>>>> whether it fits well to my task).
>>>>
>>>> Since Google Summer of Code period has passed, I'd like to thank both
>>>> my mentors, Pedro and David, who gave me a helping hand during this
>>>> project, and especially Konrad Jankowski, who found time to answer my
>>>> questions and help me too. Though GSoC is closed, I'd like to stay
>>>> with FreeBSD project. First of all, I want to finish and bring to mind
>>>> this project: I don't think it's really finished, especially its
>>>> testing part, though it seems that new collation algorithm can already
>>>> be used. Then I'd like to work in other parts of my project,
>>>> especially in internationalization parts. I'd also like to improve my
>>>> own library, qc, to provide a rich API for *BSD and POSIX systems,
>>>> since I acutely feel the lack of such API. If it is possible to stay
>>>> with project, I'd be very happy to do it. :-)
>>>>
>>>> P.S. Does anyone knows how to get diff between only for my branch
>>>> (i.e. for my part of repository)? svn diff -r $FIRST:$LAST seems to
>>>> give everything what all FreeBSD's GSoC have done, so I need some
>>>> other command. Thanks for your help!
>>>>
>>>> [0] https://wiki.freebsd.org/SummerOfCode2014/Unicode
>>>> [1] https://socsvn.freebsd.org/socsvn/soc2014/ghostmansd
>>>>
>>> First thank you very much for your work on this subject this is highly
>>> needed.
>>>
>>> Concerning the db format have you thought about using the new netbsd
>>> constant
>>> database format?
>>>
>>> It has simple API way easier to use, the db format is endian safe and
>>> final file
>>> is smaller than equivalent in bdb format.
>>>
>>> Lots of areas of FreeBSD could benefit from using this cdb format as well
>>> imho.
>>
>>
>> While here, let me congratulate Dmitry. The Unicode Collation Algorithm is
>> not something easy/fun to work with.
>>
>> Indeed both David and Konrad suggested it (or tinycdb). The reason for
>> going bdb was that we had time constraints and bdb is already in libc.
>>
>> FWIW, Nexenta kindly re-licensed localedef [1] and their collation support
>> in Illumos which basically implements their own very efficient format. We
>> ended up re-using the tools that libc already has to better focus on the
>> collation part.
>>
>> Changing it to use the NetBSD's cdb support[1] shouldn't be difficult.
>>
>> As Dmitry noted there are still details to work out and we have to run tests
>> and get the code reviewed but all in all I am very satisfied with the
>> advance
>> in this GSoC.
>>
>> Best regards,
>>
>> Pedro.
>>
>> [1] https://github.com/Nexenta/illumos-nexenta/tree/republish-localedef
>> [2] http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/cdb/
>>
>
>
>
> --
> With best regards,
> Dmitry Selyutin



--
With best regards,
Dmitry Selyutin
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Report #9: Unicode support

Pedro Giffuni-4

On 08/27/14 05:51, Dmitry Selyutin wrote:

> ...
>>> You never answered my question concerning the fallback options.
>> Really? I thought that I answered. :-D Well, I'll try to explain
>> again. DUCET seems to be a bit obsolete collation table, which can be
>> more or less successfully used with real languages. However, in real
>> world it is completely unusable, so ICU and other use CLDR collation
>> table, which supports more levels. I started with DUCET since there
>> was much more information about it, but then I found that it doesn't
>> fit well, so I switched to CLDR. We have DUCET table somewhere in our
>> revisions though; as a fallback option, it still may be useful, so I
>> can restore it if you want.


I don't see DUCET as being ever used but we are setting the old
algorithm as a fallback for CLDR.

I was just wondering how DUCET compares to the existing
algorithm. Given that DUCET is in the standard and that you
already implemented it, I thought it would be a better fallback
than the old code. It's your call though.

Pedro.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "[hidden email]"