[Bug 215393] devel/boost-libs: bad encoding conversion

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion

bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

            Bug ID: 215393
           Summary: devel/boost-libs: bad encoding conversion
           Product: Ports & Packages
           Version: Latest
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: Individual Port(s)
          Assignee: [hidden email]
          Reporter: [hidden email]
             Flags: maintainer-feedback?([hidden email])
          Assignee: [hidden email]

Created attachment 178070
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=178070&action=edit
test case

By default boost-libs is compiled with ICONV and ICU support. In my tests such
configuration gives bad results for encoding from utf8 to arbitrary codepage
comparing to other OS'es.

I've attached screenshots and test program which I use. Test program uses
pre-defined utf8 string and python script to convert it from utf8 into Latin-5
and TIS620.2533-0 (thai) codepages, then it uses boost to do the same. To
compile it simply type 'make' (or 'bmake' on Linux), boost lib and python2 are
requirements.

I've run this program on OpenBSD current, Debian stable and FreeBSD 11,
screenshots are attached.

If I compile boost-libs with ICU *only* I have better results - they're the
same as on other two OS'es, except for TIS620.2533-0 codepage - it rises an
exception about unknown encoding.

Does such inconsistency between FreeBSD's boost-lib's locale and other OS'es
should be expected?

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

maintainer-feedback requested: [Bug 215393] devel/boost-libs: bad encoding conversion

bugzilla-noreply
[hidden email] has reassigned Bugzilla Automation <[hidden email]>'s
request for maintainer-feedback to [hidden email]:
Bug 215393: devel/boost-libs: bad encoding conversion
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393



--- Description ---
Created attachment 178070
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=178070&action=edit
test case

By default boost-libs is compiled with ICONV and ICU support. In my tests such
configuration gives bad results for encoding from utf8 to arbitrary codepage
comparing to other OS'es.

I've attached screenshots and test program which I use. Test program uses
pre-defined utf8 string and python script to convert it from utf8 into Latin-5
and TIS620.2533-0 (thai) codepages, then it uses boost to do the same. To
compile it simply type 'make' (or 'bmake' on Linux), boost lib and python2 are
requirements.

I've run this program on OpenBSD current, Debian stable and FreeBSD 11,
screenshots are attached.

If I compile boost-libs with ICU *only* I have better results - they're the
same as on other two OS'es, except for TIS620.2533-0 codepage - it rises an
exception about unknown encoding.

Does such inconsistency between FreeBSD's boost-lib's locale and other OS'es
should be expected?
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #1 from [hidden email] ---
Created attachment 178071
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=178071&action=edit
linux debian behaviour

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #2 from [hidden email] ---
Created attachment 178072
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=178072&action=edit
openbsd behaviour

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #3 from [hidden email] ---
Created attachment 178073
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=178073&action=edit
freebsd default behaviour (iconv and icu, package installation)

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #4 from [hidden email] ---
Created attachment 178074
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=178074&action=edit
freebsd behaviour (icu only)

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #5 from Jan Beich (mail not working) <[hidden email]> ---
Comment on attachment 178070
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=178070
test case

utf8/encode.py:
> # -*- coding: utf-8 -*-
[...]
> test_string = u'TestéäöòДΘĝصדķћ๛ネİ'

Can you make sure the string is valid Thai in UTF-8 before testing? Try pasting
anything from https://th.wikipedia.org/

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #6 from [hidden email] ---
I don't understand - repeat the test w/o any english/foreign letters and only
with Thai alphabet?

I was expecting I can translate into any codepage from any codepage w/o check
if a string is valid for destination encoding (with risk of losing chars of
course).

At least '๛' inside the string is a valid Thai.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

Jan Beich (mail not working) <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|devel/boost-libs: bad       |devel/boost-libs: bad
                   |encoding conversion         |encoding conversion with
                   |                            |base iconv()

--- Comment #7 from Jan Beich (mail not working) <[hidden email]> ---
Nevermind, I misunderstood comment 0. FreeBSD 9.3 uses GNU libiconv and doesn't
appear to be affected.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #8 from [hidden email] ---
I can confirm that 9.3 isn't affected.

Can we change default options of a boost-libs package to be ICU only?

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #9 from [hidden email] ---
ping?

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #10 from Jan Beich <[hidden email]> ---
Created attachment 186691
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=186691&action=edit
Prefer ICU (workaround)

ICU doesn't support TIS620.2533-0, so it'd fall back to iconv().

$ ./utf8 | vis -o
UTF-8:
Test\303\251\303\244\303\266\303\262\320\224\316\230\304\235\327\223\330\265\304\267\321\233\340\271\233\357\276\210\304\260
iso-8859-9 (boost): Test\351\344\366\362\335
iso-8859-9 (python): Test\351\344\366\362\335
TIS620.2533-0 (boost): Test'e"a"o`o??^g??k?\373?I
TIS620.2533-0 (python): Test\373

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

Jan Beich <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #11 from Jan Beich <[hidden email]> ---
Tijl, do you think either libc and/or boost can be fixed to skip invalid
sequences?

$ echo
'Test\303\251\303\244\303\266\303\262\320\224\316\230\304\235\327\223\330\265\304\267\321\233\340\271\233\357\276\210\304\260'
| unvis | /usr/bin/iconv -t iso-8859-9 2>/dev/null | vis -o
Test\351\344\366\362??^g??k???\335


$ echo
'Test\303\251\303\244\303\266\303\262\320\224\316\230\304\235\327\223\330\265\304\267\321\233\340\271\233\357\276\210\304\260'
| unvis | /usr/local/bin/iconv -t iso-8859-9 2>/dev/null | vis -o
Test\351\344\366\362

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #12 from Tijl Coosemans <[hidden email]> ---
Created attachment 186721
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=186721&action=edit
boost.locale patch

(In reply to Jan Beich from comment #11)
If the input buffer is valid UTF-8 then POSIX says this:
> If iconv() encounters a character in the input buffer that is valid, but for
> which an identical character does not exist in the target codeset, iconv()
> shall perform an implementation-defined conversion on this character.

By default our iconv either replaces such characters with "?" or transliterates
them (e.g. "ĝ" becomes "^g").  GNU iconv returns an error in this case, which I
believe is not POSIX compliant.

The problem reported in this bug is in Boost itself.  Their use of
__ICONV_F_HIDE_INVALID on FreeBSD does not give the desired behaviour.  Please
try the attached patch.  Make sure your ports tree is at least r450634.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

Tijl Coosemans <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #186721|0                           |1
        is obsolete|                            |

--- Comment #13 from Tijl Coosemans <[hidden email]> ---
Created attachment 186738
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=186738&action=edit
boost.locale patch2

Cleaned up version that is probably more acceptable upstream.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

vali gholami <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #14 from vali gholami <[hidden email]> ---
چنانچه نیازمند بک لینک در سایتهای زیر بودید تماس بگیرید
09125879258  غلامی


http://00014.ir
http://1-flymusic.ir
http://8y8.ir
http://abanshargh.ir
http://akstoaks.ir
http://alibaba-fans.ir
http://alvand-ads.ir
http://amingames.ir
http://amlake-pasargad.ir
http://androidsystem.ir
http://arax24.ir
http://arax724.ir
http://armanhardware.ir
http://arsisgame.ir
http://asanban.ir
http://asanbaran.ir
http://asgas.ir
http://ashanews.ir
http://asl-ic.ir
http://astakala.ir
http://atromarket.ir
http://azarpajang.ir
http://bahartent.ir
http://bartarresins.ir
http://bazigaranesahne.ir
http://bermudasystem.ir
http://buyclockfantasy.ir
http://cs8.ir
http://d77.ir
http://dresskade.ir
http://drhesabisch.ir
http://editexpert.ir
http://eemenshop.ir
http://ehsa30.ir
http://elameharighearjmand.ir
http://e-larestan.ir
http://elsku.ir
http://emadcenter.ir
http://eta90.ir
http://far30sms.ir
http://fixpost.ir
http://fsbigroup.ir
http://gemgem.ir
http://glutenfree.ir
http://group-software.ir
http://harim-pak.ir
http://hdserial.ir
http://healthplanner.ir
http://homana-nikooei.ir
http://honardorcheh.ir
http://honarshiraz.ir
http://hsplaser.ir
http://insat.ir
http://iranitb.ir
http://iranvmag.ir
http://irboiler.ir
http://isuntrade.ir
http://ithandmade.ir
http://jahantest1.ir
http://karevanhayeqadir.ir
http://kashmarsalam.ir
http://kconf.ir
http://kore2iran.ir
http://kosar-kala.ir
http://mahdidevotees.ir
http://marjaehamayesh.ir
http://mgolden.ir
http://mhosein.ir
http://mohsenmirzazadeh.ir
http://myalibabamusic.ir
http://nama94.ir
http://nettrick.ir
http://niaze-rooz.ir
http://noaradecor.ir
http://nod32-pass.ir
http://novindpfile.ir
http://n-vasegh.ir
http://parlpd.ir
http://photoselfi.ir
http://pooyawood.ir
http://roofbam.ir
http://saadatedu.ir
http://saba-gostar.ir
http://sadafwood.ir
http://sadcover.ir
http://s-amini.ir
http://sarebanekavir.ir
http://sazmansokhan.ir
http://shafaghgostaran.ir
http://shandizasansor.ir
http://sh-iranshahr.ir
http://sms7000.ir
http://sogandmusic.ir
http://steelfood.ir
http://sticker1.ir
http://technoguard.ir
http://tegolestan.ir
http://telegramup.ir
http://torbat24.ir
http://trustech.ir
http://turkmenili.ir
http://vray4max.ir
http://winsoftware.ir
http://www.3hf.ir
http://www.4drupal.ir
http://www.arazinet.ir
http://www.arianagame.ir
http://www.clickbartar.ir
http://www.commaxstoreco.ir
http://www.daryabchat.ir
http://www.eset-ir.ir
http://www.marketstudies.ir
http://www.raadmehr.ir
http://www.rcmb.ir
http://www.shahinmag.ir
http://www.starfam.ir
http://www.steel-industrial.ir
http://www.suqr.ir
http://www.tajervenizi.ir
http://www.zarrindesign.ir
http://yahoo-shop.ir






http://www.amar365.ir
http://09123498298.ir
http://pamar.ir
http://arazproje.ir
http://signsaras.ir
http://signsaraz.ir
http://signsfarahan.ir
http://suleforosh.ir
http://tabloosazi.ir
http://zaminforosh.ir
http://9125879258.ir
http://9375883058.ir
http://www.lbfarahan.ir
http://www.ghfarahan.ir




http://pix-land.blogfa.com
http://projeamar.blog.ir
http://projectstatis.blogfa.com
http://projectstatis.rozblog.com
http://projectstatistics.blog.ir
http://projectstatistics.niloblog.com
http://projectstats.avablog.ir
http://projectstats.blog.ir
http://projectstats.javanblog.ir
http://projectstats.samenblog.com
http://projectstats1.blogfa.com
http://projectstatus.avablog.ir
http://prozhe-amar.blogfa.com
http://spam.blogfa.com
http://stats09375883058.blogfa.com
http://projeamari.blogfa.com
http://projectanalysis.mihanblog.com
http://statisticsproject.blogfa.com
http://tahghighstan.blogfa.com
http://www.amar101.blogfa.com




http://saheldarya.7blog.ir
http://pezeshkyar.arisfa.com
http://Shabebarfi.armanblog.ir
http://varzeshsara.avablog.ir
http://Azadweb.azadblog.com
http://Batoo.b88.ir
http://dostan.bestblog.ir
http://Toristi.bigsite.ir
http://Tarane18.blog.ir
http://Hambaazi.blogfa.com
http://estekhdami.blogia.ir
http://behtarinha.blogiran.net
http://Musicnaab.blognovin.com
http://baharnarenj.blogparsi.com
http://Molodi.blogpart.ir
http://Bestgirl.blograz.ir
http://razesalamati1.blogsky.com
http://Alghameh.blogtarin.com
http://deklameh.blogtez.com
http://taranoom18.blogveb.com
http://dabirestani.deyblog.ir
http://amozeshyar.eklablog.com
http://donyayenet.epage.ir
http://mamnoo.famblog.ir
http://tabasom.farazblog.com
http://daneshjoei.fardblog.com
http://Nemonehsoal.farsiblog.com
http://tabestoon.geblog.ir
http://maghaleh.iran.sc
http://funkadeh.iranblag.com
http://fotoax.jahanblog.net
http://shabgard.jasaz.com
http://jazadkadeh.javanblog.ir
http://khabarjadid.limooblog.com
http://tanzkadeh.loxblog.com
http://tahghigh18.mihanblog.com
http://niazmandiha.mojblog.ir
http://niazsara.monoblog.ir
http://divari.nedablog.ir
http://Darham.niazblog.ir
http://dabestan.niloblog.com
http://salamati.novinblog.net
http://dehati.parsablog.com
http://Baharnews.parsiblog.com
http://tanhatarin.parsunit.com
http://doghalb.persianblog.ir
http://yadgari.ratablog.com
http://baharestan.roomfa.com
http://modkadeh.royablog.ir
http://sargarmi724.rozblog.com
http://webmasteri.samenblog.com
http://mosaferati.shblog.ir
http://gardeshgari.sitearia.ir
http://Khandani.smu.ir
http://ahangjadid.takblog.net
http://filmjadid.tarlog.com
http://mashinbaz.tibablog.ir
http://lebasmajlesi.tinablog.ir
http://backlink724.titrblog.ir
http://bazigaran.toonblog.ir
http://pishwaz.twoblog.ir
http://madahi.wblog.xyz
http://hotelyar.yektablog.net

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

--- Comment #15 from [hidden email] ---
A commit references this bug:

Author: tijl
Date: Wed Jan 16 20:36:48 UTC 2019
New revision: 490518
URL: https://svnweb.freebsd.org/changeset/ports/490518

Log:
  Fix use of iconv in Boost Locale.  On FreeBSD it used __ICONV_F_HIDE_INVALID
  which hides invalid sequences, but what Boost really wants is that iconv
  returns an error on invalid sequences like GNU libiconv does by default.
  On FreeBSD ICONV_SET_ILSEQ_INVALID can be used for this.  It has to be set
  via iconvctl.

  PR:           215393

Changes:
  head/devel/boost-libs/Makefile
  head/devel/boost-libs/files/patch-libs_locale_src_encoding_iconv_codepage.ipp
  head/devel/boost-libs/files/patch-libs_locale_src_posix_codecvt.cpp
  head/devel/boost-libs/files/patch-libs_locale_src_util_iconv.hpp

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

[Bug 215393] devel/boost-libs: bad encoding conversion with base iconv()

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215393

Tijl Coosemans <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|New                         |Closed

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-office
To unsubscribe, send any mail to "[hidden email]"