initialization problem w/ thread-specific .tbss data on i386

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

initialization problem w/ thread-specific .tbss data on i386

Phil Shafer
I have a problem reported with libxo-based applications running
under FreeBSD-11-stable on i386 boxes that I think is related
to rtld:

When I breakpoint on main() and dump the contents of my uninitialized
thread-specific variable, it has not been initialized to zeroes.

I don't see this problem on 64-bit systems, only on i386 ones.

When I look at the rtld code, it appears to memset the .tbss to
zero (/usr/src/libexec/rtld-elf/rtld.c:allocate_tls) in the
non-arch-specific code so the arch shouldn't matter, but something
is not working right.

So I'm looking for a helpful clue, such as how to debug rtld to see
why this isn't being zeroed.  I thought I'd use:

    gdb /libexec/ld-elf.so.1
    run /usr/bin/uptime

for this doesn't work for me (SEGV with a callstack that doesn't
make sense).

For this instance, the work around is to initialize the contents
of xo_default_handle to zero so it's not in the .tbss, but I'd like
to understand what's failing.  In truth, I just have a hard time
blaming rtld, even though this is issue is an obscure intersection
of weird things (.tbbs on i386).  Perhaps it's something wrong with
how the library is built or similar.  But given that it's not zeroed
when main() get control, something's clearly broken.

Details follow:

I declare my variable as:

    #define THREAD_LOCAL(_x) __thread _x
    ...
    static THREAD_LOCAL(xo_handle_t) xo_default_handle;

To help debug this issue, I made the following change to the sources
to help with gdb's inability to show thread-local variables ("Cannot
find thread-local variables on this target"):

    --- contrib/libxo/libxo/libxo.c.save    2018-05-04 17:26:29.079500000 -0400
    +++ contrib/libxo/libxo/libxo.c 2018-05-04 17:28:06.570875000 -0400
    @@ -8349,3 +8349,11 @@
         xop->xo_style = XO_STYLE_ENCODER;
         xop->xo_encoder = encoder;
     }
    +
    +void xo_print_handle (void);
    +void
    +xo_print_handle (void)
    +{
    +    fprintf(stderr, "xo_default_handle: %p %d\n",
    +            &xo_default_handle, sizeof(xo_handle_t));
    +}

When I run the failing command (uptime) under gdb and breakpoint
on main, my thread-local variable is not set to zeroes:

    % gdb uptime
    GNU gdb 6.1.1 [FreeBSD]
    ...
    This GDB was configured as "i386-marcel-freebsd"...
    (gdb) b main
    Breakpoint 1 at 0x8049be5: file /usr/src/usr.bin/w/w.c, line 145.
    (gdb) run
    Starting program: /usr/home/phil/work/lib/uptime

    Breakpoint 1, main (argc=1, argv=0xbfbfe60c) at /usr/src/usr.bin/w/w.c:145
    145             (void)setlocale(LC_ALL, "");
    Current language:  auto; currently minimal
    (gdb) call xo_print_handle()
    xo_default_handle: 0x2806aff0 328
    $1 = 34
    (gdb) x/82x 0x2806aff0
    0x2806aff0:     0x00000000      0x00000000      0x00000000      0x280601ef
    0x2806b000:     0x2806b010      0x2806a200      0x00000001      0x280601ef
    0x2806b010:     0x2806b020      0x2806a400      0x0000005d      0x280601ef
    0x2806b020:     0x2806b030      0x2806a600      0x000000a1      0x280601ef
    0x2806b030:     0x2806b040      0x2806a800      0x00000147      0x280601ef
    0x2806b040:     0x00000000      0x2806aa00      0x00000164      0x280601ef
    0x2806b050:     0x2806c000      0x00000000      0x28065e70      0x280601ef
    0x2806b060:     0x2806b070      0x2806ac00      0x00000421      0x280601ef
    0x2806b070:     0x00000000      0x2806aa00      0x0000042d      0x280601ef
    0x2806b080:     0x00000000      0x2806aa00      0x000001ff      0x280601ef
    0x2806b090:     0x2806b0a0      0x2806a800      0x00000976      0x280601ef
    0x2806b0a0:     0x00000000      0x2806aa00      0x00000983      0x280601ef
    0x2806b0b0:     0x00000000      0x2806aa00      0x00000a18      0x280601ef
    0x2806b0c0:     0x00000000      0x2806aa00      0x00000571      0x280601ef
    0x2806b0d0:     0x2806b0e0      0x2806a000      0x00000000      0x280601ef
    0x2806b0e0:     0x2806b0f0      0x2806a200      0x00000000      0x280601ef
    0x2806b0f0:     0x2806b100      0x2806a400      0x00000000      0x280601ef
    0x2806b100:     0x2806b110      0x2806a600      0x00000000      0x280601ef
    0x2806b110:     0x2806b120      0x2806a800      0x00000000      0x280601ef
    0x2806b120:     0x2806b130      0x2806aa00      0x00000000      0x280601ef
    0x2806b130:     0x00000000      0x2806ac00
    (gdb)

objdump shows the lib does have a .tbbs:

     14 .tbss         00000658  000181f8  000181f8  000171f8  2**3
                      ALLOC, THREAD_LOCAL

Thanks,
 Phil
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: initialization problem w/ thread-specific .tbss data on i386

Dimitry Andric-4
On 7 May 2018, at 23:27, Phil Shafer <[hidden email]> wrote:
>
> I have a problem reported with libxo-based applications running
> under FreeBSD-11-stable on i386 boxes that I think is related
> to rtld:
>
> When I breakpoint on main() and dump the contents of my uninitialized
> thread-specific variable, it has not been initialized to zeroes.

Aha, this might very well be the root cause for
https://bugs.freebsd.org/227552, could you please have a look at that?


> I don't see this problem on 64-bit systems, only on i386 ones.
>
> When I look at the rtld code, it appears to memset the .tbss to
> zero (/usr/src/libexec/rtld-elf/rtld.c:allocate_tls) in the
> non-arch-specific code so the arch shouldn't matter, but something
> is not working right.
>
> So I'm looking for a helpful clue, such as how to debug rtld to see
> why this isn't being zeroed.

As discussed in PR227552, it seems that the update to clang 6.0 in
stable/11 is the point at which some programs start crashing, so either
it's some bug in clang's TLS handling, or some subtle change in the
resulting executables is now tripping up rtld.  (I've added John and
Kostik on CC, as they know much more about rtld than me.)

-Dimitry


signature.asc (230 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: initialization problem w/ thread-specific .tbss data on i386

Phil Shafer
Dimitry Andric writes:
>Aha, this might very well be the root cause for
>https://bugs.freebsd.org/227552, could you please have a look at that?

Yes, this looks like the same issue.  I've added my email as a
comment to that PR.

Given the size of xo_handle_t (328 bytes), I'm not sure this is an
alignment issue.  Most of the 328 bytes are polluted.

Thanks,
 Phil
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: initialization problem w/ thread-specific .tbss data on i386

Konstantin Belousov
In reply to this post by Phil Shafer
On Mon, May 07, 2018 at 05:27:03PM -0400, Phil Shafer wrote:

> I have a problem reported with libxo-based applications running
> under FreeBSD-11-stable on i386 boxes that I think is related
> to rtld:
>
> When I breakpoint on main() and dump the contents of my uninitialized
> thread-specific variable, it has not been initialized to zeroes.
>
> I don't see this problem on 64-bit systems, only on i386 ones.
>
> When I look at the rtld code, it appears to memset the .tbss to
> zero (/usr/src/libexec/rtld-elf/rtld.c:allocate_tls) in the
> non-arch-specific code so the arch shouldn't matter, but something
> is not working right.
>
> So I'm looking for a helpful clue, such as how to debug rtld to see
> why this isn't being zeroed.  I thought I'd use:
>
>     gdb /libexec/ld-elf.so.1
>     run /usr/bin/uptime
>
> for this doesn't work for me (SEGV with a callstack that doesn't
> make sense).
You need to supply argv[0].  Read ld-elf.so.1(1), it has the whole
section about direct execution mode.

>
> For this instance, the work around is to initialize the contents
> of xo_default_handle to zero so it's not in the .tbss, but I'd like
> to understand what's failing.  In truth, I just have a hard time
> blaming rtld, even though this is issue is an obscure intersection
> of weird things (.tbbs on i386).  Perhaps it's something wrong with
> how the library is built or similar.  But given that it's not zeroed
> when main() get control, something's clearly broken.
>
> Details follow:
>
> I declare my variable as:
>
>     #define THREAD_LOCAL(_x) __thread _x
>     ...
>     static THREAD_LOCAL(xo_handle_t) xo_default_handle;
>
> To help debug this issue, I made the following change to the sources
> to help with gdb's inability to show thread-local variables ("Cannot
> find thread-local variables on this target"):
>
>     --- contrib/libxo/libxo/libxo.c.save    2018-05-04 17:26:29.079500000 -0400
>     +++ contrib/libxo/libxo/libxo.c 2018-05-04 17:28:06.570875000 -0400
>     @@ -8349,3 +8349,11 @@
>          xop->xo_style = XO_STYLE_ENCODER;
>          xop->xo_encoder = encoder;
>      }
>     +
>     +void xo_print_handle (void);
>     +void
>     +xo_print_handle (void)
>     +{
>     +    fprintf(stderr, "xo_default_handle: %p %d\n",
>     +            &xo_default_handle, sizeof(xo_handle_t));
>     +}
>
> When I run the failing command (uptime) under gdb and breakpoint
> on main, my thread-local variable is not set to zeroes:
>
>     % gdb uptime
>     GNU gdb 6.1.1 [FreeBSD]
>     ...
>     This GDB was configured as "i386-marcel-freebsd"...
>     (gdb) b main
>     Breakpoint 1 at 0x8049be5: file /usr/src/usr.bin/w/w.c, line 145.
>     (gdb) run
>     Starting program: /usr/home/phil/work/lib/uptime
>
>     Breakpoint 1, main (argc=1, argv=0xbfbfe60c) at /usr/src/usr.bin/w/w.c:145
>     145             (void)setlocale(LC_ALL, "");
>     Current language:  auto; currently minimal
>     (gdb) call xo_print_handle()
>     xo_default_handle: 0x2806aff0 328
>     $1 = 34
>     (gdb) x/82x 0x2806aff0
>     0x2806aff0:     0x00000000      0x00000000      0x00000000      0x280601ef
>     0x2806b000:     0x2806b010      0x2806a200      0x00000001      0x280601ef
>     0x2806b010:     0x2806b020      0x2806a400      0x0000005d      0x280601ef
>     0x2806b020:     0x2806b030      0x2806a600      0x000000a1      0x280601ef
>     0x2806b030:     0x2806b040      0x2806a800      0x00000147      0x280601ef
>     0x2806b040:     0x00000000      0x2806aa00      0x00000164      0x280601ef
>     0x2806b050:     0x2806c000      0x00000000      0x28065e70      0x280601ef
>     0x2806b060:     0x2806b070      0x2806ac00      0x00000421      0x280601ef
>     0x2806b070:     0x00000000      0x2806aa00      0x0000042d      0x280601ef
>     0x2806b080:     0x00000000      0x2806aa00      0x000001ff      0x280601ef
>     0x2806b090:     0x2806b0a0      0x2806a800      0x00000976      0x280601ef
>     0x2806b0a0:     0x00000000      0x2806aa00      0x00000983      0x280601ef
>     0x2806b0b0:     0x00000000      0x2806aa00      0x00000a18      0x280601ef
>     0x2806b0c0:     0x00000000      0x2806aa00      0x00000571      0x280601ef
>     0x2806b0d0:     0x2806b0e0      0x2806a000      0x00000000      0x280601ef
>     0x2806b0e0:     0x2806b0f0      0x2806a200      0x00000000      0x280601ef
>     0x2806b0f0:     0x2806b100      0x2806a400      0x00000000      0x280601ef
>     0x2806b100:     0x2806b110      0x2806a600      0x00000000      0x280601ef
>     0x2806b110:     0x2806b120      0x2806a800      0x00000000      0x280601ef
>     0x2806b120:     0x2806b130      0x2806aa00      0x00000000      0x280601ef
>     0x2806b130:     0x00000000      0x2806ac00
>     (gdb)
>
> objdump shows the lib does have a .tbbs:
>
>      14 .tbss         00000658  000181f8  000181f8  000171f8  2**3
>                       ALLOC, THREAD_LOCAL
Can you build the library and binary with clang 5 or even gcc, and
see how is it end up ?

Also, try to link in libpthread.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: initialization problem w/ thread-specific .tbss data on i386

Phil Shafer
Konstantin Belousov writes:
>Also, try to link in libpthread.

This was interesting, not that I'm sure what it means:

    % env LD_PRELOAD=/usr/lib/libpthread.so /tmp/uptime
     5:26PM  up 4 days,  9:22, 3 users, load averages: 0.55, 0.52, 0.51

(where /tmp/uptime is a symlink to /usr/obj/.../w/w).
(see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227552)

Does the mean that the use of __thread requires -lpthread?  My
understanding was that the startup code handled thread-specific
data for the main thread of execution.

Thanks,
 Phil
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: initialization problem w/ thread-specific .tbss data on i386

Konstantin Belousov
On Tue, May 08, 2018 at 05:31:57PM -0400, Phil Shafer wrote:

> Konstantin Belousov writes:
> >Also, try to link in libpthread.
>
> This was interesting, not that I'm sure what it means:
>
>     % env LD_PRELOAD=/usr/lib/libpthread.so /tmp/uptime
>      5:26PM  up 4 days,  9:22, 3 users, load averages: 0.55, 0.52, 0.51
>
> (where /tmp/uptime is a symlink to /usr/obj/.../w/w).
> (see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227552)
>
> Does the mean that the use of __thread requires -lpthread?  My
> understanding was that the startup code handled thread-specific
> data for the main thread of execution.

No, try to compile libc with e.g. clang 5 and see if it also fixes
libxo.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: initialization problem w/ thread-specific .tbss data on i386

Phil Shafer
Konstantin Belousov writes:
>No, try to compile libc with e.g. clang 5 and see if it also fixes
>libxo.

See PR 227552 comment 20, where Dimitry Andric says:

    Hmm, now that we've identified .tbss as a contributor to the
    problem, it looks relevant that the r331838 version of libxo.so.0
    (compiled with the clang 6.0.0 update) does NOT have a "section
    to segment mapping" for .tbss:
    ...

Before clang-6, it was working fine.

Thanks,
 Phil
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"