Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
[I'm only claiming the new jemalloc is involved and that
reverting avoids the problem.]

I've been reporting to some lists problems with:

dhclient
sendmail
rpcbind
mountd
nfsd

getting SIGSEGV (signal 11) crashes and some core
dumps on the old 2-socket (1 core per socket) 32-bit
PowerMac G4 running head -r360311.

Mikaël Urankar sent a note suggesting that I try
testing reverting head -r360233 for my head -r360311
context. He got it right . . .


Context:

The problem was noticed by an inability to have
other machines do a:

mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt

sort of operation and to have succeed. By contrast, on
the old PowerMac G4 I could initiate mounts against
other machines just fine.

I do not see any such problems on any of (all based
on head -r360311):

powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
armv7 (OrangePi+ 2ed)
aarch64 (Rock64, RPi4, RPi3,
         OverDrive 1000,
         Macchiatobin Double Shot)
amd64 (ThreadRipper 1950X)

So I expect something 32-bit powerpc specific
is somehow involved, even if jemalloc is only
using whatever it is.

(A kyua run with a debug kernel did not find other
unexpected signal 11 sources on the 32-bit PowerMac
compared to past kyua runs, at least that I noticed.
There were a few lock order reversals that I do not
know if they are expected or known-safe or not.
I've reported those reversals to the lists as well.)


Recent experiments based on the suggestion:

Doing the buildworld, buildkernel and installing just
the new kernel and rebooting made no difference.

But then installing the new world and rebooting did
make things work again: I no longer get core files
for the likes of (old cores from before the update):

# find / -name "*.core" -print
/var/spool/clientmqueue/sendmail.core
/rpcbind.core
/mountd.core
/nfsd.core

Nor do I see the various notices for sendmail
signal 11's that did not leave behind a core file
--or for dhclient (no core file left behind).
And I can mount the old PowerMac's drive from
other machines just fine.


Other notes:

I do not actively use sendmail but it was left
to do its default things, partially to test if
such default things are working. Unfortunately,
PowerMacs have a problematical status under
FreeBSD and my context has my historical
experiments with avoiding various problems.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re[2]: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

nonameless


 
 --- Original message ---
 From: "Mark Millard" <[hidden email]>
 Date: 3 May 2020, 04:47:14
 


> [I'm only claiming the new jemalloc is involved and that
> reverting avoids the problem.]
>
> I've been reporting to some lists problems with:
>
> dhclient
> sendmail
> rpcbind
> mountd
> nfsd
>
> getting SIGSEGV (signal 11) crashes and some core
> dumps on the old 2-socket (1 core per socket) 32-bit
> PowerMac G4 running head -r<span data-ukrnet-code="360311">360311</span>.
>
> Mikaël Urankar sent a note suggesting that I try
> testing reverting head -r<span data-ukrnet-code="360233">360233</span> for my head -r<span data-ukrnet-code="360311">360311</span>
> context. He got it right . . .
>
>
> Context:
>
> The problem was noticed by an inability to have
> other machines do a:
>
> mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt
>
> sort of operation and to have succeed. By contrast, on
> the old PowerMac G4 I could initiate mounts against
> other machines just fine.
>
> I do not see any such problems on any of (all based
> on head -r360311):
>
> powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
> armv7 (OrangePi+ 2ed)
> aarch64 (Rock64, RPi4, RPi3,
> OverDrive <span data-ukrnet-code="1000">1000</span>,
> Macchiatobin Double Shot)
> amd64 (ThreadRipper 1950X)
>
> So I expect something 32-bit powerpc specific
> is somehow involved, even if jemalloc is only
> using whatever it is.
>
> (A kyua run with a debug kernel did not find other
> unexpected signal 11 sources on the 32-bit PowerMac
> compared to past kyua runs, at least that I noticed.
> There were a few lock order reversals that I do not
> know if they are expected or known-safe or not.
> I've reported those reversals to the lists as well.)
>
>
> Recent experiments based on the suggestion:
>
> Doing the buildworld, buildkernel and installing just
> the new kernel and rebooting made no difference.
>
> But then installing the new world and rebooting did
> make things work again: I no longer get core files
> for the likes of (old cores from before the update):
>
> # find / -name "*.core" -print
> /var/spool/clientmqueue/sendmail.core
> /rpcbind.core
> /mountd.core
> /nfsd.core
>
> Nor do I see the various notices for sendmail
> signal 11's that did not leave behind a core file
> --or for dhclient (no core file left behind).
> And I can mount the old PowerMac's drive from
> other machines just fine.
>
>
> Other notes:
>
> I do not actively use sendmail but it was left
> to do its default things, partially to test if
> such default things are working. Unfortunately,
> PowerMacs have a problematical status under
> FreeBSD and my context has my historical
> experiments with avoiding various problems.
>
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early <span data-ukrnet-code="2018">2018</span>-Mar)
>

Hi Mark,

It should be fixed, but not by reverting to old version. We can't stuck on old version because of ancient hardware. I think upstream is not interested in support such hardware. So, it have to patched locally.

Thanks.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

Brandon Bergren-3


On Sun, May 3, 2020, at 9:38 AM, Mark Millard via freebsd-ppc wrote:
>
> Observing and reporting the reverting result is an initial
> part of problem isolation. I made no request for FreeBSD
> to give up on using the updated jemalloc. (Unfortunately,
> I'm not sure what a good next step of problem isolation
> might be for the dual-socket PowerMac G4 context.)

I appreciate this testing btw. The only dual-socket G4 I have (my xserve g4) does not have the second socket populated, so I am currently unable to test two-socket ppc32.

> Other than reverting, no patch is known for the issue at
> this point. More problem isolation is needed first.
>
> While I do not have access, https://wiki.freebsd.org/powerpc
> lists more modern 32-bit powerpc hardware as supported:
> MPC85XX evaluation boards and AmigaOne A1222 (powerpcspe).
> (The AmigaOne A1222 seems to be dual-ore/single-socket.)

jhibbits has an A1222 that is used as an actual primary desktop, and I will hopefully have one at the end of the year. And I have an RB800 that I use for testing.

powerpcspe is really a different beast than aim32 though. I have been mainly working on aim32 on g4 laptops, although I do have an xserve.

>
> So folks with access to one of those may want to see
> if they also see the problem(s) with head -r360233 or
> later.

Frankly, I wouldn't be surprised if this continues to be down to the timebase skew somehow. I know that jemalloc tends to be sensitive to time problems.

>
> Another interesting context to test could be single-socket
> with just one core. (I might be able to do that on another
> old PowerMac, booting the same media after moving the
> media.)

That's my primary aim32 testing platform. I have a stack of g4 laptops that I test on, and a magically working usb stick (ADATA C008 / 8GB model. For some reason it just works, I've never seen another stick actually work)

>
> If I understand right, the most common 32-bit powerpc
> tier 2 hardware platforms may still be old PowerMac's.
> They are considered supported and "mature", instead of
> just "stable". See https://wiki.freebsd.org/powerpc .
> However, the reality is that there are various problems
> for old PowerMacs (32-bit and 64-bit, at least when
> there is more than one socket present). The wiki page
> does not hint at such. (I'm not sure about
> single socket/multi-core PowerMacs: no access to
> such.)

Yes, neither I nor jhibbits have multiple socket g4 hardware at the moment, and I additionally don't have multiple socket g5 either.

>
> It is certainly possible for some problem to happen
> that would lead to dropping the supported-status
> for some or all old 32-bit PowerMacs, even as tier 2.
> But that has not happened yet and I'd have no say in
> such a choice.

From a kernel standpoint, I for one have no intention of dropping 32 bit support in the forseeable future. I've actually been putting more work into 32 bit than 64 bit recently in fact.

--
  Brandon Bergren
  [hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
In reply to this post by freebsd-ppc mailing list
[At around 4AM local time dhcient got a signal 11,
despite the jemalloc revert. The other exmaples
have not happened.]

On 2020-May-2, at 18:46, Mark Millard <marklmi at yahoo.com> wrote:

> [I'm only claiming the new jemalloc is involved and that
> reverting avoids the problem.]
>
> I've been reporting to some lists problems with:
>
> dhclient
> sendmail
> rpcbind
> mountd
> nfsd
>
> getting SIGSEGV (signal 11) crashes and some core
> dumps on the old 2-socket (1 core per socket) 32-bit
> PowerMac G4 running head -r360311.
>
> Mikaël Urankar sent a note suggesting that I try
> testing reverting head -r360233 for my head -r360311
> context. He got it right . . .
>
>
> Context:
>
> The problem was noticed by an inability to have
> other machines do a:
>
> mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt
>
> sort of operation and to have succeed. By contrast, on
> the old PowerMac G4 I could initiate mounts against
> other machines just fine.
>
> I do not see any such problems on any of (all based
> on head -r360311):
>
> powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
> armv7 (OrangePi+ 2ed)
> aarch64 (Rock64, RPi4, RPi3,
>         OverDrive 1000,
>         Macchiatobin Double Shot)
> amd64 (ThreadRipper 1950X)
>
> So I expect something 32-bit powerpc specific
> is somehow involved, even if jemalloc is only
> using whatever it is.
>
> (A kyua run with a debug kernel did not find other
> unexpected signal 11 sources on the 32-bit PowerMac
> compared to past kyua runs, at least that I noticed.
> There were a few lock order reversals that I do not
> know if they are expected or known-safe or not.
> I've reported those reversals to the lists as well.)
>
>
> Recent experiments based on the suggestion:
>
> Doing the buildworld, buildkernel and installing just
> the new kernel and rebooting made no difference.
>
> But then installing the new world and rebooting did
> make things work again: I no longer get core files
> for the likes of (old cores from before the update):
>
> # find / -name "*.core" -print
> /var/spool/clientmqueue/sendmail.core
> /rpcbind.core
> /mountd.core
> /nfsd.core
>
> Nor do I see the various notices for sendmail
> signal 11's that did not leave behind a core file
> --or for dhclient (no core file left behind).
> And I can mount the old PowerMac's drive from
> other machines just fine.
>
>
> Other notes:
>
> I do not actively use sendmail but it was left
> to do its default things, partially to test if
> such default things are working. Unfortunately,
> PowerMacs have a problematical status under
> FreeBSD and my context has my historical
> experiments with avoiding various problems.

Looking, I see that I got a:

pid 572 (dhclient), jid 0, uid 0: exited on signal 11 (core dumped)

notice under the reverted build. No instances
of the other examples. This is the first that a
dhclient example has produced a .core file.

gdb indicates 0x5180936c for r7 in:

lwz     r8,36(r7)

as leading to the failure. This was in
arena_dalloc_bin_locked_impl (where
arena_slab_reg_dalloc and bitmap_unset
were apparently inlined).

The chain for the example seems to be:
fork_privchld -> dispatch_imsg -> jemalloc

For reference . . .

# gdb dhclient /dhclient.core
GNU gdb (GDB) 9.1 [GDB v9.1 for FreeBSD]
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
. . .
Reading symbols from dhclient...
Reading symbols from /usr/lib/debug//sbin/dhclient.debug...
[New LWP 100089]
Core was generated by `dhclient: gem0 [priv]'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  bitmap_unset (bitmap=0x50407164, binfo=<optimized out>, bit=167842154) at /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h:341
341 /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h: No such file or directory.
(gdb) bt -full
#0  bitmap_unset (bitmap=0x50407164, binfo=<optimized out>, bit=167842154) at /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h:341
        goff = <optimized out>
        gp = 0x51809390
        propagate = <optimized out>
        g = <optimized out>
        i = <optimized out>
#1  arena_slab_reg_dalloc (slab=0x50407140, slab_data=0x50407164, ptr=0x50088b50) at jemalloc_arena.c:273
        bin_info = <optimized out>
        binind = 0
        regind = 167842154
#2  arena_dalloc_bin_locked_impl (tsdn=0x5009f018, arena=<optimized out>, slab=<optimized out>, ptr=<optimized out>, junked=<optimized out>) at jemalloc_arena.c:1540
        slab_data = <optimized out>
        binind = <optimized out>
        bin_info = <optimized out>
        bin = <optimized out>
        nfree = <optimized out>
#3  0x502916a8 in __je_arena_dalloc_bin_junked_locked (tsdn=<optimized out>, arena=<optimized out>, extent=<optimized out>, ptr=<optimized out>) at jemalloc_arena.c:1559
No locals.
#4  0x50250d2c in __je_tcache_bin_flush_small (tsd=0x5009f018, tcache=<optimized out>, tbin=0x5009f1c0, binind=<optimized out>, rem=24) at jemalloc_tcache.c:149
        ptr = <optimized out>
        i = 0
        extent = 0x50407140
        bin_arena = 0x50400380
        bin = <optimized out>
        ndeferred = 0
        merged_stats = <optimized out>
        arena = 0x50400380
        nflush = 75
        __vla_expr0 = <optimized out>
        item_extent = 0xffffd1f0
#5  0x502508a0 in __je_tcache_event_hard (tsd=<optimized out>, tcache=0x5009f108) at jemalloc_tcache.c:54
        tbin_info = <optimized out>
        binind = 7
        tbin = 0x5009f1c0
#6  0x5029a684 in __free (ptr=0x500530c0) at /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/rtree.h:374
        tcache = 0x5009f108
        tsd = <optimized out>
        log_var = <optimized out>
        log_var = <optimized out>
#7  0x10025994 in dispatch_imsg (ifix=<optimized out>, fd=10) at /usr/powerpc32_src/sbin/dhclient/privsep.c:215
        hdr = {code = IMSG_SCRIPT_WRITE_PARAMS, len = 3225}
        lease = {next = 0x0, expiry = 1588504529, renewal = 1588504229, rebind = 1588504454, address = {len = 4, iabuf = "\300\250\001i", '\000' <repeats 11 times>}, nextserver = {len = 4,
            iabuf = '\000' <repeats 15 times>}, server_name = 0x0, filename = 0x0, medium = 0x0, is_static = 0, is_bootp = 0, options = {{len = 0, data = 0x0}, {len = 4,
              data = 0x500530c8 "\377\377\377"}, {len = 0, data = 0x0}, {len = 4, data = 0x500530d0 "\300\250\001\001"}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 4,
              data = 0x500530d8 "\300\250\001\001"}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {
              len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 20, data = 0x50055200 "hsd1.or.comcast.net."}, {len = 0, data = 0x0} <repeats 35 times>, {len = 4, data = 0x500530e0 ""}, {len = 0,
              data = 0x0}, {len = 1, data = 0x500530e8 "\005"}, {len = 4, data = 0x500530f0 "\300\250\001\001"}, {len = 0, data = 0x0} <repeats 201 times>}}
        medium_len = <optimized out>
        medium = <optimized out>
        totlen = 3225
        filename_len = <optimized out>
        filename = 0x0
        ret = <optimized out>
        buf = <optimized out>
        mtu = <optimized out>
        servername_len = <optimized out>
        servername = 0x0
        reason_len = <optimized out>
        reason = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        prefix_len = <optimized out>
        prefix = 0x500530c0 "new_"
        i = 0
        optlen = 0
#8  0x100189f4 in fork_privchld (fd=10, fd2=<optimized out>) at /usr/powerpc32_src/sbin/dhclient/dhclient.c:2847
        pfd = {{fd = 10, events = 1, revents = 1}}
        nfds = <optimized out>
#9  0x10017a80 in main (argc=<optimized out>, argv=<optimized out>) at /usr/powerpc32_src/sbin/dhclient/dhclient.c:505
        pipe_fd = {10, 11}
        rights = {cr_rights = {1342801412, 18446706484155777024}}
        immediate_daemon = 0
        i = 0
        ch = <optimized out>
        otherpid = 8
        pw = 0x5039b9d8
        fd = <optimized out>
        capmode = <optimized out>

(gdb) disass
Dump of assembler code for function arena_dalloc_bin_locked_impl:
   0x502916b8 <+0>: mflr    r0
   0x502916bc <+4>: stw     r0,4(r1)
   0x502916c0 <+8>: stwu    r1,-48(r1)
   0x502916c4 <+12>: stw     r30,40(r1)
   0x502916c8 <+16>: stw     r24,16(r1)
   0x502916cc <+20>: stw     r25,20(r1)
   0x502916d0 <+24>: stw     r26,24(r1)
   0x502916d4 <+28>: stw     r27,28(r1)
   0x502916d8 <+32>: stw     r28,32(r1)
   0x502916dc <+36>: stw     r29,36(r1)
   0x502916e0 <+40>: bl      0x502916e4 <arena_dalloc_bin_locked_impl+44>
   0x502916e4 <+44>: mr      r27,r3
   0x502916e8 <+48>: mflr    r30
   0x502916ec <+52>: addis   r30,r30,14
   0x502916f0 <+56>: addi    r30,r30,7788
   0x502916f4 <+60>: mr      r28,r4
   0x502916f8 <+64>: lwz     r4,5856(r30)
   0x502916fc <+68>: lwz     r3,4(r5)
   0x50291700 <+72>: mr      r29,r5
   0x50291704 <+76>: andi.   r5,r7,1
   0x50291708 <+80>: mr      r26,r6
   0x5029170c <+84>: lbz     r4,0(r4)
   0x50291710 <+88>: rlwinm  r5,r3,14,25,31
   0x50291714 <+92>: mulli   r24,r5,224
   0x50291718 <+96>: mulli   r25,r5,44
   0x5029171c <+100>: cmpwi   cr1,r4,0
   0x50291720 <+104>: cror    4*cr5+lt,4*cr1+eq,gt
   0x50291724 <+108>: bge     cr5,0x50291a2c <arena_dalloc_bin_locked_impl+884>
   0x50291728 <+112>: lwz     r4,0(r29)
   0x5029172c <+116>: lwz     r6,6036(r30)
   0x50291730 <+120>: lwz     r7,8(r29)
   0x50291734 <+124>: rlwinm  r8,r5,2,0,29
   0x50291738 <+128>: li      r9,1
   0x5029173c <+132>: add     r24,r28,r24
   0x50291740 <+136>: lwzx    r6,r6,r8
   0x50291744 <+140>: subf    r7,r7,r26
   0x50291748 <+144>: mulhwu  r6,r6,r7
   0x5029174c <+148>: rlwinm  r7,r6,29,3,29
   0x50291750 <+152>: add     r7,r29,r7
=> 0x50291754 <+156>: lwz     r8,36(r7)
   0x50291758 <+160>: clrlwi  r10,r6,27
   0x5029175c <+164>: slw     r9,r9,r10
   0x50291760 <+168>: xor     r9,r9,r8
   0x50291764 <+172>: cmplwi  r8,0
   0x50291768 <+176>: stw     r9,36(r7)
   0x5029176c <+180>: bne     0x502917e4 <arena_dalloc_bin_locked_impl+300>
   0x50291770 <+184>: lwz     r7,4408(r30)
   0x50291774 <+188>: mulli   r8,r5,44
   0x50291778 <+192>: add     r5,r7,r8
   0x5029177c <+196>: lwz     r5,16(r5)
   0x50291780 <+200>: cmplwi  r5,2
   0x50291784 <+204>: blt     0x502917e4 <arena_dalloc_bin_locked_impl+300
. . .

(gdb) info reg
r0             0x502916a8          1344870056
r1             0xffffd1a0          4294955424
r2             0x500a6018          1342857240
r3             0x0                 0
r4             0x0                 0
r5             0x0                 0
r6             0xa01116a           167842154
r7             0x5180936c          1367380844
r8             0x0                 0
r9             0x1                 1
r10            0x1e                30
r11            0x5005d114          1342558484
r12            0x84000c00          2214595584
r13            0x0                 0
r14            0xffffd1f0          4294955504
r15            0xfffffffc          4294967292
r16            0x4a                74
r17            0x4b                75
r18            0x0                 0
r19            0x504009a0          1346374048
r20            0x0                 0
r21            0xffffd1f0          4294955504
r22            0x620               1568
r23            0x50400380          1346372480
r24            0x50400380          1346372480
r25            0x0                 0
r26            0x50088b50          1342737232
r27            0x5009f018          1342828568
r28            0x50400380          1346372480
r29            0x50407140          1346400576
r30            0x50373550          1345795408
r31            0xffffd310          4294955792
pc             0x50291754          0x50291754 <arena_dalloc_bin_locked_impl+156>
msr            <unavailable>
cr             0x42480c00          1112017920
lr             0x502916e4          0x502916e4 <arena_dalloc_bin_locked_impl+44>
ctr            0x5005d114          1342558484
xer            0x0                 0
fpscr          0x0                 0
vscr           <unavailable>
vrsave         <unavailable>


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
This note just reports things from looking at 2
.core files (mountd and rpcbind) from the new
jemalloc context's failures. May be someone that
knows more can get something out of it. I've
not included any of the prior message history.


For mountd:

Core was generated by `/usr/sbin/mountd -r'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x50235df0 in cache_bin_dalloc_easy (bin=<optimized out>, bin_info=<optimized out>, ptr=0x50049160) at /usr/src/contrib/jemalloc/include/jemalloc/internal/cache_bin.h:121

warning: Source file is more recent than executable.
121 if (unlikely(bin->ncached == bin_info->ncached_max)) {

It turns out that bin_info traces back to:

        cache_bin_t *bin = tcache_small_bin_get(tcache, alloc_ctx.szind);
        cache_bin_info_t *bin_info = &tcache_bin_info[alloc_ctx.szind];
        if (!cache_bin_dalloc_easy(bin, bin_info, ptr)) {
                return false;
        }

based on:

#define tcache_bin_info JEMALLOC_N(tcache_bin_info)
and:
#  define JEMALLOC_N(n) __je_##n

But gdb reports:

(gdb) print __je_tcache_bin_info
$3 = (cache_bin_info_t *) 0x0

(gdb) print alloc_ctx
$1 = {szind = 0, slab = <synthetic pointer>}

so bin_info = NULL and bin_info->ncached_max would
fail (and does).

By contrast, bin->ncached seems to be from:

(gdb) print *(cache_bin_t*)0x50094018
$6 = {low_water = 65536, ncached = 1, tstats = {nrequests = 4796680610075595179}, avail = 0x0}


For rpcbind:

Core was generated by `/usr/sbin/rpcbind'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x50243fec in rendezvous_request (xprt=<optimized out>, msg=<optimized out>) at /usr/src/lib/libc/rpc/svc_vc.c:323
323 newxprt->xp_raddr = *(struct sockaddr_in *)newxprt->xp_rtaddr.buf;

(gdb) print *newxprt
$5 = {xp_fd = 14, xp_port = 0, xp_ops = 0x50329e1c, xp_addrlen = 0, xp_raddr = {sin_len = 0 '\000', sin_family = 0 '\000', sin_port = 0, sin_addr = {s_addr = 0},
    sin_zero = "\000\000\000\000P1O\374"}, xp_ops2 = 0x50329e34, xp_tp = 0x0, xp_netid = 0x50047310 "unix", xp_ltaddr = {maxlen = 1345322064, len = 1970170232, buf = 0x2020}, xp_rtaddr = {
    maxlen = 268500992, len = 16, buf = 0x0}, xp_verf = {oa_flavor = 0, oa_base = 0x202d2020 <error: Cannot access memory at address 0x202d2020>, oa_length = 538976364}, xp_p1 = 0x6f6f7000,
  xp_p2 = 0x0, xp_p3 = 0x202d2020, xp_type = 538976288}

so newxprt->xp_rtaddr.buf == 0x0 . But taht is very odd . . .

        /*
         * make a new transporter (re-uses xprt)
         */
        newxprt = makefd_xprt(sock, r->sendsize, r->recvsize);
        newxprt->xp_rtaddr.buf = mem_alloc(len);
        if (newxprt->xp_rtaddr.buf == NULL)
                return (FALSE);
        memcpy(newxprt->xp_rtaddr.buf, &addr, len);
        newxprt->xp_rtaddr.len = len;
#ifdef PORTMAP
        if (addr.ss_family == AF_INET || addr.ss_family == AF_LOCAL) {
                newxprt->xp_raddr = *(struct sockaddr_in *)newxprt->xp_rtaddr.buf;
                newxprt->xp_addrlen = sizeof (struct sockaddr_in);
        }
#endif                          /* PORTMAP */

Or, in machine code terms:

   0x50243f90 <+260>: mr      r28,r3
   0x50243f94 <+264>: lwz     r4,0(r24)
   0x50243f98 <+268>: lwz     r5,4(r24)
   0x50243f9c <+272>: mr      r3,r28
   0x50243fa0 <+276>: bl      0x5024308c <makefd_xprt>
   0x50243fa4 <+280>: lwz     r27,36(r1)
   0x50243fa8 <+284>: mr      r29,r3
   0x50243fac <+288>: li      r3,1
   0x50243fb0 <+292>: mr      r4,r27
   0x50243fb4 <+296>: bl      0x502e3214 <00000000.plt_pic32.calloc>
   0x50243fb8 <+300>: cmplwi  r3,0
   0x50243fbc <+304>: stw     r3,64(r29)
   0x50243fc0 <+308>: beq     0x50244160 <rendezvous_request+724>

Note: getting here means that newxprt->xp_rtaddr.buf
ws not NULL . Also the value was stored to 64(r29).

   0x50243fc4 <+312>: addi    r4,r1,296
   0x50243fc8 <+316>: mr      r5,r27
   0x50243fcc <+320>: bl      0x502e3264 <00000000.plt_pic32.memcpy>

Note: getting here means that memcpy was able to
store where ewxprt->xp_rtaddr.buf indicated (as
the r3 value).

   0x50243fd0 <+324>: lbz     r3,297(r1)
   0x50243fd4 <+328>: stw     r27,60(r29)
   0x50243fd8 <+332>: addi    r3,r3,-1
   0x50243fdc <+336>: clrlwi  r3,r3,24
   0x50243fe0 <+340>: cmplwi  r3,1
   0x50243fe4 <+344>: bgt     0x50244014 <rendezvous_request+392>
   0x50243fe8 <+348>: lwz     r3,64(r29)
=> 0x50243fec <+352>: lwz     r4,0(r3)

Note: the failure was above with r3==0x0:

(gdb) info reg
r0             0x50243fb8          1344552888
r1             0xffffb400          4294947840
r2             0x500a1018          1342836760
r3             0x0                 0
r4             0xffffb538          4294948152
r5             0x10                16
r6             0x50047328          1342468904
r7             0x0                 0
r8             0x50047324          1342468900
r9             0x0                 0
r10            0x20                32
r11            0x502d691c          1345153308
r12            0x24200880          606079104
r13            0x0                 0
r14            0x0                 0
r15            0xffffbc28          4294949928
r16            0x10002848          268445768
r17            0x10040000          268697600
r18            0x2                 2
r19            0x0                 0
r20            0x1                 1
r21            0x5004c044          1342488644
r22            0xffffb63c          4294948412
r23            0x80                128
r24            0x50048010          1342472208
r25            0x14                20
r26            0xffffb630          4294948400
r27            0x10                16
r28            0xe                 14
r29            0x500472e0          1342468832
r30            0x5030112c          1345327404
r31            0x10040000          268697600
pc             0x50243fec          0x50243fec <rendezvous_request+352>
msr            <unavailable>
cr             0x84200080          2216689792
lr             0x50243fd0          0x50243fd0 <rendezvous_request+324>
ctr            0x0                 0
xer            0x0                 0
fpscr          0x0                 0
vscr           <unavailable>
vrsave         <unavailable>

Note: The rest of the 2 assignments would have been
done by:

   0x50243ff0 <+356>: stw     r4,16(r29)
   0x50243ff4 <+360>: lwz     r4,4(r3)
   0x50243ff8 <+364>: stw     r4,20(r29)
   0x50243ffc <+368>: lwz     r4,8(r3)
   0x50244000 <+372>: stw     r4,24(r29)
   0x50244004 <+376>: li      r4,16
   0x50244008 <+380>: lwz     r3,12(r3)
   0x5024400c <+384>: stw     r4,12(r29)
   0x50244010 <+388>: stw     r3,28(r29)

My only guesses for alternatives are:

A) A processor context switch where the other
processor did not (yet) see the value stored
via 64(r29) and so it used an older value.

B) Something trashed the memory at 64(r29)
after it was updated by the code above.

C) r29's value was trashed by something,
changing where 64(r29) referenced.

D) r3 was trashed between the two back-to-back
instructions:

   0x50243fe8 <+348>: lwz     r3,64(r29)
=> 0x50243fec <+352>: lwz     r4,0(r3)

E) ???

I've no clue which.


Both .core files seem to have zero's showing up in
unexpected places previously non-zero.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
[This report just shows an interesting rpcbind crash:
a pointer was filled with part of a string instead,
leading to a failed memory access attempt from the junk
address produced.]

Core was generated by `/usr/sbin/rpcbind'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x5024405c in rendezvous_request (xprt=<optimized out>, msg=<optimized out>) at /usr/src/lib/libc/rpc/svc_vc.c:335
335 cd->recvsize = r->recvsize;

(gdb) list
330 _setsockopt(sock, IPPROTO_TCP, TCP_NODELAY, &len, sizeof (len));
331 }
332
333 cd = (struct cf_conn *)newxprt->xp_p1;
334
335 cd->recvsize = r->recvsize;
336 cd->sendsize = r->sendsize;
337 cd->maxrec = r->maxrec;
338
339 if (cd->maxrec != 0) {

(gdb) print/c *cd
Cannot access memory at address 0x2d202020

FYI:

. . .
   0x50244050 <+452>: bl      0x502e3404 <00000000.plt_pic32._setsockopt>
   0x50244054 <+456>: lwz     r27,80(r29)
   0x50244058 <+460>: lwz     r3,4(r24)
=> 0x5024405c <+464>: stw     r3,436(r27)

Note the 80(r29) use.

(gdb) info reg
r0             0x50244020          1344552992
r1             0xffffb400          4294947840
r2             0x500a1018          1342836760
r3             0x2328              9000
r4             0x32ef559c          854545820
r5             0x0                 0
r6             0xffffb360          4294947680
r7             0xffffb364          4294947684
r8             0x5004733c          1342468924
r9             0x0                 0
r10            0x20                32
r11            0x50252ea0          1344614048
r12            0x24200ca0          606080160
r13            0x0                 0
r14            0x0                 0
r15            0xffffbc28          4294949928
r16            0x10002848          268445768
r17            0x10040000          268697600
r18            0x2                 2
r19            0x0                 0
r20            0x1                 1
r21            0x5004c044          1342488644
r22            0xffffb63c          4294948412
r23            0x80                128
r24            0x50048010          1342472208
r25            0x14                20
r26            0xffffb630          4294948400
r27            0x2d202020          757080096
r28            0xf                 15
r29            0x50047308          1342468872
r30            0x5030112c          1345327404
r31            0x10040000          268697600
pc             0x5024405c          0x5024405c <rendezvous_request+464>
msr            <unavailable>
cr             0x842000a0          2216689824
lr             0x50244020          0x50244020 <rendezvous_request+404>
ctr            0x50252ea0          1344614048
xer            0x0                 0
fpscr          0x0                 0
vscr           <unavailable>
vrsave         <unavailable>

(gdb) x/s 0x50047308+72
0x50047350: " -      -       -\n"

So it tried to use "-   " as a pointer value.

It appears that the r29 value was from:

   0x50243f90 <+260>: mr      r28,r3
   0x50243f94 <+264>: lwz     r4,0(r24)
   0x50243f98 <+268>: lwz     r5,4(r24)
   0x50243f9c <+272>: mr      r3,r28
   0x50243fa0 <+276>: bl      0x5024308c <makefd_xprt>
   0x50243fa4 <+280>: lwz     r27,36(r1)
   0x50243fa8 <+284>: mr      r29,r3

The makefd_xprt being used as part of:

        /*
         * make a new transporter (re-uses xprt)
         */
        newxprt = makefd_xprt(sock, r->sendsize, r->recvsize);
        newxprt->xp_rtaddr.buf = mem_alloc(len);
        if (newxprt->xp_rtaddr.buf == NULL)
                return (FALSE);
        memcpy(newxprt->xp_rtaddr.buf, &addr, len);
        newxprt->xp_rtaddr.len = len;
#ifdef PORTMAP
        if (addr.ss_family == AF_INET || addr.ss_family == AF_LOCAL) {
                newxprt->xp_raddr = *(struct sockaddr_in *)newxprt->xp_rtaddr.buf;
                newxprt->xp_addrlen = sizeof (struct sockaddr_in);
        }
#endif                          /* PORTMAP */
        if (__rpc_fd2sockinfo(sock, &si) && si.si_proto == IPPROTO_TCP) {
                len = 1;
                /* XXX fvdl - is this useful? */
                _setsockopt(sock, IPPROTO_TCP, TCP_NODELAY, &len, sizeof (len));
        }
 
        cd = (struct cf_conn *)newxprt->xp_p1;
 
        cd->recvsize = r->recvsize;
        cd->sendsize = r->sendsize;
        cd->maxrec = r->maxrec;

FYI:

(gdb) print *r
$5 = {sendsize = 9000, recvsize = 9000, maxrec = 9000}

There is more evidence of strings in pointers in *newxprt
(xp_tp, oa_base, xp_p1, xp_p2, xp_p3):

(gdb) print *newxprt
$7 = {xp_fd = 15, xp_port = 0, xp_ops = 0x50329e1c, xp_addrlen = 16, xp_raddr = {sin_len = 16 '\020', sin_family = 1 '\001', sin_port = 0, sin_addr = {s_addr = 0},
    sin_zero = "\000\000\000\000\000\000\000"}, xp_ops2 = 0x756e6978, xp_tp = 0x2020 <error: Cannot access memory at address 0x2020>,
  xp_netid = 0x10010000 <error: Cannot access memory at address 0x10010000>, xp_ltaddr = {maxlen = 0, len = 0, buf = 0x0}, xp_rtaddr = {maxlen = 539828256, len = 16, buf = 0x50047330}, xp_verf = {
    oa_flavor = 0, oa_base = 0x202d2020 <error: Cannot access memory at address 0x202d2020>, oa_length = 538976288}, xp_p1 = 0x2d202020, xp_p2 = 0x20202020, xp_p3 = 0x2d0a0079, xp_type = 543780384}

(gdb) print (char*)(&newxprt->xp_verf.oa_base)
$24 = 0x50047350 " -      -       -\n"

(gdb) print (char*)(&newxprt->xp_p3)+3
$13 = 0x50047363 "y in FreeBSD.\n"

(gdb) print (char*)(&newxprt->xp_type)
$25 = 0x50047364 " in FreeBSD.\n"



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

Justin Hibbits-3
In reply to this post by Brandon Bergren-3
On Sun, 03 May 2020 09:56:02 -0500
"Brandon Bergren" <[hidden email]> wrote:

> On Sun, May 3, 2020, at 9:38 AM, Mark Millard via freebsd-ppc wrote:
> >
> > Observing and reporting the reverting result is an initial
> > part of problem isolation. I made no request for FreeBSD
> > to give up on using the updated jemalloc. (Unfortunately,
> > I'm not sure what a good next step of problem isolation
> > might be for the dual-socket PowerMac G4 context.)  
>
> I appreciate this testing btw. The only dual-socket G4 I have (my
> xserve g4) does not have the second socket populated, so I am
> currently unable to test two-socket ppc32.
>
> > Other than reverting, no patch is known for the issue at
> > this point. More problem isolation is needed first.
> >
> > While I do not have access, https://wiki.freebsd.org/powerpc
> > lists more modern 32-bit powerpc hardware as supported:
> > MPC85XX evaluation boards and AmigaOne A1222 (powerpcspe).
> > (The AmigaOne A1222 seems to be dual-ore/single-socket.)  
>
> jhibbits has an A1222 that is used as an actual primary desktop, and
> I will hopefully have one at the end of the year. And I have an RB800
> that I use for testing.
>
> powerpcspe is really a different beast than aim32 though. I have been
> mainly working on aim32 on g4 laptops, although I do have an xserve.
>
> >
> > So folks with access to one of those may want to see
> > if they also see the problem(s) with head -r360233 or
> > later.  
>
> Frankly, I wouldn't be surprised if this continues to be down to the
> timebase skew somehow. I know that jemalloc tends to be sensitive to
> time problems.
>
> >
> > Another interesting context to test could be single-socket
> > with just one core. (I might be able to do that on another
> > old PowerMac, booting the same media after moving the
> > media.)  
>
> That's my primary aim32 testing platform. I have a stack of g4
> laptops that I test on, and a magically working usb stick (ADATA C008
> / 8GB model. For some reason it just works, I've never seen another
> stick actually work)
>
> >
> > If I understand right, the most common 32-bit powerpc
> > tier 2 hardware platforms may still be old PowerMac's.
> > They are considered supported and "mature", instead of
> > just "stable". See https://wiki.freebsd.org/powerpc .
> > However, the reality is that there are various problems
> > for old PowerMacs (32-bit and 64-bit, at least when
> > there is more than one socket present). The wiki page
> > does not hint at such. (I'm not sure about
> > single socket/multi-core PowerMacs: no access to
> > such.)  
>
> Yes, neither I nor jhibbits have multiple socket g4 hardware at the
> moment, and I additionally don't have multiple socket g5 either.
>
> >
> > It is certainly possible for some problem to happen
> > that would lead to dropping the supported-status
> > for some or all old 32-bit PowerMacs, even as tier 2.
> > But that has not happened yet and I'd have no say in
> > such a choice.  
>
> From a kernel standpoint, I for one have no intention of dropping 32
> bit support in the forseeable future. I've actually been putting more
> work into 32 bit than 64 bit recently in fact.
>

I currently have FreeBSD HEAD from late last week running on a dual G4
MDD (WITNESS kernel), and no segmentation faults from dhclient.  I'm
using the following patch against jemalloc.  Brandon has reported other
results with that patch to me, so I'm not sure it's a correct patch.

- Justin




diff --git a/contrib/jemalloc/include/jemalloc/internal/cache_bin.h
b/contrib/jemalloc/include/jemalloc/internal/cache_bin.h index
d14556a3da8..728959a448e 100644 ---
a/contrib/jemalloc/include/jemalloc/internal/cache_bin.h +++
b/contrib/jemalloc/include/jemalloc/internal/cache_bin.h @@ -88,7 +88,7
@@ JEMALLOC_ALWAYS_INLINE void * cache_bin_alloc_easy(cache_bin_t *bin,
bool *success) { void *ret;
 
-       bin->ncached--;
+       cache_bin_sz_t cached = --bin->ncached;
 
        /*
         * Check for both bin->ncached == 0 and ncached < low_water
@@ -111,7 +111,7 @@ cache_bin_alloc_easy(cache_bin_t *bin, bool
*success) {
         * cacheline).
         */
        *success = true;
-       ret = *(bin->avail - (bin->ncached + 1));
+       ret = *(bin->avail - (cached + 1));
 
        return ret;
 }
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list


On 2020-May-6, at 10:02, Justin Hibbits <[hidden email]> wrote:

> On Sun, 03 May 2020 09:56:02 -0500
> "Brandon Bergren" <[hidden email]> wrote:
>
>> On Sun, May 3, 2020, at 9:38 AM, Mark Millard via freebsd-ppc wrote:
>>>
>>> Observing and reporting the reverting result is an initial
>>> part of problem isolation. I made no request for FreeBSD
>>> to give up on using the updated jemalloc. (Unfortunately,
>>> I'm not sure what a good next step of problem isolation
>>> might be for the dual-socket PowerMac G4 context.)  
>>
>> I appreciate this testing btw. The only dual-socket G4 I have (my
>> xserve g4) does not have the second socket populated, so I am
>> currently unable to test two-socket ppc32.
>>
>>> Other than reverting, no patch is known for the issue at
>>> this point. More problem isolation is needed first.
>>>
>>> While I do not have access, https://wiki.freebsd.org/powerpc
>>> lists more modern 32-bit powerpc hardware as supported:
>>> MPC85XX evaluation boards and AmigaOne A1222 (powerpcspe).
>>> (The AmigaOne A1222 seems to be dual-ore/single-socket.)  
>>
>> jhibbits has an A1222 that is used as an actual primary desktop, and
>> I will hopefully have one at the end of the year. And I have an RB800
>> that I use for testing.
>>
>> powerpcspe is really a different beast than aim32 though. I have been
>> mainly working on aim32 on g4 laptops, although I do have an xserve.
>>
>>>
>>> So folks with access to one of those may want to see
>>> if they also see the problem(s) with head -r360233 or
>>> later.  
>>
>> Frankly, I wouldn't be surprised if this continues to be down to the
>> timebase skew somehow. I know that jemalloc tends to be sensitive to
>> time problems.
>>
>>>
>>> Another interesting context to test could be single-socket
>>> with just one core. (I might be able to do that on another
>>> old PowerMac, booting the same media after moving the
>>> media.)  
>>
>> That's my primary aim32 testing platform. I have a stack of g4
>> laptops that I test on, and a magically working usb stick (ADATA C008
>> / 8GB model. For some reason it just works, I've never seen another
>> stick actually work)
>>
>>>
>>> If I understand right, the most common 32-bit powerpc
>>> tier 2 hardware platforms may still be old PowerMac's.
>>> They are considered supported and "mature", instead of
>>> just "stable". See https://wiki.freebsd.org/powerpc .
>>> However, the reality is that there are various problems
>>> for old PowerMacs (32-bit and 64-bit, at least when
>>> there is more than one socket present). The wiki page
>>> does not hint at such. (I'm not sure about
>>> single socket/multi-core PowerMacs: no access to
>>> such.)  
>>
>> Yes, neither I nor jhibbits have multiple socket g4 hardware at the
>> moment, and I additionally don't have multiple socket g5 either.
>>
>>>
>>> It is certainly possible for some problem to happen
>>> that would lead to dropping the supported-status
>>> for some or all old 32-bit PowerMacs, even as tier 2.
>>> But that has not happened yet and I'd have no say in
>>> such a choice.  
>>
>> From a kernel standpoint, I for one have no intention of dropping 32
>> bit support in the forseeable future. I've actually been putting more
>> work into 32 bit than 64 bit recently in fact.
>>
>
> I currently have FreeBSD HEAD from late last week running on a dual G4
> MDD (WITNESS kernel), and no segmentation faults from dhclient.  I'm
> using the following patch against jemalloc.  Brandon has reported other
> results with that patch to me, so I'm not sure it's a correct patch.
>
> - Justin

Thanks.

The status of trying to track this down . . .

I normally use MALLOC_PRODUCTION= in my normally non-debug
builds. So: no jemalloc assert's. So I tried a "debug" build
without MALLOC_PRODUCTION= --and so far I've not had any
failures after booting with that world-build. Nor have any
assert's failed. It has been longer than usual but it would
probably be a few days before I concluded much. (At some
point I'll reboot just to change the conditions some and
then give it more time as well.)

I had hoped this type of build would detect there being a
problem earlier after things start going bad internally.

I've still no means of directly causing the problem. I've
still only seen the odd SIGSEGV's in dhclient, rpcbind,
mountd, nfsd, and sendmail.

I've really only learned:

A) Either messed up memory contents is involved
   or addresses in registers were pointing to
   the wrong place. (I know not which for sure.)

B) It seems to be probabilistic for when it starts
   in each of the 5 types of context. (Possibly some
   data race involved?)

C) The programs do not all fail together but over time
   more than one type can get failures.

D) Once sendmail's quickly executing subprocess starts
   having the problem during its exit, later instances
   seem to have it as well. (Inheriting bad memory
   content via a fork-ish operation that creates the
   subprocess?)

E) I do have the example failure of one of the contexts
   with the prior jemalloc code. (It was a
   MALLOC_PRODUCTION= style build.) (I reverted to the
   modern jemalloc that seemed to expose the problem
   more.)

So far I've made no progress for isolating the context
for where the problem starts. I've no clue how much is
messed up or for how long it has been messed up by the
time a notice is reported.

I still do not blame jemalloc: as far as I know it could
be just contributing to exposing problem(s) from other
code instead of having problems of its own. Some of the
SIGEGV's are not in jemalloc code at the time of the
SIGSEGV.


> diff --git a/contrib/jemalloc/include/jemalloc/internal/cache_bin.h
> b/contrib/jemalloc/include/jemalloc/internal/cache_bin.h index
> d14556a3da8..728959a448e 100644 ---
> a/contrib/jemalloc/include/jemalloc/internal/cache_bin.h +++
> b/contrib/jemalloc/include/jemalloc/internal/cache_bin.h @@ -88,7 +88,7
> @@ JEMALLOC_ALWAYS_INLINE void * cache_bin_alloc_easy(cache_bin_t *bin,
> bool *success) { void *ret;
>
> -       bin->ncached--;
> +       cache_bin_sz_t cached = --bin->ncached;
>
>        /*
>         * Check for both bin->ncached == 0 and ncached < low_water
> @@ -111,7 +111,7 @@ cache_bin_alloc_easy(cache_bin_t *bin, bool
> *success) {
>         * cacheline).
>         */
>        *success = true;
> -       ret = *(bin->avail - (bin->ncached + 1));
> +       ret = *(bin->avail - (cached + 1));
>
>        return ret;
> }

As stands, it is messy trying to conclude if something
helps vs. hurts vs. makes-little-difference. So I'm not
sure how or when I'll try the above. So far I've focused
on reproducing the problem, possibly in a away that
gives better (earlier) information.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
In reply to this post by freebsd-ppc mailing list
[mountd failure example: also at sz_size2index_lookup assert
for the same zero'd memory problem.]

> On 2020-May-7, at 00:46, Mark Millard <[hidden email]> wrote:
>
> [__je_sz_size2index_tab seems messed up in 2 of the
> asserting contexts: first 384 are zero in both. More
> before that is also messed up (all zero). I show the
> details later below.]
>
> On 2020-May-6, at 16:57, Mark Millard <marklmi at yahoo.com> wrote:
>
>> [This explores process crashes that happen during system
>> shutdown, in a context not having MALLOC_PRODUCTION= .
>> So assert failures are reported as the stopping points.]
>>
>> It looks like shutdown -p now, shutdown -r now, and the
>> like can lead some processes to assert during their exit
>> attempt, including a sshd failure (that I've not seen
>> before), rpcbind, and nfsd. I show information about the
>> observed asserts for those below.
>>
>>
>> sshd hit an assert, failing slab == extent_slab_get(extent) :
>>
>> (gdb) bt
>> #0  thr_kill () at thr_kill.S:4
>> #1  0x50927170 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
>> #2  0x50886cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
>> #3  0x508834b0 in arena_dalloc (tsdn=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, alloc_ctx=<optimized out>, slow_path=<optimized out>)
>>   at /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:315
>> #4  idalloctm (tsdn=0x500dd040, ptr=0x5008a180, tcache=0x500dd160, alloc_ctx=<optimized out>, is_internal=<optimized out>, slow_path=<optimized out>)
>>   at /usr/src/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h:118
>> #5  0x5087b0a4 in ifree (tsd=0x500dd040, ptr=0x5008a180, tcache=0x500dd160, slow_path=<optimized out>) at jemalloc_jemalloc.c:2590
>> #6  0x5087acac in __je_free_default (ptr=0x5008a180) at jemalloc_jemalloc.c:2784
>> #7  0x5087b294 in __free (ptr=0x5008a180) at jemalloc_jemalloc.c:2852
>> #8  0x10029464 in server_accept_loop (config_s=<optimized out>, sock_in=<optimized out>, sock_out=<optimized out>, newsock=<optimized out>) at /usr/src/crypto/openssh/sshd.c:1185
>> #9  main (ac=<optimized out>, av=0xffffde3c) at /usr/src/crypto/openssh/sshd.c:2009
>>
>> . . .
>> (gdb) up
>> #2  0x50886cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
>> 67 (void)raise(SIGABRT);
>> (gdb) up
>> #3  0x508834b0 in arena_dalloc (tsdn=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, alloc_ctx=<optimized out>, slow_path=<optimized out>)
>>   at /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:315
>> 315 assert(slab == extent_slab_get(extent));
>>
>> (gdb) list
>> 310 rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
>> 311 extent_t *extent = rtree_extent_read(tsdn, &extents_rtree,
>> 312    rtree_ctx, (uintptr_t)ptr, true);
>> 313 assert(szind == extent_szind_get(extent));
>> 314 assert(szind < SC_NSIZES);
>> 315 assert(slab == extent_slab_get(extent));
>> 316 }
>> 317
>> 318 if (likely(slab)) {
>> 319 /* Small allocation. */
>>
>> More fully:
>>
>> 285 JEMALLOC_ALWAYS_INLINE void
>> 286 arena_dalloc(tsdn_t *tsdn, void *ptr, tcache_t *tcache,
>> 287    alloc_ctx_t *alloc_ctx, bool slow_path) {
>> 288 assert(!tsdn_null(tsdn) || tcache == NULL);
>> 289 assert(ptr != NULL);
>> 290
>> 291 if (unlikely(tcache == NULL)) {
>> 292 arena_dalloc_no_tcache(tsdn, ptr);
>> 293 return;
>> 294 }
>> 295
>> 296 szind_t szind;
>> 297 bool slab;
>> 298 rtree_ctx_t *rtree_ctx;
>> 299 if (alloc_ctx != NULL) {
>> 300 szind = alloc_ctx->szind;
>> 301 slab = alloc_ctx->slab;
>> 302 assert(szind != SC_NSIZES);
>> 303 } else {
>> 304 rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
>> 305 rtree_szind_slab_read(tsdn, &extents_rtree, rtree_ctx,
>> 306    (uintptr_t)ptr, true, &szind, &slab);
>> 307 }
>> 308
>> 309 if (config_debug) {
>> 310 rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
>> 311 extent_t *extent = rtree_extent_read(tsdn, &extents_rtree,
>> 312    rtree_ctx, (uintptr_t)ptr, true);
>> 313 assert(szind == extent_szind_get(extent));
>> 314 assert(szind < SC_NSIZES);
>> 315 assert(slab == extent_slab_get(extent));
>> 316 }
>> 317
>> 318 if (likely(slab)) {
>> 319 /* Small allocation. */
>> 320 tcache_dalloc_small(tsdn_tsd(tsdn), tcache, ptr, szind,
>> 321    slow_path);
>> 322 } else {
>> 323 arena_dalloc_large(tsdn, ptr, tcache, szind, slow_path);
>> 324 }
>> 325 }
>
> The following are only shown for contrast with the later
> cases of lots of zeros showing up where below shows
> non-zero values (taken from sshd.core, which failed
> differently):
>
> (gdb) x/4x __je_arenas+16368/4
> 0x50981ab0 <__je_arenas+16368>: 0x00000000 0x00000000 0x00000000 0x00000009
> (gdb) print/x __je_arenas_lock
> $1 = {{{prof_data = {tot_wait_time = {ns = 0x0}, max_wait_time = {ns = 0x0}, n_wait_times = 0x0, n_spin_acquired = 0x0, max_n_thds = 0x0, n_waiting_thds = {repr = 0x0}, n_owner_switches = 0x0,
>        prev_owner = 0x0, n_lock_ops = 0x0}, lock = 0x5008bf00, postponed_next = 0x50974070, locked = {repr = 0x0}}}, witness = {name = 0x50760b04, rank = 0x3, comp = 0x0, opaque = 0x0, link = {
>      qre_next = 0x50981b10, qre_prev = 0x50981b10}}, lock_order = 0x0}
> (gdb) print/x __je_narenas_auto
> $2 = 0x8
> (gdb) print/x malloc_conf
> $3 = 0x0
> (gdb) print/x __je_ncpus
> $4 = 0x2
> (gdb) print/x __je_manual_arena_base
> $5 = 0x9
> (gdb) print/x __je_sz_pind2sz_tab
> $6 = {0x1000, 0x2000, 0x3000, 0x4000, 0x5000, 0x6000, 0x7000, 0x8000, 0xa000, 0xc000, 0xe000, 0x10000, 0x14000, 0x18000, 0x1c000, 0x20000, 0x28000, 0x30000, 0x38000, 0x40000, 0x50000, 0x60000,
>  0x70000, 0x80000, 0xa0000, 0xc0000, 0xe0000, 0x100000, 0x140000, 0x180000, 0x1c0000, 0x200000, 0x280000, 0x300000, 0x380000, 0x400000, 0x500000, 0x600000, 0x700000, 0x800000, 0xa00000, 0xc00000,
>  0xe00000, 0x1000000, 0x1400000, 0x1800000, 0x1c00000, 0x2000000, 0x2800000, 0x3000000, 0x3800000, 0x4000000, 0x5000000, 0x6000000, 0x7000000, 0x8000000, 0xa000000, 0xc000000, 0xe000000,
>  0x10000000, 0x14000000, 0x18000000, 0x1c000000, 0x20000000, 0x28000000, 0x30000000, 0x38000000, 0x40000000, 0x50000000, 0x60000000, 0x70000000, 0x70001000}
> (gdb) print/x __je_sz_index2size_tab
> $7 = {0x8, 0x10, 0x20, 0x30, 0x40, 0x50, 0x60, 0x70, 0x80, 0xa0, 0xc0, 0xe0, 0x100, 0x140, 0x180, 0x1c0, 0x200, 0x280, 0x300, 0x380, 0x400, 0x500, 0x600, 0x700, 0x800, 0xa00, 0xc00, 0xe00, 0x1000,
>  0x1400, 0x1800, 0x1c00, 0x2000, 0x2800, 0x3000, 0x3800, 0x4000, 0x5000, 0x6000, 0x7000, 0x8000, 0xa000, 0xc000, 0xe000, 0x10000, 0x14000, 0x18000, 0x1c000, 0x20000, 0x28000, 0x30000, 0x38000,
>  0x40000, 0x50000, 0x60000, 0x70000, 0x80000, 0xa0000, 0xc0000, 0xe0000, 0x100000, 0x140000, 0x180000, 0x1c0000, 0x200000, 0x280000, 0x300000, 0x380000, 0x400000, 0x500000, 0x600000, 0x700000,
>  0x800000, 0xa00000, 0xc00000, 0xe00000, 0x1000000, 0x1400000, 0x1800000, 0x1c00000, 0x2000000, 0x2800000, 0x3000000, 0x3800000, 0x4000000, 0x5000000, 0x6000000, 0x7000000, 0x8000000, 0xa000000,
>  0xc000000, 0xe000000, 0x10000000, 0x14000000, 0x18000000, 0x1c000000, 0x20000000, 0x28000000, 0x30000000, 0x38000000, 0x40000000, 0x50000000, 0x60000000, 0x70000000}
> (gdb) print/x __je_sz_size2index_tab
> $2 = {0x0, 0x0, 0x1, 0x2, 0x2, 0x3, 0x3, 0x4, 0x4, 0x5, 0x5, 0x6, 0x6, 0x7, 0x7, 0x8, 0x8, 0x9, 0x9, 0x9, 0x9, 0xa, 0xa, 0xa, 0xa, 0xb, 0xb, 0xb, 0xb, 0xc, 0xc, 0xc, 0xc, 0xd, 0xd, 0xd, 0xd, 0xd,
>  0xd, 0xd, 0xd, 0xe, 0xe, 0xe, 0xe, 0xe, 0xe, 0xe, 0xe, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x11 <repeats 16 times>, 0x12 <repeats 16 times>,
>  0x13 <repeats 16 times>, 0x14 <repeats 16 times>, 0x15 <repeats 32 times>, 0x16 <repeats 32 times>, 0x17 <repeats 32 times>, 0x18 <repeats 32 times>, 0x19 <repeats 64 times>,
>  0x1a <repeats 64 times>, 0x1b <repeats 64 times>, 0x1c <repeats 64 times>}
>
>
>> rpcbind hit an assert, failing ret == sz_size2index_compute(size) :
>>
>> (gdb) bt
>> #0  thr_kill () at thr_kill.S:4
>> #1  0x502f2170 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
>> #2  0x50251d04 in abort () at /usr/src/lib/libc/stdlib/abort.c:79
>> #3  0x5024f260 in sz_size2index_lookup (size=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:159
>> #4  sz_size2index (size=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:166
>> #5  imalloc_body (sopts=0xffffb360, dopts=0xffffb340, tsd=0x5009a018) at jemalloc_jemalloc.c:2066
>> #6  0x50244874 in imalloc (sopts=0xffffb360, dopts=0xffffb340) at /usr/src/contrib/jemalloc/include/jemalloc/internal/tsd.h:331
>> #7  0x50244fe8 in __calloc (num=1, size=96) at jemalloc_jemalloc.c:2498
>> #8  0x50265690 in svc_xprt_alloc () at /usr/src/lib/libc/rpc/svc.c:541
>> #9  0x502635f4 in makefd_xprt (fd=14, sendsize=9000, recvsize=9000) at /usr/src/lib/libc/rpc/svc_vc.c:250
>> #10 0x502644b4 in rendezvous_request (xprt=0x5004c000, msg=<optimized out>) at /usr/src/lib/libc/rpc/svc_vc.c:315
>> #11 0x50265a98 in svc_getreq_common (fd=<optimized out>) at /usr/src/lib/libc/rpc/svc.c:640
>> #12 0x50265d1c in svc_getreq_poll (pfdp=<optimized out>, pollretval=1) at /usr/src/lib/libc/rpc/svc.c:739
>> #13 0x10018568 in my_svc_run () at /usr/src/usr.sbin/rpcbind/rpcb_svc_com.c:1167
>> #14 0x10014ad8 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/usr.sbin/rpcbind/rpcbind.c:250
>> (gdb) up 3
>> #3  0x5024f260 in sz_size2index_lookup (size=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:159
>> 159 assert(ret == sz_size2index_compute(size));
>> (gdb) print ret
>> $1 = 0
>>
>> 154 JEMALLOC_ALWAYS_INLINE szind_t
>> 155 sz_size2index_lookup(size_t size) {
>> 156 assert(size <= SC_LOOKUP_MAXCLASS);
>> 157 szind_t ret = (sz_size2index_tab[(size + (ZU(1) << SC_LG_TINY_MIN) - 1)
>> 158 >> SC_LG_TINY_MIN]);
>> 159 assert(ret == sz_size2index_compute(size));
>> 160 return ret;
>> 161 }
>
> gdb reports for sz_size2index_tab (really JEMALLOC_N(sz_size2index_tab),
> i.e., __je_sz_size2index_tab):
>
> (gdb) print/x __je_sz_size2index_tab
> $1 = {0x0 <repeats 384 times>, 0x1a, 0x1b <repeats 64 times>, 0x1c <repeats 64 times>}
>
> Also:
>
> (gdb) x/4x __je_arenas+16368/4
> 0x5034cab0 <__je_arenas+16368>: 0x00000000 0x00000000 0x00000000 0x00000000
> (gdb) print/x __je_arenas_lock                                                                                                                
> $8 = {{{prof_data = {tot_wait_time = {ns = 0x0}, max_wait_time = {ns = 0x0}, n_wait_times = 0x0, n_spin_acquired = 0x0, max_n_thds = 0x0, n_waiting_thds = {repr = 0x0}, n_owner_switches = 0x0,
>        prev_owner = 0x0, n_lock_ops = 0x0}, lock = 0x0, postponed_next = 0x0, locked = {repr = 0x0}}}, witness = {name = 0x0, rank = 0x0, comp = 0x0, opaque = 0x0, link = {qre_next = 0x0,
>      qre_prev = 0x0}}, lock_order = 0x0}
> (gdb) print/x __je_narenas_auto
> $9 = 0x0
> (gdb) print/x malloc_conf      
> $10 = 0x0
> (gdb) print/x __je_ncpus
> $11 = 0x0
> (gdb) print/x __je_manual_arena_base
> $12 = 0x0
> (gdb) print/x __je_sz_pind2sz_tab  
> $13 = {0x0 <repeats 72 times>}
> (gdb) print/x __je_sz_index2size_tab
> $14 = {0x0 <repeats 104 times>}
>
>
>> nfsd hit an assert, failing ret == sz_size2index_compute(size)
>
> [Correction: That should have referenced sz_index2size_lookup(index).]
>
>> (also, but a different caller of sz_size2index):
>
> [Correction: The "also" comment should be ignored:
> sz_index2size_lookup(index) is referenced below.]
>
>>
>> (gdb) bt
>> #0  thr_kill () at thr_kill.S:4
>> #1  0x502b2170 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
>> #2  0x50211cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
>> #3  0x50206104 in sz_index2size_lookup (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200
>> #4  sz_index2size (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:207
>> #5  ifree (tsd=0x50094018, ptr=0x50041028, tcache=0x50094138, slow_path=<optimized out>) at jemalloc_jemalloc.c:2583
>> #6  0x50205cac in __je_free_default (ptr=0x50041028) at jemalloc_jemalloc.c:2784
>> #7  0x50206294 in __free (ptr=0x50041028) at jemalloc_jemalloc.c:2852
>> #8  0x50287ec8 in ns_src_free (src=0x50329004, srclistsize=<optimized out>) at /usr/src/lib/libc/net/nsdispatch.c:452
>> #9  ns_dbt_free (dbt=0x50329000) at /usr/src/lib/libc/net/nsdispatch.c:436
>> #10 vector_free (vec=0x50329000, count=<optimized out>, esize=12, free_elem=<optimized out>) at /usr/src/lib/libc/net/nsdispatch.c:253
>> #11 nss_atexit () at /usr/src/lib/libc/net/nsdispatch.c:578
>> #12 0x5028d958 in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:240
>> #13 0x502117f8 in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:74
>> #14 0x10013f9c in child_cleanup (signo=<optimized out>) at /usr/src/usr.sbin/nfsd/nfsd.c:969
>> #15 <signal handler called>
>> #16 0x00000000 in ?? ()
>>
>> (gdb) up 3
>> #3  0x50206104 in sz_index2size_lookup (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200
>> 200 assert(ret == sz_index2size_compute(index));
>>
>> (ret is optimized out.)
>>
>> 197 JEMALLOC_ALWAYS_INLINE size_t
>> 198 sz_index2size_lookup(szind_t index) {
>> 199 size_t ret = (size_t)sz_index2size_tab[index];
>> 200 assert(ret == sz_index2size_compute(index));
>> 201 return ret;
>> 202 }
>
> (gdb) print/x __je_sz_index2size_tab
> $3 = {0x0 <repeats 104 times>}
>
> Also:
>
> (gdb) x/4x __je_arenas+16368/4
> 0x5030cab0 <__je_arenas+16368>: 0x00000000 0x00000000 0x00000000 0x00000000
> (gdb) print/x __je_arenas_lock                                                                                                                
> $8 = {{{prof_data = {tot_wait_time = {ns = 0x0}, max_wait_time = {ns = 0x0}, n_wait_times = 0x0, n_spin_acquired = 0x0, max_n_thds = 0x0, n_waiting_thds = {repr = 0x0}, n_owner_switches = 0x0,
>        prev_owner = 0x0, n_lock_ops = 0x0}, lock = 0x0, postponed_next = 0x0, locked = {repr = 0x0}}}, witness = {name = 0x0, rank = 0x0, comp = 0x0, opaque = 0x0, link = {qre_next = 0x0,
>      qre_prev = 0x0}}, lock_order = 0x0}
> (gdb) print/x __je_narenas_auto
> $9 = 0x0
> (gdb) print/x malloc_conf      
> $10 = 0x0
> (gdb) print/x __je_ncpus
> $11 = 0x0
> (gdb) print/x __je_manual_arena_base
> $12 = 0x0
> (gdb) print/x __je_sz_pind2sz_tab  
> $13 = {0x0 <repeats 72 times>}
> (gdb) print/x __je_sz_size2index_tab
> $1 = {0x0 <repeats 384 times>, 0x1a, 0x1b <repeats 64 times>, 0x1c <repeats 64 times>}
>
>> Booting and immediately trying something like:
>>
>> service nfsd stop
>>
>> did not lead to a failure. But may be after
>> a while it would and be less drastic than a
>> reboot or power down.
>
> More detail:
>
> So, for rpcbind and nfds at some point a large part of
> __je_sz_size2index_tab is being stomped on, as is all of
> __je_sz_index2size_tab and more.
>
> For rpcbind, the following area is zero but in a
> live process is not all-zero (I show the partially
> non-zero live-process context instead of the all-zero
> .core file content):
>
> 0x5034cab0 <__je_arenas+16368>: 0x00000000 0x00000000 0x00000000 0x00000009
> 0x5034cac0 <__je_arenas_lock>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5034cad0 <__je_arenas_lock+16>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5034cae0 <__je_arenas_lock+32>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5034caf0 <__je_arenas_lock+48>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5034cb00 <__je_arenas_lock+64>: 0x00000000 0x5033f070 0x00000000 0x00000000
> 0x5034cb10 <__je_arenas_lock+80>: 0x5012bb04 0x00000003 0x00000000 0x00000000
> 0x5034cb20 <__je_arenas_lock+96>: 0x5034cb10 0x5034cb10 0x00000000 0x00000000
>
> Then the memory in the crash continues to be zero until:
>
> 0x5034d000 <__je_sz_size2index_tab+384>: 0x1a1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
>
> Notice the interesting page boundary for where non-zero
> is first available again!
>
> Between __je_arenas_lock and __je_sz_size2index_tab are:
>
> 0x5034cb30 __je_narenas_auto
> 0x5034cb38 malloc_conf
> 0x5034cb3c __je_ncpus
> 0x5034cb40 __je_manual_arena_base
> 0x5034cb80 __je_sz_pind2sz_tab
> 0x5034ccc0 __je_sz_index2size_tab
> 0x5034ce80 __je_sz_size2index_tab
>
> For nfsd, it is similar (again showing the partially
> non-zero live process context instead of the all-zeros
> from the .core file):
>
> 0x5030cab0 <__je_arenas+16368>: 0x00000000 0x00000000 0x00000000 0x00000009
> 0x5030cac0 <__je_arenas_lock>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5030cad0 <__je_arenas_lock+16>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5030cae0 <__je_arenas_lock+32>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5030caf0 <__je_arenas_lock+48>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5030cb00 <__je_arenas_lock+64>: 0x00000000 0x502ff070 0x00000000 0x00000000
> 0x5030cb10 <__je_arenas_lock+80>: 0x500ebb04 0x00000003 0x00000000 0x00000000
> 0x5030cb20 <__je_arenas_lock+96>: 0x5030cb10 0x5030cb10 0x00000000 0x00000000
>
> Then the memory in the crash continues to be zero until:
>
> 0x5030d000 <__je_sz_size2index_tab+384>: 0x1a1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
>
> Notice the interesting page boundary for where non-zero
> is first available again!
>
> Between __je_arenas_lock and __je_sz_size2index_tab are:
>
> 0x5030cb30 __je_narenas_auto
> 0x5030cb38 malloc_conf
> 0x5030cb3c __je_ncpus
> 0x5030cb40 __je_manual_arena_base
> 0x5030cb80 __je_sz_pind2sz_tab
> 0x5030ccc0 __je_sz_index2size_tab
> 0x5030ce80 __je_sz_size2index_tab
>
>
> Note: because __je_arenas is normally
> mostly zero for these contexts, I can
> not tell where the memory trashing
> started, only where it replaced non-zero
> values with zeros.
>

I got a mountd assert failure in sz_size2index_lookup
while attempting __calloc during makefd_xprt.

(gdb) bt
#0  thr_kill () at thr_kill.S:4
#1  0x50301170 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x50260cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
#3  0x5025e260 in sz_size2index_lookup (size=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:159
#4  sz_size2index (size=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:166
#5  imalloc_body (sopts=0xffffd000, dopts=0xffffcfe0, tsd=0x50094018) at jemalloc_jemalloc.c:2066
#6  0x50253874 in imalloc (sopts=0xffffd000, dopts=0xffffcfe0) at /usr/src/contrib/jemalloc/include/jemalloc/internal/tsd.h:331
#7  0x50253fe8 in __calloc (num=1, size=96) at jemalloc_jemalloc.c:2498
#8  0x50274690 in svc_xprt_alloc () at /usr/src/lib/libc/rpc/svc.c:541
#9  0x502725f4 in makefd_xprt (fd=10, sendsize=9000, recvsize=9000) at /usr/src/lib/libc/rpc/svc_vc.c:250
#10 0x502734b4 in rendezvous_request (xprt=0x5007b120, msg=<optimized out>) at /usr/src/lib/libc/rpc/svc_vc.c:315
#11 0x50274a98 in svc_getreq_common (fd=<optimized out>) at /usr/src/lib/libc/rpc/svc.c:640
#12 0x502748e0 in svc_getreqset (readfds=<optimized out>) at /usr/src/lib/libc/rpc/svc.c:611
#13 0x1001434c in main (argc=<optimized out>, argv=0xffffde3c) at /usr/src/usr.sbin/mountd/mountd.c:683

(gdb) up 3
#3  0x5025e260 in sz_size2index_lookup (size=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:159
159 assert(ret == sz_size2index_compute(size));

Again there is that area of memory that has been zeroed, with
the same stopping point in __je_sz_size2index_tab:

(gdb) print/x __je_narenas_auto
$2 = 0x0
(gdb) print/x malloc_conf
$3 = 0x0
(gdb) print/x __je_ncpus
$4 = 0x0
(gdb) print/x __je_manual_arena_base
$5 = 0x0
(gdb) print/x __je_sz_pind2sz_tab
$6 = {0x0 <repeats 72 times>}
(gdb) print/x __je_sz_size2index_tab
$7 = {0x0 <repeats 384 times>, 0x1a, 0x1b <repeats 64 times>, 0x1c <repeats 64 times>}

Showing where the zeroing stopped:

. . .
(gdb) x/156x __je_sz_size2index_tab
0x5035be80 <__je_sz_size2index_tab>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035be90 <__je_sz_size2index_tab+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bea0 <__je_sz_size2index_tab+32>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035beb0 <__je_sz_size2index_tab+48>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bec0 <__je_sz_size2index_tab+64>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bed0 <__je_sz_size2index_tab+80>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bee0 <__je_sz_size2index_tab+96>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bef0 <__je_sz_size2index_tab+112>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf00 <__je_sz_size2index_tab+128>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf10 <__je_sz_size2index_tab+144>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf20 <__je_sz_size2index_tab+160>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf30 <__je_sz_size2index_tab+176>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf40 <__je_sz_size2index_tab+192>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf50 <__je_sz_size2index_tab+208>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf60 <__je_sz_size2index_tab+224>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf70 <__je_sz_size2index_tab+240>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf80 <__je_sz_size2index_tab+256>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bf90 <__je_sz_size2index_tab+272>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bfa0 <__je_sz_size2index_tab+288>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bfb0 <__je_sz_size2index_tab+304>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bfc0 <__je_sz_size2index_tab+320>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bfd0 <__je_sz_size2index_tab+336>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bfe0 <__je_sz_size2index_tab+352>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035bff0 <__je_sz_size2index_tab+368>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035c000 <__je_sz_size2index_tab+384>: 0x1a1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
0x5035c010 <__je_sz_size2index_tab+400>: 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
0x5035c020 <__je_sz_size2index_tab+416>: 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
0x5035c030 <__je_sz_size2index_tab+432>: 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
0x5035c040 <__je_sz_size2index_tab+448>: 0x1b1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c
0x5035c050 <__je_sz_size2index_tab+464>: 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c
0x5035c060 <__je_sz_size2index_tab+480>: 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c
0x5035c070 <__je_sz_size2index_tab+496>: 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c
0x5035c080 <__je_sz_size2index_tab+512>: 0x1c000000 0x00000000 0x50303474 0x00000000
0x5035c090: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035c0a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035c0b0: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035c0c0: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035c0d0: 0x00000000 0x00000000 0x00000000 0x00000000
0x5035c0e0: 0x00000000 0x00000000 0x00000000 0x00000000

Again: a nice page boundary: 0x5035c000 for where the
zeros stop.

Note that, despite the address ranges shifting around
between programs, the same global variables are being
stomped on, stopping at the same index into
__je_sz_size2index_tab in each of the 3 programs. It
always is page aligned there in my context.

(The sshd example is different. I've yet to explore it
much.)



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
[More details for a sshd failure. The other
examples are omitted. The sshd failure also shows
a all-zeros-up-to-a-page-boundary issue, just for
a different address range.]

On 2020-May-7, at 12:06, Mark Millard <[hidden email]> wrote:

>
> [mountd failure example: also at sz_size2index_lookup assert
> for the same zero'd memory problem.]
>
>> On 2020-May-7, at 00:46, Mark Millard <[hidden email]> wrote:
>>
>> [__je_sz_size2index_tab seems messed up in 2 of the
>> asserting contexts: first 384 are zero in both. More
>> before that is also messed up (all zero). I show the
>> details later below.]
>>
>> On 2020-May-6, at 16:57, Mark Millard <marklmi at yahoo.com> wrote:
>>
>>> [This explores process crashes that happen during system
>>> shutdown, in a context not having MALLOC_PRODUCTION= .
>>> So assert failures are reported as the stopping points.]
>>>
>>> It looks like shutdown -p now, shutdown -r now, and the
>>> like can lead some processes to assert during their exit
>>> attempt, including a sshd failure (that I've not seen
>>> before), rpcbind, and nfsd. I show information about the
>>> observed asserts for those below.
>>>
>>>
>>> sshd hit an assert, failing slab == extent_slab_get(extent) :
>>>
>>> (gdb) bt
>>> #0  thr_kill () at thr_kill.S:4
>>> #1  0x50927170 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
>>> #2  0x50886cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
>>> #3  0x508834b0 in arena_dalloc (tsdn=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, alloc_ctx=<optimized out>, slow_path=<optimized out>)
>>>  at /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:315
>>> #4  idalloctm (tsdn=0x500dd040, ptr=0x5008a180, tcache=0x500dd160, alloc_ctx=<optimized out>, is_internal=<optimized out>, slow_path=<optimized out>)
>>>  at /usr/src/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h:118
>>> #5  0x5087b0a4 in ifree (tsd=0x500dd040, ptr=0x5008a180, tcache=0x500dd160, slow_path=<optimized out>) at jemalloc_jemalloc.c:2590
>>> #6  0x5087acac in __je_free_default (ptr=0x5008a180) at jemalloc_jemalloc.c:2784
>>> #7  0x5087b294 in __free (ptr=0x5008a180) at jemalloc_jemalloc.c:2852
>>> #8  0x10029464 in server_accept_loop (config_s=<optimized out>, sock_in=<optimized out>, sock_out=<optimized out>, newsock=<optimized out>) at /usr/src/crypto/openssh/sshd.c:1185
>>> #9  main (ac=<optimized out>, av=0xffffde3c) at /usr/src/crypto/openssh/sshd.c:2009
>>>
>>> . . .
>>> (gdb) up
>>> #2  0x50886cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
>>> 67 (void)raise(SIGABRT);
>>> (gdb) up
>>> #3  0x508834b0 in arena_dalloc (tsdn=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, alloc_ctx=<optimized out>, slow_path=<optimized out>)
>>>  at /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:315
>>> 315 assert(slab == extent_slab_get(extent));
>>>
>>> (gdb) list
>>> 310 rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
>>> 311 extent_t *extent = rtree_extent_read(tsdn, &extents_rtree,
>>> 312    rtree_ctx, (uintptr_t)ptr, true);
>>> 313 assert(szind == extent_szind_get(extent));
>>> 314 assert(szind < SC_NSIZES);
>>> 315 assert(slab == extent_slab_get(extent));
>>> 316 }
>>> 317
>>> 318 if (likely(slab)) {
>>> 319 /* Small allocation. */
>>>
>>> More fully:
>>>
>>> 285 JEMALLOC_ALWAYS_INLINE void
>>> 286 arena_dalloc(tsdn_t *tsdn, void *ptr, tcache_t *tcache,
>>> 287    alloc_ctx_t *alloc_ctx, bool slow_path) {
>>> 288 assert(!tsdn_null(tsdn) || tcache == NULL);
>>> 289 assert(ptr != NULL);
>>> 290
>>> 291 if (unlikely(tcache == NULL)) {
>>> 292 arena_dalloc_no_tcache(tsdn, ptr);
>>> 293 return;
>>> 294 }
>>> 295
>>> 296 szind_t szind;
>>> 297 bool slab;
>>> 298 rtree_ctx_t *rtree_ctx;
>>> 299 if (alloc_ctx != NULL) {
>>> 300 szind = alloc_ctx->szind;
>>> 301 slab = alloc_ctx->slab;
>>> 302 assert(szind != SC_NSIZES);
>>> 303 } else {
>>> 304 rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
>>> 305 rtree_szind_slab_read(tsdn, &extents_rtree, rtree_ctx,
>>> 306    (uintptr_t)ptr, true, &szind, &slab);
>>> 307 }
>>> 308
>>> 309 if (config_debug) {
>>> 310 rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
>>> 311 extent_t *extent = rtree_extent_read(tsdn, &extents_rtree,
>>> 312    rtree_ctx, (uintptr_t)ptr, true);
>>> 313 assert(szind == extent_szind_get(extent));
>>> 314 assert(szind < SC_NSIZES);
>>> 315 assert(slab == extent_slab_get(extent));
>>> 316 }
>>> 317
>>> 318 if (likely(slab)) {
>>> 319 /* Small allocation. */
>>> 320 tcache_dalloc_small(tsdn_tsd(tsdn), tcache, ptr, szind,
>>> 321    slow_path);
>>> 322 } else {
>>> 323 arena_dalloc_large(tsdn, ptr, tcache, szind, slow_path);
>>> 324 }
>>> 325 }
>>
>> . . .

The machine code for:

309 if (config_debug) {
310 rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
311 extent_t *extent = rtree_extent_read(tsdn, &extents_rtree,
312    rtree_ctx, (uintptr_t)ptr, true);
313 assert(szind == extent_szind_get(extent));
314 assert(szind < SC_NSIZES);
315 assert(slab == extent_slab_get(extent));
316 }

was dropping the address in "extent" the next instruction
after finding it: replacing with with a field's value.
So by the time the status of the assert could be known,
examining extent was a difficulty.

So I touched the source code to force the address to be
kept and to give a place to breakpoint on failure before
calling another routine:

# svnlite diff /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h
Index: /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h
===================================================================
--- /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h (revision 360322)
+++ /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h (working copy)
@@ -308,11 +308,11 @@
 
  if (config_debug) {
  rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
- extent_t *extent = rtree_extent_read(tsdn, &extents_rtree,
+ extent_t * volatile extent = rtree_extent_read(tsdn, &extents_rtree,
     rtree_ctx, (uintptr_t)ptr, true);
  assert(szind == extent_szind_get(extent));
  assert(szind < SC_NSIZES);
- assert(slab == extent_slab_get(extent));
+ assert((slab == extent_slab_get(extent)) ?true :extent==NULL);
  }
 
  if (likely(slab)) {


The ":extent==NULL" should be guaranteed to produce :false as a result
but with more code involved to get there. It gave me a place to breakpoint
on failure.

(gdb) bt -full
#0  0x50883258 in arena_dalloc (tsdn=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, alloc_ctx=<optimized out>, slow_path=<optimized out>)
    at /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:315
        extent = 0x51007fc0
        slab = <optimized out>
        szind = <optimized out>
        rtree_ctx = 0x500dd06c
        extent = <optimized out>
#1  idalloctm (tsdn=0x500dd040, ptr=0x5008a180, tcache=0x500dd160, alloc_ctx=<optimized out>, is_internal=<optimized out>, slow_path=<optimized out>)
    at /usr/src/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h:118
No locals.
#2  0x5087b0e4 in ifree (tsd=0x500dd040, ptr=0x5008a180, tcache=0x500dd160, slow_path=<optimized out>) at jemalloc_jemalloc.c:2590
        alloc_ctx = {szind = 0, slab = true}
        rtree_ctx = <optimized out>
        usize = <optimized out>
#3  0x5087acec in __je_free_default (ptr=0x5008a180) at jemalloc_jemalloc.c:2784
        tcache = 0x500dd160
        tsd = <optimized out>
#4  0x5087b2d4 in __free (ptr=0x5008a180) at jemalloc_jemalloc.c:2852
        log_var = <optimized out>
        log_var = <optimized out>
#5  0x10029464 in server_accept_loop (config_s=<optimized out>, sock_in=<optimized out>, sock_out=<optimized out>, newsock=<optimized out>) at /usr/src/crypto/openssh/sshd.c:1185
        from = {ss_len = 16 '\020', ss_family = 2 '\002', __ss_pad1 = "\317\200\300\250\001\031", __ss_align = 0,
          __ss_pad2 = "\000\000\000\000\000\000\000\002Am", '\000' <repeats 19 times>, "\065\341I\000\000\000\000^\264\234\331", '\000' <repeats 12 times>, "^\262\027\034-a\241H\000\000\000\000\000\000\000\000^\264\235Y,\024\247\030\000\000\000\000\000\000\000\000V\312\331f6-N\370\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\b"}
        rnd = '\000' <repeats 255 times>
        startup_p = {6, 7}
        startups = 1
        i = <optimized out>
        maxfd = 6
        fdset = <optimized out>
        fromlen = <optimized out>
        ret = <optimized out>
        j = <optimized out>
        pid = <optimized out>
        laddr = <optimized out>
        raddr = <optimized out>
#6  main (ac=<optimized out>, av=0xffffde3c) at /usr/src/crypto/openssh/sshd.c:2009
        config_s = {8, 9}
        on = 1
        ssh = 0x0
        logfile = <optimized out>
        newsock = -1
        sock_out = -1
        sock_in = -1
        connection_info = <optimized out>
        i = <optimized out>
        opt = <optimized out>
        line = <optimized out>
        r = <optimized out>
        key = <optimized out>
        pubkey = <optimized out>
        keytype = <optimized out>
        fp = <optimized out>
        j = <optimized out>
        new_umask = <optimized out>
        already_daemon = <optimized out>
        remote_port = <optimized out>
        remote_ip = <optimized out>
        rdomain = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        laddr = <optimized out>
        authctxt = <optimized out>
        obytes = <optimized out>
        ibytes = <optimized out>

(gdb) list
310 rtree_ctx = tsd_rtree_ctx(tsdn_tsd(tsdn));
311 extent_t * volatile extent = rtree_extent_read(tsdn, &extents_rtree,
312    rtree_ctx, (uintptr_t)ptr, true);
313 assert(szind == extent_szind_get(extent));
314 assert(szind < SC_NSIZES);
315 assert((slab == extent_slab_get(extent)) ?true :extent==NULL);
316 }
317
318 if (likely(slab)) {
319 /* Small allocation. */

(gdb) print/x extent
$6 = 0x51007fc0

(gdb) print/x *extent
$2 = {e_bits = 0x0, e_addr = 0x0, {e_size_esn = 0x0, e_bsize = 0x0}, ql_link = {qre_next = 0x0, qre_prev = 0x0}, ph_link = {phn_prev = 0x0, phn_next = 0x0, phn_lchild = 0x0}, {e_slab_data = {
      bitmap = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xfff8}}, {e_alloc_time = {
        ns = 0x0}, e_prof_tctx = {repr = 0x0}}}}

It looks like the prefix of the above has been stomped on
to be zeros. Checking addresses as well:

(gdb) x/64x extent
0x51007fc0: 0x00000000 0x00000000 0x00000000 0x00000000
0x51007fd0: 0x00000000 0x00000000 0x00000000 0x00000000
0x51007fe0: 0x00000000 0x00000000 0x00000000 0x00000000
0x51007ff0: 0x00000000 0x00000000 0x00000000 0x00000000
0x51008000: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
0x51008010: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
0x51008020: 0xffffffff 0xffffffff 0x0000fff8 0x00000000
0x51008030: 0x00000000 0x00000000 0x00000000 0x00000000
0x51008040: 0x00000800 0x0014f000 0x5008b000 0x00005000
0x51008050: 0x51008040 0x51008040 0x00000000 0x00000000
0x51008060: 0x00000000 0x00000000 0x00000000 0x00000000
0x51008070: 0x00000000 0x00000000 0x00000000 0x00000000
0x51008080: 0x00000000 0x00000000 0x00000000 0x00000000
0x51008090: 0x00000000 0x00000000 0x00000000 0x00000000
0x510080a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x510080b0: 0x00000000 0x00000000 0x00000000 0x00000000

Note that the non-zero values start at: 0x51008000, again a page
boundary, like the other examples in prior notes for other cases.
This appears to be another example of memory having been stomped
on/replaced that likely previously had the intended values.

The first prior non-zero values are at:

0x51005a30: 0x00000000 0x51008740 0x0000000f 0x01000000
0x51005a40: 0x51008740 0x0000000f 0x01000000 0x51008740
0x51005a50: 0x0000000f 0x01000000 0x51008740 0x0000000f
0x51005a60: 0x01000000 0x51008740 0x0000000f 0x01000000
0x51005a70: 0x51008740 0x0000000f 0x01000000 0x51008740
0x51005a80: 0x0000000f 0x01000000 0x51008940 0x00000014
0x51005a90: 0x01000000 0x510089c0 0x00000014 0x01000000
0x51005aa0: 0x51008a40 0x00000018 0x01000000 0x51008ac0
0x51005ab0: 0x00000018 0x01000000 0x51008b40 0x00000018
0x51005ac0: 0x01000000 0x51008bc0 0x00000018 0x01000000
0x51005ad0: 0x51008c40 0x00000018 0x01000000 0x51008cc0
0x51005ae0: 0x00000001 0x01000000 0x00000000 0x00000000

So the pages for address ranges 0x51006yyy and 0x51007yyy seem to be
all-zero, along with the tail of the page for the range 0x51005yyy.

#2 (ifree) in the backtrace shows the alloc_ctx content:

         alloc_ctx = {szind = 0, slab = true}

So slab==true but the bit in extent->e_bits==false, leading to
the failed assert.

Before going to sleep for the night, I could ssh into the
old PowerMac without this detection. After getting up, trying
the same got the failure detection. I did not have the machine
doing anything else special between.

The other examples in the other programs are similar: just
waiting long enough with normal background processing going
on eventually leads to the context for the next explicit use
(or exit) to detect the problem.

I'm still no where near identifying when the stomped-on memory
range is trashed with zeros in code terms in any example
program.

But sshd, dhclient, rpcbind, nfsd, and sendmail all seem to
have some common subject area(s) involved in their
implementation. So I suspect something common across those
is essentially involved.


For reference:

In #0:

(gdb) print/x rtree_ctx
$5 = 0x500dd06c

(gdb) print/x *rtree_ctx
$4 = {cache = {{leafkey = 0x50000000, leaf = 0x51004fc0}, {leafkey = 0x1, leaf = 0x0}, {leafkey = 0x1, leaf = 0x0}, {leafkey = 0x50c00000, leaf = 0x51008dc0}, {leafkey = 0x1,
      leaf = 0x0} <repeats 12 times>}, l2_cache = {{leafkey = 0x1, leaf = 0x0}, {leafkey = 0x1, leaf = 0x0}, {leafkey = 0x1, leaf = 0x0}, {leafkey = 0x1, leaf = 0x0}, {leafkey = 0x1, leaf = 0x0}, {
      leafkey = 0x1, leaf = 0x0}, {leafkey = 0x1, leaf = 0x0}, {leafkey = 0x1, leaf = 0x0}}}



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
In reply to this post by freebsd-ppc mailing list
[I caused nfsd to having things shifted in mmeory some to
see it it tracked content vs. page boundary for where the
zeros stop. Non-nfsd examples omitted.]

> . . .
>> nfsd hit an assert, failing ret == sz_size2index_compute(size)
>
> [Correction: That should have referenced sz_index2size_lookup(index).]
>
>> (also, but a different caller of sz_size2index):
>
> [Correction: The "also" comment should be ignored:
> sz_index2size_lookup(index) is referenced below.]
>
>>
>> (gdb) bt
>> #0  thr_kill () at thr_kill.S:4
>> #1  0x502b2170 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
>> #2  0x50211cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
>> #3  0x50206104 in sz_index2size_lookup (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200
>> #4  sz_index2size (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:207
>> #5  ifree (tsd=0x50094018, ptr=0x50041028, tcache=0x50094138, slow_path=<optimized out>) at jemalloc_jemalloc.c:2583
>> #6  0x50205cac in __je_free_default (ptr=0x50041028) at jemalloc_jemalloc.c:2784
>> #7  0x50206294 in __free (ptr=0x50041028) at jemalloc_jemalloc.c:2852
>> #8  0x50287ec8 in ns_src_free (src=0x50329004, srclistsize=<optimized out>) at /usr/src/lib/libc/net/nsdispatch.c:452
>> #9  ns_dbt_free (dbt=0x50329000) at /usr/src/lib/libc/net/nsdispatch.c:436
>> #10 vector_free (vec=0x50329000, count=<optimized out>, esize=12, free_elem=<optimized out>) at /usr/src/lib/libc/net/nsdispatch.c:253
>> #11 nss_atexit () at /usr/src/lib/libc/net/nsdispatch.c:578
>> #12 0x5028d958 in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:240
>> #13 0x502117f8 in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:74
>> #14 0x10013f9c in child_cleanup (signo=<optimized out>) at /usr/src/usr.sbin/nfsd/nfsd.c:969
>> #15 <signal handler called>
>> #16 0x00000000 in ?? ()
>>
>> (gdb) up 3
>> #3  0x50206104 in sz_index2size_lookup (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200
>> 200 assert(ret == sz_index2size_compute(index));
>>
>> (ret is optimized out.)
>>
>> 197 JEMALLOC_ALWAYS_INLINE size_t
>> 198 sz_index2size_lookup(szind_t index) {
>> 199 size_t ret = (size_t)sz_index2size_tab[index];
>> 200 assert(ret == sz_index2size_compute(index));
>> 201 return ret;
>> 202 }
>
> (gdb) print/x __je_sz_index2size_tab
> $3 = {0x0 <repeats 104 times>}
>
> Also:
>
> (gdb) x/4x __je_arenas+16368/4
> 0x5030cab0 <__je_arenas+16368>: 0x00000000 0x00000000 0x00000000 0x00000000
> (gdb) print/x __je_arenas_lock                                                                                                                
> $8 = {{{prof_data = {tot_wait_time = {ns = 0x0}, max_wait_time = {ns = 0x0}, n_wait_times = 0x0, n_spin_acquired = 0x0, max_n_thds = 0x0, n_waiting_thds = {repr = 0x0}, n_owner_switches = 0x0,
>       prev_owner = 0x0, n_lock_ops = 0x0}, lock = 0x0, postponed_next = 0x0, locked = {repr = 0x0}}}, witness = {name = 0x0, rank = 0x0, comp = 0x0, opaque = 0x0, link = {qre_next = 0x0,
>     qre_prev = 0x0}}, lock_order = 0x0}
> (gdb) print/x __je_narenas_auto
> $9 = 0x0
> (gdb) print/x malloc_conf      
> $10 = 0x0
> (gdb) print/x __je_ncpus
> $11 = 0x0
> (gdb) print/x __je_manual_arena_base
> $12 = 0x0
> (gdb) print/x __je_sz_pind2sz_tab  
> $13 = {0x0 <repeats 72 times>}
> (gdb) print/x __je_sz_size2index_tab
> $1 = {0x0 <repeats 384 times>, 0x1a, 0x1b <repeats 64 times>, 0x1c <repeats 64 times>}
>
>> Booting and immediately trying something like:
>>
>> service nfsd stop
>>
>> did not lead to a failure. But may be after
>> a while it would and be less drastic than a
>> reboot or power down.
>
> More detail:
>
> So, for rpcbind and nfds at some point a large part of
> __je_sz_size2index_tab is being stomped on, as is all of
> __je_sz_index2size_tab and more.
>
> . . .
>
> For nfsd, it is similar (again showing the partially
> non-zero live process context instead of the all-zeros
> from the .core file):
>
> 0x5030cab0 <__je_arenas+16368>: 0x00000000 0x00000000 0x00000000 0x00000009
> 0x5030cac0 <__je_arenas_lock>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5030cad0 <__je_arenas_lock+16>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5030cae0 <__je_arenas_lock+32>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5030caf0 <__je_arenas_lock+48>: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x5030cb00 <__je_arenas_lock+64>: 0x00000000 0x502ff070 0x00000000 0x00000000
> 0x5030cb10 <__je_arenas_lock+80>: 0x500ebb04 0x00000003 0x00000000 0x00000000
> 0x5030cb20 <__je_arenas_lock+96>: 0x5030cb10 0x5030cb10 0x00000000 0x00000000
>
> Then the memory in the crash continues to be zero until:
>
> 0x5030d000 <__je_sz_size2index_tab+384>: 0x1a1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
>
> Notice the interesting page boundary for where non-zero
> is first available again!
>
> Between __je_arenas_lock and __je_sz_size2index_tab are:
>
> 0x5030cb30 __je_narenas_auto
> 0x5030cb38 malloc_conf
> 0x5030cb3c __je_ncpus
> 0x5030cb40 __je_manual_arena_base
> 0x5030cb80 __je_sz_pind2sz_tab
> 0x5030ccc0 __je_sz_index2size_tab
> 0x5030ce80 __je_sz_size2index_tab
>
>
> Note: because __je_arenas is normally
> mostly zero for these contexts, I can
> not tell where the memory trashing
> started, only where it replaced non-zero
> values with zeros.
> . . .

I caused the memory content to have shifted some in nfsd.
The resultant zeros-stop-at from the failure look like:

(gdb) x/128x __je_sz_size2index_tab
0x5030cf00 <__je_sz_size2index_tab>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf10 <__je_sz_size2index_tab+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf20 <__je_sz_size2index_tab+32>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf30 <__je_sz_size2index_tab+48>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf40 <__je_sz_size2index_tab+64>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf50 <__je_sz_size2index_tab+80>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf60 <__je_sz_size2index_tab+96>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf70 <__je_sz_size2index_tab+112>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf80 <__je_sz_size2index_tab+128>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cf90 <__je_sz_size2index_tab+144>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cfa0 <__je_sz_size2index_tab+160>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cfb0 <__je_sz_size2index_tab+176>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cfc0 <__je_sz_size2index_tab+192>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cfd0 <__je_sz_size2index_tab+208>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cfe0 <__je_sz_size2index_tab+224>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030cff0 <__je_sz_size2index_tab+240>: 0x00000000 0x00000000 0x00000000 0x00000000
0x5030d000 <__je_sz_size2index_tab+256>: 0x18191919 0x19191919 0x19191919 0x19191919
0x5030d010 <__je_sz_size2index_tab+272>: 0x19191919 0x19191919 0x19191919 0x19191919
0x5030d020 <__je_sz_size2index_tab+288>: 0x19191919 0x19191919 0x19191919 0x19191919
0x5030d030 <__je_sz_size2index_tab+304>: 0x19191919 0x19191919 0x19191919 0x19191919
0x5030d040 <__je_sz_size2index_tab+320>: 0x191a1a1a 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a
0x5030d050 <__je_sz_size2index_tab+336>: 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a
0x5030d060 <__je_sz_size2index_tab+352>: 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a
0x5030d070 <__je_sz_size2index_tab+368>: 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a
0x5030d080 <__je_sz_size2index_tab+384>: 0x1a1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
0x5030d090 <__je_sz_size2index_tab+400>: 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
0x5030d0a0 <__je_sz_size2index_tab+416>: 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
0x5030d0b0 <__je_sz_size2index_tab+432>: 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b
0x5030d0c0 <__je_sz_size2index_tab+448>: 0x1b1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c
0x5030d0d0 <__je_sz_size2index_tab+464>: 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c
0x5030d0e0 <__je_sz_size2index_tab+480>: 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c
0x5030d0f0 <__je_sz_size2index_tab+496>: 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c

So, it is the page boundary that it tracks, not the detailed
placement of the memory contents.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
[A new kind of experiment and partial results.]

Given the zero'ed memory page(s) that for some of
the example contexts include a page that should not
be changing after initialization in my context
(jemalloc global variables), I have attempted the
following for such examples:

A) Run gdb
B) Attach to one of the live example processes.
C) Check that the page is not zeroed yet.
   (print/x __je_sz_size2index_tab)
D) protect the page containing the start
   of __je_sz_size2index_tab, using 0x1
   as the PROT_READ mask.
   (print (int)mprotect(ADDRESS,1,0x1))
E) detach.

The hope was to discover which of the following
was involved:

A) user-space code trying to write the page should
   get a SIGSEGV. In this case I'd likely be able
   to see what code was attempting the write.

B) kernel-code doing something odd to the content
   or mapping of memory would not (or need not)
   lead to SIGSEGV. In this case I'd be unlikely
   to see what code lead to the zeros on the page.

So far I've gotten only one failure example, nfsd
during its handling of a SIGUSR1. Previous nfs
mounts and dismounts worked fine, not asserting,
indicating that at the time the page was not
zeroed.

I got no evidence of SIGSEGV from an attempted user
space write to the page. But the nfsd.core shows the
page as zeroed and the assert having caused abort().
That suggests the kernel side of things for what
leads to the zeros.

It turns out that just before the "unregsiteration()"
activity is "killchildren()" activity:

(gdb) list
971
972 static void
973 nfsd_exit(int status)
974 {
975 killchildren();
976 unregistration();
977 exit(status);
978 }

(frame #12) used via:

(gdb) list cleanup
954 /*
955 * Cleanup master after SIGUSR1.
956 */
957 static void
958 cleanup(__unused int signo)
959 {
960 nfsd_exit(0);
961 }
. . .

and (for master):

        (void)signal(SIGUSR1, cleanup);

This suggests the possibility that the zero'd
pages could be associated with killing the
child processes. (I've had a past aarch64
context where forking had problems with
pages that were initially common to parent
and child processes. In that context having
the processes swap out [not just mostly
paged out] and then swap back in was
involved in showing the problem. The issue
was fixed and was aarch64 specific. But it
leaves me willing to consider fork-related
memory management as possibly odd in some
way for 32-bit powerpc.)


Notes . . .

Another possible kind of evidence: I've gone far
longer with the machine doing just normal background
processing with nothing failing on its own. This
suggests that the (int)mprotect(ADDRESS,1,0x1) might
be changing the context --or just doing the attach
and detach in gdb does. I've nothing solid in this
area so I'll ignore it, other than this note.



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
[Yet another new kind of experiment. But this looks
like I can cause problems in fairly sort order on
demand now. Finally! And with that I've much better
evidence for kernel vs. user-space process for making
the zeroed memory appear in, for example, nfsd.]

I've managed to get:

<jemalloc>: /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: Failed assertion: "slab == extent_slab_get(extent)"
<jemalloc>: /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: Failed assertion: "slab == extent_slab_get(extent)"

and eventually:

[1]   Segmentation fault (core dumped) stress -m 2 --vm-bytes 1700M

from a user program (stress) while another machine was attempted an nfs
mount during the stress activity:

# mount -onoatime,soft ...:/ /mnt && umount /mnt && rpcinfo -s ...
[tcp] ...:/: RPCPROG_MNT: RPC: Timed out

(To get failure I may have to run the commands
multiple times. Timing details against stress's
activity seem to matter.)

That failure lead to:

# ls -ldT /*.core*
-rw-------  1 root  wheel  3899392 May 12 19:52:26 2020 /mountd.core

# ls -ldT *.core*
-rw-------  1 root  wheel  2682880 May 12 20:00:26 2020 stress.core

(Note which of nfsd, mountd, or rpcbind need not be
fully repeatable. stress.core seems to be written twice,
probably because of the -m 2 in use.)

The context that let me do this was to first (on the 2 socket
G4 with a full 2048 MiBYte RAM configuration):

stress -m 2 --vm-bytes 1700M &

Note that the stress command keeps the memory busy
and causes paging to the swap/page space. I've not
tried to make it just fit without paging or just
barely paging or such. The original context did not
involve paging or low RAM, so I do not expect paging
to be required but can be involved.

The stress program backtrace is different:

4827 return (tls_get_addr_slow(dtvp, index, offset));
4828 }
(gdb) bt -full
#0  0x41831b04 in tls_get_addr_common (dtvp=0x4186c010, index=2, offset=4294937444) at /usr/src/libexec/rtld-elf/rtld.c:4824
        dtv = 0x0
#1  0x4182bfcc in __tls_get_addr (ti=<optimized out>) at /usr/src/libexec/rtld-elf/powerpc/reloc.c:848
        tp = <optimized out>
        p = <optimized out>
#2  0x41a83464 in __get_locale () at /usr/src/lib/libc/locale/xlocale_private.h:199
No locals.
#3  fprintf (fp=0x41b355f8, fmt=0x1804cbc "%s: FAIL: [%lli] (%d) ") at /usr/src/lib/libc/stdio/fprintf.c:57
        ap = {{gpr = 2 '\002', fpr = 0 '\000', reserved = 20731, overflow_arg_area = 0xffffdb78, reg_save_area = 0xffffdae8}}
        ret = <optimized out>
#4  0x01802348 in main (argc=<optimized out>, argv=<optimized out>) at stress.c:415
        status = <optimized out>
        ret = 6
        do_dryrun = 0
        retval = 0
        children = 1
        do_backoff = <optimized out>
        do_hdd_bytes = <optimized out>
        do_hdd = <optimized out>
        do_vm_keep = 0
        do_vm_hang = -1
        do_vm_stride = 4096
        do_vm_bytes = 1782579200
        do_vm = 108174317627375616
        do_io = <optimized out>
        do_cpu = <optimized out>
        do_timeout = 108176117243859333
        starttime = 1589338322
        i = <optimized out>
        forks = <optimized out>
        pid = 6140
        stoptime = <optimized out>
        runtime = <optimized out>

Apparently the asserts did not stop the code
and it ran until a failure occurred (via
dtv=0x0).

Stress uses a mutex stored on a page that gets
the "turns into zeros" problem, preventing
the mprotect(ADDR,1,1) type of test because
stress will write on the page. (I've not tried
to find a minimal form of stress run.)

The the same sort of globals are again
zeroed, such as:

(gdb) print/x __je_sz_size2index_tab
$1 = {0x0 <repeats 513 times>}


Another attempt lost rpcbind instead instead of
mountd:

# ls -ldT /*.core
-rw-------  1 root  wheel  3899392 May 12 19:52:26 2020 /mountd.core
-rw-------  1 root  wheel  3170304 May 12 20:03:00 2020 /rpcbind.core


I again find that when I use gdb 3 times
to:

attach ???
x/x __je_sz_size2index_tab
print (int)mprotext(ADDRESS,1,1)
quit

for each of rpcbind, mountd, and nfsd master that
those processes no longer fail during the
mount/umount/rpcinfo (or are far less likely to).

But it turns out that later when I "service nfsd
stop" nfsd does get the zeroed Memory based assert
and core dumps. (I'd done a bunch of the mount/umount/
rpcinfo sequences before the stop.)

That the failure is during SIGUSR1 based shutdown,
leads me to wonder if killing off some child
process(es) is involved in the problem.

There was *no* evidence of a signal for an attempt
to write the page from the user process. It appears
that the kernel is doing something that changes what
the process sees --instead of the user-space programs
stomping on its own memory content.

I've no clue how to track down the kernel activity
that changes what the process sees on some page(s)
of memory.

(Prior testing with a debug kernel did not report
problems, despite getting an example failure. So
that seems insufficient.)

At least a procedure is now known that does not
involved waiting hours or days.


The procedure (adjusted for how much RAM is present
and number of cpus/cores?) could be appropriate to
run in other contexts than the 32-bit powerpc G4.
Part of the context likely should be not using
MALLOC_PRODUCTION --so problems would be detected
sooner via the asserts in jemalloc.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
[stress alone is sufficient to have jemalloc asserts fail
in stress, no need for a multi-socket G4 either. No need
to involve nfsd, mountd, rpcbind or the like. This is not
a claim that I know all the problems to be the same, just
that a jemalloc reported failure in this simpler context
happens and zeroed pages are involved.]

Reminder: head -r360311 based context.


First I show a single CPU/core PowerMac G4 context failing
in stress. (I actually did this later, but it is the
simpler context.) I simply moved the media from the
2-socket G4 to this slower, single-cpu/core one.

cpu0: Motorola PowerPC 7400 revision 2.9, 466.42 MHz
cpu0: Features 9c000000<PPC32,ALTIVEC,FPU,MMU>
cpu0: HID0 8094c0a4<EMCP,DOZE,DPM,EIEC,ICE,DCE,SGE,BTIC,BHT>
real memory  = 1577857024 (1504 MB)
avail memory = 1527508992 (1456 MB)

# stress -m 1 --vm-bytes 1792M
stress: info: [1024] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
<jemalloc>: /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: Failed assertion: "slab == extent_slab_get(extent)"
stress: FAIL: [1024] (415) <-- worker 1025 got signal 6
stress: WARN: [1024] (417) now reaping child worker processes
stress: FAIL: [1024] (451) failed run completed in 243s

(Note: 1792 is the biggest it allowed with M.)

The following still pages in and out and fails:

# stress -m 1 --vm-bytes 1290M
stress: info: [1163] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
<jemalloc>: /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: Failed assertion: "slab == extent_slab_get(extent)"
. . .

By contrast, the following had no problem for as
long as I let it run --and did not page in or out:

# stress -m 1 --vm-bytes 1280M
stress: info: [1181] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd




The 2 socket PowerMac G4 context with 2048 MiByte of RAM . . .

stress -m 1 --vm-bytes 1792M

did not (quickly?) fail or page. 1792
is as large as it would allow with M.

The following also did not (quickly?) fail
(and were not paging):

stress -m 2 --vm-bytes 896M
stress -m 4 --vm-bytes 448M
stress -m 8 --vm-bytes 224M

(Only 1 example was run at a time.)

But the following all did quickly fail (and were
paging):

stress -m 8 --vm-bytes 225M
stress -m 4 --vm-bytes 449M
stress -m 2 --vm-bytes 897M

(Only 1 example was run at a time.)

I'll note that when I exited an su process
I ended up with a:

<jemalloc>: /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200: Failed assertion: "ret == sz_index2size_compute(index)"
Abort trap (core dumped)

and a matching su.core file. It appears
that stress's activity leads to other
processes also seeing examples of the
zeroed-page(s) problem (probably su had
paged some or had been fully swapped
out):

(gdb) bt
#0  thr_kill () at thr_kill.S:4
#1  0x503821d0 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x502e1d20 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
#3  0x502d6144 in sz_index2size_lookup (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200
#4  sz_index2size (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:207
#5  ifree (tsd=0x5008b018, ptr=0x50041460, tcache=0x5008b138, slow_path=<optimized out>) at jemalloc_jemalloc.c:2583
#6  0x502d5cec in __je_free_default (ptr=0x50041460) at jemalloc_jemalloc.c:2784
#7  0x502d62d4 in __free (ptr=0x50041460) at jemalloc_jemalloc.c:2852
#8  0x501050cc in openpam_destroy_chain (chain=0x50041480) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:113
#9  0x50105094 in openpam_destroy_chain (chain=0x500413c0) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
#10 0x50105094 in openpam_destroy_chain (chain=0x50041320) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
#11 0x50105094 in openpam_destroy_chain (chain=0x50041220) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
#12 0x50105094 in openpam_destroy_chain (chain=0x50041120) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
#13 0x50105094 in openpam_destroy_chain (chain=0x50041100) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
#14 0x50105014 in openpam_clear_chains (policy=0x50600004) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:130
#15 0x50101230 in pam_end (pamh=0x50600000, status=<optimized out>) at /usr/src/contrib/openpam/lib/libpam/pam_end.c:83
#16 0x1001225c in main (argc=<optimized out>, argv=0x0) at /usr/src/usr.bin/su/su.c:477

(gdb) print/x __je_sz_size2index_tab
$1 = {0x0 <repeats 513 times>}


Notes:

Given that the original problem did not involve
paging to the swap partition, may be just making
it to the Laundry list or some such is sufficient,
something that is also involved when the swap
space is partially in use (according to top). Or
sitting in the inactive list for a long time, if
that has some special status.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list
[I'm adding a reference to an old arm64/aarch64 bug that had
pages turning to zero, in case this 32-bit powerpc issue is
somewhat analogous.]

On 2020-May-13, at 00:29, Mark Millard <marklmi at yahoo.com> wrote:

> [stress alone is sufficient to have jemalloc asserts fail
> in stress, no need for a multi-socket G4 either. No need
> to involve nfsd, mountd, rpcbind or the like. This is not
> a claim that I know all the problems to be the same, just
> that a jemalloc reported failure in this simpler context
> happens and zeroed pages are involved.]
>
> Reminder: head -r360311 based context.
>
>
> First I show a single CPU/core PowerMac G4 context failing
> in stress. (I actually did this later, but it is the
> simpler context.) I simply moved the media from the
> 2-socket G4 to this slower, single-cpu/core one.
>
> cpu0: Motorola PowerPC 7400 revision 2.9, 466.42 MHz
> cpu0: Features 9c000000<PPC32,ALTIVEC,FPU,MMU>
> cpu0: HID0 8094c0a4<EMCP,DOZE,DPM,EIEC,ICE,DCE,SGE,BTIC,BHT>
> real memory  = 1577857024 (1504 MB)
> avail memory = 1527508992 (1456 MB)
>
> # stress -m 1 --vm-bytes 1792M
> stress: info: [1024] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
> <jemalloc>: /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: Failed assertion: "slab == extent_slab_get(extent)"
> stress: FAIL: [1024] (415) <-- worker 1025 got signal 6
> stress: WARN: [1024] (417) now reaping child worker processes
> stress: FAIL: [1024] (451) failed run completed in 243s
>
> (Note: 1792 is the biggest it allowed with M.)
>
> The following still pages in and out and fails:
>
> # stress -m 1 --vm-bytes 1290M
> stress: info: [1163] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
> <jemalloc>: /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: Failed assertion: "slab == extent_slab_get(extent)"
> . . .
>
> By contrast, the following had no problem for as
> long as I let it run --and did not page in or out:
>
> # stress -m 1 --vm-bytes 1280M
> stress: info: [1181] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
>
>
>
>
> The 2 socket PowerMac G4 context with 2048 MiByte of RAM . . .
>
> stress -m 1 --vm-bytes 1792M
>
> did not (quickly?) fail or page. 1792
> is as large as it would allow with M.
>
> The following also did not (quickly?) fail
> (and were not paging):
>
> stress -m 2 --vm-bytes 896M
> stress -m 4 --vm-bytes 448M
> stress -m 8 --vm-bytes 224M
>
> (Only 1 example was run at a time.)
>
> But the following all did quickly fail (and were
> paging):
>
> stress -m 8 --vm-bytes 225M
> stress -m 4 --vm-bytes 449M
> stress -m 2 --vm-bytes 897M
>
> (Only 1 example was run at a time.)
>
> I'll note that when I exited an su process
> I ended up with a:
>
> <jemalloc>: /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200: Failed assertion: "ret == sz_index2size_compute(index)"
> Abort trap (core dumped)
>
> and a matching su.core file. It appears
> that stress's activity leads to other
> processes also seeing examples of the
> zeroed-page(s) problem (probably su had
> paged some or had been fully swapped
> out):
>
> (gdb) bt
> #0  thr_kill () at thr_kill.S:4
> #1  0x503821d0 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
> #2  0x502e1d20 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
> #3  0x502d6144 in sz_index2size_lookup (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200
> #4  sz_index2size (index=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:207
> #5  ifree (tsd=0x5008b018, ptr=0x50041460, tcache=0x5008b138, slow_path=<optimized out>) at jemalloc_jemalloc.c:2583
> #6  0x502d5cec in __je_free_default (ptr=0x50041460) at jemalloc_jemalloc.c:2784
> #7  0x502d62d4 in __free (ptr=0x50041460) at jemalloc_jemalloc.c:2852
> #8  0x501050cc in openpam_destroy_chain (chain=0x50041480) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:113
> #9  0x50105094 in openpam_destroy_chain (chain=0x500413c0) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
> #10 0x50105094 in openpam_destroy_chain (chain=0x50041320) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
> #11 0x50105094 in openpam_destroy_chain (chain=0x50041220) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
> #12 0x50105094 in openpam_destroy_chain (chain=0x50041120) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
> #13 0x50105094 in openpam_destroy_chain (chain=0x50041100) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111
> #14 0x50105014 in openpam_clear_chains (policy=0x50600004) at /usr/src/contrib/openpam/lib/libpam/openpam_load.c:130
> #15 0x50101230 in pam_end (pamh=0x50600000, status=<optimized out>) at /usr/src/contrib/openpam/lib/libpam/pam_end.c:83
> #16 0x1001225c in main (argc=<optimized out>, argv=0x0) at /usr/src/usr.bin/su/su.c:477
>
> (gdb) print/x __je_sz_size2index_tab
> $1 = {0x0 <repeats 513 times>}
>
>
> Notes:
>
> Given that the original problem did not involve
> paging to the swap partition, may be just making
> it to the Laundry list or some such is sufficient,
> something that is also involved when the swap
> space is partially in use (according to top). Or
> sitting in the inactive list for a long time, if
> that has some special status.
>

The following is was a fix for a "pages magically
turn into zeros" problem on amd64/aarch64. The
original 32-bit powerpc context did not seem a
match to me --but the stress test behavior that
I've just observed seems closer from an
external-test point of view: swapping is involved.

My be this will suggest something to someone that
knows what they are doing.

(Note: dsl-only.net closed down, so the E-mail
address reference is no longer valid.)

Author: kib
Date: Mon Apr 10 15:32:26 2017
New Revision: 316679
URL:
https://svnweb.freebsd.org/changeset/base/316679


Log:
 Do not lose dirty bits for removing PROT_WRITE on arm64.

 Arm64 pmap interprets accessed writable ptes as modified, since
 ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable bit
 is removed, page must be marked as dirty for MI VM.

 This change is most important for COW, where fork caused losing
 content of the dirty pages which were not yet scanned by pagedaemon.

 Reviewed by: alc, andrew
 Reported and tested by: Mark Millard <markmi at dsl-only.net>
 PR: 217138, 217239
 Sponsored by: The FreeBSD Foundation
 MFC after: 2 weeks

Modified:
 head/sys/arm64/arm64/pmap.c

Modified: head/sys/arm64/arm64/pmap.c
==============================================================================
--- head/sys/arm64/arm64/pmap.c Mon Apr 10 12:35:58 2017 (r316678)
+++ head/sys/arm64/arm64/pmap.c Mon Apr 10 15:32:26 2017 (r316679)
@@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sv
                    sva += L3_SIZE) {
                        l3 = pmap_load(l3p);
                        if (pmap_l3_valid(l3)) {
+ if ((l3 & ATTR_SW_MANAGED) &&
+    pmap_page_dirty(l3)) {
+ vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
+    ~ATTR_MASK));
+ }
                                pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
                                PTE_SYNC(l3p);
                                /* XXX: Use pmap_invalidate_range */


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

Justin Hibbits-3
Hi Mark,

On Wed, 13 May 2020 01:43:23 -0700
Mark Millard <[hidden email]> wrote:

> [I'm adding a reference to an old arm64/aarch64 bug that had
> pages turning to zero, in case this 32-bit powerpc issue is
> somewhat analogous.]
>
> On 2020-May-13, at 00:29, Mark Millard <marklmi at yahoo.com> wrote:
>
> > [stress alone is sufficient to have jemalloc asserts fail
> > in stress, no need for a multi-socket G4 either. No need
> > to involve nfsd, mountd, rpcbind or the like. This is not
> > a claim that I know all the problems to be the same, just
> > that a jemalloc reported failure in this simpler context
> > happens and zeroed pages are involved.]
> >
> > Reminder: head -r360311 based context.
> >
> >
> > First I show a single CPU/core PowerMac G4 context failing
> > in stress. (I actually did this later, but it is the
> > simpler context.) I simply moved the media from the
> > 2-socket G4 to this slower, single-cpu/core one.
> >
> > cpu0: Motorola PowerPC 7400 revision 2.9, 466.42 MHz
> > cpu0: Features 9c000000<PPC32,ALTIVEC,FPU,MMU>
> > cpu0: HID0 8094c0a4<EMCP,DOZE,DPM,EIEC,ICE,DCE,SGE,BTIC,BHT>
> > real memory  = 1577857024 (1504 MB)
> > avail memory = 1527508992 (1456 MB)
> >
> > # stress -m 1 --vm-bytes 1792M
> > stress: info: [1024] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
> > <jemalloc>:
> > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:
> > Failed assertion: "slab == extent_slab_get(extent)" stress: FAIL:
> > [1024] (415) <-- worker 1025 got signal 6 stress: WARN: [1024]
> > (417) now reaping child worker processes stress: FAIL: [1024] (451)
> > failed run completed in 243s
> >
> > (Note: 1792 is the biggest it allowed with M.)
> >
> > The following still pages in and out and fails:
> >
> > # stress -m 1 --vm-bytes 1290M
> > stress: info: [1163] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
> > <jemalloc>:
> > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:
> > Failed assertion: "slab == extent_slab_get(extent)" . . .
> >
> > By contrast, the following had no problem for as
> > long as I let it run --and did not page in or out:
> >
> > # stress -m 1 --vm-bytes 1280M
> > stress: info: [1181] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
> >
...

> The following is was a fix for a "pages magically
> turn into zeros" problem on amd64/aarch64. The
> original 32-bit powerpc context did not seem a
> match to me --but the stress test behavior that
> I've just observed seems closer from an
> external-test point of view: swapping is involved.
>
> My be this will suggest something to someone that
> knows what they are doing.
>
> (Note: dsl-only.net closed down, so the E-mail
> address reference is no longer valid.)
>
> Author: kib
> Date: Mon Apr 10 15:32:26 2017
> New Revision: 316679
> URL:
> https://svnweb.freebsd.org/changeset/base/316679
>
>
> Log:
>  Do not lose dirty bits for removing PROT_WRITE on arm64.
>
>  Arm64 pmap interprets accessed writable ptes as modified, since
>  ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable
> bit is removed, page must be marked as dirty for MI VM.
>
>  This change is most important for COW, where fork caused losing
>  content of the dirty pages which were not yet scanned by pagedaemon.
>
>  Reviewed by: alc, andrew
>  Reported and tested by: Mark Millard <markmi at dsl-only.net>
>  PR: 217138, 217239
>  Sponsored by: The FreeBSD Foundation
>  MFC after: 2 weeks
>
> Modified:
>  head/sys/arm64/arm64/pmap.c
>
> Modified: head/sys/arm64/arm64/pmap.c
> ==============================================================================
> --- head/sys/arm64/arm64/pmap.c Mon Apr 10 12:35:58
> 2017 (r316678) +++ head/sys/arm64/arm64/pmap.c Mon Apr
> 10 15:32:26 2017 (r316679) @@ -2481,6 +2481,11 @@
> pmap_protect(pmap_t pmap, vm_offset_t sv sva += L3_SIZE) {
> l3 = pmap_load(l3p);
> if (pmap_l3_valid(l3)) {
> + if ((l3 & ATTR_SW_MANAGED) &&
> +    pmap_page_dirty(l3)) {
> +
> vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
> +    ~ATTR_MASK));
> + }
> pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
> PTE_SYNC(l3p);
> /* XXX: Use pmap_invalidate_range */
>
>
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
>

Thanks for this reference.  I took a quick look at the 3 pmap
implementations we have (haven't check the new radix pmap yet), and it
looks like only mmu_oea.c (32-bit AIM pmap, for G3 and G4) is missing
vm_page_dirty() calls in its pmap_protect() implementation, analogous
to the change you posted right above. Given this, I think it's safe to
say that this missing piece is necessary.  We'll work on a fix for
this; looking at moea64_protect(), there may be additional work needed
to support this as well, so it may take a few days.

- Justin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list


On 2020-May-13, at 08:56, Justin Hibbits <[hidden email]> wrote:

> Hi Mark,

Hello Justin.

> On Wed, 13 May 2020 01:43:23 -0700
> Mark Millard <[hidden email]> wrote:
>
>> [I'm adding a reference to an old arm64/aarch64 bug that had
>> pages turning to zero, in case this 32-bit powerpc issue is
>> somewhat analogous.]
>>
>>> . . .
> ...
>> . . .
>>
>> (Note: dsl-only.net closed down, so the E-mail
>> address reference is no longer valid.)
>>
>> Author: kib
>> Date: Mon Apr 10 15:32:26 2017
>> New Revision: 316679
>> URL:
>> https://svnweb.freebsd.org/changeset/base/316679
>>
>>
>> Log:
>> Do not lose dirty bits for removing PROT_WRITE on arm64.
>>
>> Arm64 pmap interprets accessed writable ptes as modified, since
>> ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable
>> bit is removed, page must be marked as dirty for MI VM.
>>
>> This change is most important for COW, where fork caused losing
>> content of the dirty pages which were not yet scanned by pagedaemon.
>>
>> Reviewed by: alc, andrew
>> Reported and tested by: Mark Millard <markmi at dsl-only.net>
>> PR: 217138, 217239
>> Sponsored by: The FreeBSD Foundation
>> MFC after: 2 weeks
>>
>> Modified:
>> head/sys/arm64/arm64/pmap.c
>>
>> Modified: head/sys/arm64/arm64/pmap.c
>> ==============================================================================
>> --- head/sys/arm64/arm64/pmap.c Mon Apr 10 12:35:58
>> 2017 (r316678) +++ head/sys/arm64/arm64/pmap.c Mon Apr
>> 10 15:32:26 2017 (r316679) @@ -2481,6 +2481,11 @@
>> pmap_protect(pmap_t pmap, vm_offset_t sv sva += L3_SIZE) {
>> l3 = pmap_load(l3p);
>> if (pmap_l3_valid(l3)) {
>> + if ((l3 & ATTR_SW_MANAGED) &&
>> +    pmap_page_dirty(l3)) {
>> +
>> vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
>> +    ~ATTR_MASK));
>> + }
>> pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>> PTE_SYNC(l3p);
>> /* XXX: Use pmap_invalidate_range */
>>
>> . . .
>>
>
> Thanks for this reference.  I took a quick look at the 3 pmap
> implementations we have (haven't check the new radix pmap yet), and it
> looks like only mmu_oea.c (32-bit AIM pmap, for G3 and G4) is missing
> vm_page_dirty() calls in its pmap_protect() implementation, analogous
> to the change you posted right above. Given this, I think it's safe to
> say that this missing piece is necessary.  We'll work on a fix for
> this; looking at moea64_protect(), there may be additional work needed
> to support this as well, so it may take a few days.

Ping? Any clue when the above might happen?

I've been avoiding the old PowerMacs and leaving
them at head -r360311 , pending an update that
would avoid the kernel zeroing pages that it
should not zero. But I've seen that you were busy
with more modern contexts this last about a month.

And, clearly, my own context has left pending
(for much longer) other more involved activities
(compared to just periodically updating to
more recent FreeBSD vintages).

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

Justin Hibbits-3
On Wed, 10 Jun 2020 18:56:57 -0700
Mark Millard <[hidden email]> wrote:

> On 2020-May-13, at 08:56, Justin Hibbits <[hidden email]> wrote:
>
> > Hi Mark,  
>
> Hello Justin.

Hi Mark,


>
> > On Wed, 13 May 2020 01:43:23 -0700
> > Mark Millard <[hidden email]> wrote:
> >  
> >> [I'm adding a reference to an old arm64/aarch64 bug that had
> >> pages turning to zero, in case this 32-bit powerpc issue is
> >> somewhat analogous.]
> >>  
> >>> . . .  
> > ...  
> >> . . .
> >>
> >> (Note: dsl-only.net closed down, so the E-mail
> >> address reference is no longer valid.)
> >>
> >> Author: kib
> >> Date: Mon Apr 10 15:32:26 2017
> >> New Revision: 316679
> >> URL:
> >> https://svnweb.freebsd.org/changeset/base/316679
> >>
> >>
> >> Log:
> >> Do not lose dirty bits for removing PROT_WRITE on arm64.
> >>
> >> Arm64 pmap interprets accessed writable ptes as modified, since
> >> ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable
> >> bit is removed, page must be marked as dirty for MI VM.
> >>
> >> This change is most important for COW, where fork caused losing
> >> content of the dirty pages which were not yet scanned by
> >> pagedaemon.
> >>
> >> Reviewed by: alc, andrew
> >> Reported and tested by: Mark Millard <markmi at
> >> dsl-only.net> PR: 217138, 217239
> >> Sponsored by: The FreeBSD Foundation
> >> MFC after: 2 weeks
> >>
> >> Modified:
> >> head/sys/arm64/arm64/pmap.c
> >>
> >> Modified: head/sys/arm64/arm64/pmap.c
> >> ==============================================================================
> >> --- head/sys/arm64/arm64/pmap.c Mon Apr 10 12:35:58
> >> 2017 (r316678) +++ head/sys/arm64/arm64/pmap.c Mon
> >> Apr 10 15:32:26 2017 (r316679) @@ -2481,6 +2481,11 @@
> >> pmap_protect(pmap_t pmap, vm_offset_t sv sva += L3_SIZE) {
> >> l3 = pmap_load(l3p);
> >> if (pmap_l3_valid(l3)) {
> >> + if ((l3 & ATTR_SW_MANAGED) &&
> >> +    pmap_page_dirty(l3)) {
> >> +
> >> vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
> >> +    ~ATTR_MASK));
> >> + }
> >> pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
> >> PTE_SYNC(l3p);
> >> /* XXX: Use pmap_invalidate_range
> >> */
> >>
> >> . . .
> >>  
> >
> > Thanks for this reference.  I took a quick look at the 3 pmap
> > implementations we have (haven't check the new radix pmap yet), and
> > it looks like only mmu_oea.c (32-bit AIM pmap, for G3 and G4) is
> > missing vm_page_dirty() calls in its pmap_protect() implementation,
> > analogous to the change you posted right above. Given this, I think
> > it's safe to say that this missing piece is necessary.  We'll work
> > on a fix for this; looking at moea64_protect(), there may be
> > additional work needed to support this as well, so it may take a
> > few days.  
>
> Ping? Any clue when the above might happen?
>
> I've been avoiding the old PowerMacs and leaving
> them at head -r360311 , pending an update that
> would avoid the kernel zeroing pages that it
> should not zero. But I've seen that you were busy
> with more modern contexts this last about a month.
>
> And, clearly, my own context has left pending
> (for much longer) other more involved activities
> (compared to just periodically updating to
> more recent FreeBSD vintages).
>
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
>
Sorry for the delay, I got sidetracked with a bunch of other
development.  I did install a newer FreeBSD on my dual G4 and couldn't
see the problem.  That said, the attached patch effectively copies
what's done in OEA6464 into OEA pmap.  Can you test it?

- Justin

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"

moea_protect.diff (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

freebsd-ppc mailing list


On 2020-Jun-11, at 13:55, Justin Hibbits <chmeeedalf at gmail.com> wrote:

> On Wed, 10 Jun 2020 18:56:57 -0700
> Mark Millard <[hidden email]> wrote:
>
>> On 2020-May-13, at 08:56, Justin Hibbits <[hidden email]> wrote:
>>
>>> Hi Mark,  
>>
>> Hello Justin.
>
> Hi Mark,

Hello again, Justin.

>>
>>> On Wed, 13 May 2020 01:43:23 -0700
>>> Mark Millard <[hidden email]> wrote:
>>>
>>>> [I'm adding a reference to an old arm64/aarch64 bug that had
>>>> pages turning to zero, in case this 32-bit powerpc issue is
>>>> somewhat analogous.]
>>>>
>>>>> . . .  
>>> ...  
>>>> . . .
>>>>
>>>> (Note: dsl-only.net closed down, so the E-mail
>>>> address reference is no longer valid.)
>>>>
>>>> Author: kib
>>>> Date: Mon Apr 10 15:32:26 2017
>>>> New Revision: 316679
>>>> URL:
>>>> https://svnweb.freebsd.org/changeset/base/316679
>>>>
>>>>
>>>> Log:
>>>> Do not lose dirty bits for removing PROT_WRITE on arm64.
>>>>
>>>> Arm64 pmap interprets accessed writable ptes as modified, since
>>>> ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable
>>>> bit is removed, page must be marked as dirty for MI VM.
>>>>
>>>> This change is most important for COW, where fork caused losing
>>>> content of the dirty pages which were not yet scanned by
>>>> pagedaemon.
>>>>
>>>> Reviewed by: alc, andrew
>>>> Reported and tested by: Mark Millard <markmi at
>>>> dsl-only.net> PR: 217138, 217239
>>>> Sponsored by: The FreeBSD Foundation
>>>> MFC after: 2 weeks
>>>>
>>>> Modified:
>>>> head/sys/arm64/arm64/pmap.c
>>>>
>>>> Modified: head/sys/arm64/arm64/pmap.c
>>>> ==============================================================================
>>>> --- head/sys/arm64/arm64/pmap.c Mon Apr 10 12:35:58
>>>> 2017 (r316678) +++ head/sys/arm64/arm64/pmap.c Mon
>>>> Apr 10 15:32:26 2017 (r316679) @@ -2481,6 +2481,11 @@
>>>> pmap_protect(pmap_t pmap, vm_offset_t sv sva += L3_SIZE) {
>>>> l3 = pmap_load(l3p);
>>>> if (pmap_l3_valid(l3)) {
>>>> + if ((l3 & ATTR_SW_MANAGED) &&
>>>> +    pmap_page_dirty(l3)) {
>>>> +
>>>> vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
>>>> +    ~ATTR_MASK));
>>>> + }
>>>> pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>>>> PTE_SYNC(l3p);
>>>> /* XXX: Use pmap_invalidate_range
>>>> */
>>>>
>>>> . . .
>>>>
>>>
>>> Thanks for this reference.  I took a quick look at the 3 pmap
>>> implementations we have (haven't check the new radix pmap yet), and
>>> it looks like only mmu_oea.c (32-bit AIM pmap, for G3 and G4) is
>>> missing vm_page_dirty() calls in its pmap_protect() implementation,
>>> analogous to the change you posted right above. Given this, I think
>>> it's safe to say that this missing piece is necessary.  We'll work
>>> on a fix for this; looking at moea64_protect(), there may be
>>> additional work needed to support this as well, so it may take a
>>> few days.  
>>
>> Ping? Any clue when the above might happen?
>>
>> I've been avoiding the old PowerMacs and leaving
>> them at head -r360311 , pending an update that
>> would avoid the kernel zeroing pages that it
>> should not zero. But I've seen that you were busy
>> with more modern contexts this last about a month.
>>
>> And, clearly, my own context has left pending
>> (for much longer) other more involved activities
>> (compared to just periodically updating to
>> more recent FreeBSD vintages).
>>
>> . . .
>>
>
> Sorry for the delay, I got sidetracked with a bunch of other
> development.

> I did install a newer FreeBSD on my dual G4 and couldn't
> see the problem.

How did you test?

In my context it was far easier to see the problem
with builds that did not use MALLOC_PRODUCTION. In
other words: jemalloc having its asserts tested.

The easiest way I found to get the asserts to fail
was to do (multiple processes (-m) and totaling to
more than enough to force paging/swapping):

stress -m 2 --vm-bytes 1700M &

(Possibly setting up some shells first
to potentially later exit.)

Normally stress itself would hit jemalloc
asserts. Apparently the asserts did not
stop the code and it ran until a failure
occurred (via dtv=0x0). I never had to
manually stop the stress processes.

If no failures during, then exit shells
that likely were swapped out or partially
paged out during the stress run. They
hit jemalloc asserts during their cleanup
activity in my testing.


> That said, the attached patch effectively copies
> what's done in OEA6464 into OEA pmap.  Can you test it?

I'll try it once I get a chance, probably later
today.

I gather from what I see that moea64_protect did not
need the changes that you originally thought might
be required? I only see moea_protect changes in the
patch.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

Brandon Bergren-3
An update from my end: I now have the ability to test dual processor G4 as well, now that mine is up and running.

On Thu, Jun 11, 2020, at 4:36 PM, Mark Millard wrote:

>
> How did you test?
>
> In my context it was far easier to see the problem
> with builds that did not use MALLOC_PRODUCTION. In
> other words: jemalloc having its asserts tested.
>
> The easiest way I found to get the asserts to fail
> was to do (multiple processes (-m) and totaling to
> more than enough to force paging/swapping):
>
> stress -m 2 --vm-bytes 1700M &
>
> (Possibly setting up some shells first
> to potentially later exit.)
>
> Normally stress itself would hit jemalloc
> asserts. Apparently the asserts did not
> stop the code and it ran until a failure
> occurred (via dtv=0x0). I never had to
> manually stop the stress processes.
>
> If no failures during, then exit shells
> that likely were swapped out or partially
> paged out during the stress run. They
> hit jemalloc asserts during their cleanup
> activity in my testing.
>
>
> > That said, the attached patch effectively copies
> > what's done in OEA6464 into OEA pmap.  Can you test it?
>
> I'll try it once I get a chance, probably later
> today.
>
> I gather from what I see that moea64_protect did not
> need the changes that you originally thought might
> be required? I only see moea_protect changes in the
> patch.
>
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
>
>

--
  Brandon Bergren
  [hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "[hidden email]"
12