Re: threads/84778: libpthread busy loop/hang with Java when handling signals and Runtime.exec

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: threads/84778: libpthread busy loop/hang with Java when handling signals and Runtime.exec

Nicklas Johnson-3
This is still happening on FreeBSD 6.0-STABLE as of 12/6/2005.

World and kernel were compiled without any CPUTYPE or CFLAGS options, as
was Java.

This problem is trivial to reproduce.  One merely has to start httpd.sh in
Resin 3.0 with no arguments, wait for the JVM to start running, and hit
^Z or ^C.

The process then goes into an infinite loop in kse_release.  truss -p
shows the following:

kse_release(0x8056f44)                           = 0 (0x0)
kse_release(0x8056f44)                           = 0 (0x0)
kse_release(0x8056f4c)                           = 0 (0x0)
kse_release(0x8056f44)                           = 0 (0x0)
kse_release(0x8056f44)                           = 0 (0x0)
kse_release(0x8056f4c)                           = 0 (0x0)

ktrace shows a bit more detail of the infinite loop:

  21357 java     CALL  gettimeofday(0xbf275e70,0)
  21357 java     RET   gettimeofday 0
  21357 java     CALL  gettimeofday(0xbf275ec0,0)
  21357 java     RET   gettimeofday 0
  21357 java     CALL  kse_release(0x8056f4c)
  21357 java     RET   kse_release 0
  21357 java     CALL  gettimeofday(0xbf275e70,0)
  21357 java     RET   gettimeofday 0
  21357 java     CALL  gettimeofday(0xbf275ec0,0)
  21357 java     RET   gettimeofday 0
  21357 java     CALL  kse_release(0x8056f4c)
  21357 java     RET   kse_release 0

gcore -s 21357; gdb /usr/local/jdk14/bin/java core.21357 shows:

Cannot get thread info: generic error

#0  0x4809f1af in pthread_testcancel () from /usr/lib/libpthread.so.2
#1  0x480a01dd in __error () from /usr/lib/libpthread.so.2
#2  0x4808adeb in pthread_getschedparam () from /usr/lib/libpthread.so.2
#3  0x4808feef in pthread_create () from /usr/lib/libpthread.so.2
#4  0x48465f10 in os::create_thread () from /usr/local/jdk1.4.2/jre/lib/i386/client/libjvm.so
#5  0x484afb28 in JavaThread::JavaThread () from /usr/local/jdk1.4.2/jre/lib/i386/client/libjvm.so
#6  0x483f927c in JVM_StartThread () from /usr/local/jdk1.4.2/jre/lib/i386/client/libjvm.so

followed by a smashed-looking stack for about 60 frames, and then:

#65 0x485735c0 in __JCR_LIST__ () from /usr/local/jdk1.4.2/jre/lib/i386/client/libjvm.so
#66 0x4a3c20d7 in ?? ()
#67 0xbf073d00 in ?? ()
#68 0xbf073d48 in ?? ()
#69 0x4839032b in JavaCalls::call_helper () from /usr/local/jdk1.4.2/jre/lib/i386/client/libjvm.so
Previous frame inner to this frame (corrupt stack?)

Note that the valid-looking portion of the stack trace still isn't valid
higher than pthread_create(), though this could have still been a result
of the process still making calls while gcore was running, since the
process does not respond to the STOP signal.

I'm inclined to suspect that this is a Java bug, considering the smashed
stack.

    Nick

--
"The aptly-named morons.org is an obscenity-laced screed..."
  -- Robert P. Lockwood, Catholic League director of research
Nick Johnson, version 2.1                             http://web.morons.org/
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-java
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: threads/84778: libpthread busy loop/hang with Java when handling signals and Runtime.exec

Nicklas Johnson-3
Another bit of information of interest: sending the process a kill -stop
(-17) after trying to stop it with ^Z *will* cause the process to stop.

Sending it a kill -stop without hitting ^Z will also cause the process to
stop normally.

In either case, however, resuming from the shell with fg will cause the
same infinite-loop behaviour.  Resuming with a kill -19 will also cause
the infinite-loop behaviour.

Sending a regular kill signal also causes the infinite-loop behaviour.
The process has to be killed with a kill -9 in all cases once this
behaiour starts.

--
"The aptly-named morons.org is an obscenity-laced screed..."
  -- Robert P. Lockwood, Catholic League director of research
Nick Johnson, version 2.1                             http://web.morons.org/

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-java
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: threads/84778: libpthread busy loop/hang with Java when handling signals and Runtime.exec

Nicklas Johnson-3
In reply to this post by Nicklas Johnson-3
As I continue to muck about with this I'm finding a few new things out.

I sent the process a ^Z, which caused the process to begin running away.
Then I sent it a kill -stop and waited for the process to stop completely.
Then I ran gcore, and although the first few stack frames are probably
caused by the stop signal, the last stack frame may be telling:

Cannot get thread info: generic error
(gdb) bt
#0  0x480a0090 in __error () from /usr/lib/libpthread.so.2
#1  0x4808d074 in sigaction () from /usr/lib/libpthread.so.2
#2  0x4808d624 in sigaction () from /usr/lib/libpthread.so.2
#3  0x48097bd0 in pthread_mutexattr_init () from /usr/lib/libpthread.so.2
#4  0x00000000 in ?? ()

Looks like sending a ^Z may be causing Java to dereference a null.  I'll
continue investigating to see what else can be discovered.

--
"The aptly-named morons.org is an obscenity-laced screed..."
  -- Robert P. Lockwood, Catholic League director of research
Nick Johnson, version 2.1                             http://web.morons.org/

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-java
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: threads/84778: libpthread busy loop/hang with Java when handling signals and Runtime.exec

Nicklas Johnson-3
I have found the problem, at long last.

Resin uses a custom JNI library for acceleration of some features.  In
their compilation process, they specify -lc_r for the threading library
used in their JNI library.

This in turn seems to be the cause of unexpected and erratic behaviour
when running Java with libpthread, and probably explains why the problems
completely disappear when mapping Java to use libc_r.

Modifying the makefile to link the JNI library to libpthread completely
alleviates the signal handling problems.  The process can be suspended and
resumed from the keyboard ad-nauseam.  Removing the JNI library from the
picture entirely has the same effect (Resin will use the native Java code
if the JNI library cannot be loaded.)

I will follow up with the Resin folks and submit a patch to detect the
thread library that is in use and use that one when compiling the JNI
library.

Please close this bug.

    Nick

--
"The aptly-named morons.org is an obscenity-laced screed..."
  -- Robert P. Lockwood, Catholic League director of research
Nick Johnson, version 2.1                             http://web.morons.org/

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-java
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: threads/84778: libpthread busy loop/hang with Java when handling signals and Runtime.exec

Panagiotis Astithas
[hidden email] wrote:

> I have found the problem, at long last.
>
> Resin uses a custom JNI library for acceleration of some features.  In
> their compilation process, they specify -lc_r for the threading library
> used in their JNI library.
>
> This in turn seems to be the cause of unexpected and erratic behaviour
> when running Java with libpthread, and probably explains why the
> problems completely disappear when mapping Java to use libc_r.
>
> Modifying the makefile to link the JNI library to libpthread completely
> alleviates the signal handling problems.  The process can be suspended
> and resumed from the keyboard ad-nauseam.  Removing the JNI library from
> the picture entirely has the same effect (Resin will use the native Java
> code if the JNI library cannot be loaded.)
>
> I will follow up with the Resin folks and submit a patch to detect the
> thread library that is in use and use that one when compiling the JNI
> library.
>
> Please close this bug.
>
>    Nick

Congratulations for your debugging work and thank you for being so
persistent!

Panagiotis
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-java
To unsubscribe, send any mail to "[hidden email]"