Java stack overflow segfaults

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Java stack overflow segfaults

Greg Lewis-2
Hi all,

I'm investigating an issue where, on FreeBSD, Java will crash rather than
throw a StackOverflowError given a simple test program with a function
that just calls itself over and over.  There's an example of such a test
in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222146

This affects, I suspect, every native version of Java in the ports tree,
although I've only tried openjdk8 and higher.  My investigation has mostly
focused on openjdk11.

To outline the situation, Java uses pthreads internally for threading.  It
doesn't use the pthreads own guard page(s), but instead creates it's own
guard area at the bottom of the stack (which grows downward) using
mprotect.  It then installs a signal handler and examines any SIGSEGV's
fault address to see if it falls within the guard area, and if so throws a
StackOverflowError.  This logic is the same across all of the OSes I've
looked at and works on OpenBSD, Linux, etc.  On FreeBSD though, the fault
address lies in the page above the guard zone, rather than in the guard
zone, which results in a crash rather than throwing StackOverflowError.

An diagram may help here:

--- <- Stack top
|
| Untouched memory + stack frames + etc.
|
|
| <-- SIGSEGV signal info fault address (< 1 page above guard zone)
--- <- Start of JVM reserved zone / guard zone
|
| JVM Reserved page
|
--- <- Start of JVM yellow zone
|
| JVM Yellow pages
|
--- <- Start of JVM red zone
|
| JVM Red page
|
--- <- Stack bottom
|
| Pthread guard page(s)
|
---

On my FreeBSD 11.3/amd64 machine the JVM uses a total of four pages for the
guard zone (1 reserved, 2 yellow, 1 red).  The page size is 4K, and I see
the follow mprotect calls with truss:

mprotect(stack bottom address, 4K, PROT_NONE) (Just the red zone)
mprotect(stack bottom address, 16K, PROT_NONE) (The entire guard zone)
mprotect(top of red zone address, 12K, PROT_READ|PROT_WRITE) (Reserved + yellow)
mprotect(top of red zone address, 12K, PROT_NONE) (Reserved + yellow)

While I've committed a workaround for openjdk8, which just rounds down the
fault address, it isn't entirely satisfactory (it's a hack) and I wondered
if anyone had any insight into what may be going on.  I've done an analysis
of the sizes and addresses being used and used truss to check the parameters
to the mprotect calls, and everything appears to add up.

The same problem also occurs under FreeBSD 12.0/i386 and on aarch64, so it
doesn't appear to be either version or platform specific.  I've simplified
a little here, but am happy to provide additional details and code
references.

-- Greg
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Java stack overflow segfaults

Konstantin Belousov
On Mon, Aug 12, 2019 at 09:16:29AM -0700, Greg Lewis wrote:

> Hi all,
>
> I'm investigating an issue where, on FreeBSD, Java will crash rather than
> throw a StackOverflowError given a simple test program with a function
> that just calls itself over and over.  There's an example of such a test
> in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222146
>
> This affects, I suspect, every native version of Java in the ports tree,
> although I've only tried openjdk8 and higher.  My investigation has mostly
> focused on openjdk11.
>
> To outline the situation, Java uses pthreads internally for threading.  It
> doesn't use the pthreads own guard page(s), but instead creates it's own
> guard area at the bottom of the stack (which grows downward) using
> mprotect.  It then installs a signal handler and examines any SIGSEGV's
> fault address to see if it falls within the guard area, and if so throws a
> StackOverflowError.  This logic is the same across all of the OSes I've
> looked at and works on OpenBSD, Linux, etc.  On FreeBSD though, the fault
> address lies in the page above the guard zone, rather than in the guard
> zone, which results in a crash rather than throwing StackOverflowError.
>
> An diagram may help here:
>
> --- <- Stack top
> |
> | Untouched memory + stack frames + etc.
> |
> |
> | <-- SIGSEGV signal info fault address (< 1 page above guard zone)
> --- <- Start of JVM reserved zone / guard zone
> |
> | JVM Reserved page
> |
> --- <- Start of JVM yellow zone
> |
> | JVM Yellow pages
> |
> --- <- Start of JVM red zone
> |
> | JVM Red page
> |
> --- <- Stack bottom
> |
> | Pthread guard page(s)
> |
> ---
>
> On my FreeBSD 11.3/amd64 machine the JVM uses a total of four pages for the
> guard zone (1 reserved, 2 yellow, 1 red).  The page size is 4K, and I see
> the follow mprotect calls with truss:
>
> mprotect(stack bottom address, 4K, PROT_NONE) (Just the red zone)
> mprotect(stack bottom address, 16K, PROT_NONE) (The entire guard zone)
> mprotect(top of red zone address, 12K, PROT_READ|PROT_WRITE) (Reserved + yellow)
> mprotect(top of red zone address, 12K, PROT_NONE) (Reserved + yellow)
>
> While I've committed a workaround for openjdk8, which just rounds down the
> fault address, it isn't entirely satisfactory (it's a hack) and I wondered
> if anyone had any insight into what may be going on.  I've done an analysis
> of the sizes and addresses being used and used truss to check the parameters
> to the mprotect calls, and everything appears to add up.
>
> The same problem also occurs under FreeBSD 12.0/i386 and on aarch64, so it
> doesn't appear to be either version or platform specific.  I've simplified
> a little here, but am happy to provide additional details and code
> references.

Can you provide me with the java class that demonstrates the issue ?
What exact environment do I need to reproduce it ?

Is amd64 stable/11 openjdk8 good enough ?
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Java stack overflow segfaults

Greg Lewis-2
In reply to this post by Greg Lewis-2
On Mon, Aug 12, 2019 at 09:16:29AM -0700, Greg Lewis wrote:

> I'm investigating an issue where, on FreeBSD, Java will crash rather than
> throw a StackOverflowError given a simple test program with a function
> that just calls itself over and over.  There's an example of such a test
> in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222146
>
> This affects, I suspect, every native version of Java in the ports tree,
> although I've only tried openjdk8 and higher.  My investigation has mostly
> focused on openjdk11.
>
> To outline the situation, Java uses pthreads internally for threading.  It
> doesn't use the pthreads own guard page(s), but instead creates it's own
> guard area at the bottom of the stack (which grows downward) using
> mprotect.  It then installs a signal handler and examines any SIGSEGV's
> fault address to see if it falls within the guard area, and if so throws a
> StackOverflowError.  This logic is the same across all of the OSes I've
> looked at and works on OpenBSD, Linux, etc.  On FreeBSD though, the fault
> address lies in the page above the guard zone, rather than in the guard
> zone, which results in a crash rather than throwing StackOverflowError.
>
> An diagram may help here:
>
> --- <- Stack top
> |
> | Untouched memory + stack frames + etc.
> |
> |
> | <-- SIGSEGV signal info fault address (< 1 page above guard zone)
> --- <- Start of JVM reserved zone / guard zone
> |
> | JVM Reserved page
> |
> --- <- Start of JVM yellow zone
> |
> | JVM Yellow pages
> |
> --- <- Start of JVM red zone
> |
> | JVM Red page
> |
> --- <- Stack bottom
> |
> | Pthread guard page(s)
> |
> ---
>
> On my FreeBSD 11.3/amd64 machine the JVM uses a total of four pages for the
> guard zone (1 reserved, 2 yellow, 1 red).  The page size is 4K, and I see
> the follow mprotect calls with truss:
>
> mprotect(stack bottom address, 4K, PROT_NONE) (Just the red zone)
> mprotect(stack bottom address, 16K, PROT_NONE) (The entire guard zone)
> mprotect(top of red zone address, 12K, PROT_READ|PROT_WRITE) (Reserved + yellow)
> mprotect(top of red zone address, 12K, PROT_NONE) (Reserved + yellow)
>
> While I've committed a workaround for openjdk8, which just rounds down the
> fault address, it isn't entirely satisfactory (it's a hack) and I wondered
> if anyone had any insight into what may be going on.  I've done an analysis
> of the sizes and addresses being used and used truss to check the parameters
> to the mprotect calls, and everything appears to add up.
>
> The same problem also occurs under FreeBSD 12.0/i386 and on aarch64, so it
> doesn't appear to be either version or platform specific.  I've simplified
> a little here, but am happy to provide additional details and code
> references.

This appears to be due to security.bsd.stack_guard_page and setting that
to different values alters the extra space which may contain the fault
address.

-- Greg
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"