process shared mutexes?

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

process shared mutexes?

Volker Lendecke
Hello!

For Samba's tdb I'm trying to get process shared robust mutexes to
work. However, tdb has a usage pattern that seems to confuse FreeBSD
11 (32-bit x86 if that matters).

The attached program fails in the final pthread_mutex_lock call. If I
comment out the call to

ptr = mmap(NULL, 0xb0, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x1000);

it works.

FWIW, tdb uses robust shared mutexes on Linux successfully for a while
now. I haven't tried Solaris yet, the only other platform I know about
that has them.

What am I doing wrong?

Thanks,

Volker
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Konstantin Belousov
On Mon, Nov 21, 2016 at 02:35:28PM +0100, Volker Lendecke wrote:

> Hello!
>
> For Samba's tdb I'm trying to get process shared robust mutexes to
> work. However, tdb has a usage pattern that seems to confuse FreeBSD
> 11 (32-bit x86 if that matters).
>
> The attached program fails in the final pthread_mutex_lock call. If I
> comment out the call to
>
> ptr = mmap(NULL, 0xb0, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x1000);
>
> it works.
>
> FWIW, tdb uses robust shared mutexes on Linux successfully for a while
> now. I haven't tried Solaris yet, the only other platform I know about
> that has them.
>
> What am I doing wrong?
>
> Thanks,
>

There is no attached program, please mail it either inline or put it
somewhere on web.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Volker Lendecke
On Mon, Nov 21, 2016 at 03:50:36PM +0200, Konstantin Belousov wrote:

> On Mon, Nov 21, 2016 at 02:35:28PM +0100, Volker Lendecke wrote:
> > Hello!
> >
> > For Samba's tdb I'm trying to get process shared robust mutexes to
> > work. However, tdb has a usage pattern that seems to confuse FreeBSD
> > 11 (32-bit x86 if that matters).
> >
> > The attached program fails in the final pthread_mutex_lock call. If I
> > comment out the call to
> >
> > ptr = mmap(NULL, 0xb0, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x1000);
> >
> > it works.
> >
> > FWIW, tdb uses robust shared mutexes on Linux successfully for a while
> > now. I haven't tried Solaris yet, the only other platform I know about
> > that has them.
> >
> > What am I doing wrong?
> >
> > Thanks,
> >
>
> There is no attached program, please mail it either inline or put it
> somewhere on web.

Hmm. Inline now.

Volker


#include <stdio.h>
#include <pthread.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <string.h>

int main(int argc, const char *argv[])
{
        int fd, ret;
        void *ptr;
        pthread_mutex_t *m;
        pthread_mutexattr_t attr;

        if (argc != 2) {
                fprintf(stderr, "usage: %s <filename>\n", argv[0]);
                return 1;
        }

        fd = open(argv[1], O_RDWR|O_CREAT, 0644);
        if (fd == -1) {
                perror("open failed");
                return 1;
        }

        ret = ftruncate(fd, 0x1000+0xb0);
        if (ret == -1) {
                perror("ftruncate failed");
                return 1;
        }

        m = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
        if (m == MAP_FAILED) {
                perror("mmap failed");
                return 1;
        }

        ret = pthread_mutexattr_init(&attr);
        if (ret != 0) {
                fprintf(stderr, "pthread_mutexattr_init failed: %s\n",
                        strerror(ret));
                return 1;
        }

        ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
        if (ret != 0) {
                fprintf(stderr, "pthread_mutexattr_setpshared failed: %s\n",
                        strerror(ret));
                return 1;
        }

        ret = pthread_mutex_init(m, &attr);
        if (ret != 0) {
                fprintf(stderr, "pthread_mutex_init failed: %s\n",
                        strerror(ret));
                return 1;
        }

        ret = munmap(m, 0x1000);
        if (ret == -1) {
                perror("munmap failed");
                return 1;
        }

#if 1
        ptr = mmap(NULL, 0xb0, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x1000);
        if (ptr == MAP_FAILED) {
                perror("mmap failed");
                return 1;
        }
#endif

        m = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
        if (m == MAP_FAILED) {
                perror("mmap failed");
                return 1;
        }

        ret = pthread_mutex_lock(m);
        if (ret != 0) {
                fprintf(stderr, "pthread_mutex_lock failed: %s\n",
                        strerror(ret));
                return 1;
        }

        ret = pthread_mutex_unlock(m);
        if (ret != 0) {
                fprintf(stderr, "pthread_mutex_lock failed: %s\n",
                        strerror(ret));
                return 1;
        }

        return 0;
}
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Konstantin Belousov
On Mon, Nov 21, 2016 at 03:16:16PM +0100, Volker Lendecke wrote:

> On Mon, Nov 21, 2016 at 03:50:36PM +0200, Konstantin Belousov wrote:
> > On Mon, Nov 21, 2016 at 02:35:28PM +0100, Volker Lendecke wrote:
> > > Hello!
> > >
> > > For Samba's tdb I'm trying to get process shared robust mutexes to
> > > work. However, tdb has a usage pattern that seems to confuse FreeBSD
> > > 11 (32-bit x86 if that matters).
> > >
> > > The attached program fails in the final pthread_mutex_lock call. If I
> > > comment out the call to
> > >
> > > ptr = mmap(NULL, 0xb0, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x1000);
> > >
> > > it works.
> > >
> > > FWIW, tdb uses robust shared mutexes on Linux successfully for a while
> > > now. I haven't tried Solaris yet, the only other platform I know about
> > > that has them.
> > >
> > > What am I doing wrong?
Please see the libthr(3) man page, in particular, read the RUN-TIME
SETTINGS section, the description of the kern.ipc.umtx_vnode_persistent
sysctl.

Does setting the sysctl to 1 allow your program to run ?

> > >
> > > Thanks,
> > >
> >
> > There is no attached program, please mail it either inline or put it
> > somewhere on web.
>
> Hmm. Inline now.
>
> Volker
>
>
> #include <stdio.h>
> #include <pthread.h>
> #include <errno.h>
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sys/mman.h>
> #include <string.h>
>
> int main(int argc, const char *argv[])
> {
> int fd, ret;
> void *ptr;
> pthread_mutex_t *m;
> pthread_mutexattr_t attr;
>
> if (argc != 2) {
> fprintf(stderr, "usage: %s <filename>\n", argv[0]);
> return 1;
> }
>
> fd = open(argv[1], O_RDWR|O_CREAT, 0644);
> if (fd == -1) {
> perror("open failed");
> return 1;
> }
>
> ret = ftruncate(fd, 0x1000+0xb0);
> if (ret == -1) {
> perror("ftruncate failed");
> return 1;
> }
>
> m = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> if (m == MAP_FAILED) {
> perror("mmap failed");
> return 1;
> }
>
> ret = pthread_mutexattr_init(&attr);
> if (ret != 0) {
> fprintf(stderr, "pthread_mutexattr_init failed: %s\n",
> strerror(ret));
> return 1;
> }
>
> ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> if (ret != 0) {
> fprintf(stderr, "pthread_mutexattr_setpshared failed: %s\n",
> strerror(ret));
> return 1;
> }
>
> ret = pthread_mutex_init(m, &attr);
> if (ret != 0) {
> fprintf(stderr, "pthread_mutex_init failed: %s\n",
> strerror(ret));
> return 1;
> }
>
> ret = munmap(m, 0x1000);
> if (ret == -1) {
> perror("munmap failed");
> return 1;
> }
>
> #if 1
> ptr = mmap(NULL, 0xb0, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x1000);
> if (ptr == MAP_FAILED) {
> perror("mmap failed");
> return 1;
> }
> #endif
>
> m = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> if (m == MAP_FAILED) {
> perror("mmap failed");
> return 1;
> }
>
> ret = pthread_mutex_lock(m);
> if (ret != 0) {
> fprintf(stderr, "pthread_mutex_lock failed: %s\n",
> strerror(ret));
> return 1;
> }
>
> ret = pthread_mutex_unlock(m);
> if (ret != 0) {
> fprintf(stderr, "pthread_mutex_lock failed: %s\n",
> strerror(ret));
> return 1;
> }
>
> return 0;
> }
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Volker Lendecke
On Mon, Nov 21, 2016 at 05:10:40PM +0200, Konstantin Belousov wrote:
> Please see the libthr(3) man page, in particular, read the RUN-TIME
> SETTINGS section, the description of the kern.ipc.umtx_vnode_persistent
> sysctl.
>
> Does setting the sysctl to 1 allow your program to run ?

Yes, that does make it work. The description says that the umtx vnode
is dropped on the last munmap. If I #if 0 the middle mmap, it works,
although there is no mmap around anymore. So the description is not
100% accurate I'd say.

When does the recycling happen exactly? Can it break running
applications?

And -- how can I make sure for Samba that this is set properly at
runtime? We already have a runtime mutex test for some ancient Linux
kernels that were broken. We could add this as a subtest too. But --
what happens if the admin resets this while Samba is running? Does
the kernel make sure that existing files still get the correct
behaviour when the sysctl changes?

Thanks!

Volker
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Volker Lendecke
In reply to this post by Konstantin Belousov
On Mon, Nov 21, 2016 at 05:10:40PM +0200, Konstantin Belousov wrote:
> Please see the libthr(3) man page, in particular, read the RUN-TIME
> SETTINGS section, the description of the kern.ipc.umtx_vnode_persistent
> sysctl.

One more question: man libtrh(3) says that this behaviour is allowed
by Posix. Do you have a reference that is available online? Or do I
have to buy the Posix standards documents?

Thanks,

Volker
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Konstantin Belousov
In reply to this post by Volker Lendecke
On Mon, Nov 21, 2016 at 04:25:42PM +0100, Volker Lendecke wrote:

> On Mon, Nov 21, 2016 at 05:10:40PM +0200, Konstantin Belousov wrote:
> > Please see the libthr(3) man page, in particular, read the RUN-TIME
> > SETTINGS section, the description of the kern.ipc.umtx_vnode_persistent
> > sysctl.
> >
> > Does setting the sysctl to 1 allow your program to run ?
>
> Yes, that does make it work. The description says that the umtx vnode
> is dropped on the last munmap. If I #if 0 the middle mmap, it works,
> although there is no mmap around anymore. So the description is not
> 100% accurate I'd say.
No, not umtx vnode is dropped, but the shared umutexes attached to the
file' page.

What you observed is the consequence of an implementation detail.  It is
impossible to execute umtx cleanup while dereferencing vm object, due to
locking issues.  An asynchronous task is scheduled to perform the cleanup.
But when the task is run, it is quite possible that your process is already
executed second mmap() and pthread_mutex_lock(), creating another reference
on the umtx data.

The mmap(NULL, 0xb, ...) delays execution of that part of the program, so
the task wins more reliably.  This initially puzzled me as well, since I
was not able to observe your reported behaviour on the amd64 host, either
on 64 or 32bit binary.

>
> When does the recycling happen exactly? Can it break running
> applications?
See above.  I cannot answer the second question.

>
> And -- how can I make sure for Samba that this is set properly at
> runtime? We already have a runtime mutex test for some ancient Linux
> kernels that were broken. We could add this as a subtest too. But --
> what happens if the admin resets this while Samba is running? Does
> the kernel make sure that existing files still get the correct
> behaviour when the sysctl changes?

It is reasonable to expect that administrator set the knob very early
during the boot, and not twiddle it at runtime, at least if explicit
instructions to do so are provided.  Samba may, additionally, check
the sysctl value during the initialization and provide a hint to
user if needed.

On Mon, Nov 21, 2016 at 04:39:45PM +0100, Volker Lendecke wrote:
> On Mon, Nov 21, 2016 at 05:10:40PM +0200, Konstantin Belousov wrote:
> > Please see the libthr(3) man page, in particular, read the RUN-TIME
> > SETTINGS section, the description of the kern.ipc.umtx_vnode_persistent
> > sysctl.
>
> One more question: man libtrh(3) says that this behaviour is allowed
> by Posix. Do you have a reference that is available online? Or do I
> have to buy the Posix standards documents?

Look at the Single Unix Specification, particularly to the following
paragraph in the mmap() description:

The state of synchronization objects such as mutexes, semaphores,
barriers, and conditional variables placed in shared memory mapped
with MAP_SHARED becomes undefined when the last region in any process
containing the synchronization object is unmapped.

The reference is available at
http://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html
but you might need to register before getting the online access or
downloading the entire SUSv4tc2 in archive.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Volker Lendecke
On Mon, Nov 21, 2016 at 05:58:23PM +0200, Konstantin Belousov wrote:

> On Mon, Nov 21, 2016 at 04:25:42PM +0100, Volker Lendecke wrote:
> > On Mon, Nov 21, 2016 at 05:10:40PM +0200, Konstantin Belousov wrote:
> > > Please see the libthr(3) man page, in particular, read the RUN-TIME
> > > SETTINGS section, the description of the kern.ipc.umtx_vnode_persistent
> > > sysctl.
> > >
> > > Does setting the sysctl to 1 allow your program to run ?
> >
> > Yes, that does make it work. The description says that the umtx vnode
> > is dropped on the last munmap. If I #if 0 the middle mmap, it works,
> > although there is no mmap around anymore. So the description is not
> > 100% accurate I'd say.
> No, not umtx vnode is dropped, but the shared umutexes attached to the
> file' page.
>
> What you observed is the consequence of an implementation detail.  It is
> impossible to execute umtx cleanup while dereferencing vm object, due to
> locking issues.  An asynchronous task is scheduled to perform the cleanup.
> But when the task is run, it is quite possible that your process is already
> executed second mmap() and pthread_mutex_lock(), creating another reference
> on the umtx data.

Hmm. If I do a poll(NULL, 0, 60000) between the munmap and mmap
without the intervening mmap, it still works. It's really the
mmap(NULL,0xb0...) that kills it.

> Look at the Single Unix Specification, particularly to the following
> paragraph in the mmap() description:
>
> The state of synchronization objects such as mutexes, semaphores,
> barriers, and conditional variables placed in shared memory mapped
> with MAP_SHARED becomes undefined when the last region in any process
> containing the synchronization object is unmapped.

Thanks! Hidden deep in mmap(2)... No hint in any of the pthread calls.

So -- all of the above discussion becomes irrelevant if I change tdb
such that it keeps the mutex area mmappe'd at least once? Then no GC
will kick in regardless of the sysctl? This would be possible, because
we use mutexes on so-called CLEAR_IF_FIRST databases only. When the last
process closes the db, it will be wiped on the next open.

Volker
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Konstantin Belousov
On Mon, Nov 21, 2016 at 05:14:54PM +0100, Volker Lendecke wrote:

> On Mon, Nov 21, 2016 at 05:58:23PM +0200, Konstantin Belousov wrote:
> > On Mon, Nov 21, 2016 at 04:25:42PM +0100, Volker Lendecke wrote:
> > > On Mon, Nov 21, 2016 at 05:10:40PM +0200, Konstantin Belousov wrote:
> > > > Please see the libthr(3) man page, in particular, read the RUN-TIME
> > > > SETTINGS section, the description of the kern.ipc.umtx_vnode_persistent
> > > > sysctl.
> > > >
> > > > Does setting the sysctl to 1 allow your program to run ?
> > >
> > > Yes, that does make it work. The description says that the umtx vnode
> > > is dropped on the last munmap. If I #if 0 the middle mmap, it works,
> > > although there is no mmap around anymore. So the description is not
> > > 100% accurate I'd say.
> > No, not umtx vnode is dropped, but the shared umutexes attached to the
> > file' page.
> >
> > What you observed is the consequence of an implementation detail.  It is
> > impossible to execute umtx cleanup while dereferencing vm object, due to
> > locking issues.  An asynchronous task is scheduled to perform the cleanup.
> > But when the task is run, it is quite possible that your process is already
> > executed second mmap() and pthread_mutex_lock(), creating another reference
> > on the umtx data.
>
> Hmm. If I do a poll(NULL, 0, 60000) between the munmap and mmap
> without the intervening mmap, it still works. It's really the
> mmap(NULL,0xb0...) that kills it.
Yes, because at that time, the cleanup task already completed.  So the
mutex is auto-reinited with default attributes, but as shared.  When
the cleanup is pending, the mutex off-page data is marked for pending
removal, and lookup of that data in pthread_mutex_lock() returns an
error.

This is unfortunate consequence of the initially limiting ABI which
we are trying to preserve still.

>
> > Look at the Single Unix Specification, particularly to the following
> > paragraph in the mmap() description:
> >
> > The state of synchronization objects such as mutexes, semaphores,
> > barriers, and conditional variables placed in shared memory mapped
> > with MAP_SHARED becomes undefined when the last region in any process
> > containing the synchronization object is unmapped.
>
> Thanks! Hidden deep in mmap(2)... No hint in any of the pthread calls.
>
> So -- all of the above discussion becomes irrelevant if I change tdb
> such that it keeps the mutex area mmappe'd at least once? Then no GC
> will kick in regardless of the sysctl? This would be possible, because
> we use mutexes on so-called CLEAR_IF_FIRST databases only. When the last
> process closes the db, it will be wiped on the next open.

If the file is mmaped, then yes, the mutex must be not destroyed.  If it
is, then there is a bug in the current implementation.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: process shared mutexes?

Volker Lendecke
On Mon, Nov 21, 2016 at 06:41:09PM +0200, Konstantin Belousov wrote:
> > So -- all of the above discussion becomes irrelevant if I change tdb
> > such that it keeps the mutex area mmappe'd at least once? Then no GC
> > will kick in regardless of the sysctl? This would be possible, because
> > we use mutexes on so-called CLEAR_IF_FIRST databases only. When the last
> > process closes the db, it will be wiped on the next open.
>
> If the file is mmaped, then yes, the mutex must be not destroyed.  If it
> is, then there is a bug in the current implementation.

Just wanted to say thanks! With a pretty simple change to tdb our
tdbtorture runs smoothly on FreeBSD 11!

Volker
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"