python 3 subprocess performance

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

python 3 subprocess performance

Alexander Zagrebin-2
Hi!

I've noticed the subprocess performance issue with python 3.
For example, this simple script takes just 0,15 second to complete
with python 2.7, but more than 5 sec with python 3.6:

import subprocess

for i in range(100):
    p = subprocess.Popen(['uname', '-a'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
    (stdoutdata, stderrdata) = p.communicate()

Profiling with the cProfile shows, that this excessive 5 seconds was
wasted in the "{built-in method posix.read}"

Could anybody confirm this issue?

--
Alexander Zagrebin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Konstantin Schukraft
Hi,

>I've noticed the subprocess performance issue with python 3.
>For example, this simple script takes just 0,15 second to complete
>with python 2.7, but more than 5 sec with python 3.6:
>
>import subprocess
>
>for i in range(100):
>    p = subprocess.Popen(['uname', '-a'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
>    (stdoutdata, stderrdata) = p.communicate()
>
>Profiling with the cProfile shows, that this excessive 5 seconds was
>wasted in the "{built-in method posix.read}"
>
>Could anybody confirm this issue?
I see this on HardenedBSD 12 with python2.7 (fast) and python3.6 and 3.7
(both slow) installed from packages.
Interestingly, I don't see this on my FreeNAS 11.3, where python2.7 and
3.6 are both installed by default and run fast.


Hope that helps,

Konstantin

signature.asc (235 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Jan Bramkamp-2
In reply to this post by Alexander Zagrebin-2
Please run python under truss -f. Does it try to close(2) all possible
file descriptors? Does the runtime decrease if fdescfs is mounted at
/dev/fd?

On 11.04.19 15:16, Alexander Zagrebin wrote:

> Hi!
>
> I've noticed the subprocess performance issue with python 3.
> For example, this simple script takes just 0,15 second to complete
> with python 2.7, but more than 5 sec with python 3.6:
>
> import subprocess
>
> for i in range(100):
>     p = subprocess.Popen(['uname', '-a'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
>     (stdoutdata, stderrdata) = p.communicate()
>
> Profiling with the cProfile shows, that this excessive 5 seconds was
> wasted in the "{built-in method posix.read}"
>
> Could anybody confirm this issue?
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Konstantin Schukraft
>Please run python under truss -f. Does it try to close(2) all possible
>file descriptors? Does the runtime decrease if fdescfs is mounted at
>/dev/fd?
truss -f -p 17407

17407: read(5,0x1265d2e9120,50000)               = 0 (0x0)
17407: close(5)                                  = 0 (0x0)
17407: lseek(3,0x0,SEEK_CUR)                     ERR#29 'Illegal seek'
17407: fstat(3,{ mode=p--------- ,inode=1610971,size=0,blksize=4096 }) = 0 (0x0)
17407: read(3,"FreeBSD midgard 12.0-STABLE-HBSD"...,8192) = 88 (0x58)
17407: read(3,0x1265ceda078,8104)                = 0 (0x0)
17407: close(3)                                  = 0 (0x0)
17407: wait4(1948,{ EXITED,val=0 },0x0,0x0)      = 1948 (0x79c)
17407: write(1,"63\n",3)                         = 3 (0x3)
17407: pipe2(0x63857dad22e8,O_CLOEXEC)           = 0 (0x0)
17407: fstat(3,{ mode=p--------- ,inode=1610975,size=0,blksize=4096 }) = 0 (0x0)
17407: ioctl(3,TIOCGETA,0x63857dad21a0)          ERR#25 'Inappropriate ioctl for device'
17407: lseek(3,0x0,SEEK_CUR)                     ERR#29 'Illegal seek'
17407: pipe2(0x63857dad21e8,O_CLOEXEC)           = 0 (0x0)
17407: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|
SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
97164: <new process>
17407: fork()                                    = 97164 (0x17b8c)
97164: thr_self(0x1265c7e9000)                   = 0 (0x0)
17407: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0)
97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0)
97164: close(3)                                  = 0 (0x0)
97164: close(5)                                  = 0 (0x0)
17407: close(6)                                  = 0 (0x0)
97164: dup2(4,1)                                 = 1 (0x1)
17407: close(4)                                  = 0 (0x0)
97164: dup2(4,2)                                 = 2 (0x2)
97164: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPRO
F|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
97164: sigaction(SIGPIPE,{ SIG_DFL 0x0 ss_t },{ SIG_IGN 0x0 ss_t }) = 0 (0x0)
97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0)
97164: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPRO
F|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
97164: sigaction(SIGXFSZ,{ SIG_DFL 0x0 ss_t },{ SIG_IGN 0x0 ss_t }) = 0 (0x0)
97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0)
97164: fstatat(AT_FDCWD,"/dev",{ mode=dr-xr-xr-x ,inode=2,size=512,blksize=4096 },0x0) = 0 (0x0)
97164: fstatat(AT_FDCWD,"/dev/fd",{ mode=dr-xr-xr-x ,inode=135,size=512,blksize=4096 },0x0) = 0 (0x0)
97164: getrlimit(RLIMIT_NOFILE,{ cur=940563,max=940563 }) = 0 (0x0)
97164: close(3)                                  ERR#9 'Bad file descriptor'
97164: close(4)                                  = 0 (0x0)
97164: close(5)                                  ERR#9 'Bad file descriptor'
97164: close(7)                                  ERR#9 'Bad file descriptor'
97164: close(8)                                  ERR#9 'Bad file descriptor'
97164: close(9)                                  ERR#9 'Bad file descriptor'

The last error repeats several 10000 times.

Mounting fdescfs does indeed speed python3 up to python2 levels.

>On 11.04.19 15:16, Alexander Zagrebin wrote:
>> Hi!
>>
>> I've noticed the subprocess performance issue with python 3.
>> For example, this simple script takes just 0,15 second to complete
>> with python 2.7, but more than 5 sec with python 3.6:
>>
>> import subprocess
>>
>> for i in range(100):
>>     p = subprocess.Popen(['uname', '-a'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
>>     (stdoutdata, stderrdata) = p.communicate()
>>
>> Profiling with the cProfile shows, that this excessive 5 seconds was
>> wasted in the "{built-in method posix.read}"
>>
>> Could anybody confirm this issue?
>>
>_______________________________________________
>[hidden email] mailing list
>https://lists.freebsd.org/mailman/listinfo/freebsd-ports
>To unsubscribe, send any mail to "[hidden email]"

signature.asc (235 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Jan Bramkamp-2
The reason is that that python does something stupid (tm). It tries to
close all file descriptors (except a few whitelisted ones) up to the
maximum file descriptor number. It does this by asking the kernel for
the maximum possible number and closing everything it doesn't want to
keep. Some time later someone came up with an optimization (read the
open file descriptors from /dev/fd). All of this pain and suffering is
caused by good old Ulrich Drepper braindamage:
https://sourceware.org/bugzilla/show_bug.cgi?id=10353.

Most Linux distros have lower default file descriptor limits than
FreeBSD making this workaround less painful. The correct solution would
be to teach python3 about closefrom(2).

On 11.04.19 16:39, Konstantin Schukraft wrote:

>> Please run python under truss -f. Does it try to close(2) all possible
>> file descriptors? Does the runtime decrease if fdescfs is mounted at
>> /dev/fd?
> truss -f -p 17407
>
> 17407: read(5,0x1265d2e9120,50000)               = 0 (0x0)
> 17407: close(5)                                  = 0 (0x0)
> 17407: lseek(3,0x0,SEEK_CUR)                     ERR#29 'Illegal seek'
> 17407: fstat(3,{ mode=p--------- ,inode=1610971,size=0,blksize=4096 }) =
> 0 (0x0)
> 17407: read(3,"FreeBSD midgard 12.0-STABLE-HBSD"...,8192) = 88 (0x58)
> 17407: read(3,0x1265ceda078,8104)                = 0 (0x0)
> 17407: close(3)                                  = 0 (0x0)
> 17407: wait4(1948,{ EXITED,val=0 },0x0,0x0)      = 1948 (0x79c)
> 17407: write(1,"63\n",3)                         = 3 (0x3)
> 17407: pipe2(0x63857dad22e8,O_CLOEXEC)           = 0 (0x0)
> 17407: fstat(3,{ mode=p--------- ,inode=1610975,size=0,blksize=4096 }) =
> 0 (0x0)
> 17407: ioctl(3,TIOCGETA,0x63857dad21a0)          ERR#25 'Inappropriate
> ioctl for device'
> 17407: lseek(3,0x0,SEEK_CUR)                     ERR#29 'Illegal seek'
> 17407: pipe2(0x63857dad21e8,O_CLOEXEC)           = 0 (0x0)
> 17407: sigprocmask(SIG_BLOCK,{
> SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|
>
> SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
> 97164: <new process>
> 17407: fork()                                    = 97164 (0x17b8c)
> 97164: thr_self(0x1265c7e9000)                   = 0 (0x0)
> 17407: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0)
> 97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0)
> 97164: close(3)                                  = 0 (0x0)
> 97164: close(5)                                  = 0 (0x0)
> 17407: close(6)                                  = 0 (0x0)
> 97164: dup2(4,1)                                 = 1 (0x1)
> 17407: close(4)                                  = 0 (0x0)
> 97164: dup2(4,2)                                 = 2 (0x2)
> 97164: sigprocmask(SIG_SETMASK,{
> SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPRO
>
> F|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
> 97164: sigaction(SIGPIPE,{ SIG_DFL 0x0 ss_t },{ SIG_IGN 0x0 ss_t }) = 0
> (0x0)
> 97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0)
> 97164: sigprocmask(SIG_SETMASK,{
> SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPRO
>
> F|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
> 97164: sigaction(SIGXFSZ,{ SIG_DFL 0x0 ss_t },{ SIG_IGN 0x0 ss_t }) = 0
> (0x0)
> 97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0)
> 97164: fstatat(AT_FDCWD,"/dev",{ mode=dr-xr-xr-x
> ,inode=2,size=512,blksize=4096 },0x0) = 0 (0x0)
> 97164: fstatat(AT_FDCWD,"/dev/fd",{ mode=dr-xr-xr-x
> ,inode=135,size=512,blksize=4096 },0x0) = 0 (0x0)
> 97164: getrlimit(RLIMIT_NOFILE,{ cur=940563,max=940563 }) = 0 (0x0)
> 97164: close(3)                                  ERR#9 'Bad file
> descriptor'
> 97164: close(4)                                  = 0 (0x0)
> 97164: close(5)                                  ERR#9 'Bad file
> descriptor'
> 97164: close(7)                                  ERR#9 'Bad file
> descriptor'
> 97164: close(8)                                  ERR#9 'Bad file
> descriptor'
> 97164: close(9)                                  ERR#9 'Bad file
> descriptor'
>
> The last error repeats several 10000 times.
>
> Mounting fdescfs does indeed speed python3 up to python2 levels.
>
>> On 11.04.19 15:16, Alexander Zagrebin wrote:
>>> Hi!
>>>
>>> I've noticed the subprocess performance issue with python 3.
>>> For example, this simple script takes just 0,15 second to complete
>>> with python 2.7, but more than 5 sec with python 3.6:
>>>
>>> import subprocess
>>>
>>> for i in range(100):
>>>     p = subprocess.Popen(['uname', '-a'], stdout=subprocess.PIPE,
>>> stderr=subprocess.STDOUT)
>>>     (stdoutdata, stderrdata) = p.communicate()
>>>
>>> Profiling with the cProfile shows, that this excessive 5 seconds was
>>> wasted in the "{built-in method posix.read}"
>>>
>>> Could anybody confirm this issue?
>>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-ports
>> To unsubscribe, send any mail to "[hidden email]"
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Alexander Zagrebin-2
В Thu, 11 Apr 2019 17:32:42 +0200
Jan Bramkamp <[hidden email]> пишет:

> The reason is that that python does something stupid (tm). It tries to
> close all file descriptors (except a few whitelisted ones) up to the
> maximum file descriptor number. It does this by asking the kernel for
> the maximum possible number and closing everything it doesn't want to
> keep. Some time later someone came up with an optimization (read the
> open file descriptors from /dev/fd). All of this pain and suffering is
> caused by good old Ulrich Drepper braindamage:
> https://sourceware.org/bugzilla/show_bug.cgi?id=10353.
>
> Most Linux distros have lower default file descriptor limits than
> FreeBSD making this workaround less painful. The correct solution
> would be to teach python3 about closefrom(2).

Thank you for hint and testing!

Indeed the problem is in closing more than 400,000 file descriptors in
loop. It seems that all current versions of Python are affected.
Python2 uses False as default value for the close_fds parameter of
the Popen constructor, so this issue is mostly not visible.
Python3 has changed this default to True.

As Jan Bramkamp suggested, I've wrote simple patch to fix an issue (see
attached file). It seems the problem has gone.

--
Alexander Zagrebin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Dima Pasechnik
On Fri, Apr 12, 2019 at 9:11 AM Alexander Zagrebin <[hidden email]> wrote:

>
> В Thu, 11 Apr 2019 17:32:42 +0200
> Jan Bramkamp <[hidden email]> пишет:
>
> > The reason is that that python does something stupid (tm). It tries to
> > close all file descriptors (except a few whitelisted ones) up to the
> > maximum file descriptor number. It does this by asking the kernel for
> > the maximum possible number and closing everything it doesn't want to
> > keep. Some time later someone came up with an optimization (read the
> > open file descriptors from /dev/fd). All of this pain and suffering is
> > caused by good old Ulrich Drepper braindamage:
> > https://sourceware.org/bugzilla/show_bug.cgi?id=10353.
> >
> > Most Linux distros have lower default file descriptor limits than
> > FreeBSD making this workaround less painful. The correct solution
> > would be to teach python3 about closefrom(2).
>
> Thank you for hint and testing!
>
> Indeed the problem is in closing more than 400,000 file descriptors in
> loop. It seems that all current versions of Python are affected.
> Python2 uses False as default value for the close_fds parameter of
> the Popen constructor, so this issue is mostly not visible.
> Python3 has changed this default to True.
>
> As Jan Bramkamp suggested, I've wrote simple patch to fix an issue (see
> attached file). It seems the problem has gone.

The attachment has been stripped out. Could you paste the diff into the message?

If this is a Python issue, shouldn't this be reported upstream, on
https://bugs.python.org ?

Dima
>
> --
> Alexander Zagrebin
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> To unsubscribe, send any mail to "[hidden email]"
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Alexander Zagrebin-2
В Fri, 12 Apr 2019 09:36:13 +0200
Dima Pasechnik <[hidden email]> пишет:

> On Fri, Apr 12, 2019 at 9:11 AM Alexander Zagrebin <[hidden email]>
> wrote:
> >
> > В Thu, 11 Apr 2019 17:32:42 +0200
> > Jan Bramkamp <[hidden email]> пишет:
> >  
> > > The reason is that that python does something stupid (tm). It
> > > tries to close all file descriptors (except a few whitelisted
> > > ones) up to the maximum file descriptor number. It does this by
> > > asking the kernel for the maximum possible number and closing
> > > everything it doesn't want to keep. Some time later someone came
> > > up with an optimization (read the open file descriptors
> > > from /dev/fd). All of this pain and suffering is caused by good
> > > old Ulrich Drepper braindamage:
> > > https://sourceware.org/bugzilla/show_bug.cgi?id=10353.
> > >
> > > Most Linux distros have lower default file descriptor limits than
> > > FreeBSD making this workaround less painful. The correct solution
> > > would be to teach python3 about closefrom(2).  
> >
> > Thank you for hint and testing!
> >
> > Indeed the problem is in closing more than 400,000 file descriptors
> > in loop. It seems that all current versions of Python are affected.
> > Python2 uses False as default value for the close_fds parameter of
> > the Popen constructor, so this issue is mostly not visible.
> > Python3 has changed this default to True.
> >
> > As Jan Bramkamp suggested, I've wrote simple patch to fix an issue
> > (see attached file). It seems the problem has gone.  
>
> The attachment has been stripped out. Could you paste the diff into
> the message?

Yes, sure.

--- Modules/_posixsubprocess.c.orig     2018-12-24 00:37:14.000000000
+0300 +++ Modules/_posixsubprocess.c          2019-04-12
09:25:21.549389000 +0300 @@ -235,11 +235,15 @@
_close_fds_by_brute_force(long start_fd, }
         start_fd = keep_fd + 1;
     }
+#if defined(__FreeBSD__)
+    closefrom(start_fd);
+#else
     if (start_fd <= end_fd) {
         for (fd_num = start_fd; fd_num < end_fd; ++fd_num) {
             close(fd_num);
         }
     }
+#endif
 }

> If this is a Python issue, shouldn't this be reported upstream, on
> https://bugs.python.org ?

May be. Rather, it is a FreeBSD-specific optimization.

--
Alexander Zagrebin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Dima Pasechnik
On Fri, Apr 12, 2019 at 9:46 AM Alexander Zagrebin <[hidden email]> wrote:

>
> В Fri, 12 Apr 2019 09:36:13 +0200
> Dima Pasechnik <[hidden email]> пишет:
>
> > On Fri, Apr 12, 2019 at 9:11 AM Alexander Zagrebin <[hidden email]>
> > wrote:
> > >
> > > В Thu, 11 Apr 2019 17:32:42 +0200
> > > Jan Bramkamp <[hidden email]> пишет:
> > >
> > > > The reason is that that python does something stupid (tm). It
> > > > tries to close all file descriptors (except a few whitelisted
> > > > ones) up to the maximum file descriptor number. It does this by
> > > > asking the kernel for the maximum possible number and closing
> > > > everything it doesn't want to keep. Some time later someone came
> > > > up with an optimization (read the open file descriptors
> > > > from /dev/fd). All of this pain and suffering is caused by good
> > > > old Ulrich Drepper braindamage:
> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=10353.
> > > >
> > > > Most Linux distros have lower default file descriptor limits than
> > > > FreeBSD making this workaround less painful. The correct solution
> > > > would be to teach python3 about closefrom(2).
> > >
> > > Thank you for hint and testing!
> > >
> > > Indeed the problem is in closing more than 400,000 file descriptors
> > > in loop. It seems that all current versions of Python are affected.
> > > Python2 uses False as default value for the close_fds parameter of
> > > the Popen constructor, so this issue is mostly not visible.
> > > Python3 has changed this default to True.
> > >
> > > As Jan Bramkamp suggested, I've wrote simple patch to fix an issue
> > > (see attached file). It seems the problem has gone.
> >
> > The attachment has been stripped out. Could you paste the diff into
> > the message?
>
> Yes, sure.
>
> --- Modules/_posixsubprocess.c.orig     2018-12-24 00:37:14.000000000
> +0300 +++ Modules/_posixsubprocess.c          2019-04-12
> 09:25:21.549389000 +0300 @@ -235,11 +235,15 @@
> _close_fds_by_brute_force(long start_fd, }
>          start_fd = keep_fd + 1;
>      }
> +#if defined(__FreeBSD__)
> +    closefrom(start_fd);
> +#else
>      if (start_fd <= end_fd) {
>          for (fd_num = start_fd; fd_num < end_fd; ++fd_num) {
>              close(fd_num);
>          }
>      }
> +#endif
>  }
>
> > If this is a Python issue, shouldn't this be reported upstream, on
> > https://bugs.python.org ?
>
> May be. Rather, it is a FreeBSD-specific optimization.

Well, closefrom() is also available in Darwin (a.k.a. MacOSX :-)),
OpenBSD and NetBSD. (It's not documented in current MacOSX, but it is
there, I just checked)
Anyway, FreeBSD Python maintainers will ask for an upstream PR.

I can do such a PR is noone else is willing to...

Dima


>
> --
> Alexander Zagrebin
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> To unsubscribe, send any mail to "[hidden email]"
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Alexander Zagrebin-2
В Fri, 12 Apr 2019 12:41:53 +0200
Dima Pasechnik <[hidden email]> пишет:

> > > > В Thu, 11 Apr 2019 17:32:42 +0200
> > > > Jan Bramkamp <[hidden email]> пишет:
> > > >  
> > > > > The reason is that that python does something stupid (tm). It
> > > > > tries to close all file descriptors (except a few whitelisted
> > > > > ones) up to the maximum file descriptor number. It does this
> > > > > by asking the kernel for the maximum possible number and
> > > > > closing everything it doesn't want to keep. Some time later
> > > > > someone came up with an optimization (read the open file
> > > > > descriptors from /dev/fd). All of this pain and suffering is
> > > > > caused by good old Ulrich Drepper braindamage:
> > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=10353.
> > > > >
> > > > > Most Linux distros have lower default file descriptor limits
> > > > > than FreeBSD making this workaround less painful. The correct
> > > > > solution would be to teach python3 about closefrom(2).  
> > > >
> > > > Thank you for hint and testing!
> > > >
> > > > Indeed the problem is in closing more than 400,000 file
> > > > descriptors in loop. It seems that all current versions of
> > > > Python are affected. Python2 uses False as default value for
> > > > the close_fds parameter of the Popen constructor, so this issue
> > > > is mostly not visible. Python3 has changed this default to True.
> > > >
> > > > As Jan Bramkamp suggested, I've wrote simple patch to fix an
> > > > issue (see attached file). It seems the problem has gone.  
> > >
> > > The attachment has been stripped out. Could you paste the diff
> > > into the message?  
> >
> > Yes, sure.
> >
> > --- Modules/_posixsubprocess.c.orig     2018-12-24
> > 00:37:14.000000000 +0300 +++ Modules/_posixsubprocess.c
> > 2019-04-12 09:25:21.549389000 +0300 @@ -235,11 +235,15 @@
> > _close_fds_by_brute_force(long start_fd, }
> >          start_fd = keep_fd + 1;
> >      }
> > +#if defined(__FreeBSD__)
> > +    closefrom(start_fd);
> > +#else
> >      if (start_fd <= end_fd) {
> >          for (fd_num = start_fd; fd_num < end_fd; ++fd_num) {
> >              close(fd_num);
> >          }
> >      }
> > +#endif
> >  }
> >  
> > > If this is a Python issue, shouldn't this be reported upstream, on
> > > https://bugs.python.org ?  
> >
> > May be. Rather, it is a FreeBSD-specific optimization.  
>
> Well, closefrom() is also available in Darwin (a.k.a. MacOSX :-)),
> OpenBSD and NetBSD. (It's not documented in current MacOSX, but it is
> there, I just checked)
> Anyway, FreeBSD Python maintainers will ask for an upstream PR.
>
> I can do such a PR is noone else is willing to...

This would be good. Thanks!

--
Alexander Zagrebin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Kubilay Kocak
In reply to this post by Dima Pasechnik
On 12/04/2019 8:41 pm, Dima Pasechnik wrote:

> On Fri, Apr 12, 2019 at 9:46 AM Alexander Zagrebin <[hidden email]> wrote:
>>
>> В Fri, 12 Apr 2019 09:36:13 +0200
>> Dima Pasechnik <[hidden email]> пишет:
>>
>>> On Fri, Apr 12, 2019 at 9:11 AM Alexander Zagrebin <[hidden email]>
>>> wrote:
>>>>
>>>> В Thu, 11 Apr 2019 17:32:42 +0200
>>>> Jan Bramkamp <[hidden email]> пишет:
>>>>
>>>>> The reason is that that python does something stupid (tm). It
>>>>> tries to close all file descriptors (except a few whitelisted
>>>>> ones) up to the maximum file descriptor number. It does this by
>>>>> asking the kernel for the maximum possible number and closing
>>>>> everything it doesn't want to keep. Some time later someone came
>>>>> up with an optimization (read the open file descriptors
>>>>> from /dev/fd). All of this pain and suffering is caused by good
>>>>> old Ulrich Drepper braindamage:
>>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=10353.
>>>>>
>>>>> Most Linux distros have lower default file descriptor limits than
>>>>> FreeBSD making this workaround less painful. The correct solution
>>>>> would be to teach python3 about closefrom(2).
>>>>
>>>> Thank you for hint and testing!
>>>>
>>>> Indeed the problem is in closing more than 400,000 file descriptors
>>>> in loop. It seems that all current versions of Python are affected.
>>>> Python2 uses False as default value for the close_fds parameter of
>>>> the Popen constructor, so this issue is mostly not visible.
>>>> Python3 has changed this default to True.
>>>>
>>>> As Jan Bramkamp suggested, I've wrote simple patch to fix an issue
>>>> (see attached file). It seems the problem has gone.
>>>
>>> The attachment has been stripped out. Could you paste the diff into
>>> the message?
>>
>> Yes, sure.
>>
>> --- Modules/_posixsubprocess.c.orig     2018-12-24 00:37:14.000000000
>> +0300 +++ Modules/_posixsubprocess.c          2019-04-12
>> 09:25:21.549389000 +0300 @@ -235,11 +235,15 @@
>> _close_fds_by_brute_force(long start_fd, }
>>           start_fd = keep_fd + 1;
>>       }
>> +#if defined(__FreeBSD__)
>> +    closefrom(start_fd);
>> +#else
>>       if (start_fd <= end_fd) {
>>           for (fd_num = start_fd; fd_num < end_fd; ++fd_num) {
>>               close(fd_num);
>>           }
>>       }
>> +#endif
>>   }
>>
>>> If this is a Python issue, shouldn't this be reported upstream, on
>>> https://bugs.python.org ?
>>
>> May be. Rather, it is a FreeBSD-specific optimization.
>
> Well, closefrom() is also available in Darwin (a.k.a. MacOSX :-)),
> OpenBSD and NetBSD. (It's not documented in current MacOSX, but it is
> there, I just checked)
> Anyway, FreeBSD Python maintainers will ask for an upstream PR.
>
> I can do such a PR is noone else is willing to...
>
> Dima
>
>

Hi Dima,

Issue exists for this:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700

Pending *upstreamable* patches for lang/python*, that we can carry
locally until released.

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: python 3 subprocess performance

Jan Beich-5
Kubilay Kocak <[hidden email]> writes:

> On 12/04/2019 8:41 pm, Dima Pasechnik wrote:
>
>> On Fri, Apr 12, 2019 at 9:46 AM Alexander Zagrebin <[hidden email]> wrote:
>>>
>>> В Fri, 12 Apr 2019 09:36:13 +0200
>>> Dima Pasechnik <[hidden email]> пишет:
>>>
>>>> On Fri, Apr 12, 2019 at 9:11 AM Alexander Zagrebin <[hidden email]>
>>>> wrote:
>>>>>
>>>>> В Thu, 11 Apr 2019 17:32:42 +0200
>>>>> Jan Bramkamp <[hidden email]> пишет:
>>>>>
>>>>>> The reason is that that python does something stupid (tm). It
>>>>>> tries to close all file descriptors (except a few whitelisted
>>>>>> ones) up to the maximum file descriptor number. It does this by
>>>>>> asking the kernel for the maximum possible number and closing
>>>>>> everything it doesn't want to keep. Some time later someone came
>>>>>> up with an optimization (read the open file descriptors
>>>>>> from /dev/fd). All of this pain and suffering is caused by good
>>>>>> old Ulrich Drepper braindamage:
>>>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=10353.
>>>>>>
>>>>>> Most Linux distros have lower default file descriptor limits than
>>>>>> FreeBSD making this workaround less painful. The correct solution
>>>>>> would be to teach python3 about closefrom(2).
>>>>>
>>>>> Thank you for hint and testing!
>>>>>
>>>>> Indeed the problem is in closing more than 400,000 file descriptors
>>>>> in loop. It seems that all current versions of Python are affected.
>>>>> Python2 uses False as default value for the close_fds parameter of
>>>>> the Popen constructor, so this issue is mostly not visible.
>>>>> Python3 has changed this default to True.
>>>>>
>>>>> As Jan Bramkamp suggested, I've wrote simple patch to fix an issue
>>>>> (see attached file). It seems the problem has gone.
>>>>
>>>> The attachment has been stripped out. Could you paste the diff into
>>>> the message?
>>>
>>> Yes, sure.
>>>
>>> --- Modules/_posixsubprocess.c.orig     2018-12-24 00:37:14.000000000
>>> +0300 +++ Modules/_posixsubprocess.c          2019-04-12
>>> 09:25:21.549389000 +0300 @@ -235,11 +235,15 @@
>>> _close_fds_by_brute_force(long start_fd, }
>>>           start_fd = keep_fd + 1;
>>>       }
>>> +#if defined(__FreeBSD__)
>>> +    closefrom(start_fd);
>>> +#else
>>>       if (start_fd <= end_fd) {
>>>           for (fd_num = start_fd; fd_num < end_fd; ++fd_num) {
>>>               close(fd_num);
>>>           }
>>>       }
>>> +#endif
>>>   }
>>>
>>>> If this is a Python issue, shouldn't this be reported upstream, on
>>>> https://bugs.python.org ?
>>>
>>> May be. Rather, it is a FreeBSD-specific optimization.
>>
>> Well, closefrom() is also available in Darwin (a.k.a. MacOSX :-)),
>> OpenBSD and NetBSD. (It's not documented in current MacOSX, but it is
>> there, I just checked)

FreeBSD was late to the party. Even DragonFly got closefrom() before. ;)

Note, closefrom() looks non-atomic on Solaris.
http://src.illumos.org/source/xref/illumos-gate/usr/src/lib/libc/port/gen/closefrom.c

> Issue exists for this:
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700

Oh, nice. I've recently it hit via www/firefox build.
https://bugzilla.mozilla.org/show_bug.cgi?id=1507655
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[hidden email]"