Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]

Konstantin Belousov-3
On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote:
> [The new, trial code also has truncation occurring.]
In fact no, I do not think it is.

> An example observation of diff_scaled having an overflowed
> value was:
>
> scale_factor            == 0x80da2067ac
> scale_factor*freq overflows unsigned, 64 bit representation.
> tim_offset              ==   0x3da0eaeb
> tim_cnt                 ==   0x42dea3c4
> tim_diff                ==    0x53db8d9
> For reference:                0x1fc9d43 == 0xffffffffffffffffull/scale_factor
> scaled_diff       == 0xA353A5BF3FF780CC (truncated to 64 bits)
>
> But for the new, trail code:
>
> 0x80da2067ac is 40 bits
>    0x53db8d9 is 27 bits
> So 67 bits, more than 63. Then:
>
>    x
> == (0x80da2067ac>>32) * 0x53db8d9
> == 0x80 * 0x53db8d9
> == 0x29EDC6C80
>
>    x>>32
> == 0x2
>
>    x<<32
> == 0x9EDC6C8000000000 (limited to 64 bits)
> Note the truncation of: 0x29EDC6C8000000000.
Right, this is how the patch is supposed to work.  Note that the overflow
bits 'lost' due to overflow of the left shift are the same bits that as
used to increment bt->sec:
        bt->sec += x >> 32;
So the 2 seconds are accounted for.

>
> Thus the "bintime_addx(bt, x << 32)" is still
> based on a truncated value.
>

I must admit that 2 seconds of interval where the timehands where
not updated is too much.  This might be the real cause of all ppc
troubles.  I tried to see if the overflow case is possible on amd64,
and did not get a single case of the '> 63' branch executed during the
/usr/tests/lib/libc run.

Actually, the same overflow-prone code exists in libc, so below is the
updated patch:
- I added __predict_false()
- libc multiplication is also done separately for high-order bits.
(fftclock counterpart is still pending).

diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c
index 3749e0473af..a14576988ff 100644
--- a/lib/libc/sys/__vdso_gettimeofday.c
+++ b/lib/libc/sys/__vdso_gettimeofday.c
@@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$");
 #include <sys/time.h>
 #include <sys/vdso.h>
 #include <errno.h>
+#include <limits.h>
+#include <strings.h>
 #include <time.h>
 #include <machine/atomic.h>
 #include "libc_private.h"
@@ -62,6 +64,7 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs)
 {
  struct vdso_timehands *th;
  uint32_t curr, gen;
+ uint64_t scale, x;
  u_int delta;
  int error;
 
@@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs)
  continue;
  if (error != 0)
  return (error);
- bintime_addx(bt, th->th_scale * delta);
+ scale = th->th_scale;
+ if (__predict_false(fls(scale) + fls(delta) > 63)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  if (abs)
  bintime_add(bt, &th->th_boottime);
 
diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c
index 2656fb4d22f..be75781e000 100644
--- a/sys/kern/kern_tc.c
+++ b/sys/kern/kern_tc.c
@@ -355,13 +355,22 @@ void
 binuptime(struct bintime *bt)
 {
  struct timehands *th;
- u_int gen;
+ uint64_t scale, x;
+ u_int delta, gen;
 
  do {
  th = timehands;
  gen = atomic_load_acq_int(&th->th_generation);
  *bt = th->th_offset;
- bintime_addx(bt, th->th_scale * tc_delta(th));
+ scale = th->th_scale;
+ delta = tc_delta(th);
+ if (__predict_false(fls(scale) + fls(delta) > 63)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  atomic_thread_fence_acq();
  } while (gen == 0 || gen != th->th_generation);
 }
@@ -388,13 +397,22 @@ void
 bintime(struct bintime *bt)
 {
  struct timehands *th;
- u_int gen;
+ uint64_t scale, x;
+ u_int delta, gen;
 
  do {
  th = timehands;
  gen = atomic_load_acq_int(&th->th_generation);
  *bt = th->th_bintime;
- bintime_addx(bt, th->th_scale * tc_delta(th));
+ scale = th->th_scale;
+ delta = tc_delta(th);
+ if (__predict_false(fls(scale) + fls(delta) > 63)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  atomic_thread_fence_acq();
  } while (gen == 0 || gen != th->th_generation);
 }
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]

Konstantin Belousov-3
On Sat, Mar 02, 2019 at 05:40:58AM +1100, Bruce Evans wrote:

> On Fri, 1 Mar 2019, Konstantin Belousov wrote:
>
> > On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote:
> >> [The new, trial code also has truncation occurring.]
> > In fact no, I do not think it is.
> >
> >> An example observation of diff_scaled having an overflowed
> >> value was:
> >>
> >> scale_factor            == 0x80da2067ac
> >> scale_factor*freq overflows unsigned, 64 bit representation.
> >> tim_offset              ==   0x3da0eaeb
> >> tim_cnt                 ==   0x42dea3c4
> >> tim_diff                ==    0x53db8d9
> >> For reference:                0x1fc9d43 == 0xffffffffffffffffull/scale_factor
> >> scaled_diff       == 0xA353A5BF3FF780CC (truncated to 64 bits)
> >>
> >> But for the new, trail code:
> >>
> >> 0x80da2067ac is 40 bits
> >>    0x53db8d9 is 27 bits
> >> So 67 bits, more than 63. Then:
> >>
> >>    x
> >> == (0x80da2067ac>>32) * 0x53db8d9
> >> == 0x80 * 0x53db8d9
> >> == 0x29EDC6C80
> >>
> >>    x>>32
> >> == 0x2
> >>
> >>    x<<32
> >> == 0x9EDC6C8000000000 (limited to 64 bits)
> >> Note the truncation of: 0x29EDC6C8000000000.
> > Right, this is how the patch is supposed to work.  Note that the overflow
> > bits 'lost' due to overflow of the left shift are the same bits that as
> > used to increment bt->sec:
> > bt->sec += x >> 32;
> > So the 2 seconds are accounted for.
> >
> >>
> >> Thus the "bintime_addx(bt, x << 32)" is still
> >> based on a truncated value.
> >
> > I must admit that 2 seconds of interval where the timehands where
> > not updated is too much.  This might be the real cause of all ppc
> > troubles.  I tried to see if the overflow case is possible on amd64,
> > and did not get a single case of the '> 63' branch executed during the
> > /usr/tests/lib/libc run.
>
> The algorithm requires the update interval to be less than 1 second.
> th_scale is 2**64 / tc_frequency, so whatever tc_frequency is, after
> 1 second the value of the multiplication is approximately 2**64 so
> it overflows about then (depending on rounding).
>
> The most useful timecounters are TSC's, and these give another overflow
> in tc_delta() after 1 second when their frequency is 4 GHz (except the
> bogus TSC-low timecounter reduces the frequency to below 2 binary GHz,
> so the usual case is overflow after 2 seconds).
As I said, I was unable to trigger the overflow on amd64.

>
> > Actually, the same overflow-prone code exists in libc, so below is the
> > updated patch:
> > - I added __predict_false()
> > - libc multiplication is also done separately for high-order bits.
> > (fftclock counterpart is still pending).
> >
> > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c
> > index 3749e0473af..a14576988ff 100644
> > --- a/lib/libc/sys/__vdso_gettimeofday.c
> > +++ b/lib/libc/sys/__vdso_gettimeofday.c
> > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$");
> > #include <sys/time.h>
> > #include <sys/vdso.h>
> > #include <errno.h>
> > +#include <limits.h>
> > +#include <strings.h>
> > #include <time.h>
> > #include <machine/atomic.h>
> > #include "libc_private.h"
> > @@ -62,6 +64,7 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs)
> > {
> > struct vdso_timehands *th;
> > uint32_t curr, gen;
> > + uint64_t scale, x;
> > u_int delta;
> > int error;
> >
> > @@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs)
> > continue;
> > if (error != 0)
> > return (error);
> > - bintime_addx(bt, th->th_scale * delta);
> > + scale = th->th_scale;
> > + if (__predict_false(fls(scale) + fls(delta) > 63)) {
>
> This is unnecessarily pessimal.  Updates must be frequent enough to prevent
> tc_delta() overflowing, and it is even easier to arrange that this
> multiplication doesn't overflow (since the necessary update interval for
> the latter is constant).
>
> `scale' is 64 bits, so fls(scale) is broken on 32-bit arches, and
> flls(scale) is an especially large pessimization.  I saw this on my
> calcru1() fixes -- the flls()s take almost as long as long long
> divisions when the use the pessimal C versions.
Ok, fixed.

>
> The algorithm requires tc_delta() to be only 32 bits, since otherwise the
> multiplication would be 64 x 64 bits so would be much slower and harder
> to write.
>
> If tc_freqency is far above 4GHz, then th_scale is far below 4G, so the
> scaling is not so accurate.  But 0.25 parts per billion is much more than
> enough.  Even 1 part per million is enough for a TSC, since TSC instability
> is more than 1ppm.  The overflows could be pushed off to 1024 seconds by
> dividing by 1024 at suitable places.  A 32-bit scale times a 64-bit delta
> would be simple compared with both 64 bits (much like the code here).
>
> The code here can be optimized using values calculated at initialization
> time instead of fls*().  The overflow threshold for delta is approximately
> 2**64 / tc_frequency.
>
> > + x = (scale >> 32) * delta;
> > + scale &= UINT_MAX;
> > + bt->sec += x >> 32;
> > + bintime_addx(bt, x << 32);
> > + }
> > + bintime_addx(bt, scale * delta);
> > if (abs)
> > bintime_add(bt, &th->th_boottime);
>
> When the timecounter is the i8254, as it often was when timecounters
> were new, tc_windup() had to be called more often than every i8254 rollover
> (in practice once every hardclock tick), partly to keep tc_delta() small
> (since rollover gives a form of overflow).  This was not so easy to arrange.
> It requires not losing any hardclock ticks and also not having any with
> high latency and also complications to detect the rollover when there is
> small latency.  Most hardware is easier to handle now.  With tickless
> kernels, hardclock() is often not called for about 1 second, but it must
> be called at least that often to prevent the overflow here.

Updated patch.

diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c
index 3749e0473af..fdefda08e39 100644
--- a/lib/libc/sys/__vdso_gettimeofday.c
+++ b/lib/libc/sys/__vdso_gettimeofday.c
@@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$");
 #include <sys/time.h>
 #include <sys/vdso.h>
 #include <errno.h>
+#include <limits.h>
+#include <strings.h>
 #include <time.h>
 #include <machine/atomic.h>
 #include "libc_private.h"
@@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs)
 {
  struct vdso_timehands *th;
  uint32_t curr, gen;
- u_int delta;
+ uint64_t scale, x;
+ u_int delta, scale_bits;
  int error;
 
  do {
@@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs)
  continue;
  if (error != 0)
  return (error);
- bintime_addx(bt, th->th_scale * delta);
+ scale = th->th_scale;
+#ifdef _LP64
+ scale_bits = ffsl(scale);
+#else
+ scale_bits = ffsll(scale);
+#endif
+ if (__predict_false(scale_bits + fls(delta) > 63)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  if (abs)
  bintime_add(bt, &th->th_boottime);
 
diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c
index 2656fb4d22f..eedea5183c0 100644
--- a/sys/kern/kern_tc.c
+++ b/sys/kern/kern_tc.c
@@ -72,6 +72,7 @@ struct timehands {
  struct timecounter *th_counter;
  int64_t th_adjustment;
  uint64_t th_scale;
+ u_int th_scale_bits;
  u_int th_offset_count;
  struct bintime th_offset;
  struct bintime th_bintime;
@@ -355,13 +356,22 @@ void
 binuptime(struct bintime *bt)
 {
  struct timehands *th;
- u_int gen;
+ uint64_t scale, x;
+ u_int delta, gen;
 
  do {
  th = timehands;
  gen = atomic_load_acq_int(&th->th_generation);
  *bt = th->th_offset;
- bintime_addx(bt, th->th_scale * tc_delta(th));
+ scale = th->th_scale;
+ delta = tc_delta(th);
+ if (__predict_false(th->th_scale_bits + fls(delta) > 63)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  atomic_thread_fence_acq();
  } while (gen == 0 || gen != th->th_generation);
 }
@@ -388,13 +398,22 @@ void
 bintime(struct bintime *bt)
 {
  struct timehands *th;
- u_int gen;
+ uint64_t scale, x;
+ u_int delta, gen;
 
  do {
  th = timehands;
  gen = atomic_load_acq_int(&th->th_generation);
  *bt = th->th_bintime;
- bintime_addx(bt, th->th_scale * tc_delta(th));
+ scale = th->th_scale;
+ delta = tc_delta(th);
+ if (__predict_false(th->th_scale_bits + fls(delta) > 63)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  atomic_thread_fence_acq();
  } while (gen == 0 || gen != th->th_generation);
 }
@@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin)
  scale += (th->th_adjustment / 1024) * 2199;
  scale /= th->th_counter->tc_frequency;
  th->th_scale = scale * 2;
+#ifdef _LP64
+ th->th_scale_bits = ffsl(th->th_scale);
+#else
+ th->th_scale_bits = ffsll(th->th_scale);
+#endif
 
  /*
  * Now that the struct timehands is again consistent, set the new
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]

freebsd-hackers mailing list
In reply to this post by Konstantin Belousov-3


On 2019-Mar-1, at 13:11, Ian Lepore <ian at freebsd.org> wrote:

> On Fri, 2019-03-01 at 20:57 +0000, Poul-Henning Kamp wrote:
>> --------
>> In message <[hidden email]>, Mark
>> Millard via freebsd-hackers writes:
>>
>>>> I must admit that 2 seconds of interval where the timehands where
>>>> not updated is too much.
>>
>> I have no idea how you got in that situation, but it is very far
>> from how timecounters were designed to work.
>>
>
> I wonder if it's fallout from reducing the number of timehands to 2,
> which always struck me as a really bad idea. I know of at least one arm
> configuration which fails because of it (it takes a combo of a single-
> core system, and a pps capture driver that uses hardware latching of
> the timer and the polling method for reading the latched value; given
> all that, at least 4 sets of timehands are needed to avoid losing PPS
> events due to generation changes).

Thanks for the suggestion.

I tried putting back having 10 timehands structures with my
variant of the original binuptime code. (My investigative
code records some information.) Unfortunately, having the
extra timehands's did not change the PowerMac G5's behavior.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]

Konstantin Belousov
In reply to this post by Konstantin Belousov-3
On Sat, Mar 02, 2019 at 07:38:20AM +1100, Bruce Evans wrote:

> On Fri, 1 Mar 2019, Konstantin Belousov wrote:
> > + scale = th->th_scale;
> > + delta = tc_delta(th);
> > + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) {
>
> Better, but shouldn't be changed (and the bug that causes the large intervals
> remains unlocated), and if it is changed then it should use:
>
>   if (delta >= th->th_large_delta)
>
> > @@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin)
> > scale += (th->th_adjustment / 1024) * 2199;
> > scale /= th->th_counter->tc_frequency;
> > th->th_scale = scale * 2;
> > +#ifdef _LP64
> > + th->th_scale_bits = ffsl(th->th_scale);
> > +#else
> > + th->th_scale_bits = ffsll(th->th_scale);
> > +#endif
>
>   th->th_large_delta = ((uint64_t)1 << 63) / scale;

So I am able to reproduce it with some surprising ease on HPET running
on Haswell.

I looked at the generated code for libc which still uses ffsll() on 32bit,
due to the ABI issues.  At least clang generates two BSF instructions for
this code, so I think that forking vdso_timehands ABI for this is not
reasonable right now.

diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c
index 3749e0473af..3c3c71207bd 100644
--- a/lib/libc/sys/__vdso_gettimeofday.c
+++ b/lib/libc/sys/__vdso_gettimeofday.c
@@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$");
 #include <sys/time.h>
 #include <sys/vdso.h>
 #include <errno.h>
+#include <limits.h>
+#include <strings.h>
 #include <time.h>
 #include <machine/atomic.h>
 #include "libc_private.h"
@@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs)
 {
  struct vdso_timehands *th;
  uint32_t curr, gen;
- u_int delta;
+ uint64_t scale, x;
+ u_int delta, scale_bits;
  int error;
 
  do {
@@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs)
  continue;
  if (error != 0)
  return (error);
- bintime_addx(bt, th->th_scale * delta);
+ scale = th->th_scale;
+#ifdef _LP64
+ scale_bits = ffsl(scale);
+#else
+ scale_bits = ffsll(scale);
+#endif
+ if (__predict_false(scale_bits + fls(delta) > 63)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  if (abs)
  bintime_add(bt, &th->th_boottime);
 
diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c
index 2656fb4d22f..0a11c726e3c 100644
--- a/sys/kern/kern_tc.c
+++ b/sys/kern/kern_tc.c
@@ -72,6 +72,7 @@ struct timehands {
  struct timecounter *th_counter;
  int64_t th_adjustment;
  uint64_t th_scale;
+ uint64_t th_large_delta;
  u_int th_offset_count;
  struct bintime th_offset;
  struct bintime th_bintime;
@@ -355,13 +356,22 @@ void
 binuptime(struct bintime *bt)
 {
  struct timehands *th;
- u_int gen;
+ uint64_t scale, x;
+ u_int delta, gen;
 
  do {
  th = timehands;
  gen = atomic_load_acq_int(&th->th_generation);
  *bt = th->th_offset;
- bintime_addx(bt, th->th_scale * tc_delta(th));
+ scale = th->th_scale;
+ delta = tc_delta(th);
+ if (__predict_false(th->th_large_delta <= delta)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  atomic_thread_fence_acq();
  } while (gen == 0 || gen != th->th_generation);
 }
@@ -388,13 +398,22 @@ void
 bintime(struct bintime *bt)
 {
  struct timehands *th;
- u_int gen;
+ uint64_t scale, x;
+ u_int delta, gen;
 
  do {
  th = timehands;
  gen = atomic_load_acq_int(&th->th_generation);
  *bt = th->th_bintime;
- bintime_addx(bt, th->th_scale * tc_delta(th));
+ scale = th->th_scale;
+ delta = tc_delta(th);
+ if (__predict_false(th->th_large_delta <= delta)) {
+ x = (scale >> 32) * delta;
+ scale &= UINT_MAX;
+ bt->sec += x >> 32;
+ bintime_addx(bt, x << 32);
+ }
+ bintime_addx(bt, scale * delta);
  atomic_thread_fence_acq();
  } while (gen == 0 || gen != th->th_generation);
 }
@@ -1464,6 +1483,7 @@ tc_windup(struct bintime *new_boottimebin)
  scale += (th->th_adjustment / 1024) * 2199;
  scale /= th->th_counter->tc_frequency;
  th->th_scale = scale * 2;
+ th->th_large_delta = ((uint64_t)1 << 63) / scale;
 
  /*
  * Now that the struct timehands is again consistent, set the new

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]

Konstantin Belousov
In reply to this post by Konstantin Belousov-3
On Sat, Mar 02, 2019 at 02:49:59PM +0000, Poul-Henning Kamp wrote:
> --------
> In message <[hidden email]>, Konstantin Belousov writes:
> >On Sat, Mar 02, 2019 at 02:17:10PM +0000, Poul-Henning Kamp wrote:
>
> >More the timehands, more the propagation can be delayed by the hardware.
>
> One of the two of us don't understand how timecounters work.

Perhaps look at the code ?  Pre-2015 (or so) timecounters did not worked
on non-TSO machines.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]

freebsd-hackers mailing list
In reply to this post by Konstantin Belousov-3


On 2019-Mar-1, at 13:19, Mark Millard <marklmi at yahoo.com> wrote:

> On 2019-Mar-1, at 11:42, Konstantin Belousov <kib at freebsd.org> wrote:
>
>> . . .
>> +#ifdef _LP64
>> + scale_bits = ffsl(scale);
>> +#else
>> + scale_bits = ffsll(scale);
>> +#endif. . .
>> + if (__predict_false(scale_bits + fls(delta) > 63)) {
>
>
> The patch from yesterday uniformly used:
>
> int
> fls(int mask)
> {
>        int bit;
>
>        if (mask == 0)
>                return (0);
>        for (bit = 1; mask != 1; bit++)
>                mask = (unsigned int)mask >> 1;
>        return (bit);
> }
>
> that looks for the most significant 1 bit.
>
> The new patch uses in some places:
>
> int
> ffsl(long mask)
> {
>        int bit;
>
>        if (mask == 0)
>                return (0);
>        for (bit = 1; !(mask & 1); bit++)
>                mask = (unsigned long)mask >> 1;
>        return (bit);
> }
>
> that looks for the least significant 1 bit. Similarly
> for:
>
> int
> ffsll(long long mask)
> {
>        int bit;
>
>        if (mask == 0)
>                return (0);
>        for (bit = 1; !(mask & 1); bit++)
>                mask = (unsigned long long)mask >> 1;
>        return (bit);
> }
>
> Was that deliberate?

Be that as it may:

I've been watching you and Bruce work on a code update. I'm
intending to wait until you let me know you want me to test
before trying again (on the PowerMac G5). (I've not been
testing on anything else: I did not intended to test
systems I've not seen a problem with until after the G5
seemed to be working.)

In part my waiting is because the first patch that I tried
left things unusable and I have to recover from the
consequences of a forced power off. It took a fair amount
of time.

I'd not be surprised if the G5 type of context has another
problem, separate from what I reported and what you are
working on. If so I may not be able to be an effective
tester: the fix may just repeat what I saw the first
time (based on a messed up context).

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"