Ryzen issues on FreeBSD ?

classic Classic list List threaded Threaded
173 messages Options
1234 ... 9
Reply | Threaded
Open this post in threaded view
|

Ryzen issues on FreeBSD ?

mdtancsa
With the Intel issues exposed in meltdown, we were looking at possibly
deploying some Ryzen based servers for FreeBSD.  We got a pair of
ASUS PRIME X370-PRO and

CPU: AMD Ryzen 5 1600X Six-Core Processor            (3593.34-MHz
K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1

Everything is at its default in the BIOS, no overclocking etc.

However, we are seeing random lockups on both boxes. It doesnt seem to
correspond with load/activity.  And its a hard lockup.  Keyboard not
responsive and I cant break to serial debugger, so it doesnt seem to be
an issue with something in the kernel going into deadlock.

It sort of feels like a hardware issue, but it seems odd that both boxes
are showing the same issue with random lockups like that.  It could be
twice in a day or once every 3 days.

Anyone have any insights ?  Anyone have any suggestions about better
motherboards out there ? We are waiting for Supermicro's Epyc
availability, but nothing yet.  It would be nice if we could find a
board with at least some hardware watchdog on it.


        ---Mike

--
-------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, [hidden email]
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Pete French-3
I am in much the same situation as you (want to deploy Epyc, waiting for
SM stuff to become available). I currently have here a set of parts to
make a test Ryzen box, so you are ahead of me on that though. Should
have that gong this week I hope.

Are you running the latest STABLE ? There were some patches for Ryzen
which went in I belive, and might affect te stability. Specificly the
chnages to stop it locking up when executing code in the top page ?

I'll get back to you when I have done some more testing...

-pete.

On 17/01/2018 13:38, Mike Tancsa wrote:

> With the Intel issues exposed in meltdown, we were looking at possibly
> deploying some Ryzen based servers for FreeBSD.  We got a pair of
> ASUS PRIME X370-PRO and
>
> CPU: AMD Ryzen 5 1600X Six-Core Processor            (3593.34-MHz
> K8-class CPU)
>    Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>
> Everything is at its default in the BIOS, no overclocking etc.
>
> However, we are seeing random lockups on both boxes. It doesnt seem to
> correspond with load/activity.  And its a hard lockup.  Keyboard not
> responsive and I cant break to serial debugger, so it doesnt seem to be
> an issue with something in the kernel going into deadlock.
>
> It sort of feels like a hardware issue, but it seems odd that both boxes
> are showing the same issue with random lockups like that.  It could be
> twice in a day or once every 3 days.
>
> Anyone have any insights ?  Anyone have any suggestions about better
> motherboards out there ? We are waiting for Supermicro's Epyc
> availability, but nothing yet.  It would be nice if we could find a
> board with at least some hardware watchdog on it.
>
>
> ---Mike
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Nimrod Levy-2
In reply to this post by mdtancsa
I've been seeing similar issues on Ryzen and asked some questions, here
https://lists.freebsd.org/pipermail/freebsd-stable/2017-December/088121.html

My previous queries didn't go anywhere.

--
Nimrod

On Wed, Jan 17, 2018 at 8:38 AM Mike Tancsa <[hidden email]> wrote:

> With the Intel issues exposed in meltdown, we were looking at possibly
> deploying some Ryzen based servers for FreeBSD.  We got a pair of
> ASUS PRIME X370-PRO and
>
> CPU: AMD Ryzen 5 1600X Six-Core Processor            (3593.34-MHz
> K8-class CPU)
>   Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>
> Everything is at its default in the BIOS, no overclocking etc.
>
> However, we are seeing random lockups on both boxes. It doesnt seem to
> correspond with load/activity.  And its a hard lockup.  Keyboard not
> responsive and I cant break to serial debugger, so it doesnt seem to be
> an issue with something in the kernel going into deadlock.
>
> It sort of feels like a hardware issue, but it seems odd that both boxes
> are showing the same issue with random lockups like that.  It could be
> twice in a day or once every 3 days.
>
> Anyone have any insights ?  Anyone have any suggestions about better
> motherboards out there ? We are waiting for Supermicro's Epyc
> availability, but nothing yet.  It would be nice if we could find a
> board with at least some hardware watchdog on it.
>
>
>         ---Mike
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400 <(519)%20651-3400>
> Sentex Communications, [hidden email]
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   http://www.tancsa.com/
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[hidden email]"
>


--

--
Nimrod
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

mdtancsa
On 1/17/2018 8:46 AM, Nimrod Levy wrote:
> I've been seeing similar issues on Ryzen and asked some questions,
> here https://lists.freebsd.org/pipermail/freebsd-stable/2017-December/088121.html
>
> My previous queries didn't go anywhere.  
>

Thats not very promising :(  Googling around, shows lots of similar
reports both on FreeBSD and Linux, but its a lot of "I tweaked this BIOS
setting and so far so good" but nothing definitive / conclusive.  Having
to mess about with hardware settings for days on end hoping to fix
random lockups is .... not good.

        ---Mike


--
-------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, [hidden email]
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

mdtancsa
In reply to this post by Pete French-3
On 1/17/2018 8:43 AM, Pete French wrote:
>
> Are you running the latest STABLE ? There were some patches for Ryzen
> which went in I belive, and might affect te stability. Specificly the
> chnages to stop it locking up when executing code in the top page ?

Hi,
        I was testing with RELENG_11 as of 2 days ago.  The fix seems to be there

# sysctl -A hw.lower_amd64_sharedpage
hw.lower_amd64_sharedpage: 1

Would love to find a class of motherboard that pushes its "You dont need
to dork around with any BIOS settings. It just works.  Oh, and we have a
hardware watchdog too".... ipmi would be stellar.

        ---Mike
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Don Lewis-5
On 17 Jan, Mike Tancsa wrote:

> On 1/17/2018 8:43 AM, Pete French wrote:
>>
>> Are you running the latest STABLE ? There were some patches for Ryzen
>> which went in I belive, and might affect te stability. Specificly the
>> chnages to stop it locking up when executing code in the top page ?
>
> Hi,
> I was testing with RELENG_11 as of 2 days ago.  The fix seems to be there
>
> # sysctl -A hw.lower_amd64_sharedpage
> hw.lower_amd64_sharedpage: 1
>
> Would love to find a class of motherboard that pushes its "You dont need
> to dork around with any BIOS settings. It just works.  Oh, and we have a
> hardware watchdog too".... ipmi would be stellar.

The shared page change fixed the random lockup and silent reboot problem
for me.  I've got a 1700X eight core CPU and a Gigabyte X370 Gaming 5. I
did have to RMA my CPU (it was an early one) because it had the problem
with random segfaults that seemed to be triggered by process migration
between CPU cores.  I still haven't switched over to using it for
package builds because I see more random fallout than on my older
package builder.  I'm not blaming the hardware for that at this point
because I see a lot of the same issues on my older machine, but less
frequently.

One thing to watch (though it should be less critical with a six core
CPU) is VRM cooling.  I removed the stupid plastic shroud over the VRM
sink on my motherboard so that it gets some more airflow.

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

mdtancsa
On 1/17/2018 3:39 PM, Don Lewis wrote:

> On 17 Jan, Mike Tancsa wrote:
>> On 1/17/2018 8:43 AM, Pete French wrote:
>>>
>>> Are you running the latest STABLE ? There were some patches for Ryzen
>>> which went in I belive, and might affect te stability. Specificly the
>>> chnages to stop it locking up when executing code in the top page ?
>>
>> Hi,
>> I was testing with RELENG_11 as of 2 days ago.  The fix seems to be there
>>
>> # sysctl -A hw.lower_amd64_sharedpage
>> hw.lower_amd64_sharedpage: 1
>>
>> Would love to find a class of motherboard that pushes its "You dont need
>> to dork around with any BIOS settings. It just works.  Oh, and we have a
>> hardware watchdog too".... ipmi would be stellar.
>
> The shared page change fixed the random lockup and silent reboot problem
> for me.  I've got a 1700X eight core CPU and a Gigabyte X370 Gaming 5. I
> did have to RMA my CPU (it was an early one) because it had the problem
> with random segfaults that seemed to be triggered by process migration
> between CPU cores.  I still haven't switched over to using it for
> package builds because I see more random fallout than on my older
> package builder.  I'm not blaming the hardware for that at this point
> because I see a lot of the same issues on my older machine, but less
> frequently.
>
> One thing to watch (though it should be less critical with a six core
> CPU) is VRM cooling.  I removed the stupid plastic shroud over the VRM
> sink on my motherboard so that it gets some more airflow.

Thanks! I will confirm the cooling.  I tried just now looking at the CPU
FAN control in the BIOS and up'd it to "turbo" from the default.  Does
amdtmp.ko work with your chipset ? Nothing on mine unfortunately, so I
cant tell from the OS if its running hot.

Is there a way to see if your CPU is old and has that bug ? I havent
seen any segfaults on the few dozen buildworlds I have done. So far its
always been a total lockup and not crash with RELENG11.

x86info v1.31pre
Found 12 identical CPUs
Extended Family: 8 Extended Model: 0 Family: 15 Model: 1 Stepping: 1
CPU Model (x86info's best guess): AMD Zen Series Processor (ZP-B1)
Processor name string (BIOS programmed): AMD Ryzen 5 1600 Six-Core
Processor

Monitor/Mwait: min/max line size 64/64, ecx bit 0 support, enumeration
extension
SVM: revision 1, 32768 ASIDs, np, lbrVirt, SVMLock, NRIPSave,
TscRateMsr, VmcbClean, FlushByAsid, DecodeAssists, PauseFilter,
PauseFilterThreshold
Address Size: 48 bits virtual, 48 bits physical
The physical package has 12 of 16 possible cores implemented.
 running at an estimated 3.20GHz




        ---Mike



--
-------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, [hidden email]
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

freebsd-stable mailing list
In reply to this post by mdtancsa
Mike Tancsa mike at sentex.net wrote on:
Wed Jan 17 14:31:50 UTC 2018 :

> On 1/17/2018 8:46 AM, Nimrod Levy wrote:
> > I've been seeing similar issues on Ryzen and asked some questions,
> > here https://lists.freebsd.org/pipermail/freebsd-stable/2017-December/088121.html
> >
> > My previous queries didn't go anywhere.  
> >
>  
>
>
> Thats not very promising :(  Googling around, shows lots of similar
> reports both on FreeBSD and Linux, but its a lot of "I tweaked this BIOS
> setting and so far so good" but nothing definitive / conclusive.  Having
> to mess about with hardware settings for days on end hoping to fix
> random lockups is .... not good.

See Bugzilla 219399 and 221029 :

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221029

I'm not sure how much stable/11 and the like have been
tracking things that were done in head (12) during this.
My use has only been via versions of head.

My 1800X use was basically after head was updated to deal
with what 219399 eventually was isolated to. (221029 is
from splitting off problems that were not originally known
to be separate.)

While I had problems for 1800X that are what the 221029
bugzilla above is about, I've not had such with a 1950X
in the same sorts of contexts as I had been using the
1800X. But this was under Hyper-V for both processor
variants (with matching boards).

I've only tried the 1950X with a native FreeBSD boot once
(a fair time ago). It showed a lockup problem fairly
quickly (power switch/plug time). I've never seen such
(or anything analogous) under Hyper-V with extensive use.

It does not look like I'll be investigating native FreeBSD
on the 1950X anytime soon. (I no longer have access to the
1800X.)

===
Mark Millard
marklmi26-fbsd at yahoo.com
( markmi at dsl-only.net is going away in 2018-Feb, late)
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Nimrod Levy-2
In reply to this post by mdtancsa
I'm running 11-STABLE from 12/9.  amdtemp works for me.  It also has the
systl indicating that it it has the shared page fix. I'm pretty sure I've
seen the lockups since then.  I'll update to the latest STABLE and see
what  happens.

One weird thing about my experience is that if I keep something running
continuously like the distributed.net client on 6 of 12 possible threads,
it keeps the system up for MUCH longer than without.  This is a home server
and very lightly loaded (one could argue insanely overpowered for the use
case).

I'm glad to see that there has been some attention on this.  I was a little
disappointed by the earlier thread.

I'm happy to help troubleshoot, but I'm not sure what information I can
gather from a hard locked system that doesn't even show anything on the
console.

--
Nimrod


On Wed, Jan 17, 2018 at 4:01 PM Mike Tancsa <[hidden email]> wrote:

> On 1/17/2018 3:39 PM, Don Lewis wrote:
> > On 17 Jan, Mike Tancsa wrote:
> >> On 1/17/2018 8:43 AM, Pete French wrote:
> >>>
> >>> Are you running the latest STABLE ? There were some patches for Ryzen
> >>> which went in I belive, and might affect te stability. Specificly the
> >>> chnages to stop it locking up when executing code in the top page ?
> >>
> >> Hi,
> >>      I was testing with RELENG_11 as of 2 days ago.  The fix seems to
> be there
> >>
> >> # sysctl -A hw.lower_amd64_sharedpage
> >> hw.lower_amd64_sharedpage: 1
> >>
> >> Would love to find a class of motherboard that pushes its "You dont need
> >> to dork around with any BIOS settings. It just works.  Oh, and we have a
> >> hardware watchdog too".... ipmi would be stellar.
> >
> > The shared page change fixed the random lockup and silent reboot problem
> > for me.  I've got a 1700X eight core CPU and a Gigabyte X370 Gaming 5. I
> > did have to RMA my CPU (it was an early one) because it had the problem
> > with random segfaults that seemed to be triggered by process migration
> > between CPU cores.  I still haven't switched over to using it for
> > package builds because I see more random fallout than on my older
> > package builder.  I'm not blaming the hardware for that at this point
> > because I see a lot of the same issues on my older machine, but less
> > frequently.
> >
> > One thing to watch (though it should be less critical with a six core
> > CPU) is VRM cooling.  I removed the stupid plastic shroud over the VRM
> > sink on my motherboard so that it gets some more airflow.
>
> Thanks! I will confirm the cooling.  I tried just now looking at the CPU
> FAN control in the BIOS and up'd it to "turbo" from the default.  Does
> amdtmp.ko work with your chipset ? Nothing on mine unfortunately, so I
> cant tell from the OS if its running hot.
>
> Is there a way to see if your CPU is old and has that bug ? I havent
> seen any segfaults on the few dozen buildworlds I have done. So far its
> always been a total lockup and not crash with RELENG11.
>
> x86info v1.31pre
> Found 12 identical CPUs
> Extended Family: 8 Extended Model: 0 Family: 15 Model: 1 Stepping: 1
> CPU Model (x86info's best guess): AMD Zen Series Processor (ZP-B1)
> Processor name string (BIOS programmed): AMD Ryzen 5 1600 Six-Core
> Processor
>
> Monitor/Mwait: min/max line size 64/64, ecx bit 0 support, enumeration
> extension
> SVM: revision 1, 32768 ASIDs, np, lbrVirt, SVMLock, NRIPSave,
> TscRateMsr, VmcbClean, FlushByAsid, DecodeAssists, PauseFilter,
> PauseFilterThreshold
> Address Size: 48 bits virtual, 48 bits physical
> The physical package has 12 of 16 possible cores implemented.
>  running at an estimated 3.20GHz
>
>
>
>
>         ---Mike
>
>
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400 <(519)%20651-3400>
> Sentex Communications, [hidden email]
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   http://www.tancsa.com/
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[hidden email]"
>


--

--
Nimrod
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Don Lewis-5
In reply to this post by mdtancsa
On 17 Jan, Mike Tancsa wrote:

> On 1/17/2018 3:39 PM, Don Lewis wrote:
>> On 17 Jan, Mike Tancsa wrote:
>>> On 1/17/2018 8:43 AM, Pete French wrote:
>>>>
>>>> Are you running the latest STABLE ? There were some patches for Ryzen
>>>> which went in I belive, and might affect te stability. Specificly the
>>>> chnages to stop it locking up when executing code in the top page ?
>>>
>>> Hi,
>>> I was testing with RELENG_11 as of 2 days ago.  The fix seems to be there
>>>
>>> # sysctl -A hw.lower_amd64_sharedpage
>>> hw.lower_amd64_sharedpage: 1
>>>
>>> Would love to find a class of motherboard that pushes its "You dont need
>>> to dork around with any BIOS settings. It just works.  Oh, and we have a
>>> hardware watchdog too".... ipmi would be stellar.
>>
>> The shared page change fixed the random lockup and silent reboot problem
>> for me.  I've got a 1700X eight core CPU and a Gigabyte X370 Gaming 5. I
>> did have to RMA my CPU (it was an early one) because it had the problem
>> with random segfaults that seemed to be triggered by process migration
>> between CPU cores.  I still haven't switched over to using it for
>> package builds because I see more random fallout than on my older
>> package builder.  I'm not blaming the hardware for that at this point
>> because I see a lot of the same issues on my older machine, but less
>> frequently.
>>
>> One thing to watch (though it should be less critical with a six core
>> CPU) is VRM cooling.  I removed the stupid plastic shroud over the VRM
>> sink on my motherboard so that it gets some more airflow.
>
> Thanks! I will confirm the cooling.  I tried just now looking at the CPU
> FAN control in the BIOS and up'd it to "turbo" from the default.  Does
> amdtmp.ko work with your chipset ? Nothing on mine unfortunately, so I
> cant tell from the OS if its running hot.
>
> Is there a way to see if your CPU is old and has that bug ? I havent
> seen any segfaults on the few dozen buildworlds I have done. So far its
> always been a total lockup and not crash with RELENG11.
>
> x86info v1.31pre
> Found 12 identical CPUs
> Extended Family: 8 Extended Model: 0 Family: 15 Model: 1 Stepping: 1
> CPU Model (x86info's best guess): AMD Zen Series Processor (ZP-B1)
> Processor name string (BIOS programmed): AMD Ryzen 5 1600 Six-Core
> Processor

My original CPU had a date code of 1708SUT (8th week of 2017 I think),
and the replacement has a date code of 1733SUS.  There's a humungous
discussion thread here <https://community.amd.com/thread/215773> where
date codes are discussed.  As I recall, the first replacement parts
shipped had dates codes somewhere in the mid 20's, but I think AMD was
still hand screening parts at that point.  My replacement came in a
sealed box, so it wasn't hand screened and AMD probably was able to
screen for this problem in their production test.

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Don Lewis-5
In reply to this post by Nimrod Levy-2
On 17 Jan, Nimrod Levy wrote:

> I'm running 11-STABLE from 12/9.  amdtemp works for me.  It also has the
> systl indicating that it it has the shared page fix. I'm pretty sure I've
> seen the lockups since then.  I'll update to the latest STABLE and see
> what  happens.
>
> One weird thing about my experience is that if I keep something running
> continuously like the distributed.net client on 6 of 12 possible threads,
> it keeps the system up for MUCH longer than without.  This is a home server
> and very lightly loaded (one could argue insanely overpowered for the use
> case).

This sounds like the problem with the deep Cx states that has been
reported by numerous Linux users.  I think some motherboard brands are
more likely to have the problem.  See:
http://forum.asrock.com/forum_posts.asp?TID=5963&title=taichi-x370-with-ubuntu-idle-lock-ups-idle-freeze

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Nimrod Levy-2
That looks promising. I just found that seeing in the bios and disabled it.
I'll see how it runs.

Thanks


On Wed, Jan 17, 2018, 18:38 Don Lewis <[hidden email]> wrote:

> On 17 Jan, Nimrod Levy wrote:
> > I'm running 11-STABLE from 12/9.  amdtemp works for me.  It also has the
> > systl indicating that it it has the shared page fix. I'm pretty sure I've
> > seen the lockups since then.  I'll update to the latest STABLE and see
> > what  happens.
> >
> > One weird thing about my experience is that if I keep something running
> > continuously like the distributed.net client on 6 of 12 possible
> threads,
> > it keeps the system up for MUCH longer than without.  This is a home
> server
> > and very lightly loaded (one could argue insanely overpowered for the use
> > case).
>
> This sounds like the problem with the deep Cx states that has been
> reported by numerous Linux users.  I think some motherboard brands are
> more likely to have the problem.  See:
>
> http://forum.asrock.com/forum_posts.asp?TID=5963&title=taichi-x370-with-ubuntu-idle-lock-ups-idle-freeze
>
> --

--
Nimrod
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Nimrod Levy-2
Looks like disabling the C- states in the bios didn't change anything.

On Wed, Jan 17, 2018 at 9:22 PM Nimrod Levy <[hidden email]> wrote:

> That looks promising. I just found that seeing in the bios and disabled
> it. I'll see how it runs.
>
> Thanks
>
>
> On Wed, Jan 17, 2018, 18:38 Don Lewis <[hidden email]> wrote:
>
>> On 17 Jan, Nimrod Levy wrote:
>> > I'm running 11-STABLE from 12/9.  amdtemp works for me.  It also has the
>> > systl indicating that it it has the shared page fix. I'm pretty sure
>> I've
>> > seen the lockups since then.  I'll update to the latest STABLE and see
>> > what  happens.
>> >
>> > One weird thing about my experience is that if I keep something running
>> > continuously like the distributed.net client on 6 of 12 possible
>> threads,
>> > it keeps the system up for MUCH longer than without.  This is a home
>> server
>> > and very lightly loaded (one could argue insanely overpowered for the
>> use
>> > case).
>>
>> This sounds like the problem with the deep Cx states that has been
>> reported by numerous Linux users.  I think some motherboard brands are
>> more likely to have the problem.  See:
>>
>> http://forum.asrock.com/forum_posts.asp?TID=5963&title=taichi-x370-with-ubuntu-idle-lock-ups-idle-freeze
>>
>> --
>
> --
> Nimrod
>


--

--
Nimrod
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

mdtancsa
Drag :( I have mine disabled as well as lowering the RAM freq to 2100
from 2400.  For me the hangs are infrequent.  Its only been a day and a
half, so not sure if its gone or I have been "lucky"... Either ways,
this platform feels way too fragile to deploy on anything :(

        ---Mike

On 1/19/2018 3:08 PM, Nimrod Levy wrote:

> Looks like disabling the C- states in the bios didn't change anything. 
>
> On Wed, Jan 17, 2018 at 9:22 PM Nimrod Levy <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     That looks promising. I just found that seeing in the bios and
>     disabled it. I'll see how it runs.
>
>     Thanks
>
>
>     On Wed, Jan 17, 2018, 18:38 Don Lewis <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         On 17 Jan, Nimrod Levy wrote:
>         > I'm running 11-STABLE from 12/9.  amdtemp works for me.  It
>         also has the
>         > systl indicating that it it has the shared page fix. I'm
>         pretty sure I've
>         > seen the lockups since then.  I'll update to the latest STABLE
>         and see
>         > what  happens.
>         >
>         > One weird thing about my experience is that if I keep
>         something running
>         > continuously like the distributed.net <http://distributed.net>
>         client on 6 of 12 possible threads,
>         > it keeps the system up for MUCH longer than without.  This is
>         a home server
>         > and very lightly loaded (one could argue insanely overpowered
>         for the use
>         > case).
>
>         This sounds like the problem with the deep Cx states that has been
>         reported by numerous Linux users.  I think some motherboard
>         brands are
>         more likely to have the problem.  See:
>         http://forum.asrock.com/forum_posts.asp?TID=5963&title=taichi-x370-with-ubuntu-idle-lock-ups-idle-freeze
>
>     --
>
>     --
>     Nimrod
>
>
>
> --
>
> --
> Nimrod
>


--
-------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, [hidden email]
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Lucas Holt
In reply to this post by mdtancsa
I have an Asus Prime X370-pro and a Ryzen 7 1700 that I bought in late
April.  Make sure you have the latest BIOS for these boards or else it
will randomly freak out.

While i haven't used it much with FreeBSD, I can confirm that I had a
lot of stability issues solved with a December BIOS update on
MidnightBSD. I back ported the shared page fix and amdtemp.  (it's
basically FreeBSD 9.1)

I couldn't even get it to boot until the August BIOS update.  I've had
my box stay up at least a week, and it's my primary development box so
I'm mostly doing src/ports builds all the time on it.

If you have the latest BIOS, check the memory timings too.  It's rather
picky with some memory modules.

Luke
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Ryan Root
In reply to this post by mdtancsa
Have you double checked the qualified vendors list data for your
motherboard.  Sometimes memory chips not on the list will work as it's
probably only a list of ones they've tested but it might be the problem
in this situation.  If that was already brought up by someone else sorry
for butting in.

This looks like the QVL list for your MB ->
http://download.gigabyte.us/FileList/Memory/mb_memory_ga-ax370-Gaming5.pdf


On 1/19/2018 12:13 PM, Mike Tancsa wrote:

> Drag :( I have mine disabled as well as lowering the RAM freq to 2100
> from 2400.  For me the hangs are infrequent.  Its only been a day and a
> half, so not sure if its gone or I have been "lucky"... Either ways,
> this platform feels way too fragile to deploy on anything :(
>
> ---Mike
>
> On 1/19/2018 3:08 PM, Nimrod Levy wrote:
>> Looks like disabling the C- states in the bios didn't change anything. 
>>
>> On Wed, Jan 17, 2018 at 9:22 PM Nimrod Levy <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     That looks promising. I just found that seeing in the bios and
>>     disabled it. I'll see how it runs.
>>
>>     Thanks
>>
>>
>>     On Wed, Jan 17, 2018, 18:38 Don Lewis <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>
>>         On 17 Jan, Nimrod Levy wrote:
>>         > I'm running 11-STABLE from 12/9.  amdtemp works for me.  It
>>         also has the
>>         > systl indicating that it it has the shared page fix. I'm
>>         pretty sure I've
>>         > seen the lockups since then.  I'll update to the latest STABLE
>>         and see
>>         > what  happens.
>>         >
>>         > One weird thing about my experience is that if I keep
>>         something running
>>         > continuously like the distributed.net <http://distributed.net>
>>         client on 6 of 12 possible threads,
>>         > it keeps the system up for MUCH longer than without.  This is
>>         a home server
>>         > and very lightly loaded (one could argue insanely overpowered
>>         for the use
>>         > case).
>>
>>         This sounds like the problem with the deep Cx states that has been
>>         reported by numerous Linux users.  I think some motherboard
>>         brands are
>>         more likely to have the problem.  See:
>>         http://forum.asrock.com/forum_posts.asp?TID=5963&title=taichi-x370-with-ubuntu-idle-lock-ups-idle-freeze
>>
>>     --
>>
>>     --
>>     Nimrod
>>
>>
>>
>> --
>>
>> --
>> Nimrod
>>
>


_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

mdtancsa
In reply to this post by Lucas Holt
On 1/19/2018 3:22 PM, Lucas Holt wrote:
> I have an Asus Prime X370-pro and a Ryzen 7 1700 that I bought in late

Thanks! Thats the board I have, but no luck with amdtemp.  Did you have
to change the source code for it to work ?

dmidecode shows

        Manufacturer: ASUSTeK COMPUTER INC.
        Product Name: PRIME X370-PRO

        Vendor: American Megatrends Inc.
        Version: 3402
        Release Date: 12/11/2017
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 16 MB
        Characteristics:

memory is

        Type: DDR4
        Type Detail: Synchronous Unbuffered (Unregistered)
        Speed: 2133 MT/s
        Manufacturer: Unknown
        Serial Number: 192BE196
        Asset Tag: Not Specified
        Part Number: CT16G4DFD824A.C16FHD
        Rank: 2
        Configured Clock Speed: 1067 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V



When I try and load the kld, I get nothing :(

0(ms-v1)# kldload amdtemp
0(ms-v1)# dmesg | tail -2
ums0: at uhub0, port 3, addr 1 (disconnected)
ums0: detached
0(ms-v1)#



> April.  Make sure you have the latest BIOS for these boards or else it
> will randomly freak out.
>
> While i haven't used it much with FreeBSD, I can confirm that I had a
> lot of stability issues solved with a December BIOS update on
> MidnightBSD. I back ported the shared page fix and amdtemp.  (it's
> basically FreeBSD 9.1)
>
> I couldn't even get it to boot until the August BIOS update.  I've had
> my box stay up at least a week, and it's my primary development box so
> I'm mostly doing src/ports builds all the time on it.
>
> If you have the latest BIOS, check the memory timings too.  It's rather
> picky with some memory modules.
>
> Luke
>
>


--
-------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, [hidden email]
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Peter Moody
In reply to this post by mdtancsa
On Fri, Jan 19, 2018 at 12:13 PM, Mike Tancsa <[hidden email]> wrote:
> Drag :( I have mine disabled as well as lowering the RAM freq to 2100
> from 2400.  For me the hangs are infrequent.  Its only been a day and a
> half, so not sure if its gone or I have been "lucky"... Either ways,
> this platform feels way too fragile to deploy on anything :(
>
>         ---Mike
>
> On 1/19/2018 3:08 PM, Nimrod Levy wrote:
>> Looks like disabling the C- states in the bios didn't change anything.

it's too early for me to be 100% certain, but disabling SMT in the
bios has thus far resulted in a more stable system.

I have a ryzen5 1600X and an ASRock AB350M and I've tried just about
everything in all of these threads; disabling C state (no effect),
setting the sysctl (doesn't exist on my 11.1 RELEASE), tweaking
voltage and cooling settings, rma'ing the board the cpu and the
memory. nothing helped.

last night I tried disabling SMT and, so far so good.


>> On Wed, Jan 17, 2018 at 9:22 PM Nimrod Levy <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     That looks promising. I just found that seeing in the bios and
>>     disabled it. I'll see how it runs.
>>
>>     Thanks
>>
>>
>>     On Wed, Jan 17, 2018, 18:38 Don Lewis <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>
>>         On 17 Jan, Nimrod Levy wrote:
>>         > I'm running 11-STABLE from 12/9.  amdtemp works for me.  It
>>         also has the
>>         > systl indicating that it it has the shared page fix. I'm
>>         pretty sure I've
>>         > seen the lockups since then.  I'll update to the latest STABLE
>>         and see
>>         > what  happens.
>>         >
>>         > One weird thing about my experience is that if I keep
>>         something running
>>         > continuously like the distributed.net <http://distributed.net>
>>         client on 6 of 12 possible threads,
>>         > it keeps the system up for MUCH longer than without.  This is
>>         a home server
>>         > and very lightly loaded (one could argue insanely overpowered
>>         for the use
>>         > case).
>>
>>         This sounds like the problem with the deep Cx states that has been
>>         reported by numerous Linux users.  I think some motherboard
>>         brands are
>>         more likely to have the problem.  See:
>>         http://forum.asrock.com/forum_posts.asp?TID=5963&title=taichi-x370-with-ubuntu-idle-lock-ups-idle-freeze
>>
>>     --
>>
>>     --
>>     Nimrod
>>
>>
>>
>> --
>>
>> --
>> Nimrod
>>
>
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400
> Sentex Communications, [hidden email]
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   http://www.tancsa.com/
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[hidden email]"
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

Lucas Holt
In reply to this post by mdtancsa
We have the same bios version.

I have corsair RAM

Handle 0x003B, DMI type 17, 40 bytes
Memory Device
         Array Handle: 0x0032
         Error Information Handle: 0x003A
         Total Width: 64 bits
         Data Width: 64 bits
         Size: 16384 MB
         Form Factor: DIMM
         Set: None
         Locator: DIMM_A2
         Bank Locator: BANK 1
         Type: <OUT OF SPEC>
         Type Detail: Synchronous Unbuffered (Unregistered)
         Speed: 2666 MHz
         Manufacturer: Unknown
         Serial Number: 00000000
         Asset Tag: Not Specified
         Part Number: CMK32GX4M2A2666C16
         Rank: 2
         Configured Clock Speed: 1333 MHz
         Minimum voltage:  1.200 V
         Maximum voltage:  1.200 V
         Configured voltage:  1.200 V


I just double checked and amdtemp isn't working correctly.  I was
probably thinking of my other system which has an FX 8350.

Luke
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Ryzen issues on FreeBSD ?

mdtancsa
In reply to this post by Peter Moody
On 1/19/2018 3:32 PM, Peter Moody wrote:
>
> I have a ryzen5 1600X and an ASRock AB350M and I've tried just about
> everything in all of these threads; disabling C state (no effect),
> setting the sysctl (doesn't exist on my 11.1 RELEASE), tweaking
> voltage and cooling settings, rma'ing the board the cpu and the
> memory. nothing helped.
>
> last night I tried disabling SMT and, so far so good.


Is there anything that can be done to trigger the lockup more reliably ?
I havent found any patterns. I have had lockups with the system is 100%
idle and lockups when lightly loaded.  I have yet to see any segfaults
or sig 11s while doing buildworld (make -j12 or make -j16 even)

        ---Mike

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
1234 ... 9