RFC: Hiding per-CPU kernel output behind bootverbose

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFC: Hiding per-CPU kernel output behind bootverbose

Colin Percival
On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
the boot time console output contains a large number of lines of the forms

SMP: AP CPU #N Launched!
cpuN: <ACPI CPU> on acpi0
estN: <Enhanced SpeedStep Frequency Control> on cpuN

Having 128 almost-identical lines of output doesn't seem very useful, and
it actually has a nontrivial impact on the time spent booting.

Does anyone mind if I hide these by default, having them only show up if
boot verbosity is requested?

--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Konstantin Belousov
On Thu, Apr 19, 2018 at 06:06:21PM +0000, Colin Percival wrote:

> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
> the boot time console output contains a large number of lines of the forms
>
> SMP: AP CPU #N Launched!
> cpuN: <ACPI CPU> on acpi0
> estN: <Enhanced SpeedStep Frequency Control> on cpuN
>
> Having 128 almost-identical lines of output doesn't seem very useful, and
> it actually has a nontrivial impact on the time spent booting.
>
> Does anyone mind if I hide these by default, having them only show up if
> boot verbosity is requested?

The 'CPU XX Launched' messages are very useful for initial diagnostic
of the SMP startup failures. You need to enable bootverbose to see the
hang details, but for initial hint they are required. Unfortunately, AP
startup hangs occur too often to pretend that this can be delegated to
very specific circumstances.

Rest of the lines you pasted are normal device attach messages, so it is
not clear how would you hide them without ugly hacks.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Steven Hartland
In reply to this post by Colin Percival
Sounds good to me, I think we could actually benefit from even quieter
modes if I’m honest.

On Thu, 19 Apr 2018 at 21:09, Colin Percival <[hidden email]> wrote:

> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
> the boot time console output contains a large number of lines of the forms
>
> SMP: AP CPU #N Launched!
> cpuN: <ACPI CPU> on acpi0
> estN: <Enhanced SpeedStep Frequency Control> on cpuN
>
> Having 128 almost-identical lines of output doesn't seem very useful, and
> it actually has a nontrivial impact on the time spent booting.
>
> Does anyone mind if I hide these by default, having them only show up if
> boot verbosity is requested?
>
> --
> Colin Percival
> Security Officer Emeritus, FreeBSD | The power to serve
> Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Colin Percival
In reply to this post by Konstantin Belousov
On 04/19/18 13:44, Konstantin Belousov wrote:

> On Thu, Apr 19, 2018 at 06:06:21PM +0000, Colin Percival wrote:
>> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
>> the boot time console output contains a large number of lines of the forms
>>
>> SMP: AP CPU #N Launched!
>> cpuN: <ACPI CPU> on acpi0
>> estN: <Enhanced SpeedStep Frequency Control> on cpuN
>>
>> Having 128 almost-identical lines of output doesn't seem very useful, and
>> it actually has a nontrivial impact on the time spent booting.
>>
>> Does anyone mind if I hide these by default, having them only show up if
>> boot verbosity is requested?
>
> The 'CPU XX Launched' messages are very useful for initial diagnostic
> of the SMP startup failures. You need to enable bootverbose to see the
> hang details, but for initial hint they are required. Unfortunately, AP
> startup hangs occur too often to pretend that this can be delegated to
> very specific circumstances.

Do SMP startup failures need to be debugged often enough to justify having
this verbosity every time a FreeBSD system boots?

> Rest of the lines you pasted are normal device attach messages, so it is
> not clear how would you hide them without ugly hacks.

I would be willing to employ ugly hacks in order to silence unhelpful output
and speed up the boot process.

--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Conrad Meyer-2
In reply to this post by Konstantin Belousov
On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov
<[hidden email]> wrote:

> On Thu, Apr 19, 2018 at 06:06:21PM +0000, Colin Percival wrote:
>> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
>> the boot time console output contains a large number of lines of the forms
>>
>> SMP: AP CPU #N Launched!
>> cpuN: <ACPI CPU> on acpi0
>> estN: <Enhanced SpeedStep Frequency Control> on cpuN
>>
>> Having 128 almost-identical lines of output doesn't seem very useful, and
>> it actually has a nontrivial impact on the time spent booting.
>>
>> Does anyone mind if I hide these by default, having them only show up if
>> boot verbosity is requested?

+1.  For the device attaches, perhaps it makes sense to add a device
'spammy' flag, and set it for per-CPU devices like cpuN or estN.  Then
the generic attach code can choose whether to print spammy attaches
based on bootverbose.  dmesg for device attaches seems mostly
redundant with devinfo(8) for persistent devices like ACPI CPU and
est(4).

> The 'CPU XX Launched' messages are very useful for initial diagnostic
> of the SMP startup failures. You need to enable bootverbose to see the
> hang details, but for initial hint they are required. Unfortunately, AP
> startup hangs occur too often to pretend that this can be delegated to
> very specific circumstances.

Really?  I don't know that I've ever seen an AP startup hang.  How
often do they occur?

Best,
Conrad
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Konstantin Belousov-3
On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:

> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov
> <[hidden email]> wrote:
> > On Thu, Apr 19, 2018 at 06:06:21PM +0000, Colin Percival wrote:
> >> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
> >> the boot time console output contains a large number of lines of the forms
> >>
> >> SMP: AP CPU #N Launched!
> >> cpuN: <ACPI CPU> on acpi0
> >> estN: <Enhanced SpeedStep Frequency Control> on cpuN
> >>
> >> Having 128 almost-identical lines of output doesn't seem very useful, and
> >> it actually has a nontrivial impact on the time spent booting.
> >>
> >> Does anyone mind if I hide these by default, having them only show up if
> >> boot verbosity is requested?
>
> +1.  For the device attaches, perhaps it makes sense to add a device
> 'spammy' flag, and set it for per-CPU devices like cpuN or estN.  Then
> the generic attach code can choose whether to print spammy attaches
> based on bootverbose.  dmesg for device attaches seems mostly
> redundant with devinfo(8) for persistent devices like ACPI CPU and
> est(4).
>
> > The 'CPU XX Launched' messages are very useful for initial diagnostic
> > of the SMP startup failures. You need to enable bootverbose to see the
> > hang details, but for initial hint they are required. Unfortunately, AP
> > startup hangs occur too often to pretend that this can be delegated to
> > very specific circumstances.
>
> Really?  I don't know that I've ever seen an AP startup hang.  How
> often do they occur?

It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
supplier and its interaction with the x2APIC enablement, see madt.c:170
and below.

There were several recent reports of the issue with Broadwell Xeon
machines, no additional data or resolution.

There are sporadic reports of the problem, where I do not see
a clear commonality.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Stefan Blachmann
I would also like a solution to this issue, which is looking highly
unprofessional and uselessly wasting space in the dmesg ring buffer.

In the first version of my sysutils/cpupdate tool I also had spammy
output like being complained about.
This was hard-to-read. Thus I changed code so that identical cores
output was collected up in blocks of single output, i.e.: "core #n to
#m: <identical stats>".
If all cpus are identical, cpu stats will be output only once this
way, else in subsequent blocks comprising all identical cores on
different cpus.

To avoid the spamming at bootup and microcode updating, the kernel
would need such a function to read/evaluate *all* cores in a row, like
I did in cpupdate.
This would be only a few lines of code.


On 4/19/18, Konstantin Belousov <[hidden email]> wrote:

> On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:
>> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov
>> <[hidden email]> wrote:
>> > On Thu, Apr 19, 2018 at 06:06:21PM +0000, Colin Percival wrote:
>> >> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128
>> >> vCPUs)
>> >> the boot time console output contains a large number of lines of the
>> >> forms
>> >>
>> >> SMP: AP CPU #N Launched!
>> >> cpuN: <ACPI CPU> on acpi0
>> >> estN: <Enhanced SpeedStep Frequency Control> on cpuN
>> >>
>> >> Having 128 almost-identical lines of output doesn't seem very useful,
>> >> and
>> >> it actually has a nontrivial impact on the time spent booting.
>> >>
>> >> Does anyone mind if I hide these by default, having them only show up
>> >> if
>> >> boot verbosity is requested?
>>
>> +1.  For the device attaches, perhaps it makes sense to add a device
>> 'spammy' flag, and set it for per-CPU devices like cpuN or estN.  Then
>> the generic attach code can choose whether to print spammy attaches
>> based on bootverbose.  dmesg for device attaches seems mostly
>> redundant with devinfo(8) for persistent devices like ACPI CPU and
>> est(4).
>>
>> > The 'CPU XX Launched' messages are very useful for initial diagnostic
>> > of the SMP startup failures. You need to enable bootverbose to see the
>> > hang details, but for initial hint they are required. Unfortunately, AP
>> > startup hangs occur too often to pretend that this can be delegated to
>> > very specific circumstances.
>>
>> Really?  I don't know that I've ever seen an AP startup hang.  How
>> often do they occur?
>
> It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
> supplier and its interaction with the x2APIC enablement, see madt.c:170
> and below.
>
> There were several recent reports of the issue with Broadwell Xeon
> machines, no additional data or resolution.
>
> There are sporadic reports of the problem, where I do not see
> a clear commonality.
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Rodney W. Grimes-4
In reply to this post by Conrad Meyer-2
> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov
> <[hidden email]> wrote:
> > On Thu, Apr 19, 2018 at 06:06:21PM +0000, Colin Percival wrote:
> >> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
> >> the boot time console output contains a large number of lines of the forms
> >>
> >> SMP: AP CPU #N Launched!
> >> cpuN: <ACPI CPU> on acpi0
> >> estN: <Enhanced SpeedStep Frequency Control> on cpuN
> >>
> >> Having 128 almost-identical lines of output doesn't seem very useful, and
> >> it actually has a nontrivial impact on the time spent booting.
> >>
> >> Does anyone mind if I hide these by default, having them only show up if
> >> boot verbosity is requested?
>
> +1.  For the device attaches, perhaps it makes sense to add a device
> 'spammy' flag, and set it for per-CPU devices like cpuN or estN.  Then
> the generic attach code can choose whether to print spammy attaches
> based on bootverbose.  dmesg for device attaches seems mostly
> redundant with devinfo(8) for persistent devices like ACPI CPU and
> est(4).
>
> > The 'CPU XX Launched' messages are very useful for initial diagnostic
> > of the SMP startup failures. You need to enable bootverbose to see the
> > hang details, but for initial hint they are required. Unfortunately, AP
> > startup hangs occur too often to pretend that this can be delegated to
> > very specific circumstances.
>
> Really?  I don't know that I've ever seen an AP startup hang.  How
> often do they occur?
>

What about if XX > 4 silence the messages unless bootverbose is set?
That gets us kinda both worlds, you still see AP startup occuring,
but once you seen 4 of them go off there isnt much point in seeing
the next N of them.


--
Rod Grimes                                                 [hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Julian Elischer-5
In reply to this post by Steven Hartland
On 20/4/18 4:48 am, Steven Hartland wrote:

> Sounds good to me, I think we could actually benefit from even quieter
> modes if I’m honest.
>
> On Thu, 19 Apr 2018 at 21:09, Colin Percival <[hidden email]> wrote:
>
>> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
>> the boot time console output contains a large number of lines of the forms
>>
>> SMP: AP CPU #N Launched!
>> cpuN: <ACPI CPU> on acpi0
>> estN: <Enhanced SpeedStep Frequency Control> on cpuN
>>
>> Having 128 almost-identical lines of output doesn't seem very useful, and
>> it actually has a nontrivial impact on the time spent booting.
>>
>> Does anyone mind if I hide these by default, having them only show up if
>> boot verbosity is requested?

make them single line and just type a CR at the end so they cycle
through on the same line.
or "previous line repeated N times  with only 3 characters different"


>> --
>> Colin Percival
>> Security Officer Emeritus, FreeBSD | The power to serve
>> Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
>> _______________________________________________
>> [hidden email] mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to "[hidden email]"
>>
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[hidden email]"
>
>

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Colin Percival
In reply to this post by Konstantin Belousov-3
On 04/19/18 14:45, Konstantin Belousov wrote:

> On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:
>> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov wrote:
>>> The 'CPU XX Launched' messages are very useful for initial diagnostic
>>> of the SMP startup failures. You need to enable bootverbose to see the
>>> hang details, but for initial hint they are required. Unfortunately, AP
>>> startup hangs occur too often to pretend that this can be delegated to
>>> very specific circumstances.
>>
>> Really?  I don't know that I've ever seen an AP startup hang.  How
>> often do they occur?
>
> It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
> supplier and its interaction with the x2APIC enablement, see madt.c:170
> and below.
>
> There were several recent reports of the issue with Broadwell Xeon
> machines, no additional data or resolution.
>
> There are sporadic reports of the problem, where I do not see
> a clear commonality.

Would it be sufficient for debugging purposes if I change the !bootverbose
case from printing many lines of

SMP: AP CPU #N Launched!

to instead have a single

SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41 113 36
19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12 63 18 67
13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47 37 22
54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23 115 16
35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49 25 108
106 122 117 114 26 110 59 76 124 119

?  (With each AP printing its number as it reaches the appropriate point?)

This yields almost the same gain as silencing the launch messages completely,
while still allowing you to see each CPU announcing itself.

--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Konstantin Belousov-3
In reply to this post by Konstantin Belousov-3
On Sat, Apr 21, 2018 at 12:11:07AM +0000, Colin Percival wrote:

> On 04/19/18 14:45, Konstantin Belousov wrote:
> > On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:
> >> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov wrote:
> >>> The 'CPU XX Launched' messages are very useful for initial diagnostic
> >>> of the SMP startup failures. You need to enable bootverbose to see the
> >>> hang details, but for initial hint they are required. Unfortunately, AP
> >>> startup hangs occur too often to pretend that this can be delegated to
> >>> very specific circumstances.
> >>
> >> Really?  I don't know that I've ever seen an AP startup hang.  How
> >> often do they occur?
> >
> > It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
> > supplier and its interaction with the x2APIC enablement, see madt.c:170
> > and below.
> >
> > There were several recent reports of the issue with Broadwell Xeon
> > machines, no additional data or resolution.
> >
> > There are sporadic reports of the problem, where I do not see
> > a clear commonality.
>
> Would it be sufficient for debugging purposes if I change the !bootverbose
> case from printing many lines of
>
> SMP: AP CPU #N Launched!
>
> to instead have a single
>
> SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41 113 36
> 19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12 63 18 67
> 13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47 37 22
> 54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23 115 16
> 35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49 25 108
> 106 122 117 114 26 110 59 76 124 119
>
> ?  (With each AP printing its number as it reaches the appropriate point?)
>
> This yields almost the same gain as silencing the launch messages completely,
> while still allowing you to see each CPU announcing itself.
I am fine with the behaviour, but I am not sure how would you implement
this.  printf(9) buffers the output, you need to flush it somehow.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Gary Jennejohn-6
On Sat, 21 Apr 2018 12:20:49 +0300
Konstantin Belousov <[hidden email]> wrote:

> On Sat, Apr 21, 2018 at 12:11:07AM +0000, Colin Percival wrote:
> > On 04/19/18 14:45, Konstantin Belousov wrote:  
> > > On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:  
> > >> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov wrote:  
> > >>> The 'CPU XX Launched' messages are very useful for initial diagnostic
> > >>> of the SMP startup failures. You need to enable bootverbose to see the
> > >>> hang details, but for initial hint they are required. Unfortunately, AP
> > >>> startup hangs occur too often to pretend that this can be delegated to
> > >>> very specific circumstances.  
> > >>
> > >> Really?  I don't know that I've ever seen an AP startup hang.  How
> > >> often do they occur?  
> > >
> > > It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
> > > supplier and its interaction with the x2APIC enablement, see madt.c:170
> > > and below.
> > >
> > > There were several recent reports of the issue with Broadwell Xeon
> > > machines, no additional data or resolution.
> > >
> > > There are sporadic reports of the problem, where I do not see
> > > a clear commonality.  
> >
> > Would it be sufficient for debugging purposes if I change the !bootverbose
> > case from printing many lines of
> >
> > SMP: AP CPU #N Launched!
> >
> > to instead have a single
> >
> > SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41 113 36
> > 19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12 63 18 67
> > 13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47 37 22
> > 54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23 115 16
> > 35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49 25 108
> > 106 122 117 114 26 110 59 76 124 119
> >
> > ?  (With each AP printing its number as it reaches the appropriate point?)
> >
> > This yields almost the same gain as silencing the launch messages completely,
> > while still allowing you to see each CPU announcing itself.  
> I am fine with the behaviour, but I am not sure how would you implement
> this.  printf(9) buffers the output, you need to flush it somehow.
>

And printf(9) calls vprintf(9) which in turn calls _vprintf which
will allocate a buffer on the stack (bad idea) if the option
PRINTF_BUFR_SIZE is set in the kernel config file.  So it seems
that output may even be double buffered.

--
Gary Jennejohn
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Stefan Blachmann
There are not many places that call the cpu initializing routine.
Thus it should be easy to add a pointer to the output function to be used.

What about solving the problem using simple code like I did this in
sysutils/cpupdate?
(https://github.com/kernschmelze/cpupdate)

[BEGIN PSEUDOCODE SNIPPET]
print_cpu_stats( stringbuffer *info, int from, int to)
{
        print( "Core %d to %d: %s", from, to, info);
}

initialize_all_cpus()
{
        int startcore = 0;
        int ncore = 1;
        stringbuffer
                *ncoreinfo,
                *startcoreinfo = initializeCPU(0, functionPointerPrintToStringBuffer);
       
        for ( ; ncore < numCores ; ++ncore) {
          ncoreinfo = initialize_one_CPU(ncore, functionPointerPrintToStringBuffer);
    if ( <compare_startcoreinfo_output_is_different_to_ncoreinfo> ) {
          // core #ncore is different than core #startcore..#(ncore-1).
                        // Print core info for that block
                        print_cpu_stats( startcoreinfo, startcore, ncore - 1);
                        startcore = ncore;
                        startcoreinfo = ncoreinfo;
                }
        }
        // make sure also the last block of identical cores are shown
        if (startcore < ncore)
                print_cpu_stats( startcoreinfo, startcore, ncore - 1);
}
[END PSEUDOCODE SNIPPET]

Only such a small change would be necessary in the bootup code and in
cpuctl.c to avoid massive spamming when updating microcode.

But I have given up attempting to contribute to FreeBSD, because of
the gross lack of interest on core developers' side in things
important to people running FreeBSD on bare metal.

For example, it is just amazing to see how all reports about massive
flaws in kernel microcode detection/updating and all offered patches
are being blatantly ignored for many years. See for example:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192487

Sad, very sad!

On 4/21/18, Gary Jennejohn <[hidden email]> wrote:

> On Sat, 21 Apr 2018 12:20:49 +0300
> Konstantin Belousov <[hidden email]> wrote:
>
>> On Sat, Apr 21, 2018 at 12:11:07AM +0000, Colin Percival wrote:
>> > On 04/19/18 14:45, Konstantin Belousov wrote:
>> > > On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:
>> > >> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov wrote:
>> > >>> The 'CPU XX Launched' messages are very useful for initial
>> > >>> diagnostic
>> > >>> of the SMP startup failures. You need to enable bootverbose to see
>> > >>> the
>> > >>> hang details, but for initial hint they are required. Unfortunately,
>> > >>> AP
>> > >>> startup hangs occur too often to pretend that this can be delegated
>> > >>> to
>> > >>> very specific circumstances.
>> > >>
>> > >> Really?  I don't know that I've ever seen an AP startup hang.  How
>> > >> often do they occur?
>> > >
>> > > It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
>> > > supplier and its interaction with the x2APIC enablement, see
>> > > madt.c:170
>> > > and below.
>> > >
>> > > There were several recent reports of the issue with Broadwell Xeon
>> > > machines, no additional data or resolution.
>> > >
>> > > There are sporadic reports of the problem, where I do not see
>> > > a clear commonality.
>> >
>> > Would it be sufficient for debugging purposes if I change the
>> > !bootverbose
>> > case from printing many lines of
>> >
>> > SMP: AP CPU #N Launched!
>> >
>> > to instead have a single
>> >
>> > SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41
>> > 113 36
>> > 19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12 63
>> > 18 67
>> > 13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47
>> > 37 22
>> > 54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23
>> > 115 16
>> > 35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49
>> > 25 108
>> > 106 122 117 114 26 110 59 76 124 119
>> >
>> > ?  (With each AP printing its number as it reaches the appropriate
>> > point?)
>> >
>> > This yields almost the same gain as silencing the launch messages
>> > completely,
>> > while still allowing you to see each CPU announcing itself.
>> I am fine with the behaviour, but I am not sure how would you implement
>> this.  printf(9) buffers the output, you need to flush it somehow.
>>
>
> And printf(9) calls vprintf(9) which in turn calls _vprintf which
> will allocate a buffer on the stack (bad idea) if the option
> PRINTF_BUFR_SIZE is set in the kernel config file.  So it seems
> that output may even be double buffered.
>
> --
> Gary Jennejohn
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Cy Schubert-4
In reply to this post by Colin Percival
In message <[hidden email]
il.com>
, Stefan Blachmann writes:
> I would also like a solution to this issue, which is looking highly
> unprofessional and uselessly wasting space in the dmesg ring buffer.

Really? Have you looked at Red Hat (or Oracle) Linux or Solaris dmesg
output?

Why neuter the dmesg ring buffer?

>
> In the first version of my sysutils/cpupdate tool I also had spammy
> output like being complained about.
> This was hard-to-read. Thus I changed code so that identical cores
> output was collected up in blocks of single output, i.e.: "core #n to
> #m: <identical stats>".
> If all cpus are identical, cpu stats will be output only once this
> way, else in subsequent blocks comprising all identical cores on
> different cpus.
>
> To avoid the spamming at bootup and microcode updating, the kernel
> would need such a function to read/evaluate *all* cores in a row, like
> I did in cpupdate.
> This would be only a few lines of code.

Maybe the solution is a Red Hat style of slider bar or a Windows style
graphic to hide messages.

Personally, a Red Hat style of slider might work. Hit the enter to see
the gory details. Hit enter again to see the slider bar again. It
doesn't have to be a slider -- we don't want to look like a RH
derivative. It can be any kind of progress indicator.

Additionally, green and red checkboxes like HP/UX and Red Hat (they
ripped that off from HP) is okay but when something fails you're in the
dark. (At $JOB we use HP SA, with its green checkmarks and red x's on
various dashboard style outputs. All you can tell is something is out
of compliance but not where the actual problem is. This is totally
unacceptable.)

~cy

>
>
> On 4/19/18, Konstantin Belousov <[hidden email]> wrote:
> > On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:
> >> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov
> >> <[hidden email]> wrote:
> >> > On Thu, Apr 19, 2018 at 06:06:21PM +0000, Colin Percival wrote:
> >> >> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128
> >> >> vCPUs)
> >> >> the boot time console output contains a large number of lines of the
> >> >> forms
> >> >>
> >> >> SMP: AP CPU #N Launched!
> >> >> cpuN: <ACPI CPU> on acpi0
> >> >> estN: <Enhanced SpeedStep Frequency Control> on cpuN
> >> >>
> >> >> Having 128 almost-identical lines of output doesn't seem very useful,
> >> >> and
> >> >> it actually has a nontrivial impact on the time spent booting.
> >> >>
> >> >> Does anyone mind if I hide these by default, having them only show up
> >> >> if
> >> >> boot verbosity is requested?
> >>
> >> +1.  For the device attaches, perhaps it makes sense to add a device
> >> 'spammy' flag, and set it for per-CPU devices like cpuN or estN.  Then
> >> the generic attach code can choose whether to print spammy attaches
> >> based on bootverbose.  dmesg for device attaches seems mostly
> >> redundant with devinfo(8) for persistent devices like ACPI CPU and
> >> est(4).
> >>
> >> > The 'CPU XX Launched' messages are very useful for initial diagnostic
> >> > of the SMP startup failures. You need to enable bootverbose to see the
> >> > hang details, but for initial hint they are required. Unfortunately, AP
> >> > startup hangs occur too often to pretend that this can be delegated to
> >> > very specific circumstances.
> >>
> >> Really?  I don't know that I've ever seen an AP startup hang.  How
> >> often do they occur?
> >
> > It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
> > supplier and its interaction with the x2APIC enablement, see madt.c:170
> > and below.
> >
> > There were several recent reports of the issue with Broadwell Xeon
> > machines, no additional data or resolution.
> >
> > There are sporadic reports of the problem, where I do not see
> > a clear commonality.
> > _______________________________________________
> > [hidden email] mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> > To unsubscribe, send any mail to "[hidden email]"
> >
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[hidden email]"
>

--
Cheers,
Cy Schubert <[hidden email]>
FreeBSD UNIX:  <[hidden email]>   Web:  http://www.FreeBSD.org

        The need of the many outweighs the greed of the few.


_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Julian Elischer-5
In reply to this post by Stefan Blachmann
On 21/4/18 9:03 pm, Stefan Blachmann wrote:
> For example, it is just amazing to see how all reports about massive
> flaws in kernel microcode detection/updating and all offered patches
> are being blatantly ignored for many years. See for example:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192487
yes I saw that but Only a handful of people know what's going on there
so 99% of the
developers just stood back and said "I hope someone who understands
this picks it up".
I didn't follow the one response I saw. So I couldn't decide on who
had the better argument.


_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Warner Losh
In reply to this post by Colin Percival
On Fri, Apr 20, 2018 at 6:11 PM, Colin Percival <[hidden email]>
wrote:

> On 04/19/18 14:45, Konstantin Belousov wrote:
> > On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:
> >> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov wrote:
> >>> The 'CPU XX Launched' messages are very useful for initial diagnostic
> >>> of the SMP startup failures. You need to enable bootverbose to see the
> >>> hang details, but for initial hint they are required. Unfortunately, AP
> >>> startup hangs occur too often to pretend that this can be delegated to
> >>> very specific circumstances.
> >>
> >> Really?  I don't know that I've ever seen an AP startup hang.  How
> >> often do they occur?
> >
> > It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
> > supplier and its interaction with the x2APIC enablement, see madt.c:170
> > and below.
> >
> > There were several recent reports of the issue with Broadwell Xeon
> > machines, no additional data or resolution.
> >
> > There are sporadic reports of the problem, where I do not see
> > a clear commonality.
>
> Would it be sufficient for debugging purposes if I change the !bootverbose
> case from printing many lines of
>
> SMP: AP CPU #N Launched!
>
> to instead have a single
>
> SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41 113
> 36
> 19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12 63 18
> 67
> 13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47 37
> 22
> 54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23 115
> 16
> 35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49 25
> 108
> 106 122 117 114 26 110 59 76 124 119
>
> ?  (With each AP printing its number as it reaches the appropriate point?)
>
> This yields almost the same gain as silencing the launch messages
> completely,
> while still allowing you to see each CPU announcing itself.


The trouble is that you've got N CPUs that are doing output at the same
time. You'll need to synchronize somehow. And how do you know that the last
one is done? Especailly if one of the CPUs doesn't start..

It looks great in theory, but I'm not sure how you'd make it work in
practice.

The other stuff (cpu and per-cpu stuff) is actually easy to pare down
entirely inside of newbus. I'll share a patch to do that.

Warner
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Warner Losh
In reply to this post by Julian Elischer-5
On Sat, Apr 21, 2018 at 11:00 AM, Julian Elischer <[hidden email]>
wrote:

> On 21/4/18 9:03 pm, Stefan Blachmann wrote:
>
>> For example, it is just amazing to see how all reports about massive
>> flaws in kernel microcode detection/updating and all offered patches
>> are being blatantly ignored for many years. See for example:
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192487
>>
> yes I saw that but Only a handful of people know what's going on there so
> 99% of the
> developers just stood back and said "I hope someone who understands this
> picks it up".
> I didn't follow the one response I saw. So I couldn't decide on who had
> the better argument.


Except that  I said "sure, I'll integrate it, but you need to do these
things" and Stefan said "oh, lemme get you the code" and I never heard back
anything except complaints that it hasn't been integrated yet. Not playing
well with others makes them less willing to help you.

Warner
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Stefan Blachmann
On 4/21/18, Warner Losh <[hidden email]> wrote:
> Except that  I said "sure, I'll integrate it, but you need to do these
> things" and Stefan said "oh, lemme get you the code" and I never heard back
> anything except complaints that it hasn't been integrated yet. Not playing
> well with others makes them less willing to help you.

Well, I refactored the code to be up to style guidelines, and when it
was ready on Mar 20th, I emailed you and the other guys involved with
microcode stuff.
Didn't get an answer from you, so I thought there would be no more interest.

But, maybe it was just the mail slipping through, or get forgotten due
to much work?
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Conrad Meyer-2
In reply to this post by Warner Losh
On Sat, Apr 21, 2018 at 10:02 AM, Warner Losh <[hidden email]> wrote:

> On Fri, Apr 20, 2018 at 6:11 PM, Colin Percival <[hidden email]>
> wrote:
>> Would it be sufficient for debugging purposes if I change the !bootverbose
>> case from printing many lines of
>>
>> SMP: AP CPU #N Launched!
>>
>> to instead have a single
>>
>> SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41 113
>> 36
>> 19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12 63 18
>> 67
>> 13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47 37
>> 22
>> 54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23 115
>> 16
>> 35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49 25
>> 108
>> 106 122 117 114 26 110 59 76 124 119
>>
>> ?  (With each AP printing its number as it reaches the appropriate point?)
>>
>> This yields almost the same gain as silencing the launch messages
>> completely,
>> while still allowing you to see each CPU announcing itself.
>
>
> The trouble is that you've got N CPUs that are doing output at the same
> time. You'll need to synchronize somehow. And how do you know that the last
> one is done? Especailly if one of the CPUs doesn't start..
>
> It looks great in theory, but I'm not sure how you'd make it work in
> practice.

Yeah.  You could pare down the number of characters per-AP
substantially.  Something like:

SMP: Launching APs:
2
3
4
...

It sounds like EC2 is redrawing on every character emitted, scrolling
or not.  So the additional line breaks shouldn't hurt.

Best,
Conrad
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Hiding per-CPU kernel output behind bootverbose

Warner Losh
In reply to this post by Julian Elischer-5
On Fri, Apr 20, 2018 at 12:11 AM, Julian Elischer <[hidden email]>
wrote:

> On 20/4/18 4:48 am, Steven Hartland wrote:
>
>> Sounds good to me, I think we could actually benefit from even quieter
>> modes if I’m honest.
>>
>> On Thu, 19 Apr 2018 at 21:09, Colin Percival <[hidden email]>
>> wrote:
>>
>> On large systems (e.g., EC2's x1e.32xlarge instance type, with 128 vCPUs)
>>> the boot time console output contains a large number of lines of the
>>> forms
>>>
>>> SMP: AP CPU #N Launched!
>>> cpuN: <ACPI CPU> on acpi0
>>> estN: <Enhanced SpeedStep Frequency Control> on cpuN
>>>
>>> Having 128 almost-identical lines of output doesn't seem very useful, and
>>> it actually has a nontrivial impact on the time spent booting.
>>>
>>> Does anyone mind if I hide these by default, having them only show up if
>>> boot verbosity is requested?
>>>
>>
https://reviews.freebsd.org/D15153 implements that for everything except
the CPU launched.

Warner
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
12