Are sysctl(8) values useful for measuring system resource consumption?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Are sysctl(8) values useful for measuring system resource consumption?

David Wolfskill
At ${work}, one of my projects is to help obtain information regarding
the "developer experience," what resources are thus consumed, and figure
out ways to mitigate the pain -- the objective, of course, being to help
the developers be more productive within a FreeBSD environment.

A couple of the perceived "pain points" are the time it takes to perform
a CVS checkout and the time it takes to perform a build of the system
(ours, at work -- not FreeBSD itself).

As a step toward obtaining the information, I've cobbled up a Perl
script that essentailly acts as a bit of "scaffolding" around time(1);
the script sets things up to invoke time(1) with the "-l" flag (so we
get the rusage structure information) and use "-o" to direct the output
of time(1) to a file in /tmp, which the script then reads.

The script then spits out a bunch of information as a single record in a
CSV (Comma-Separated Variable) file (as that's the format my colleague
wanted): start- and stop-timestamps, the hostname where the processes
ran, the current working directory, real- and effective UIDs & GIDs; the
exit code for the invoked command, the output from time(1), and
(finally) the invoked command itself.  (I then use a different script to
read the CSV and update an RRD, then use rrdcgi(1) to generate graphs.)

This has proved to be interesting, and quite possibly useful, but it
merely provides a view as to the resources used by the processes being
invoked from within the "scaffolding."

I believe we would be well-served by also collecting information as to
the resources being consumed by the system as a whole, as well -- for
instance, if there's a lot of other activity on the machine in question,
it might be nice to know that (and it might be even better if we had a
way to characterize the rest of the workload as a whole).

It would be handy if I could arrange to run vmstat(8), iostat(8), or
netstat(1) in such a way that I got counters for the values immediately
prior to starting the command being tested, then got a similar set of
counters just after the test command completed, so I could store the
"counter differences" some place handy.

But that doesn't seem too readily feasible at this time.

I had been trying to think of a decent way to get the overall system
information for precisely the interval that I'm running the test, and
then the thought occurred to me that perhaps I could invoke sysctl(8)
with a suitable set of arguments both before & after invoking the
process being tested; perhaps that would be a reasonable way to get
information of the desired quality.

I do not expect to necessarily be able to install random ports on the
development machines, so there's a significant benefit to using an
approach that doesn't require doing that.  (I can use scripts that I
write, as they are being invoked by the "test" user, from that test
user's environment.)

So this is a reality check:  does that approach make sense?  If not,
what shortcomings does it have with respect to other alternatives?

Please recall that the intent is to be able to place the rusage data
from time(1) in a relevant context.

I'm reasonably open to suggestions & alternatives.

Thanks!

Peace,
david
--
David H. Wolfskill [hidden email]
I submit that "conspiracy" would be an appropriate collective noun for cats.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

attachment0 (202 bytes) Download Attachment