Removing build metadata, for reproducible kernel builds

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Removing build metadata, for reproducible kernel builds

Ed Maste-2
The main issue currently preventing kernel builds from being
reproducible[1] is the build metadata itself that's included (time,
user, host, build path). In order to make the kernel build
reproducible I plan to remove these by default, and add a src.conf
knob to enable them for developers who want them in their own builds.

The user-facing effect of this is that the kern.version sysctl no
longer conveys this information, and uname -a changes from something
like:

FreeBSD ref11-amd64.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #0
r288681: Mon Oct  5 01:40:11 UTC 2015
[hidden email]:/usr/obj/usr/src/sys/CLUSTER11  amd64

to something like:

FreeBSD feynman 10.2-STABLE FreeBSD 10.2-STABLE #44
r288174+7644546(stable-10) amd64

The current version of the change is available for review at
https://reviews.freebsd.org/D4347.

[1] See https://reproducible-builds.org/ for more information on the
reproducible builds project.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Alfred Perlstein-3


On 12/2/15 9:36 AM, Ed Maste wrote:

> The main issue currently preventing kernel builds from being
> reproducible[1] is the build metadata itself that's included (time,
> user, host, build path). In order to make the kernel build
> reproducible I plan to remove these by default, and add a src.conf
> knob to enable them for developers who want them in their own builds.
>
> The user-facing effect of this is that the kern.version sysctl no
> longer conveys this information, and uname -a changes from something
> like:
>
> FreeBSD ref11-amd64.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #0
> r288681: Mon Oct  5 01:40:11 UTC 2015
> [hidden email]:/usr/obj/usr/src/sys/CLUSTER11  amd64
>
> to something like:
>
> FreeBSD feynman 10.2-STABLE FreeBSD 10.2-STABLE #44
> r288174+7644546(stable-10) amd64
>
> The current version of the change is available for review at
> https://reviews.freebsd.org/D4347.
>
> [1] See https://reproducible-builds.org/ for more information on the
> reproducible builds project.

Can it not be done as a kernel module (containing the strings/numbers)
or injected after the fact by editing the binaries?

This info is very useful.

-Alfred
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

John Baldwin
In reply to this post by Ed Maste-2
On Wednesday, December 02, 2015 05:36:52 PM Ed Maste wrote:

> The main issue currently preventing kernel builds from being
> reproducible[1] is the build metadata itself that's included (time,
> user, host, build path). In order to make the kernel build
> reproducible I plan to remove these by default, and add a src.conf
> knob to enable them for developers who want them in their own builds.
>
> The user-facing effect of this is that the kern.version sysctl no
> longer conveys this information, and uname -a changes from something
> like:
>
> FreeBSD ref11-amd64.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #0
> r288681: Mon Oct  5 01:40:11 UTC 2015
> [hidden email]:/usr/obj/usr/src/sys/CLUSTER11  amd64
>
> to something like:
>
> FreeBSD feynman 10.2-STABLE FreeBSD 10.2-STABLE #44
> r288174+7644546(stable-10) amd64
>
> The current version of the change is available for review at
> https://reviews.freebsd.org/D4347.
>
> [1] See https://reproducible-builds.org/ for more information on the
> reproducible builds project.

As I noted in the review, this will break kgdb -n (and possibly crashinfo,
less certain about that).  Keeping the path (which should not vary if you
build out of the same tree) will be sufficient to let kgdb -n still work
(though it may need some changes to recognize both formats).

Keeping the path also means that 'uname -a' still tells you which kernel
config you are running (I assume you aren't changing 'uname -i', but
'uname -a' doesn't include 'uname -i').

--
John Baldwin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Ian Lepore-3
On Wed, 2015-12-02 at 12:03 -0800, John Baldwin wrote:

> On Wednesday, December 02, 2015 05:36:52 PM Ed Maste wrote:
> > The main issue currently preventing kernel builds from being
> > reproducible[1] is the build metadata itself that's included (time,
> > user, host, build path). In order to make the kernel build
> > reproducible I plan to remove these by default, and add a src.conf
> > knob to enable them for developers who want them in their own
> > builds.
> >
> > The user-facing effect of this is that the kern.version sysctl no
> > longer conveys this information, and uname -a changes from
> > something
> > like:
> >
> > FreeBSD ref11-amd64.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT
> > #0
> > r288681: Mon Oct  5 01:40:11 UTC 2015
> > [hidden email]:/usr/obj/usr/src/sys/CLUSTER11  amd64
> >
> > to something like:
> >
> > FreeBSD feynman 10.2-STABLE FreeBSD 10.2-STABLE #44
> > r288174+7644546(stable-10) amd64
> >
> > The current version of the change is available for review at
> > https://reviews.freebsd.org/D4347.
> >
> > [1] See https://reproducible-builds.org/ for more information on
> > the
> > reproducible builds project.
>
> As I noted in the review, this will break kgdb -n (and possibly
> crashinfo,
> less certain about that).  Keeping the path (which should not vary if
> you
> build out of the same tree) will be sufficient to let kgdb -n still
> work
> (though it may need some changes to recognize both formats).
>
> Keeping the path also means that 'uname -a' still tells you which
> kernel
> config you are running (I assume you aren't changing 'uname -i', but
> 'uname -a' doesn't include 'uname -i').
>

But in the kinds of venues where reproducible builds are most
important, such as creating images that are part of commercial
products, the build path is one of the things most likely to change
between builds and least likely to be significant in terms of any
differences to the conents of the build.  Likewise the hostname of the
build machine, which it appears is still in the uname output.

-- Ian


_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Andriy Gapon
In reply to this post by Ed Maste-2
On 02/12/2015 19:36, Ed Maste wrote:

> The main issue currently preventing kernel builds from being
> reproducible[1] is the build metadata itself that's included (time,
> user, host, build path). In order to make the kernel build
> reproducible I plan to remove these by default, and add a src.conf
> knob to enable them for developers who want them in their own builds.
>
> The user-facing effect of this is that the kern.version sysctl no
> longer conveys this information, and uname -a changes from something
> like:
>
> FreeBSD ref11-amd64.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #0
> r288681: Mon Oct  5 01:40:11 UTC 2015
> [hidden email]:/usr/obj/usr/src/sys/CLUSTER11  amd64
>
> to something like:
>
> FreeBSD feynman 10.2-STABLE FreeBSD 10.2-STABLE #44
> r288174+7644546(stable-10) amd64
>
> The current version of the change is available for review at
> https://reviews.freebsd.org/D4347.
>
> [1] See https://reproducible-builds.org/ for more information on the
> reproducible builds project.

Personally, I would prefer that, at least initially, KERNEL_METADATA is "yes" by
default.  My thinking is that people who really need reproducible builds would
have no trouble toggling the knob and the rest would have the traditional behavior.

--
Andriy Gapon
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Tim Kientzle-2
In reply to this post by Ed Maste-2

> On Dec 2, 2015, at 9:36 AM, Ed Maste <[hidden email]> wrote:
>
> The main issue currently preventing kernel builds from being
> reproducible[1] is the build metadata itself that's included (time,
> user, host, build path). In order to make the kernel build
> reproducible I plan to remove these by default, and add a src.conf
> knob to enable them for developers who want them in their own builds.
>
> The user-facing effect of this is that the kern.version sysctl no
> longer conveys this information, and uname -a changes from something
> like:
>
> FreeBSD ref11-amd64.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #0
> r288681: Mon Oct  5 01:40:11 UTC 2015
> [hidden email]:/usr/obj/usr/src/sys/CLUSTER11  amd64
>
> to something like:
>
> FreeBSD feynman 10.2-STABLE FreeBSD 10.2-STABLE #44
> r288174+7644546(stable-10) amd64
>
> The current version of the change is available for review at
> https://reviews.freebsd.org/D4347.
>
> [1] See https://reproducible-builds.org/ for more information on the
> reproducible builds project.

How feasible would it be for the various metadata here to
be overridable by src.conf?

That is, by default, the time, user, host, etc, are taken from
the local environment, but src.conf variables can override them
to produce more predictable results.

Tim


_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Warner Losh
In reply to this post by Ed Maste-2

> On Dec 2, 2015, at 10:36 AM, Ed Maste <[hidden email]> wrote:
>
> The main issue currently preventing kernel builds from being
> reproducible[1] is the build metadata itself that's included (time,
> user, host, build path). In order to make the kernel build
> reproducible I plan to remove these by default, and add a src.conf
> knob to enable them for developers who want them in their own builds.
>
> The user-facing effect of this is that the kern.version sysctl no
> longer conveys this information, and uname -a changes from something
> like:
>
> FreeBSD ref11-amd64.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #0
> r288681: Mon Oct  5 01:40:11 UTC 2015
> [hidden email]:/usr/obj/usr/src/sys/CLUSTER11  amd64
>
> to something like:
>
> FreeBSD feynman 10.2-STABLE FreeBSD 10.2-STABLE #44
> r288174+7644546(stable-10) amd64
>
> The current version of the change is available for review at
> https://reviews.freebsd.org/D4347.
>
> [1] See https://reproducible-builds.org/ for more information on the
> reproducible builds project.
I noted in the review that I don’t like the default being no.

I also don’t like that we’re growing lots of different knobs that need
to be set to get a repeatable build. Let’s have one, or barring that,
let’s have one that sets all the sub-knobs.

I think that host and path are more worthless than date and time
in many environments. Who builds it likewise. Those are all things
that are likely to change between builds, yet change the kernel
image. I’d rather see it all gone when this option is in effect.
And I’d rather see the default be to the historical behavior.
The build number too is kinda lame here, since that’s just a history
of the number of tries. If you are building from svn, it should be
zero. But if you’re rebuilding, you can easily get that number over
100 as you update from rev to rev and reboot. It’s better to have
the date / time of the build so if you are seeing a problem on a
test machine, you’ll know more firmly if the build has that thing
you fixed yesterday afternoon or not by the date / time it
was built, and by whom (since my kernels after 9:15am
have the fix, but nobody else does before 2:00pm since
that’s when I checked it in).

So I see the need for the feature, in general. But this doesn’t
implement a reproducible build due to the build number, the
user, the host and the path still being encoded into it. That makes
the change to remove date / time completely arbitrary which
is annoying because they are useful in many environments
where it would be difficult to force everybody to ‘opt in’ to
having them included. It’s easier to opt-out the release
process.

Warner

signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Ed Maste-2
On 3 December 2015 at 05:51, Warner Losh <[hidden email]> wrote:
>
> I noted in the review that I don’t like the default being no.
>
> I also don’t like that we’re growing lots of different knobs that need
> to be set to get a repeatable build. Let’s have one, or barring that,
> let’s have one that sets all the sub-knobs.

My hope is that we'll have a reproducible build by default, and that
*no* knobs need to be set. That's what I intend with my patch. I can
rename the knob to WITH_/WITHOUT_REPRODUCIBLE_BUILD though if that's
generally desired. If there's a consensus to default to including the
metadata I'm fine with setting it in make release.

> I think that host and path are more worthless than date and time
> in many environments. Who builds it likewise. Those are all things
> that are likely to change between builds, yet change the kernel
> image. I’d rather see it all gone when this option is in effect.

I don't follow -- other than the build iteration number (which I
indeed missed), it is all gone.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Enji Cooper

> On Dec 2, 2015, at 23:55, Ed Maste <[hidden email]> wrote:
>
> On 3 December 2015 at 05:51, Warner Losh <[hidden email]> wrote:
>>
>> I noted in the review that I don’t like the default being no.
>>
>> I also don’t like that we’re growing lots of different knobs that need
>> to be set to get a repeatable build. Let’s have one, or barring that,
>> let’s have one that sets all the sub-knobs.
>
> My hope is that we'll have a reproducible build by default, and that
> *no* knobs need to be set. That's what I intend with my patch. I can
> rename the knob to WITH_/WITHOUT_REPRODUCIBLE_BUILD though if that's
> generally desired. If there's a consensus to default to including the
> metadata I'm fine with setting it in make release.
>
>> I think that host and path are more worthless than date and time
>> in many environments. Who builds it likewise. Those are all things
>> that are likely to change between builds, yet change the kernel
>> image. I’d rather see it all gone when this option is in effect.
>
> I don't follow -- other than the build iteration number (which I
> indeed missed), it is all gone.

I personally like being able to debug when user A builds on machine X vs user B on machine Y — because it's helped me find issues with peoples’ build environments in the past where I could have ended up pulling teeth.

I think the single-knob src.conf knob approach is wrong though. Why not document how to do it with build(7) and tweak newvers.sh to do this (which drives this to begin with)? That would generalize the solution, accomplish this goal, and help $work accomplish this goal, because right now we ($work) hack newvers.sh in order to change the version information to brand the product appropriately, instead of build upon existing infrastructure, as the existing infrastructure is not flexible and documented and is very static.

Thanks,
-NGie
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Erik Cederstrand-3
In reply to this post by John Baldwin

> Den 2. dec. 2015 kl. 21.03 skrev John Baldwin <[hidden email]>:
>
> As I noted in the review, this will break kgdb -n (and possibly crashinfo,
> less certain about that).  Keeping the path (which should not vary if you
> build out of the same tree) will be sufficient to let kgdb -n still work
> (though it may need some changes to recognize both formats).

Would it be feasible to include the relative build path instead of the absolute path? I seem to remember patches floating around for the __FILE__ macro, but I don't know if (k)gdb can work with relative paths.

Erik
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

John Baldwin
On Thursday, December 03, 2015 10:28:10 AM Erik Cederstrand wrote:
>
> > Den 2. dec. 2015 kl. 21.03 skrev John Baldwin <[hidden email]>:
> >
> > As I noted in the review, this will break kgdb -n (and possibly crashinfo,
> > less certain about that).  Keeping the path (which should not vary if you
> > build out of the same tree) will be sufficient to let kgdb -n still work
> > (though it may need some changes to recognize both formats).
>
> Would it be feasible to include the relative build path instead of the absolute path? I seem to remember patches floating around for the __FILE__ macro, but I don't know if (k)gdb can work with relative paths.

This is what kgdb -n does:

        /*
         * No kernel image here.  Parse the dump header.  The kernel object
         * directory can be found there and we probably have the kernel
         * image still in it.  The object directory may also have a kernel
         * with debugging info (called kernel.debug).  If we have a debug
         * kernel, use it.
         */
        snprintf(path, sizeof(path), "%s/info.%d", crashdir, nr);
        info = fopen(path, "r");
        if (info == NULL) {
                warn("%s", path);
                return;
        }
        while (fgets(path, sizeof(path), info) != NULL) {
                l = strlen(path);
                if (l > 0 && path[l - 1] == '\n')
                        path[--l] = '\0';
                if (strncmp(path, "    ", 4) == 0) {
                        s = strchr(path, ':');
                        s = (s == NULL) ? path + 4 : s + 1;
                        l = snprintf(path, sizeof(path), "%s/kernel.debug", s);
                        if (stat(path, &st) == -1 || !S_ISREG(st.st_mode)) {
                                path[l - 6] = '\0';
                                if (stat(path, &st) == -1 ||
                                    !S_ISREG(st.st_mode))
                                        break;
                        }
                        kernel = strdup(path);
                        break;
                }
        }
        fclose(info);

It basically pulls the path from the 'version' string in the /var/crash/info.X
line, appends 'kernel.debug' to it and sees if there is a file with that
pathname.  If so, it uses it.  This means it doesn't find a kernel in some
/boot/foo, it looks in the build directory.

crashinfo instead finds all the 'kernel' files under /boot, extracts the
version string using gdb from each kernel, and does a string compare with the
version string in info.X.  For this reason, crashinfo will still work if each
string is unique.  However, with the proposal, kernels built with different
kernel configs from the same tree would have the same version string, thus being
indistinguishable.

A more robust solution than the string compares would be build-id, but that
requires a newer linker which we don't have.

--
John Baldwin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Warner Losh
In reply to this post by Ed Maste-2
On Thu, Dec 3, 2015 at 12:55 AM, Ed Maste <[hidden email]> wrote:

> On 3 December 2015 at 05:51, Warner Losh <[hidden email]> wrote:
> >
> > I noted in the review that I don’t like the default being no.
> >
> > I also don’t like that we’re growing lots of different knobs that need
> > to be set to get a repeatable build. Let’s have one, or barring that,
> > let’s have one that sets all the sub-knobs.
>
> My hope is that we'll have a reproducible build by default, and that
> *no* knobs need to be set. That's what I intend with my patch. I can
> rename the knob to WITH_/WITHOUT_REPRODUCIBLE_BUILD though if that's
> generally desired. If there's a consensus to default to including the
> metadata I'm fine with setting it in make release.


I think this an unwise decision in the current form suggested. The kernel
metadata has saved my butt enough times I really don't want to see it
go by default. But see below for a reasonable (imho) middle ground that
would be a good default.


> > I think that host and path are more worthless than date and time
> > in many environments. Who builds it likewise. Those are all things
> > that are likely to change between builds, yet change the kernel
> > image. I’d rather see it all gone when this option is in effect.
>
> I don't follow -- other than the build iteration number (which I
> indeed missed), it is all gone.
>

Yea I was reading things backwards.

In the review, I suggested that if you've modified the tree (which the SCM
will tell you), then do the old format to preserve useful metadata that's
really really needed and if not to use the shorter version. When you've
modified the tree, reproducible builds aren't a concern at all.

Warner
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Ian Lepore-3
On Thu, 2015-12-03 at 12:53 -0700, Warner Losh wrote:

> On Thu, Dec 3, 2015 at 12:55 AM, Ed Maste <[hidden email]> wrote:
>
> > On 3 December 2015 at 05:51, Warner Losh <[hidden email]> wrote:
> > >
> > > I noted in the review that I don’t like the default being no.
> > >
> > > I also don’t like that we’re growing lots of different knobs that need
> > > to be set to get a repeatable build. Let’s have one, or barring that,
> > > let’s have one that sets all the sub-knobs.
> >
> > My hope is that we'll have a reproducible build by default, and that
> > *no* knobs need to be set. That's what I intend with my patch. I can
> > rename the knob to WITH_/WITHOUT_REPRODUCIBLE_BUILD though if that's
> > generally desired. If there's a consensus to default to including the
> > metadata I'm fine with setting it in make release.
>
>
> I think this an unwise decision in the current form suggested. The kernel
> metadata has saved my butt enough times I really don't want to see it
> go by default. But see below for a reasonable (imho) middle ground that
> would be a good default.
>

I'm curious why anyone wants this enabled by default, like... are we
missing something?  Does it improve freebsd-update behavior maybe?

If it's just for some general "reproducibility is good" philosophy then
I would counter with "information is even better, so don't throw it
away without a good reason."

Reproducibility is good for some people, and completely useless for
others, and the people who need it aren't going to mind turning on a
knob or two to get what they want.

>
> > > I think that host and path are more worthless than date and time
> > > in many environments. Who builds it likewise. Those are all things
> > > that are likely to change between builds, yet change the kernel
> > > image. I’d rather see it all gone when this option is in effect.
> >
> > I don't follow -- other than the build iteration number (which I
> > indeed missed), it is all gone.
> >
>
> Yea I was reading things backwards.
>
> In the review, I suggested that if you've modified the tree (which the SCM
> will tell you), then do the old format to preserve useful metadata that's
> really really needed and if not to use the shorter version. When you've
> modified the tree, reproducible builds aren't a concern at all.
>

How are you going to determine what consitutes a modified tree?  What
you think of as modifications may be what I call my baseline version.

-- Ian
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Justin Hibbits-2
On Thu, Dec 3, 2015 at 3:15 PM, Ian Lepore <[hidden email]> wrote:

> On Thu, 2015-12-03 at 12:53 -0700, Warner Losh wrote:
>> On Thu, Dec 3, 2015 at 12:55 AM, Ed Maste <[hidden email]> wrote:
>>
>> > On 3 December 2015 at 05:51, Warner Losh <[hidden email]> wrote:
>> > >
>> > > I noted in the review that I don’t like the default being no.
>> > >
>> > > I also don’t like that we’re growing lots of different knobs that need
>> > > to be set to get a repeatable build. Let’s have one, or barring that,
>> > > let’s have one that sets all the sub-knobs.
>> >
>> > My hope is that we'll have a reproducible build by default, and that
>> > *no* knobs need to be set. That's what I intend with my patch. I can
>> > rename the knob to WITH_/WITHOUT_REPRODUCIBLE_BUILD though if that's
>> > generally desired. If there's a consensus to default to including the
>> > metadata I'm fine with setting it in make release.
>>
>>
>> I think this an unwise decision in the current form suggested. The kernel
>> metadata has saved my butt enough times I really don't want to see it
>> go by default. But see below for a reasonable (imho) middle ground that
>> would be a good default.
>>
>
> I'm curious why anyone wants this enabled by default, like... are we
> missing something?  Does it improve freebsd-update behavior maybe?
>
> If it's just for some general "reproducibility is good" philosophy then
> I would counter with "information is even better, so don't throw it
> away without a good reason."
>
> Reproducibility is good for some people, and completely useless for
> others, and the people who need it aren't going to mind turning on a
> knob or two to get what they want.
>
>>
>> > > I think that host and path are more worthless than date and time
>> > > in many environments. Who builds it likewise. Those are all things
>> > > that are likely to change between builds, yet change the kernel
>> > > image. I’d rather see it all gone when this option is in effect.
>> >
>> > I don't follow -- other than the build iteration number (which I
>> > indeed missed), it is all gone.
>> >
>>
>> Yea I was reading things backwards.
>>
>> In the review, I suggested that if you've modified the tree (which the SCM
>> will tell you), then do the old format to preserve useful metadata that's
>> really really needed and if not to use the shorter version. When you've
>> modified the tree, reproducible builds aren't a concern at all.
>>
>
> How are you going to determine what consitutes a modified tree?  What
> you think of as modifications may be what I call my baseline version.
>
> -- Ian

svnversion resulting in a 'nnnnnnM'?

- Justin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Jonathan Anderson-2
In reply to this post by Ian Lepore-3
On 3 Dec 2015, at 17:45, Ian Lepore wrote:
> I'm curious why anyone wants this enabled by default, like... are we
> missing something?  Does it improve freebsd-update behavior maybe?

There is value in being able to reproduce the things you run, especially
if you download them from somebody else (like releases or binary
packages). It's not a panacea (see "Reflections on Trusting Trust"), but
it’s helpful, even if you don't always do the reproduction work. The
very fact that someone *can* check a binary release for naughtiness is a
strong incentive for many adversaries not to try their hand.


> If it's just for some general "reproducibility is good" philosophy
> then
> I would counter with "information is even better, so don't throw it
> away without a good reason."

When you're building your own stuff, sure, it might help to know that
this is the kernel you built on "this machine" at "that time". When
running 10.2-RELEASE-p7, however, it’s not very useful to know that it
was built on amd64-builder.daemonology.net, or that the source tree was
located at /usr/src. It *might* be useful to know that {set of people}
all got kernels that hash to {some bit pattern} when they reproduced the
build (like Certificate Transparency). Or, more interestingly, that
{people using some configuration} got a different result. Again, like
Certificate Transparency. :)


> Reproducibility is good for some people, and completely useless for
> others, and the people who need it aren't going to mind turning on a
> knob or two to get what they want.

Possibly. I don't have any strong opinions on whether the default is
"reproducible" or "full of information that helps me identify busted
kernels”, just so long as "reproducible" is available and easy to turn
on. And my personal opinion is that it should be turned on for public
releases: I think that being able to validate the kernel is more
important than knowing what machine it was built on.


>> Yea I was reading things backwards.
>>
>> In the review, I suggested that if you've modified the tree (which
>> the SCM
>> will tell you), then do the old format to preserve useful metadata
>> that's
>> really really needed and if not to use the shorter version. When
>> you've
>> modified the tree, reproducible builds aren't a concern at all.
>>
>
> How are you going to determine what consitutes a modified tree?  What
> you think of as modifications may be what I call my baseline version.

Since we host our code in Subversion and have an official Git mirror,
how about svn status || git status? If you're basing your code off of
anything other than an official mirror, you get to deal with the
reproducibility problem yourself, but it sounds like many people in this
camp would prefer the more verbose version string anyway.


Jon
--
Jonathan Anderson
[hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Ian Lepore-3
In reply to this post by Justin Hibbits-2
On Thu, 2015-12-03 at 15:35 -0600, Justin Hibbits wrote:

> On Thu, Dec 3, 2015 at 3:15 PM, Ian Lepore <[hidden email]> wrote:
> > On Thu, 2015-12-03 at 12:53 -0700, Warner Losh wrote:
> > > On Thu, Dec 3, 2015 at 12:55 AM, Ed Maste <[hidden email]>
> > > wrote:
> > >
> > > > On 3 December 2015 at 05:51, Warner Losh <[hidden email]>
> > > > wrote:
> > > > >
> > > > > I noted in the review that I don’t like the default being no.
> > > > >
> > > > > I also don’t like that we’re growing lots of different knobs
> > > > > that need
> > > > > to be set to get a repeatable build. Let’s have one, or
> > > > > barring that,
> > > > > let’s have one that sets all the sub-knobs.
> > > >
> > > > My hope is that we'll have a reproducible build by default, and
> > > > that
> > > > *no* knobs need to be set. That's what I intend with my patch.
> > > > I can
> > > > rename the knob to WITH_/WITHOUT_REPRODUCIBLE_BUILD though if
> > > > that's
> > > > generally desired. If there's a consensus to default to
> > > > including the
> > > > metadata I'm fine with setting it in make release.
> > >
> > >
> > > I think this an unwise decision in the current form suggested.
> > > The kernel
> > > metadata has saved my butt enough times I really don't want to
> > > see it
> > > go by default. But see below for a reasonable (imho) middle
> > > ground that
> > > would be a good default.
> > >
> >
> > I'm curious why anyone wants this enabled by default, like... are
> > we
> > missing something?  Does it improve freebsd-update behavior maybe?
> >
> > If it's just for some general "reproducibility is good" philosophy
> > then
> > I would counter with "information is even better, so don't throw it
> > away without a good reason."
> >
> > Reproducibility is good for some people, and completely useless for
> > others, and the people who need it aren't going to mind turning on
> > a
> > knob or two to get what they want.
> >
> > >
> > > > > I think that host and path are more worthless than date and
> > > > > time
> > > > > in many environments. Who builds it likewise. Those are all
> > > > > things
> > > > > that are likely to change between builds, yet change the
> > > > > kernel
> > > > > image. I’d rather see it all gone when this option is in
> > > > > effect.
> > > >
> > > > I don't follow -- other than the build iteration number (which
> > > > I
> > > > indeed missed), it is all gone.
> > > >
> > >
> > > Yea I was reading things backwards.
> > >
> > > In the review, I suggested that if you've modified the tree
> > > (which the SCM
> > > will tell you), then do the old format to preserve useful
> > > metadata that's
> > > really really needed and if not to use the shorter version. When
> > > you've
> > > modified the tree, reproducible builds aren't a concern at all.
> > >
> >
> > How are you going to determine what consitutes a modified tree?
> >  What
> > you think of as modifications may be what I call my baseline
> > version.
> >
> > -- Ian
>
> svnversion resulting in a 'nnnnnnM'?
>
> - Justin
>

svnversion isn't going to be able to return anything useful inside one
of my build sandboxes in which there is no hint of svn anything.

-- Ian

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Ed Maste-2
In reply to this post by Justin Hibbits-2
On 3 December 2015 at 21:35, Justin Hibbits <[hidden email]> wrote:
>
> svnversion resulting in a 'nnnnnnM'?

Warner suggested this in the review also, and it might be a good way
to choose a default. In any case it's clear that there's strong (and
reasonable) objection to enabling this by default for all builds, so
I'll not commit the change as-is.

I believe there are three separate issues here:

1) It should be possible to build the kernel reproducibly. I hope this
isn't contentious.

2) Control over enabling reproducible builds -- build knob or no,
default to on/off, based on svnversion including 'M', forced on for
release builds, etc.

3) Some tools rely on the current format / data, and will need to be fixed.

I expect to make a change so that a reproducible build is possible,
but not introduce a new knob or change anything by default. After that
I'll work on the issues in #3 and once that's done we can start the
bikeshed about whether there should be a knob, what the default should
be etc.

Thanks all for the feedback.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Ian Lepore-3
In reply to this post by Jonathan Anderson-2
On Thu, 2015-12-03 at 18:11 -0330, Jonathan Anderson wrote:

> On 3 Dec 2015, at 17:45, Ian Lepore wrote:
> > I'm curious why anyone wants this enabled by default, like... are
> > we
> > missing something?  Does it improve freebsd-update behavior maybe?
>
> There is value in being able to reproduce the things you run,
> especially
> if you download them from somebody else (like releases or binary
> packages). It's not a panacea (see "Reflections on Trusting Trust"),
> but
> it’s helpful, even if you don't always do the reproduction work. The
> very fact that someone *can* check a binary release for naughtiness
> is a
> strong incentive for many adversaries not to try their hand.
>
>
> > If it's just for some general "reproducibility is good" philosophy
> > then
> > I would counter with "information is even better, so don't throw it
> > away without a good reason."
>
> When you're building your own stuff, sure, it might help to know that
> this is the kernel you built on "this machine" at "that time". When
> running 10.2-RELEASE-p7, however, it’s not very useful to know that
> it
> was built on amd64-builder.daemonology.net, or that the source tree
> was
> located at /usr/src. It *might* be useful to know that {set of
> people}
> all got kernels that hash to {some bit pattern} when they reproduced
> the
> build (like Certificate Transparency). Or, more interestingly, that
> {people using some configuration} got a different result. Again, like
> Certificate Transparency. :)
>
>
> > Reproducibility is good for some people, and completely useless for
> > others, and the people who need it aren't going to mind turning on
> > a
> > knob or two to get what they want.
>
> Possibly. I don't have any strong opinions on whether the default is
> "reproducible" or "full of information that helps me identify busted
> kernels”, just so long as "reproducible" is available and easy to
> turn
> on. And my personal opinion is that it should be turned on for public
> releases: I think that being able to validate the kernel is more
> important than knowing what machine it was built on.
>
>
> > > Yea I was reading things backwards.
> > >
> > > In the review, I suggested that if you've modified the tree
> > > (which
> > > the SCM
> > > will tell you), then do the old format to preserve useful
> > > metadata
> > > that's
> > > really really needed and if not to use the shorter version. When
> > > you've
> > > modified the tree, reproducible builds aren't a concern at all.
> > >
> >
> > How are you going to determine what consitutes a modified tree?
> >  What
> > you think of as modifications may be what I call my baseline
> > version.
>
> Since we host our code in Subversion and have an official Git mirror,
> how about svn status || git status? If you're basing your code off of
> anything other than an official mirror, you get to deal with the
> reproducibility problem yourself, but it sounds like many people in
> this
> camp would prefer the more verbose version string anyway.
>

By "we" you must mean "The FreeBSD Project" but surely you also realize
that the universe of freebsd users is much larger than just the
project, and not all of them use subversion or git to check out freebsd
and/or manage their local copies of it.

For a company building products based on freebsd, reproducibility is
important, but they're quite likely to be using something other than
subversion or git to manage the source.  They're also quite likely to
have local modifications that they consider to be part of their
baseline even if they appear to be modifications from the project's
repo at the same svn revision number.  Either way, these folks are
going to want to set some control that enforces reproducibility
regardless of any build system heuristics about what to default to.

For other companies or end users the important factor might be the
ability to reproduce an official release, which one presumes would
start with checkout out the official sources using one of the official
SCMs and then a whole other set of "what constitues a modification"
would apply.

As someone who works for one of those "not-svn, not-git" companies I
just want to make sure there's a "do what I say" knob that overrides
any attempts to be smart about detecting modifications.

-- Ian

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

Warner Losh
In reply to this post by Ian Lepore-3
On Thu, Dec 3, 2015 at 2:45 PM, Ian Lepore <[hidden email]> wrote:

> On Thu, 2015-12-03 at 15:35 -0600, Justin Hibbits wrote:
> > On Thu, Dec 3, 2015 at 3:15 PM, Ian Lepore <[hidden email]> wrote:
> > > On Thu, 2015-12-03 at 12:53 -0700, Warner Losh wrote:
> > > > On Thu, Dec 3, 2015 at 12:55 AM, Ed Maste <[hidden email]>
> > > > wrote:
> > > >
> > > > > On 3 December 2015 at 05:51, Warner Losh <[hidden email]>
> > > > > wrote:
> > > > > >
> > > > > > I noted in the review that I don’t like the default being no.
> > > > > >
> > > > > > I also don’t like that we’re growing lots of different knobs
> > > > > > that need
> > > > > > to be set to get a repeatable build. Let’s have one, or
> > > > > > barring that,
> > > > > > let’s have one that sets all the sub-knobs.
> > > > >
> > > > > My hope is that we'll have a reproducible build by default, and
> > > > > that
> > > > > *no* knobs need to be set. That's what I intend with my patch.
> > > > > I can
> > > > > rename the knob to WITH_/WITHOUT_REPRODUCIBLE_BUILD though if
> > > > > that's
> > > > > generally desired. If there's a consensus to default to
> > > > > including the
> > > > > metadata I'm fine with setting it in make release.
> > > >
> > > >
> > > > I think this an unwise decision in the current form suggested.
> > > > The kernel
> > > > metadata has saved my butt enough times I really don't want to
> > > > see it
> > > > go by default. But see below for a reasonable (imho) middle
> > > > ground that
> > > > would be a good default.
> > > >
> > >
> > > I'm curious why anyone wants this enabled by default, like... are
> > > we
> > > missing something?  Does it improve freebsd-update behavior maybe?
> > >
> > > If it's just for some general "reproducibility is good" philosophy
> > > then
> > > I would counter with "information is even better, so don't throw it
> > > away without a good reason."
> > >
> > > Reproducibility is good for some people, and completely useless for
> > > others, and the people who need it aren't going to mind turning on
> > > a
> > > knob or two to get what they want.
> > >
> > > >
> > > > > > I think that host and path are more worthless than date and
> > > > > > time
> > > > > > in many environments. Who builds it likewise. Those are all
> > > > > > things
> > > > > > that are likely to change between builds, yet change the
> > > > > > kernel
> > > > > > image. I’d rather see it all gone when this option is in
> > > > > > effect.
> > > > >
> > > > > I don't follow -- other than the build iteration number (which
> > > > > I
> > > > > indeed missed), it is all gone.
> > > > >
> > > >
> > > > Yea I was reading things backwards.
> > > >
> > > > In the review, I suggested that if you've modified the tree
> > > > (which the SCM
> > > > will tell you), then do the old format to preserve useful
> > > > metadata that's
> > > > really really needed and if not to use the shorter version. When
> > > > you've
> > > > modified the tree, reproducible builds aren't a concern at all.
> > > >
> > >
> > > How are you going to determine what consitutes a modified tree?
> > >  What
> > > you think of as modifications may be what I call my baseline
> > > version.
> > >
> > > -- Ian
> >
> > svnversion resulting in a 'nnnnnnM'?
> >
> > - Justin
> >
>
> svnversion isn't going to be able to return anything useful inside one
> of my build sandboxes in which there is no hint of svn anything.
>

Then, in my proposal, you'd get the 'reproducible' format. We already
don't include the SVN info in this case.

Perhaps this isn't desirable for you, but  it's my proposal and my
suggestion and I'd welcome comments on it.

Warner
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Removing build metadata, for reproducible kernel builds

John Baldwin
In reply to this post by Jonathan Anderson-2
On Thursday, December 03, 2015 06:11:27 PM Jonathan Anderson wrote:

> > Reproducibility is good for some people, and completely useless for
> > others, and the people who need it aren't going to mind turning on a
> > knob or two to get what they want.
>
> Possibly. I don't have any strong opinions on whether the default is
> "reproducible" or "full of information that helps me identify busted
> kernels”, just so long as "reproducible" is available and easy to turn
> on. And my personal opinion is that it should be turned on for public
> releases: I think that being able to validate the kernel is more
> important than knowing what machine it was built on.

FYI, I think most folks agree that releases should be reproducible (and
in particular the release bits that are shipped).  I think the primary
question people have raised is what the default behavior is if someone
is building a kernel themselves vs a kernel from an ISO or freebsd-update.

Secondly, the whole kgdb/crashinfo thing does sort of matter if we want
users to have usable crash summaries when reporting bugs on release
installs.  (crashinfo matters more here than kgdb -n's hackish thing,
and crashinfo just needs 'version' to be unique)

--
John Baldwin
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
12