nosh init system

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: nosh init system

Warner Losh
On Sun, Feb 10, 2019, 11:34 AM Cy Schubert <[hidden email] wrote:

> In message <[hidden email]>, Enji
> Cooper writes
> :
> > On Feb 9, 2019, at 20:20, Rodney W. Grimes <
> [hidden email]
> > t> wrote:
> >
> > >> In message
> <[hidden email]
> > >> il.com>
> > >> , Conrad Meyer writes:
> > >>> Hi Cy,
> > >>>
> > >>>> On Sat, Feb 9, 2019 at 3:35 PM Cy Schubert <
> [hidden email]> w
> > rote:
> > >>>> I don't see what's so "incredibly fragile" about rc(8). That's not
> to
> > >>>> say there aren't better solutions, like SMF.
> > >>>
> > >>> Maybe "incredibly" as a choice of adjective is inappropriate.  I
> think
> > >>> we (you, me, and ngie@) can all agree it is somewhat fragile, and
> > >>> there are things SMF/systemd/nosh get right that rc(8) does not
> > >>> (today).  Anyway, your next paragraph goes on to be a good start at
> > >>> describing some of rc's fragility.  :-)
> > >>>
> > >>>> Where rc(8) falls down is any port or a customer's (user of
> FreeBSD) rc
> > >>>> script could fail hosing the boot or worse hosing the system*.
> Where a
> > >>>> solution like SMF solves the problem is that should a service which
> > >>>> other services depend on fail, only that branch of the startup tree
> > >>>> would fail.
> > >>>
> > >>> Right; that's a great example.
> > >>>
> > >>>> In that scenario, if a service fails but sshd start, a
> > >>>> sysadmin would still be able to login remotely to resolve the
> problem.
> > >>>> So in this regard rc(8) is at a disadvantage.
> > >>>>
> > >>>> We could address the above paragraph by starting sshd earlier during
> > >>>> boot thereby allowing the opportunity to fix remotely.
> > >>>
> > >>> I don't think that is really sufficient without substantially
> > >>> modifying init+rc to be closer to something like systemd or SMF,
> > >>> anyway.  And then we'd rather just have something like SMF :-).
> > >>
> > >> I'd rather see SMF but a number felt a CDDL licensed init was
> > >> unacceptable -- except for the fact that SMF doesn't replace init.
> > >>
> > >>>
> > >>> As soon as *any* rc service fails to start (signal, non-zero exit,
> > >>> stop_boot), rc(8) exits non-zero, causing init(8) to go to single
> > >>> user.  All service state is thrown away with rc(8) exit, but any rc.d
> > >>> "services" that managed to start before boot failed are not
> > >>> terminated.  Even if an admin manages to log in and fix the
> > >>> configuration, re-starting rc(8) restarts the runcom process from
> > >>> scratch, as if nothing had already been done, without first stopping
> > >>> anything that was already running.  The only safe, reproducible way
> to
> > >>> re-start rc(8) is to fully reboot the system.
> > >
> > > It -should- be safe to restart rc, as rc scripts should check to
> > > see if the item they are being requested to start is already running,
> > > rc scripts that fail to have this check are defective and should be
> > > fixed.  You should be able to invate /etc/rc.d/foo start as many
> > > times as you want in a row and only get 1 instance of foo, with the
> > > other starts returning "foo already running"   Same with stop.
> >
> > I’m not sure if Conrad is referring to the isilon way of restarting
> service
> > s. If so, the isilon parallel start process would effectively wipe the
> slate
> > clean and restart everything if interrupted, which (because of the
> nature of
> > cleanvar, etc), would wipe out any and all pidfiles, resulting in in
> weird se
> > t of services which fail to start on next run through.
> >
> > In short, I think the fact that isilon didn’t mount tmpfs to /var/run
> was b
> > egging for pain, as it’s a directory one should only setup once at
> boot.
>
> Regardless of whether they use tmpfs or not, services should be
> constructed in a manner such that it should still work if the customer
> chooses not to use tmpfs.
>

Correct. If we require this. That's a bug.

This also goes for those who mount /usr separately like I do (which has

> saved my bacon as recently as a couple of weeks ago). A change made to
> one of the RC scripts assumed /usr was on rootfs. (When I raised the
> issue the reply was "you should /usr on / anyway.") My point is that we
> assume our way of setting up a server is the only way and we bulldoze.
> In reality FreeBSD and prior to that commercial UNIX were set up
> variously. It's only since Linux became so popular that it has been
> assumed that one size fits all.
>
> These are two examples of why this approach doesn't work. POLA is
> painful.
>

This would also be a bug. I'd just fix the bug. I know people don't want to
think of these things, but we still support separate filesystems. Saying
not to run that way is lame and unhelpful.

>
> > That being said, there are other pseudo services that aren’t
> necessarily id
> > empotent. If they run twice, the second run could result in breakage to
> other
> >  dependent services run after them.
>
> Cleanvar being the focus of much of our discussion should be able to
> determine it has run before.
>
> I'm purposely not discussing implementation details.
>

Yea. That's also a sloppy bug. In this case, there is no concept of
restarting... we want to run it only once... maybe that is the real bug
here: we don't adequately have a way to Express that notion.

Of course the bigger issue is that this is the sort of thing you want to be
100% sure is done before anything that depends on it runs. When you have a
complicated topology like our start graph, that makes doing stuff in
parallel hard.

Warner

--

> Cheers,
> Cy Schubert <[hidden email]>
> FreeBSD UNIX:  <[hidden email]>   Web:  http://www.FreeBSD.org
>
>         The need of the many outweighs the greed of the few.
>
>
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: nosh init system

Rodney W. Grimes-4
> On Sun, Feb 10, 2019, 11:34 AM Cy Schubert <[hidden email] wrote:
>
> > In message <[hidden email]>, Enji
> > Cooper writes
> > :
> > > On Feb 9, 2019, at 20:20, Rodney W. Grimes <
> > [hidden email]
> > > t> wrote:
> > >
> > > >> In message
> > <[hidden email]
> > > >> il.com>
> > > >> , Conrad Meyer writes:
> > > >>> Hi Cy,
> > > >>>
> > > >>>> On Sat, Feb 9, 2019 at 3:35 PM Cy Schubert <
> > [hidden email]> w
> > > rote:
> > > >>>> I don't see what's so "incredibly fragile" about rc(8). That's not
> > to
> > > >>>> say there aren't better solutions, like SMF.
> > > >>>
> > > >>> Maybe "incredibly" as a choice of adjective is inappropriate.  I
> > think
> > > >>> we (you, me, and ngie@) can all agree it is somewhat fragile, and
> > > >>> there are things SMF/systemd/nosh get right that rc(8) does not
> > > >>> (today).  Anyway, your next paragraph goes on to be a good start at
> > > >>> describing some of rc's fragility.  :-)
> > > >>>
> > > >>>> Where rc(8) falls down is any port or a customer's (user of
> > FreeBSD) rc
> > > >>>> script could fail hosing the boot or worse hosing the system*.
> > Where a
> > > >>>> solution like SMF solves the problem is that should a service which
> > > >>>> other services depend on fail, only that branch of the startup tree
> > > >>>> would fail.
> > > >>>
> > > >>> Right; that's a great example.
> > > >>>
> > > >>>> In that scenario, if a service fails but sshd start, a
> > > >>>> sysadmin would still be able to login remotely to resolve the
> > problem.
> > > >>>> So in this regard rc(8) is at a disadvantage.
> > > >>>>
> > > >>>> We could address the above paragraph by starting sshd earlier during
> > > >>>> boot thereby allowing the opportunity to fix remotely.
> > > >>>
> > > >>> I don't think that is really sufficient without substantially
> > > >>> modifying init+rc to be closer to something like systemd or SMF,
> > > >>> anyway.  And then we'd rather just have something like SMF :-).
> > > >>
> > > >> I'd rather see SMF but a number felt a CDDL licensed init was
> > > >> unacceptable -- except for the fact that SMF doesn't replace init.
> > > >>
> > > >>>
> > > >>> As soon as *any* rc service fails to start (signal, non-zero exit,
> > > >>> stop_boot), rc(8) exits non-zero, causing init(8) to go to single
> > > >>> user.  All service state is thrown away with rc(8) exit, but any rc.d
> > > >>> "services" that managed to start before boot failed are not
> > > >>> terminated.  Even if an admin manages to log in and fix the
> > > >>> configuration, re-starting rc(8) restarts the runcom process from
> > > >>> scratch, as if nothing had already been done, without first stopping
> > > >>> anything that was already running.  The only safe, reproducible way
> > to
> > > >>> re-start rc(8) is to fully reboot the system.
> > > >
> > > > It -should- be safe to restart rc, as rc scripts should check to
> > > > see if the item they are being requested to start is already running,
> > > > rc scripts that fail to have this check are defective and should be
> > > > fixed.  You should be able to invate /etc/rc.d/foo start as many
> > > > times as you want in a row and only get 1 instance of foo, with the
> > > > other starts returning "foo already running"   Same with stop.
> > >
> > > I???m not sure if Conrad is referring to the isilon way of restarting
> > service
> > > s. If so, the isilon parallel start process would effectively wipe the
> > slate
> > > clean and restart everything if interrupted, which (because of the
> > nature of
> > > cleanvar, etc), would wipe out any and all pidfiles, resulting in in
> > weird se
> > > t of services which fail to start on next run through.
> > >
> > > In short, I think the fact that isilon didn???t mount tmpfs to /var/run
> > was b
> > > egging for pain, as it???s a directory one should only setup once at
> > boot.
> >
> > Regardless of whether they use tmpfs or not, services should be
> > constructed in a manner such that it should still work if the customer
> > chooses not to use tmpfs.
> >
>
> Correct. If we require this. That's a bug.
>
> This also goes for those who mount /usr separately like I do (which has
> > saved my bacon as recently as a couple of weeks ago). A change made to
> > one of the RC scripts assumed /usr was on rootfs. (When I raised the
> > issue the reply was "you should /usr on / anyway.") My point is that we
> > assume our way of setting up a server is the only way and we bulldoze.
> > In reality FreeBSD and prior to that commercial UNIX were set up
> > variously. It's only since Linux became so popular that it has been
> > assumed that one size fits all.
> >
> > These are two examples of why this approach doesn't work. POLA is
> > painful.
> >
>
> This would also be a bug. I'd just fix the bug. I know people don't want to
> think of these things, but we still support separate filesystems. Saying
> not to run that way is lame and unhelpful.

Then I'll done my nomex and jump in with seperate /usr is
rather seriously broken and neglected, to the point diskless
booting with seperate /usr is marginal and I actually gave
up fighting it and merged my / to /usr on the diskless server.

I really would like to see this fixed and remove that merging.

> > > That being said, there are other pseudo services that aren???t
> > necessarily id
> > > empotent. If they run twice, the second run could result in breakage to
> > other
> > >  dependent services run after them.
> >
> > Cleanvar being the focus of much of our discussion should be able to
> > determine it has run before.
> >
> > I'm purposely not discussing implementation details.
> >
>
> Yea. That's also a sloppy bug. In this case, there is no concept of
> restarting... we want to run it only once... maybe that is the real bug
> here: we don't adequately have a way to Express that notion.
>
> Of course the bigger issue is that this is the sort of thing you want to be
> 100% sure is done before anything that depends on it runs. When you have a
> complicated topology like our start graph, that makes doing stuff in
> parallel hard.

We do not have to wait for fsck any more,
that was a huge upside, even parallel fsck was at
the mercy of your largest partition.

Doesnt the openrc thing have this parrallel startup stuff in it,
and what happened to that FPC to move forward on that,
did it end up in the "lacks enough round tuits" basket?

--
Rod Grimes                                                 [hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: nosh init system

Conrad Meyer-2
In reply to this post by Rodney W. Grimes-4
On Sat, Feb 9, 2019 at 8:20 PM Rodney W. Grimes
<[hidden email]> wrote:
> It -should- be safe to restart rc, as rc scripts should check to
> see if the item they are being requested to start is already running,

It isn't, as described in detail in the email Cy replied to.

There is some difficulty in making scripts idempotent even in
relatively happy cases; not everything has a pid file (and pid files
are not a particularly robust system anyway).  Even harder are weird
corner cases like interrupted and resumed boot, that are rarely or
never tested.  Shell is just a poor language for any sophisticated
behavior or robust error handling.

Conrad
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Restarting netif, static addressing/routing (was: nosh init system)

grahamperrin
In reply to this post by Enji Cooper
On 10/02/2019 03:17, Enji Cooper wrote:

 > … Try restarting netif if you have a static route set; it will break
routing (until you restart the routing pseudo service), …

Ha, I imagined that it was _me_ breaking things :-)

Somewhere (on the disk of a notebook that was thrown from a window
(don't ask)) I have a long, probably very messy script that used to help
me regain Internet access after e.g. switching from eduroam to a wired
'staff' network with static addressing.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: nosh init system

Julian Elischer-5
In reply to this post by freebsd-hackers mailing list
On 2/8/19 12:50 PM, Sidju via freebsd-hackers wrote:

> Hi everyone.
>
> I might be missing something since I have only been in the group for a few months, but is anyone looking at the "nosh" init system ( https://jdebp.eu/Softwares/nosh/ )?
> >From what I have read there is some talk of writing a new init system; is nosh known to be bad in some way or just obscure (it did take me a decent while to find)?
>
> >From what I can find it is aiming to fill an systemd-shaped hole in a better way and while maintaining compatibility with BSD. I am not exceptionally read in and may be missing some pitfalls.
>
> I am curious what you think of it.
>
> /Sidju
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[hidden email]"
>
so I see that no one answered the question..
Why are we as a group completely ignoring the work the author of nosh is
doing if no one has a clue as to why he's doing t or what the supposed
advantages and problems would be. For that matter what about the apple
stuff?

it would be at least worth taking a good look at it and discussing it
here..

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: nosh init system

grahamperrin
On 12/02/2019 02:20, Julian Elischer wrote:

 > … Why are we as a group completely ignoring the work the author of
nosh is
 > doing if no one has a clue as to why he's doing t or what the supposed
 > advantages and problems would be. …

Some clues may be found at/around <http://jdebp.eu./about-the-site.html>.

I bookmarked <https://forums.pcbsd.org/thread-20225.html> in relation to
nosh but sadly re:
<https://web.archive.org/web/20180218153844/https://forums.pcbsd.org/announcement-3.html>
the forum was taken down, and (sorry) the required post seems to be not
amongst the things that I captured before the take-down.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: nosh init system

grahamperrin
Also amongst my bookmarks (not recent, but maybe of interest):

<https://www.freebsd.org/news/status/report-2015-10-2015-12.html#The-nosh-Project>
– ignore the ntlworld links

<https://www.freebsd.org/news/status/report-2017-07-2017-09.html#The-nosh-Project>

Welcome openrc . The good question is what happened to launchd , runit
and , and... | Hacker News
<https://news.ycombinator.com/item?id=13453068>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: nosh init system

Mark Martinec-6
For a perspective: worth listening to a recent talk by a FreeBSD
developer
on a Linux conference:

Benno Rice: The Tragedy of systemd

   linux.conf.au 2019 — Christchurch, New Zealand, published on Jan 24,
2019

https://youtu.be/o_AIw9bGogo



Mark
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
12