Re: s6 as a systemd alternative from Steve Litt on 2017-06-30 (supervision)

From: Steve Litt <slitt_at_troubleshooters.com>
Date: Fri, 30 Jun 2017 16:38:47 -0400

On Fri, 30 Jun 2017 19:50:17 +0000
"Laurent Bercot" <ska-supervision_at_skarnet.org> wrote:

> >The runsv executable is pretty robust, so it's unlikely to die.
> Yadda yadda yadda. Most daemons are also unlikely to die, so
> following your reasoning, I wonder why we're doing supervision in the
> first place. Hint: we're doing supervision because we are not content
> with "unlikely". We want "impossible".

You want impossible. I'm quite happy with unlikely. With my use
case, rebooting my computer doesn't ruin my whole day. If it *did* ruin
my whole day, my priorities would be changed and I'd switch to s6.

>
>
> > As far
> >as somebody killing it accidentally or on purpose with the kill
> >command, that's a marginal case. But if it were *really* important to
> >protect against, fine, have one link dir per early longrun, and run
> >an individual runsvdir on each of those link directories.
> And you just increased the length of the chain while adding no
> guarantee at all, because now someone can just kill that runsvdir
> first and then go down the chain, like an assassin starting with the
> bodyguards of the bodyguards of the important people. Or the assassin
> might just use a bomb and blow up the whole house in one go: kill -9
> -1.
>
> The main point of supervision is to provide an absolute guarantee
> that some process tree will always be up, no matter what gets killed
> in what order, and even if everything is killed at the same time.

To me, the preceding isn't the main point of supervision. Supervision
benefits I value more are:

* Run my daemon in foreground, so homegrown daemons have no need to
self-background.
* Consistent and easy handling of log files.
* Under almost all circumstances, dead daemons get restarted.
* Simple config and troubleshooting, lots of test points.
* POSIX methodologies ensure I can easily do special stuff with it.
* Ability to base process dependency on whether the dependee is
*really* doing its job.

> You
> can only achieve that guarantee by rooting your supervision tree in
> process 1.

Yes.

>
> With runit, only the main runsvdir is supervised - and even then it
> isn't really, because when it dies runit switches to stage 3 and
> reboots the machine. Which is probably acceptable behaviour, but
> still not supervision.

If we're going to get into definitions, then let me start by saying
what I want is daemontools that comes up automatically when the machine
is booted. Whether or not that's supervision isn't something I care
about.

> And everything running outside of that main
> runsvdir is just hanging up in the air - they can be easily killed
> and will not return.

Wellllll, if they kill the runsv that's true, but if they kill the
daemon, no. Either way, I'm willing to live with it.

>
> By adding supervisors to supervisors, you are making probabilistic
> statements, and hoping that nobody will kill all the processes in the
> wrong order. But hope is not a strategy. There is, however, a strategy
> that works 100% of the time, and that is also more lightweight because
> it doesn't require long supervisor chains: rooting the supervision
> tree in process 1. That is what an s6-based init does, and it
> provides real, strong supervision; and unlike with runit, the machine
> is only rebooted when the admin explicitly decides so.

I completely understand your point. I just don't need that level of
indestructibility.

>
> If you're not convinced: *even systemd* does better than your
> solution. systemd obviously has numerous other problems, but it does
> the "root the supervision tree in process 1" thing right.

LOL, my whole point is I don't necessarily think "root the supervision
tree in process 1" is right, at least for my use case. I *enjoy* having
a tiny, do-almost-nothing PID1.

Like I said before, if losing control of the system during special
circumstances would ruin my whole day, I'd change my priorities and use
s6.

>
> I appreciate your enthusiasm for supervision suites. I would
> appreciate it more if you didn't stop halfway from understanding
> everything they bring, and if you didn't paint your unwillingness to
> learn more as a technical argument, which it definitely is not, while
> flippantly dismissing contributions from people who know what they
> are talking about.

But I didn't flippantly dismiss anybody or any contributions. I
pointed out that one can, and I'll use different verbiage now, respawn
daemons early in the boot, before some of the one-shots had started.

I'm not an enemy of s6. I'm not an enemy of anything you apply the word
"supervision" to. I think I understand your reasons for doing what you
do. It's just that with my current use case, I've traded some of s6's
process and boot security (you know what I mean) for a simpler PID1 and
a standalone daemon respawner.

If and when I get a use case requiring more durability of processes and
what runs them, I'll for sure use s6 for that.

SteveT

Steve Litt
June 2017 featured book: The Key to Everyday Excellence
http://www.troubleshooters.com/key
Received on Fri Jun 30 2017 - 20:38:47 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC