Hey Laurent,
Over at LQ, I'm working on importing s6 into LFS again, but this time at a slower pace. I was hoping to also see about using the native LFS utilities as much as possible and only include the init-shim tools (halt, shutdown, pause, and runlevel scripts and binaries) from Runit-For-LFS for low level system management if possible to avoid using more extras.
I have had a though, why not include symlinkable functionality for halt, poweroff, shutdown, and reboot directly in s6-svscanctl and move s6-pause into s6 itself to simplify the packages (you could even have a configure trigger --with-s6-pause to enable or disable it during build. Just a suggestion, but no biggie.
Anyways, I'll be posting more frequently about getting init-stage-1/2/3 drafted correctly and in execline script language. Avery maybe you can share your notes as well on this with me, if possible.
Thanks,
Jim
Sent from my Windows Phone
________________________________
From: Laurent Bercot<mailto:ska-supervision_at_skarnet.org>
Sent: 1/2/2015 4:59 AM
To: supervision_at_list.skarnet.org<mailto:supervision_at_list.skarnet.org>
Subject: Re: runit-scripts gone, supervision-scripts progress
Hi Avery,
Happy new year to you !
Congratulations on the achievements so far, even if they're not reaching
the bar you set for yourself.
Just a little note:
> + The ./finish concept needs development and refinement.
>
> + Need to incorporate some kind of alerting or reporting mechanism into
> ./finish, so that the sysadmin receives notifications
./finish is a delicate beast. It is not only run when the admin brings
the service down, which is fine, but also when the service stops in an
untimely fashion; and the service cannot start again as long as ./finish
is running. So, if anything time-consuming, or worse, blocking, happens
in ./finish, the service can be totally hosed.
Services should do all their necessary work in ./run, before executing
into the long-lived process: when they are in ./run, it's a known and
manageable state, they are up, even if they are not ready yet. But in
./finish, it's kind of a limbo state that shouldn't be drawn out. The
service is down, but it's still doing something, can't be brought up
right now, etc. Having a service stuck in "finish" state is about as
infuriating as having a process stuck in "D" state on Linux.
s6-supervise has a built-in protection against misbehaving ./finish
scripts: if ./finish is still around after 5 seconds, it kills it.
(With a SIGKILL. When a service is down is not the time to be polite.)
AFAICT, runsv does not have such a protection, which makes it even more
important to pay attention when writing ./finish scripts.
One way or the other, ./finish should only be used scarcely, for clean-up
duties that absolutely need to happen when the long-lived process has died:
removing stale or temporary files, for instance. Those should be brief
operations and absolutely cannot block.
So, if you're implementing reporting in ./finish, make sure you are using
fast, non-blocking commands that just fail (possibly logging an error
message) if they have trouble doing their job.
The way I would implement reporting wouldn't be based on ./finish, but on
an external set of processes listening to down/up/ready notifications in
/service/foobar/event. It would only work with s6, though.
--
Laurent
Received on Fri Jan 02 2015 - 23:42:39 UTC