Re: [PATCH 0/4] Add info on why process is down to statusfile

From: Olivier Brunel <jjk_at_jjacky.com>
Date: Mon, 19 Jan 2015 01:04:45 +0100

On 01/19/15 00:34, Laurent Bercot wrote:
>
> Hi Olivier !
> Thanks for the patches. A bit of discussion below.
>
>
>> So I wanted to have the return code/signal that occured when a process
>> went down
>> available on the statusfile, as I believe this is useful information
>> to have for
>> all services.
>
> Yes, that makes sense.
>
>
>> Looking into it, I noticed that s6 already did gather that info from
>> wait(), and
>> sends it to ./finish -- that was a nice surprise, but (unless I missed
>> it?)
>> totally undocumented.
>
> I was certain I had documented it somewhere, but I must have been
> wrong, because
> I couldn't find it. So, thanks for the report, and I'll apply that patch or
> update the doc somehow.
>
>
>> First patch fixes the first argument to ./finish, since I assume it
>> was meant to
>> be able to know whether there was a return code or not, but 255
>> actually is a
>> valid return code. Second patch adds a mention of those arguments to
>> the doc.
>
> Yes, I didn't think too much about that, but in hindsight, since it's
> possible
> to use a code outside of the valid range, it's the right thing to do
> indeed.
> What bothers me, though, is that it breaks existing scripts, so it
> requires a
> major (as in second number in my 4-number versioning system) number change,
> loud horns and blinking lights. It can't be the sneaky patch. Totally my
> fault, but compatibility breaks are never fun.
>
>
>> Then third patch does add that info to the statusfile
>
> And here I'm slowing down. I agree it makes sense for the info to be
> there.
> However, wstat is specified as an int, and the mapping of the bits inside
> a wstat is totally unspecified, even if most implementations are the same
> in practice. That means that wstat cannot be stored as an uint32 in the
> status file; if anything, it should be stored as an uint64, to accommodate
> archs where int is 64 bits. Adding 8 bytes to the status file is heavy
> (all things are relative...) I'll probably end up doing that, but I need to
> think about it a bit more first.
>
>
>> and finally as a separate
>> patch I've also added the signal name to s6-svstat as I think it's a
>> nice thing
>> to have.
>
> I'm reluctant to add a number-to-signal-name conversion to a single
> program.
> It really should be a libc function. If anything, I'll add it to
> skalibs, so
> other programs can use the conversion too.

Right, adding this to skalibs sounds like a better idea indeed than svstat.

> More importantly, the s6-svstat output is meant to be parsable, and
> changing
> the numbers into signal names makes parsing harder. So there should be at
> least an option to deactivate the conversion.

Well, with the patch it still shows the number, e.g:
down (signal=15:SIGTERM) 42 seconds, normally up

So it should be still parsable w/out too much difficulty. But there
could be a switch to turn that conversion on/off (after all, ./finish
only gets the number).

> Next time, please express wishes before sending patches - I like design
> discussions and agreement on what to do, I kinda feel trapped or forced
> when
> I get patches, as in "I worked hard on this so you better apply them!".

Sure; though I'm totally fine if you want to discuss them, or would like
me to fix/make changes and re-send, as you see fit. That's what I
expected really.

I did do patches first because I'm used to places where people like to
have/see code, and I felt those were relatively small stuff. Anything
larger I would have asked/talked about first, ofc.

> I have to say your patches are pretty much on-point, though, and almost
> applicable as is, so thank you for that.

You're very welcome :)

-j
Received on Mon Jan 19 2015 - 00:04:45 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC