Re: s6, listen(8), etc.

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Thu, 1 Sep 2016 14:34:18 +0200

On 01/09/2016 07:52, Daniel Kahn Gillmor wrote:
> I think you might have misunderstood the description of the convention,
> actually. There's no unix domain socket for receiving the descriptors
> at all. Rather, the descriptors are already open and present in the
> process.

  OK, after reading your message, then the sd_listen_fds man page, it
appears that I have indeed misunderstood it. My bad. It my defence,
it really is not clear from that page whether the descriptors are already
open in the process that invokes sd_listen_fds(), or whether sd_listen_fds
actually connects to systemd to receive them. I understood the latter, and
I will agree it was my prejudice speaking; the former is much more
reasonable. So let's dive a bit deeper.


> exec listen -udp::53 \
> -tcp::53 \
> -tcp:label=tls:853 \
> -unix:label=control,mode=0600:/run/kresd/control \
> chpst -u kresd -p 1 \
> /usr/sbin/kresd
>
> This means kresd doesn't need to know about dropping privileges, opening
> listening ports, or resource constraints at all (listen and chpst take
> care of that), but kresd can still retain in-memory state that might be
> useful for handling multiple connections with no exec() ever involved.

  I agree with this principle entirely. This is good design. The question
now becomes: how does the daemon know what fd corresponds to what use?

  The simplest, natural answer, used for instance by UCSPI clients and
servers, is: the fd numbers are hardcoded, they're a part of the daemon
interface and that should be documented.
  So, in your example, the kresd documentation should say something like:

  - fd 3 must be a datagram socket, that's where kresd will read its
UDP DNS queries from.
  - fds 4 and 5 must be listening stream sockets, that's where kresd will
read its TCP DNS queries from. kresd use two sockets for this in order to
allow one of them to be set up over TLS.
  - fd 6 must be a listening Unix domain stream socket, that's where kresd
will read its control messages from.

  To me, that approach is good enough. But you could reasonably argue that's
unwieldy. So the next step is to have a map, a simple key-value store that
maps identifiers to fd numbers. And the simplest implementation of a
key-value store is the environment. So, the kresd documentation could say
something like this:

  kresd will read the values of the UDP_FD, TCP_FD, TLS_FD and CONTROL_FD
environment variables. Those variables should contain an integer which
represents an open file descriptor. kresd will use the descriptor in
$UDP_FD to receive DNS queries over a datagram socket. etc. etc.

  That's much friendlier to the person who writes the run script for kresd,
and that's also pretty easy to implement for the kresd author: read an
integer in an environment variable. All in all, it sounds like a good
solution when the daemon has more than 2 or 3 fds to handle.

  But, you say, that's not generic enough a mechanism! A process manager
cannot pass a flock of environment variables to a daemon, the names of
which depend on the daemon entirely! It has to settle on a few standardized
environment variable names.
  Well, first, yes, it definitely can pass a flock of environment variables,
that's what a run script is for. I'm pretty sure systemd even has syntactic
sugar to change the daemon's environment before spawning it. With s6,
you can also store your variable set into the filesystem and read it via
s6-envdir.
  Second, even if you don't want to do that, it's a simple map, and a map
is easily passed as a single variable in the environment, provided you
reserve a separator character. Let's say your process manager will pass
the LISTEN_FD_MAP environment variable to daemons, and labels aren't
allowed to use commas:
  LISTEN_FD_MAP=udp=3,tcp=4,tls=5,control=6
  It's a bit more work for the daemon, but you can still relatively easily
extract a map from labels to fd numbers from a single environment variable.

  OK, now let's have a look at LISTEN_FDS.

  LISTEN_FDS is designed to be usable without LISTEN_FDNAMES, i.e. as a
list. And in that case, you have a single piece of information: the number
of open fds. Which means:
  - the starting fd has to be hardcoded (3)
  - the fds have to be consecutive.
  So here you'd have LISTEN_FDS=4 and the fds would be 3,4,5,6. The daemon
has to hardcode that 3 is udp, 4 is tcp, 5 is tls and 6 is control - it's
exactly as unwieldy for the run script author, and it's less flexible
because neither the daemon nor the script author can even choose what fds
are allocated! The numbers are enforced by the API. This is not good.

  Right, so there's the LISTEN_FDNAMES mechanism to help. With LISTEN_FDNAMES,
we have a map: the daemon gets labels to help it identify the fds. This
avoids any number hardcoding, this makes it more flexible for the run
script author.
  Or does it? The API still constrains you: even if you can provide labels,
you still don't send a map, you send a number of fds and a list of labels.
So you still have to make sure the fds you send are 3, 4, 5, 6 ! And
if your "listen" program happens to have fd 3 open when it's launched
(and there may be very good reasons to have some fd already open when you
run a daemon), well, it has to overwrite it with the udp socket it opens,
because that's what the API forces you to do. If I want the daemon to run
with some open fd that's not explicitly passed via LISTEN_FDS (for instance
if the daemon doesn't have to know it has it open) then I have to make sure
that my fd is at least 3+$LISTEN_FDS. Talk about unwieldy.

  And from the daemon's side? it has to parse the contents of $LISTEN_FDNAMES,
which is a colon-separated list of labels; that's not really easier than
parsing my suggested LISTEN_FD_MAP string.

  So, LISTEN_FDS doesn't know if it wants to pass a list of hardcoded fd
numbers, or a map with labels; it ends up doing a bit of both, badly, and
getting the worst of both worlds.
  Yeah, typical systemd. You may say I'm biased, but it's the same thing
every time I look at one of their interfaces.

  Oh, and I haven't entirely given up my argument of political agenda.
The only reason you could have for designing such a whimsical interface,
apart from obvious lack of experience in software design, is to encourage
daemon authors to use the sd_listen_fds_with_names() interface (and so,
link against libsystemd), because it's too painful to fully implement it
yourself as a daemon author - you have to handle the case when
LISTEN_FD_NAMES is set, the case when it's not set, etc. etc. Nobody in
their right mind would do that every time, so it's much easier to just
call sd_listen_fds_with_names(). One more piece of software that gets
patched just to work with systemd!

  Meanwhile, parsing LISTEN_FD_MAP is what, 15 lines of C code tops?

  Oh, and while we're at it, let's praise sd_listen_fd_with_names for
managing to allocate heap memory, that must be freed by the user, while
parsing LISTEN_FD_NAMES, just because it wants to provide an array of
null-terminated strings to the user. The user has to free *every single
element of the array* ! Congratulations guys, my daemon hasn't
even started running and it's already eating memory and performing
operations that can fail.

  Conclusion: it's definitely possible to design a WORSE interface to pass
fds to a daemon than LISTEN_FDS, but it would require some creative effort.
Do I want to support that interface, or help you support that interface?
Take a wild guess.


> Even better, kresd can now offer neat tricks like universal DNS
> resolution over a unix-domain datagram socket without any change to
> kresd at all, just an additional line in the ./run script:
> -unix:mode=0666:/run/kresd/query.socket \

  Again, I agree with the principle. I don't agree with your choice
of syntax (because it's a bit painful to programmatically generate) but
that's not important. What is important, however, is the interface you
use to tell daemons which fd is which; and I'm not exactly sold on
LISTEN_FDS. :P


> I'm afraid i don't see that here, sorry :/ All listen is doing is
> opening file descriptors and setting environment variables according to
> a simple convention.

  Yeah, sorry, I was wrong about that. The convention is still bad though.


> Hm, i don't see a way to communicate to your daemon which sockets are to
> be used for what purpose. Using kresd above as an example, what if you
> wanted your daemon to offer dns-over-tls on TCP port 443 as well? what
> if you wanted to offer a second control socket in an unusual location
> with unusual ownership or permissions?

  My approach is simply to say that the daemon should hardcode the fds
it's using and document them. But alternatively, I'm not opposed to the
daemon reading environment variables containing fd numbers.

  I have written a small wrapper so that daemons can avoid using
sd_notify(): http://skarnet.org/software/sdnotify-wrapper.c
  I think I should write a similar wrapper so that daemons can avoid using
sd_listen_fds_with_names(), and can just read an fd map from the environment,
while still being usable under systemd that uses the LISTEN_FDS
mechanism.

-- 
  Laurent
Received on Thu Sep 01 2016 - 12:34:18 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC