Re: Linuxisms in s6

From: Jonathan de Boyne Pollard <J.deBoynePollard-newsgroups_at_NTLWorld.com>
Date: Sat, 27 Aug 2016 17:51:55 +0100

Adrian Chadd:

> [...] the uptime stuff really threw us.
>
It's unfair to lay such system time problems at s6's door. Systems whose
system clock jumps 46 years during system bootstrap don't get to blame
s6 for mad time gaps that appear in logs and service start time
records. There is a *lot* of the Unix and Linux worlds that depends
from time being right. It's not just s6 that is affected by such
things. You note crypto. There are a lot of other things as well that
have unstated, sometimes undocumented, and sometimes surprising
dependencies upon system time being current.

Here's one such.

For quite a while, Linux distributions had rather an odd problem at
bootstrap. They'd repeatedly fsck volumes at every bootstrap when they
need not have. And this didn't affect U.S. or U.K. people, which is in
part why it persisted for so long.

* https://bugs.launchpad.net/ubuntu/+source/e2fsprogs/+bug/63175

* https://bugs.archlinux.org/task/17438

* http://lwn.net/Articles/264498/

The problem was that people were erroneously running their real-time
clocks in local time rather than UTC, and this triggered an odd hidden
dependency upon having the right time in the system clock in countries
where local time was in advance of UTC. The Linux method for handling
RTCs erroneously running in local time is for the system bootstrap to
make a special settimeofday() call that effectively tells the kernel
what the UTC offset is for the RTC hardware. This could happen *after*
the fsck of the root volume, however. So whilst that fsck was
happening, the kernel was assuming that UTC was the local time that it
had taken from the RTC and initialized its system clock with. In
effect, as soon as the special settimeofday() call was executed, the
system clock would jump backwards by one or more hours, to what UTC
actually was.

But the ext2/3/4 filesystem format has last checked/mounted/written
timestamps in its superblock. Part of the checking to see whether a
full fsck is needed at bootstrap is comparing them to the current time.
If they are in the future by hours or more, something is clearly wrong,
thinks fsck, and it runs the full check. At bootstrap, when the initial
fsck (of at least the root volume and sometimes other volumes as well)
is run, the system clock is not UTC yet. Comedy results.

Both systemd and the nosh system manager have to ensure that they do the
special settimeofday() system call before they start off service
management and thus run mount/fsck services, or indeed anything else
that might have a closet dependency from not stepping the system time by
hours partway through bootstrap. The nosh system-manager's manual page
has a whole section on this subject.

FreeBSD/PC-BSD has a mechanism for correctly reading a RTC that is
erroneously in local time. One sets up the RTC's offset from UTC in the
machdep.adjkerntz variable in /boot/loader.conf{,.local} and the system
clock never has to jump by hours during bootstrap. I've yet to
experience a FreeBSD/PC-BSD system where the installer actually
configures this, though.

Interestingly, FreeBSD/PC-BSD also has a fallback mechanism that uses
the latest volume mount timestamp that it can find as the initial system
time when no hardware clock device registers at bootstrap. Presumably
you have a clock device that registers but it is not battery-backed,
your volumes don't preserve (or reset) their mount timestamps, or you
are encountering the comedy situation where FreeBSD/PC-BSD is setting
the system clock to 1970-01-01 because the last time around it mounted
the filesystems before the clock was corrected.
Received on Sat Aug 27 2016 - 16:51:55 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC