systemd

Differences between revisions 68 and 69
Revision 68 as of 2012-09-10 20:26:47
Size: 22815
Editor: p548B7B91
Comment: Workaround #2: More elegant solution using an initctl replacement
Revision 69 as of 2012-09-10 20:42:03
Size: 22870
Editor: p548B7B91
Comment: Workaround #2: Create some sub-chapters
Deletions are marked like this. Additions are marked like this.
Line 353: Line 353:
EXAMPLE: [[https://bugs.launchpad.net/bugs/1008837|Bug #1008837]] "cups fails to install/upgrade with systemd"
Line 358: Line 356:
Here, a little demo:
=== Example ===

[[https://bugs.launchpad.net/bugs/1008837|Bug #1008837]] "cups fails to install/upgrade with systemd"

=== A little demo ===
Line 367: Line 371:
=== The ugly workaround ===
Line 394: Line 400:
 * Use-case: Re-install openssh-server package === Use-case: Re-install openssh-server package ===
Line 398: Line 405:

systemd - An alternative boot manager

systemd is a system and session manager for Linux, compatible with SysV and LSB init scripts. systemd provides aggressive parallelization capabilities, uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux cgroups, supports snapshotting and restoring of the system state, maintains mount and automount points and implements an elaborate transactional dependency-based service control logic.

See the systemd home page for further information.

Warning! Experimental code

systemd is under active development. Until now (1/2012) it has been shipped in Fedora 15+16, OpenSUSE 12.1, Mageia 2 (Alpha).

Current versions of systemd require that you customize your kernel as well as running a non-standard init process. These are pretty intrusive changes.

Installing systemd in Ubuntu may limit the amount of help and support available to you. If you have a commercial support agreement then installing systemd would almost certainly invalidate it. Even if you rely on forums etc, you will probably have to reproduce problems on a standard Ubuntu build before anyone can help you much.

If you want to quickly try out systemd, it may be a good idea to create a "sandpit" system for the purpose (e.g. a virtual machine that you can easily re-install or delete afterwards). If you are installing systemd on a system containing data that you care about, please take a full backup first, and make a plan for restoring from backup in the event that the system ends up unbootable.

Status as at December 2010

systemd on Ubuntu can boot the system to the desktop with networking configured and a reasonable number of services running. Some services are not started. This is because they have been converted to "native" upstart jobs in Ubuntu. systemd parses traditional sysvinit boot scripts (under /etc/init.d) but does not parse native upstart jobs (under /etc/init).

Package systemd-extra-units ships additional units (configuration files) that replace the native upstart jobs. Additional services could be started by adding further units to this package.

Ubuntu 12.04 running systemd (termcast)

12.10 will NOT ship systemd as default

Personal Package Archive location

systemd and related packages are available on this PPA To use the PPA, first add it to your software sources list as follows.

add-apt-repository ppa:pitti/systemd
apt-get update

Kernel requirements

systemd requires the directory /sys/fs/cgroup as a mountpoint. It doesn't exist in the current Ubuntu kernel (2.6.35). It can't be created with mkdir either because sysfs doesn't allow that. To create the directory this patch from the 2.6.36 kernel must be backported.

The following kernel config options must be selected. The standard Ubuntu kernel meets these config requirements.

General setup  --->
     [*] Control Group support
Device Drivers --->
     Generic Driver Options  --->
          [*] Maintain a devtmpfs filesystem to mount at /dev
          [*]   Automount devtmpfs at /dev, after the kernel mounted the rootfs (NEW)
File systems --->
     < > Kernel automounter support
     <*> Kernel automounter version 4 support (also supports v3)

A suitably patched kernel is available on the PPA. You can install it as follows.

apt-get install linux-image-2.6.35-23-generic=2.6.35-23.41ppa1 linux-headers-2.6.35-23-generic=2.6.35-23.41ppa1 linux-headers-2.6.35-23=2.6.35-23.41ppa1

If you prefer to build your own kernel you will need to apply the patch mentioned above and select the appropriate config options.

Installing systemd

systemd can be installed from the PPA as follows.

apt-get install systemd libpam-systemd systemd-gui systemd-extra-units

This results in systemd being installed alongside upstart. If you were going to try replacing upstart altogether then you would need the additional package systemd-sysv, which provides replacements for commands such as reboot.

Boot loader configuration

After installation, the machine will still boot under upstart by default. To boot under systemd, the following argument must be specified on the kernel command line:

init=/lib/systemd/systemd

Note that the systemd binary resides in /lib/systemd (/bin/systemd is a symlink to it).

To boot under systemd by default, edit /etc/default/grub and change the following line:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash init=/lib/systemd/systemd"

After modifying any grub related configuration files like /etc/default/grub the following command is needed to bring the changes into effect.

update-grub

/etc/mtab

systemd prints the following warning on boot:

/etc/mtab is not a symlink or not pointing to /proc/self/mounts. This is not supported anymore. Please make sure to replace this file by a symlink to avoid incorrect or misleading mount(8) output.

It is advisable to do as suggested and replace /etc/mtab. It is not only mount that will behave incorrectly otherwise, but also df and probably most other commands that look at the list of mounted filesystems. This change can be made as follows.

ln -fs /proc/self/mounts /etc/mtab

Using systemd

Booting

To boot under systemd, select the boot menu entry that you created for the purpose. If you didn't bother to create one, just select the entry for your patched kernel, edit the kernel command line directly in grub and add init=/bin/systemd.

If a normal boot under systemd is not successful then it is worth trying with the following parameters:

init=/bin/systemd systemd.unit=emergency.service
  • systemd.unit= specifies the target state that the system should boot to (similar to specifying a run level under sysvinit).

  • emergency.service launches an emergency bash shell on the console without attempting to start any other services.

Controlling systemd once booted

The main command used to control systemd is systemctl. Some of its subcommands are as follows.

  • systemctl list-units - List all units (where unit is the term for a job/service)

  • systemctl start [NAME...] - Start (activate) one or more units
  • systemctl stop [NAME...] - Stop (deactivate) one or more units
  • systemctl enable [NAME...] - Enable one or more unit files
  • systemctl disable [NAME...] - Disable one or more unit files
  • systemctl reboot - Shut down and reboot the system

For the complete list, see systemctl(1).

systemadm is the GUI equivalent to systemctl, if you like that sort of thing.

Remote filesystem mounts

If you have NFS mounts listed in /etc/fstab then systemd will attempt to mount them but will typically do so too early, before networking has been configured. To get the timing correct we need to tell systemd explicitly that the mount depends on networking and on rpc.statd. To do this, create a file under /lib/systemd/system named <mount-unit-name>.mount with contents as follows.

[Unit]
Description=<mountpoint>
Wants=network.target statd.service
After=network.target statd.service

[Mount]
What=<server>:<share>
Where=<mountpoint>
Type=nfs
StandardOutput=syslog
StandardError=syslog

In the above

  • mount-unit-name is the full path to the mountpoint in an escaped format. For example, a mount unit for /usr/local must be named usr-local.mount.

  • mountpoint is the local mountpoint

  • server:share specify the remote filesystem in the same manner as for /etc/fstab

See systemd.unit(5) and systemd.mount(5) for further details.

A similar approach will probably be required for other remote filesystem types such as nfs4 and cifs

Known issues

This kernel panic is seen sometimes when running with systemd. It occurs most commonly on shutdown.

[   79.173073] BUG: unable to handle kernel NULL pointer dereference at 0000001c
[   79.175071] IP: [<c01443b4>] set_next_entity+0x14/0xe0
[   79.176046] *pde = 2d918067 *pte = 00000000
[   79.176046] Oops: 0000 [#1] SMP
[   79.176046] last sysfs file: /sys/module/pcie_aspm/parameters/policy
[   79.176046] Modules linked in: snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm binfmt_misc snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc ppdev parport_pc i2c_piix4 parport psmouse serio_raw autofs4 floppy pcnet32 mii [last unloaded: vboxguest]
[   79.176046]
[   79.176046] Pid: 1138, comm: systemctl Not tainted 2.6.35-22-generic #35ppa1 /VirtualBox
[   79.176046] EIP: 0060:[<c01443b4>] EFLAGS: 00010082 CPU: 0
[   79.176046] EIP is at set_next_entity+0x14/0xe0
[   79.176046] EAX: ef1ea720 EBX: ef1ea720 ECX: edbed19c EDX: 00000000
[   79.176046] ESI: 00000000 EDI: 00000000 EBP: e3927ee4 ESP: e3927ed0
[   79.176046]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   79.176046] Process systemctl (pid: 1138, ti=e3926000 task=e396f230 task.ti=e3926000)
[   79.176046] Stack:
[   79.176046]  133e53b7 00000000 ef1ea720 00000000 00000000 e3927efc c014457d c5808700
[   79.176046] <0> 00000000 c5808700 00000000 e3927f54 c05c6bb1 c08c3700 c08c3700 e396f230
[   79.176046] <0> c05d8aa0 c08c3700 c08c3700 00000001 e3927f28 c08c3700 c08c3700 c016abff
[   79.176046] Call Trace:
[   79.176046]  [<c014457d>] ? pick_next_task_fair+0xbd/0x100
[   79.176046]  [<c05c6bb1>] ? schedule+0x4f1/0x7a0
[   79.176046]  [<c016abff>] ? switch_task_namespaces+0x1f/0x60
[   79.176046]  [<c014f079>] ? do_exit+0x1b9/0x340
[   79.176046]  [<c014f23e>] ? do_group_exit+0x3e/0xa0
[   79.176046]  [<c014f2b8>] ? sys_exit_group+0x18/0x20
[   79.176046]  [<c05c9114>] ? syscall_call+0x7/0xb
[   79.176046] Code: e4 83 45 ec 08 8b 45 ec 8b 00 85 c0 89 45 e8 75 d8 e9 77 ff ff ff 90 55 89 e5 83 ec 14 89 5d f4 89 75 f8 89 7d fc 0f 1f 44 00 00 <8b> 4a 1c 89 c6 89 d3 85 c9 0f 85 95 00 00 00 8b 46 40 8b 90 4c
[   79.176046] EIP: [<c01443b4>] set_next_entity+0x14/0xe0 SS:ESP 0068:e3927ed0
[   79.176046] CR2: 000000000000001c
[   79.176046] ---[ end trace 19d7c4bdf6f52705 ]---
[   79.176046] Fixing recursive fault but reboot is needed!

This is in the process scheduler code and seems to be related somehow to destroying a cpu-cgroup. Miklos Vajna reproduced a very similar looking panic in a recent kernel.org kernel and raised it on LKML here and a fix is in Linux upstream.

Implementation issues

Packaging of units

A unit properly belongs in the package of the daemon that it starts. systemd units are already included in some upstream packages and some Debian Experimental packages. They will probably appear in Ubuntu packages in due course, unless Ubuntu maintainers deliberately remove them.

The systemd-extra-units package is intended only to make systemd usable in the short term by shipping some important units that don't yet exist in other Ubuntu packages. It should be scaled back as and when units start to appear in their proper places, and should eventually be dropped.

Dependencies on things not yet available in Ubuntu

Upstream systemd depends on recent upstream changes to several other packages that are not yet available in Ubuntu. These have been worked around by reverting the relevant changes in systemd or disabling the relevant feature. In brief these are:

  • systemd 15 wants libnotify >= 0.7 and vala 0.11

    • Relevant changes have been reverted in order to build with libnotify 0.5.0 and vala 0.9.
  • systemd wants /sbin/agetty but Debian/Ubuntu renames it to /sbin/getty

    • /sbin/agetty will be added as a link in Debian wheezy (Debian bug 603786).

  • systemd invokes agetty -s where the option -s was added by recent upstream commits in util-linux.

    • Units getty@.service and serial-getty@.service have been patched to remove this option.

  • systemd-fsck invokes fsck -l where the option -l was added by recent upstream commits in util-linux

    • systemd-fsck has been patched to remove this option.

Networking

NetworkManager.service just starts NetworkManager and does not wait for it to bring up a network interface. This seems appropriate, but network.target should not become active until a network connection is up. An extra unit, wait-for-network.service, has been created to delay network.target.

The appropriate way to wait for a network connection with NetworkManager appears to be nm-online, but this binary is not shipped in the Debian or Ubuntu packages for some reason. For systemd-extra-units 0.2 a simple shell loop has been used. This is effective, but hardly in the spirit of systemd.

TODO: Ask whether nm-online could be shipped in the network-manager package, or re-implement something similar if not.

NOTE: The network-manager package from Debian experimental does ship nm-online and will be uploaded to wheezy as soon as squeeze is released.

Remote filesystem mounts

systemd reads /etc/fstab and automatically creates a mount unit for each entry. The automatically generated dependencies for these units are insufficient in the case of remote filesystems. The correct dependencies can be specified explicitly in a mount unit configuration file (see above) but ideally this should be handled automatically.

Proposal: When parsing an /etc/fstab entry, systemd should look for a unit template named after the filesystem type (e.g. nfs@.mount for an NFS mount). If a template is found it should be used, with appropriate substitutions. Otherwise systemd should fall back to creating the mount unit with default settings as per the current behaviour.

TODO: Write a patch for this and propose it upstream.

portmap

The upstart job for portmap saves state using pmap_dump and restores it using pmap_set. The sysvinit script in Debian does the same thing. These actions appear redundant because portmap itself saves state to /var/run/portmap_mapping after each change. My guess is that this is all based on instructions in the upstream source README, which may be out of date or just misleading.

TODO: Confirm whether pmap_dump/pmap_set really serve some purpose and include them in the systemd unit if so.

anacron

/etc/cron.d/anacron invokes anacron via the upstart job, which won't work while booted under systemd. A change to the anacron package would be needed to address this.

/etc/default

Existing sysvinit style scripts read configuration in the form of variable assignments from a file under /etc/default. A policy decision is needed on whether systemd units will do the same.

Advantages: Separates configuration from code, simplifies package upgrade (i.e. same reasoning that applies to the existing sysvinit scripts).

Disadvantages: Extra complexity. Some may argue that systemd units are simple enough to be treated entirely as configuration files.

/etc/default files can be read using the EnvironmentFile= parameter in a systemd service unit. However, this is not fully compatible with the shell. In particular the quoting rules differ. It is not possible to write an environment variable assignment containing whitespace or literal quotes in a way that both a legacy init script and systemd will understand.

TODO: Decide whether systemd will use /etc/default files. Needs discussion with Debian maintainers. There would be little benefit in Ubuntu going it's own way on this.

TODO: Finish off patch for shell compatible quoting support and submit upstream.

Other TODO Items

  • Units for NFS v4 support:
    • gssd
    • idmapd
    • rpc_pipefs
  • Unit for sshd
  • Unit(s) for static network configuration without NetworkManager.

  • Units for remaining native Upstart jobs in a default Ubuntu install that don't currently have systemd equivalents:
    • apport
    • cups
    • log dmesg after boot
    • failsafe X session
    • irqbalance
    • bring up virtual network devices
  • Desktop integration for systemadm
  • Plymouth integration

Workarounds

Workaround #1: Setup a unit for sshd

The systemd package from Martin Pitt's PPA does not support a "sshd" SSH daemon (see also above "Other TODO Items").
Here, we try to setup a unit for the sshd binary shipped with openssh-server package.

  • Prerequisites: Install openssh-server package:

# apt-get update
# apt-get install openssh-server
  • Change to the directory where systemd stores its system-wide units:

# cd /lib/systemd/system/

NOTE: As an alternative you can put any user-defined units into /etc/systemd/system/ directory.

  • Edit a new ssh.service file (Thanks Gentoo folks for the sshd.service sample):

[Unit]
Description=OpenBSD Secure Shell server
After=network.target

[Service]
ExecStartPre=/bin/mkdir -p /var/run/sshd
ExecStart=/usr/sbin/sshd -D
KillMode=process
Restart=always

[Install]
WantedBy=multi-user.target
  • Enable this new created sshd unit:

# systemctl enable ssh.service
[OUTPUT]
ln -s '/lib/systemd/system/ssh.service' '/etc/systemd/system/multi-user.target.wants/ssh.service'

NOTE: The symlink points to a file in the user-defined /etc/systemd/system/ directory!

  • It should now look like this:

lrwxrwxrwx 1 root root 46 Sep  9 17:33 /etc/systemd/system/multi-user.target.wants/ssh.service -> /lib/systemd/system/ssh.service
  • Start and stop OpenBSD Secure Shell server via systemctl command

# systemctl start ssh.service

# systemctl status ssh.service
[OUTPUT]
ssh.service - OpenBSD Secure Shell server
          Loaded: loaded (/lib/systemd/system/ssh.service; enabled)
          Active: active (running) since Mon, 10 Sep 2012 12:37:26 +0200; 1s ago
         Process: 4751 ExecStartPre=/bin/mkdir -p /var/run/sshd (code=exited, status=0/SUCCESS)
        Main PID: 4753 (sshd)
          CGroup: name=systemd:/system/ssh.service
                  └ 4753 /usr/sbin/sshd -D

# systemctl stop ssh.service
  • To display more informations, try systemctl show $unit:

# systemctl show ssh.service

Workaround #2: Failures on software-upgrades

The post-installation script of packages uses /sbin/start and /sbin/stop to start and stop (running) services in an upstart environment. Both binaries are symlinks to /sbin/initctl shipped with upstart package.

Unfortunately, software-upgrades break when switching to systemd as an alternative init-system on Ubuntu systems, more precisely the software-management tool dpkg breaks.

Example

Bug #1008837 "cups fails to install/upgrade with systemd"

A little demo

# mv /sbin/start /sbin/start.orig

# apt-get install --reinstall openssh-server
[OUTPUT]
/var/lib/dpkg/info/openssh-server.postinst: 429: /var/lib/dpkg/info/openssh-server.postinst: start: not found

The ugly workaround

  • dpkg-divert can be used to set up and update a list of diversions.

# dpkg-divert --divert /sbin/start.upstart /sbin/start
# dpkg-divert --divert /sbin/stop.upstart /sbin/stop
  • List the "true names" of start and stop binaries:

# dpkg-divert --truename /sbin/start
/sbin/start.upstart

# dpkg-divert --truename /sbin/stop
/sbin/stop.upstart
  • As a next step we symlink both binaries to /bin/true:

# ln -sf /bin/true /sbin/start
# ln -sf /bin/true /sbin/stop
  • It should look now like this:

lrwxrwxrwx 1 root root     9 Sep  9 18:09 /sbin/start -> /bin/true*
lrwxrwxrwx 1 root root     9 Sep  9 18:09 /sbin/stop -> /bin/true*

Use-case: Re-install openssh-server package

# apt-get install --reinstall openssh-server

This should now install properly in an upstart or systemd environment.

Update 10-Sep-2012: More elegant solution using an initctl replacement

With Michael Biebl and others I discussed more on the topic and Michael had concerns using /bin/true as a symlink. In the end the idea of an initctl replacement was born.

  • If you followed the 1st solution above, then...
    1. Remove the diverts from /sbin/start.upstart and /sbin/stop.upstart

    2. Restore the original symlinks for /sbin/start and /sbin/stop.

  • Rename the original /sbin/initctl binary:

# mv /sbin/initctl /sbin/initctl.upstart
  • Edit a new /sbin/initctl replacement file (Thanks mbiebl and grawity):

service=$1
action=${0##*/}

if [ -e /sys/fs/cgroup/systemd ] &&
   [ -e /lib/systemd/system/${service}.service ] ||
   [ -e /etc/systemd/system/${service}.service ] ; then

        if [ -n "$DPKG_MAINTSCRIPT_PACKAGE" ]; then
                # If we are called by a maintainer script, chances are
                # good that a new or updated service was installed.
                # Reload daemon to pick up any changes.
                systemctl daemon-reload
        fi

        systemctl $action ${service}.service
else
        if [ "$action" = initctl ] ; then
                initctl.upstart $@
        else
                initctl.upstart $action $service
        fi
fi
  • Make the new script executable:

# chmod +x /sbin/initctl

* Create and list a divert for the original initctl (this prevents e.g. its removal):

# dpkg-divert --divert /sbin/initctl.upstart /sbin/initctl

# dpkg-divert --truename /sbin/initctl
/sbin/initctl.upstart

Software-upgrades touching start/stop of daemons should now work fine within an upstart or systemd environment.

Thanks to user grawity and others for vital help from #systemd IRC-channel (freenode)!

systemd (last edited 2015-01-22 09:56:01 by pitti)