Systemd

From Lolly's Wiki
Jump to navigationJump to search

systemd

Yes, like daemons are usually written this has to be written lowercase.

What is systemd?

systemd is a replacement for the old and rusty init system of Linux. It has many new features and extends the normal init system with the ability to watch processes after the start has done, list sockets owned by processes started with systemd, adds security features like capabilities(7) and a lot more.

Maybe it will be as good as SMF (Service Management Facility) of Solaris one day :-).

Take a look with systemctl

List units

As you can see, there are hardware and software related units.

# systemctl list-units
  UNIT                                                            LOAD   ACTIVE SUB       DESCRIPTION
  proc-sys-fs-binfmt_misc.automount                               loaded active running   Arbitrary Executable File Formats File System Automount Point
  sys-devices-pci0000:00-0000:00:02.0-backlight-acpi_video0.device loaded active plugged   /sys/devices/pci0000:00/0000:00:02.0/backlight/acpi_video0
  sys-devices-pci0000:00-0000:00:02.0-drm-card0-card0\x2dLVDS\x2d1-intel_backlight.device loaded active plugged   /sys/devices/pci0000:00/0000:00:02.0/drm
  sys-devices-pci0000:00-0000:00:19.0-net-eth0.device             loaded active plugged   82579LM Gigabit Network Connection
  sys-devices-pci0000:00-0000:00:1a.0-usb1-1\x2d1-1\x2d1.4-1\x2d1.4:1.0-bluetooth-hci0-rfkill3.device loaded active plugged   /sys/devices/pci0000:00/0000
  sys-devices-pci0000:00-0000:00:1a.0-usb1-1\x2d1-1\x2d1.4-1\x2d1.4:1.0-bluetooth-hci0.device loaded active plugged   /sys/devices/pci0000:00/0000:00:1a.0
  sys-devices-pci0000:00-0000:00:1b.0-sound-card0.device          loaded active plugged   6 Series/C200 Series Chipset Family High Definition Audio Contro
  sys-devices-pci0000:00-0000:00:1c.1-0000:03:00.0-ieee80211-phy0-rfkill2.device loaded active plugged   /sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0
  sys-devices-pci0000:00-0000:00:1c.1-0000:03:00.0-net-wlan0.device loaded active plugged   Centrino Advanced-N 6205 [Taylor Peak] (Centrino Advanced-N 62
  sys-devices-pci0000:00-0000:00:1d.0-usb2-2\x2d1-2\x2d1.4-2\x2d1.4:1.1-tty-ttyACM0.device loaded active plugged   F5521gw
  sys-devices-pci0000:00-0000:00:1d.0-usb2-2\x2d1-2\x2d1.4-2\x2d1.4:1.3-tty-ttyACM1.device loaded active plugged   F5521gw
...
  session-c2.scope                                                loaded active running   Session c2 of user lollypop
  accounts-daemon.service                                         loaded active running   Accounts Service
● anacron.service                                                 loaded failed failed    Run anacron jobs
  apparmor.service                                                loaded active exited    LSB: AppArmor initialization
  apport.service                                                  loaded active exited    LSB: automatic crash report generation
...

In this example you can see that the anacron.service failed to start.

Display unit status

# systemctl status anacron
● anacron.service - Run anacron jobs
   Loaded: loaded (/lib/systemd/system/anacron.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Fr 2015-08-28 09:18:13 CEST; 31min ago
  Process: 1591 ExecStart=/usr/sbin/anacron -dsq (code=exited, status=1/FAILURE)
 Main PID: 1591 (code=exited, status=1/FAILURE)

Aug 28 09:18:13 lollybook systemd[1]: Started Run anacron jobs.
Aug 28 09:18:13 lollybook systemd[1]: Starting Run anacron jobs...
Aug 28 09:18:13 lollybook systemd[1]: anacron.service: main process exited, code=exited, status=1/FAILURE
Aug 28 09:18:13 lollybook anacron[1591]: anacron: Can't chdir to /var/spool/anacron: No such file or directory
Aug 28 09:18:13 lollybook systemd[1]: Unit anacron.service entered failed state.
Aug 28 09:18:13 lollybook systemd[1]: anacron.service failed.

Ah, deleted the anacron spool directory. ;-)

Restart units

Fix the problem and restart the service.

root@lollybook:~# mkdir /var/spool/anacron
root@lollybook:~# systemctl restart anacron.service
root@lollybook:~# systemctl status anacron
● anacron.service - Run anacron jobs
   Loaded: loaded (/lib/systemd/system/anacron.service; enabled; vendor preset: enabled)
   Active: active (running) since Fr 2015-08-28 09:53:49 CEST; 4s ago
 Main PID: 5179 (anacron)
   CGroup: /system.slice/anacron.service
           └─5179 /usr/sbin/anacron -dsq

Aug 28 09:53:49 lollybook systemd[1]: Started Run anacron jobs.
Aug 28 09:53:49 lollybook systemd[1]: Starting Run anacron jobs...
Aug 28 09:53:49 lollybook anacron[5179]: Anacron 2.3 started on 2015-08-28
Aug 28 09:53:49 lollybook anacron[5179]: Will run job `cron.daily' in 5 min.
Aug 28 09:53:49 lollybook anacron[5179]: Will run job `cron.weekly' in 10 min.
Aug 28 09:53:49 lollybook anacron[5179]: Will run job `cron.monthly' in 15 min.
Aug 28 09:53:49 lollybook anacron[5179]: Jobs will be executed sequentially

Display unit declaration

# systemctl cat zfs.target
# /lib/systemd/system/zfs.target
[Unit]
Description=ZFS startup target
Requires=zfs-mount.service
Requires=zfs-share.service
Wants=zed.service

[Install]
WantedBy=multi-user.target

Sockets

# systemctl list-sockets --all
LISTEN                          UNIT                            ACTIVATES
/run/acpid.socket               acpid.socket                    acpid.service
/run/systemd/fsckd              systemd-fsckd.socket            systemd-fsckd.service
/run/systemd/initctl/fifo       systemd-initctl.socket          systemd-initctl.service
/run/systemd/journal/dev-log    systemd-journald-dev-log.socket systemd-journald.service
/run/systemd/journal/socket     systemd-journald.socket         systemd-journald.service
/run/systemd/journal/stdout     systemd-journald.socket         systemd-journald.service
/run/systemd/journal/syslog     syslog.socket                   rsyslog.service
/run/systemd/shutdownd          systemd-shutdownd.socket        systemd-shutdownd.service
/run/udev/control               systemd-udevd-control.socket    systemd-udevd.service
/run/uuidd/request              uuidd.socket                    uuidd.service
/var/run/avahi-daemon/socket    avahi-daemon.socket             avahi-daemon.service
/var/run/cups/cups.sock         cups.socket                     cups.service
/var/run/dbus/system_bus_socket dbus.socket                     dbus.service
127.0.0.1:631                   cups.socket                     cups.service
[::1]:631                       cups.socket                     cups.service
audit 1                         systemd-journald-audit.socket   systemd-journald.service
kobject-uevent 1                systemd-udevd-kernel.socket     systemd-udevd.service

17 sockets listed.

View dependencies

What depends on zfs.target:

# systemctl list-dependencies --reverse zfs.target 
zfs.target
● ├─basic.target
...
● └─multi-user.target
...

And what do we need to reach the zfs.target?

# systemctl list-dependencies --recursive zfs.target
zfs.target
● ├─zed.service
● ├─zfs-mount.service
● └─zfs-share.service

Get the main PID of a service

$ systemctl show --property=MainPID --value ssh.service
2026

Security

Use capabilities to drop user privileges (CapabilityBoundingSet)

# systemctl cat  systemd-networkd.service --no-pager
...

[Service]
Type=notify
Restart=on-failure
RestartSec=0
ExecStart=/lib/systemd/systemd-networkd
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW CAP_SETUID CAP_SETGID CAP_SETPCAP CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER
ProtectSystem=full
ProtectHome=yes
WatchdogSec=1min

...

Now the process is started with exactly the capabilities it needs to have. Even if it starts as root all unnessesary capabilities are dropped for starting the process.

I dont want to copy the whole man page of capabilities(7) here but you can take a look to understand what this capabilities are.

BUT beware of programs which just test on UID 0!

Nailing a process to it's rights : NoNewPrivileges

Setting NoNewPrivileges=true ensures that the processtree from this level on will stuck at the UID and the privileges it has. This prohibits UID changes. No set UID binary will help the hacker to get more privileges than the user of the exploited service.

Limiting access to a socket

For example for the check_mk monitoring system:

# systemctl edit check_mk.socket

Deny from all, but the monitoring server (172.17.128.193):

[Socket]
IPAddressDeny=any
IPAddressAllow=172.17.128.193

Limiting a socket to IPv4

For example for the check_mk monitoring system:

# systemctl edit check_mk.socket

First remove old value, then set new one.

[Socket]
ListenStream=
ListenStream=0.0.0.0:6556

systemd-resolved the name resolve service

Status

$ systemd-resolve --status
Global
          DNS Domain: fritz.box
          DNSSEC NTA: 10.in-addr.arpa
                      168.192.in-addr.arpa
                      corp
                      d.f.ip6.arpa
                      home
                      internal
                      intranet
                      lan
                      local
                      private
                      test

Link 3 (wlan0)
      Current Scopes: none
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no

Link 2 (eth0)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
         DNS Servers: 192.168.178.1
          DNS Domain: fritz.box

Cache statistics

$ systemd-resolve --statistics
DNSSEC supported by current servers: no

Transactions
Current Transactions: 0
  Total Transactions: 1824

Cache
  Current Cache Size: 11
          Cache Hits: 1104
        Cache Misses: 771

DNSSEC Verdicts
              Secure: 0
            Insecure: 0
               Bogus: 0
       Indeterminate: 0

Flush the cache

$ systemd-resolve --flush-caches

Check with:

$ systemd-resolve --statistics
DNSSEC supported by current servers: no

Transactions
Current Transactions: 0
  Total Transactions: 1809

Cache
  Current Cache Size: 0       <--- Empty
          Cache Hits: 1099
        Cache Misses: 761

DNSSEC Verdicts
              Secure: 0
            Insecure: 0
               Bogus: 0
       Indeterminate: 0

systemd-timesyncd an alternative to ntp

The ntpd is a good and fat old horse for servers but clients do not necessarily need this one. Just give systemd-timesyncd a chance.

Configuration can be easily made through /etc/systemd/timesyncd.conf:

#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See timesyncd.conf(5) for details.

[Time]
NTP=ptbtime1.ptb.de hora.cs.tu-berlin.de
FallbackNTP=ntp.ubuntu.com

The server list is a space separated list of NTP servers. FallbackNTP is a list of servers if none of the NTP list could be reached.

If you want to split them into multiple files or generate them at start you can put files with the ending .conf in /etc/systemd/timesyncd.conf.d/.

After you setup the config you can enable the timesyncd via:

# timedatectl set-ntp true

Control your success with:

# timedatectl 
      Local time: Fr 2016-07-01 09:16:24 CEST
  Universal time: Fr 2016-07-01 07:16:24 UTC
        RTC time: Fr 2016-07-01 07:16:24
       Time zone: Europe/Berlin (CEST, +0200)
 Network time on: yes
NTP synchronized: yes
 RTC in local TZ: no

Nice it worked NTP synchronized: yes.

If not take a look with systemctl:

# systemctl status systemd-timesyncd.service
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/systemd-timesyncd.service.d
           └─disable-with-time-daemon.conf
   Active: inactive (dead)
Condition: start condition failed at Fr 2016-07-01 10:49:15 CEST; 1h 43min left
     Docs: man:systemd-timesyncd.service(8)

Hmm... let us take a look at ntp:

# systemctl status ntp.service 
● ntp.service - LSB: Start NTP daemon
   Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
   Active: active (exited) since Fr 2016-07-01 10:49:19 CEST; 1h 44min left
     Docs: man:systemd-sysv-generator(8)

Maybe we should uninstall or disable ntp first ;-).

# systemctl stop    ntp.service
# systemctl disable ntp.service
# systemctl start systemd-timesyncd.service
# systemctl status systemd-timesyncd.service
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/systemd-timesyncd.service.d
           └─disable-with-time-daemon.conf
   Active: active (running) since Fr 2016-07-01 09:06:10 CEST; 1s ago
     Docs: man:systemd-timesyncd.service(8)
 Main PID: 12360 (systemd-timesyn)
   Status: "Synchronized to time server 192.53.103.108:123 (ptbtime1.ptb.de)."
   CGroup: /system.slice/systemd-timesyncd.service
           └─12360 /lib/systemd/systemd-timesyncd

Jul 01 09:06:10 lollybook systemd[1]: Starting Network Time Synchronization...
Jul 01 09:06:10 lollybook systemd[1]: Started Network Time Synchronization.
Jul 01 09:06:10 lollybook systemd-timesyncd[12360]: Synchronized to time server 192.53.103.108:123 (ptbtime1.ptb.de).

That's it!

Units

[Unit]

Define dependencies

For example the zfs.target is defined like this:

# systemctl cat zfs.target
# /lib/systemd/system/zfs.target
[Unit]
Description=ZFS startup target
Requires=zfs-mount.service
Requires=zfs-share.service
Wants=zed.service

[Install]
WantedBy=multi-user.target

This means to reach the zfs.target we want that zed.service is started if enabled and we need zfs-mount.service and zfs-share.service.

Directories

ReadWrite-, ReadOnly- and InaccessibleDirectories

Private Tmp-Directories

Mounts a private incarnation of /tmp and /var/tmp which only lives as long as the unit is up. When the unit comes down the directories are cleared. This is done by a seperate namespace for this unit.

[Unit]
...
PrivateTmp=true|false
...

If several units should share a private tmp-directory you can use JoinsNamespaceOf=<unit1>[,<unit2>,<unit3>].

[Service]

[Install]

Tools

Testing around with capabilities

For example arping:

# getcap /usr/bin/arping
/usr/bin/arping = cap_net_raw+ep

With this capability set we can use this as normal user:

lollypop $ /usr/bin/arping -I wlan0 192.168.178.1
ARPING 192.168.178.1 from 192.168.178.31 wlan0
Unicast reply from 192.168.178.1 [24:65:11:F0:DC:A8]  1.774ms
Unicast reply from 192.168.178.1 [24:65:11:F0:DC:A8]  1.658ms

If we remove this capability it does not work:

# setcap cap_net_raw=-ep /usr/bin/arping
lollypop $ /usr/bin/arping -I wlan0 192.168.178.1
arping: socket: Operation not permitted

Of course it still works as root as root has all capabilities:

root@lollybook:~# /usr/bin/arping -I wlan0 192.168.178.1
ARPING 192.168.178.1 from 192.168.178.31 wlan0
Unicast reply from 192.168.178.1 [24:65:11:F0:DC:A8]  2.052ms
Unicast reply from 192.168.178.1 [24:65:11:F0:DC:A8]  1.852ms
Received 2 response(s)

So we better set this capability again:

# setcap cap_net_raw=+ep /usr/bin/arping


Logging with syslog-ng and systemd in a chroot environment

If you have a chroot environment (here I have /var/chroot) some things are a little bit tricky.

The needed logging socket in your chroot is /run/systemd/journal/dev-log

Prepare the mountpoint:

# mkdir -p /var/chroot/run/systemd/journal
# touch /var/chroot/run/systemd/journal/dev-log

Get the name for the needed unit file

The name of a .mount-unit file has to be the mount destination path. Dashes must be escaped. To get the resulting name you can easily use systemd-escape.

# systemd-escape -p --suffix=mount /var/chroot/run/systemd/journal/dev-log
var-chroot-run-systemd-journal-dev\x2dlog.mount

Create the unit file /lib/systemd/system/var-chroot-run-systemd-journal-dev\\x2dlog.mount for the mount

Remember to double escape (\\) the x2d (which is a dash -).

# vi /lib/systemd/system/var-chroot-run-systemd-journal-dev\\x2dlog.mount

I want to mount it before syslog-ng and pdns-recursor are up. Put this contents in the file:

[Unit]
Description=Mount /run/systemd/journal/dev-log to chroot
DefaultDependencies=no
ConditionPathExists=/var/chroot/run/systemd/journal/dev-log
ConditionCapability=CAP_SYS_ADMIN
After=systemd-modules-load.service
Before=pdns-recursor.service
Before=syslog-ng.service

[Mount]
What=/run/systemd/journal/dev-log
Where=/var/chroot/run/systemd/journal/dev-log
Type=none
Options=bind

[Install]
WantedBy=multi-user.target

Mount the socket

# systemctl daemon-reload
# systemctl enable var-chroot-run-systemd-journal-dev\\x2dlog.mount
# systemctl start var-chroot-run-systemd-journal-dev\\x2dlog.mount

Check the success:

# grep /var/chroot/run/systemd/journal/dev-log /proc/mounts 
tmpfs /var/chroot/run/systemd/journal/dev-log tmpfs rw,nosuid,noexec,relatime,size=101604k,mode=755 0 0


Tell the journald to forward logging lines to the socket

/etc/systemd/journald.conf

[Journal]
...
ForwardToSyslog=yes
...

Restart the journal daemon:

# systemctl restart systemd-journald.service

Configure syslog-ng

/etc/syslog-ng/syslog-ng.conf

Take the log from systemd-journald socket:

...
source s_src {
       system();
       internal();
       unix-dgram ("/run/systemd/journal/dev-log"); 
};
...

Example for powerdns recursor

/etc/syslog-ng/conf.d/destination.d/pdns.conf

# PowerDNS authoritative server destination

destination d_pdns          { file("/var/log/powerdns/pdns.log"); };
destination d_pdns_recursor { file("/var/log/powerdns/recursor.log"); };

/etc/syslog-ng/conf.d/filter.d/pdns.conf

# PowerDNS authoritative server filter

filter f_pdns          { program("^pdns$"); };
filter f_pdns_recursor { program("^pdns_recursor$"); };

/etc/syslog-ng/conf.d/log.d/90_pdns.conf

# PowerDNS authoritative server default final file log
log { source(s_src); filter(f_pdns);          destination(d_pdns);          flags(final); };
log { source(s_src); filter(f_pdns_recursor); destination(d_pdns_recursor); flags(final); };

Restart syslog-ng daemon

# systemctl restart syslog-ng.service

systemd-tmpfiles

The housekeeping of temporary directories is done by the service systemd-tmpfiles-clean.service . This service is triggered by the timer systemd-tmpfiles-clean.timer

To use this service for PrivateTMP directories for example of apache2.service you may use a config file under /etc/tmpfiles.d/ like this example /etc/tmpfiles.d/apache-cleanup.conf :

e /tmp/systemd-private-%b-apache2.service-*/tmp - - - 6h

This will cleanup all files under /tmp/systemd-private-%b-apache2.service-*/tmp which are older than 6 hours every time the systemd-tmpfiles-clean.service runs.

The %b in the path is the actual boot-id. What ist that? An id which is generated at each boot. You can get the boot-id with:

# journalctl --list-boots

The second field of the last line is the actual one, e.g.:

# journalctl --list-boots  | awk 'END {print $2}'
52ae0c2a587a47048ee76818ede269a6


When will that be? Try:

# systemctl list-timers systemd-tmpfiles-clean.timer
NEXT                          LEFT       LAST PASSED UNIT                         ACTIVATES
Thu 2020-08-13 16:07:24 CEST  46min left n/a  n/a    systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.service

1 timers listed.
Pass --all to see loaded but inactive timers, too.

OK, but you probably want to run ist once an hour? OK, just rescedule the timer like this:

# systemctl edit systemd-tmpfiles-clean.timer

and change the interval like this

[Timer]
OnUnitActiveSec=1h

Well done...

Examples

fwupd.service behind proxy

# systemctl edit fwupd-refresh.service
[Service]
Environment=http_proxy="http://user:passw0rd@proxy.intern.net:8080" https_proxy="http://user:passw0rd@proxy.intern.net:8080"
PassEnvironment=http_proxy https_proxy

Tomcat

/etc/systemd/system/tomcat-example.service

Simple service definition with some security options (ReadOnlyDirectories):

# /etc/systemd/system/tomcat-ndr.service
[Unit]
Description=Apache Tomcat Web Application Container
After=syslog.target network.target remote-fs.target
ConditionPathExists=/opt/tomcat/bin
ConditionPathExists=/home/tomcat/bin

[Service]
Type=forking
User=tomcat
Group=java
PrivateTmp=true
RuntimeDirectory=tomcat-example
RuntimeDirectoryMode=0700
ReadOnlyDirectories=/etc
ReadOnlyDirectories=/lib
ReadOnlyDirectories=/usr
EnvironmentFile=/home/tomcat/.Tomcat_init_systemd
PIDFile=/run/tomcat-example/tomcat.pid
ExecStart=/opt/tomcat/bin/catalina.sh start
ExecStop=/opt/tomcat/bin/catalina.sh stop
SuccessExitStatus=0

[Install]
WantedBy=multi-user.target

/etc/polkit-1/rules.d/57-tomcat-example.rules

Allow the user tomcat to start/stop the service:

polkit.addRule(function(action, subject) {
    if (action.id == "org.freedesktop.systemd1.manage-units" &&
        action.lookup("unit") == "tomcat-example.service" &&
        subject.user == "tomcat") {
        return polkit.Result.YES;
    }
});

Oracle

UNTESTED, just an example!

File this as

 /usr/lib/systemd/system/dbora@.service (SLES12)
#  This file is part of systemd.
#
# Configure instances for your oracle database versions like this
#  # systemctl enable dbora@<product>.service
# e.g.:
#  # systemctl enable dbora@12cR1.service
#
[Unit]
Description=Oracle Database %I
After=syslog.target network.target
 
[Service]
# systemd ignores PAM limits, so set any necessary limits in the service.
# Not really a bug, but a feature.
# https://bugzilla.redhat.com/show_bug.cgi?id=754285
LimitMEMLOCK=infinity
LimitNOFILE=65535
#
Type=simple
RemainAfterExit=yes
User=oracle
Group=dba
Environment="ORACLE_HOME=/opt/oracle/product/%i/db"
ExecStart=/opt/oracle/product/%i/db/bin/dbstart $ORACLE_HOME >> 2>&1 &
ExecStop=/opt/oracle/product/%i/db/bin/dbstart $ORACLE_HOME 2>&1 &
 
[Install]
WantedBy=multi-user.target
# systemctl daemon-reload
# systemctl enable dbora@12cR2.service
Created symlink from /etc/systemd/system/multi-user.target.wants/dbora@12cR2.service to /usr/lib/systemd/system/dbora@.service.