Running a private mail server for six years, easy peasy

TL;DR – High-level overview of running my own, small private Linux mail server since late 2015. I’ve encountered surprisingly few issues and many valuable learnings. Initial setup (including monitoring, backups, configuration management) has taken some time, but recurring maintenance since then has been an estimated 10 – 20 minutes per month. Worth it for me, but probably not for most people. The next best thing, in my opinion, is mailbox.org with one’s own TLD.

Motivation

The main reason for this writeup is meant as a response to the sentiment I keep reading about in IT forums, that it’s “very time demanding”, “impossible to maintain”, “a pain to make sure your mails are being delivered”. I understand the reasons, and tend to agree when it comes to large-ish selfhosted mail deployments with hundreds of users and tens of thousands of mails per day, which also happens to be part of my current day job. It’s true that many IT people understandably don’t want to invest private time into things which appears to be another kind of work assignment. But personally, it fills me with satisfaction to self-host my own infrastructure, my little internet island where I’m root, especially in times of mega corporations trying (and succeeding) in redefining “the internet” as a portfolio of services only they can offer, with little alternative.

The Right Reasons

As mentioned above, I’m working in Linux administration / engineering and know my way around technical aspects of mail systems. I also love to self-host stuff, and it motivates me to approach all kinds of challenges when it comes to making things work. As one of the positive side effects, I’m often able to apply the experience I’ve gained privately during my career.

There’s a few things one should consider before diving into selfhosting mail systems for production use. There’s a nice overview of best practices by Phil Pennock which I mostly agree with. A few more points from my own experience:

  • Knowing that mail in itself is a painfully outdated protocol from the early days of the internet.
  • Knowing what an open relay is, and how to avoid it.
  • Familiarity with DNS, TTLs, records, zone files and specifically SPF, DKIM, DMARC.
  • Hosting should only happen in data centers, with dedicated, public IP adresses. Most residential IP spaces are likely blacklisted by default in many spam filters.
  • Knowing how to read mail headers.
  • Knowing about mail-tester.com, how it’s an awesome tool to debug one’s sending capabilites (and how there’s only a few free tries at a time).
  • Monitoring and backups should be in place, or should at least be considered. More below.
  • Researching the new public IP address with abuse DBs before implementation.
  • Knowing how to keep all parts of the system up to date.
  • Knowing that while the whole setup might be under one’s own control, and it’s possible to only allow the most secure TLS ciphers for user logins, mails between servers might still go unencrypted unless it’s specifically enforced, which might lead to deliverability problems if the other side doesn’t have encryption enabled. RFC2487 even states that enforcing encryption “MUST NOT be applied in case of a publicly-referenced [Postfix] SMTP server”. According to Google, about 90% to 93% of mail is encrypted in transit these days.
  • Knowing that mail is not a real-time communications medium, despite appearances. If a receiving mail server is down, the sending server might try resending for 24 – 48 hours before issuing a bounce to the original sender address. Having short downtimes is usually not a big problem with mail servers.
  • Despite doing everything correctly, sent mails might in some cases never arrive, without receiving a bounce message or any other indication something went wrong (looking at you, Microsoft).
  • Bonus points for having a trusted, technical person knowing about the setup with the ability to access the stuff in case of one’s incapacitation or death.

Tech Stack

My mail server is hosted inside a KVM / libvirt VM on a dedicated Hetzner server, a hosting provider I can recommend 100%. My OS of choice is Debian stable with backported kernels. An iptables script takes care of internal forwardings from hypervisor to VM. The mail server itself is Postfix, with Dovecot on top, as well as spamassassin, OpenDKIM, MySQL. There’s only one inbox with a catchall rule, where each login or service gets it’s own mail alias. Everything is monitored by Icinga2 and provisioned using Puppet. Since I’m accessing my mails only through either Thunderbird (Desktop) or K-9 Mail (Mobile), there’s no web frontend.

Implementation

While I’m not going into specifics regarding postfix, dovecot, etc. it’s important to mention a few architectual details. The mail server VM (residing as a qcow2 image file inside an encrypted LV, among others) is backed up twice per week using virsh blockcopy and transferred to another remote server. This setup has proven to be quite portable. I’ve since migrated my system to newer dedicated servers several times by just deploying my basic puppet hypervisor role, executing the iptables script, copying the VM the new server and updating my DNS records. I also like to test dist-upgrades by spinning up a local copy of the VM.

Monitoring is underrated when it comes to selfhosting. This is something I’ve learned soon after the initial deployment, in 2016, when postfix was down for about 14 hours due to carelessness on my part. I’ve since added Icinga2 to all of my systems for internal checks, as well as adding a secondary remote AWS EC2 Icinga2 instance for monitoring the monitoring server (yo dawg…) as well as various TCP ports from the outside.

my main Icinga2 instance watching over my mail server

Monitoring mails are delivered to an inbox outside of my mail setup. Same for cron mails. I can recommend an Android app called aNag, which visualizes Icinga2 state changes through push notifications, but I’m not going so far as to add some kind of oncall alerting. If something’s down, it stays down until I have time to fix it – which, so far, has not been the case with my mail server.

Mail rate and spaminess

My low mail throughput is one of the likely reasons my setup has been working well. Even while being subscribed to a bunch of newsletters and services, there’s only about 20 to 40 incoming mails per week. Looking at my sent folder, there’s just about 550 outgoing mails since late 2015.

I’ve had exactly one problem with deliverabilty during that time, where someone with a Hotmail account complained to never have received my mail – even though the Microsoft server claimed to have accepted it according to my logs. While Microsoft can be notoriously intransparent and unforgiving with (not) accepting mail, in this case it turned out to be a blacklisting issue. I had just moved servers and IP addresses shortly before, with the new IP having been on an internal MS blacklist. I raised a ticket with their mail infrastructure department, and to my surprise, the IP was cleared soon after.

I rarely ever see any spam. Once every few months I’ll receive a french SEO mail, which is more of a mild curiosity than a bother, and not really worth looking into.

Ongoing maintenance

As mentioned before, I nowadays spend maybe 2 to 5 hours per year on maintance, perhaps a bit more if a Debian dist-upgrade comes along. Every once in a while I’ll grep through my mail logs out of curiosity, but there’s rarely any surprises there. I recommend implementing some kind of auto-upgrade mechanism for security updates as well as subscriptions to various mailing lists, such as Debian Security.

Alternatives?

Writing all this down, it does seem to be an insanely inconveniencing thing to do, but I’ve invested many hours tuning my setup and it seems rock solid at this point. If I were to give up selfhosting, my first choice would be to migrate my TLD to mailbox.org. I consider mailbox.org to be one of the most capable and trustworthy mail providers out there. I also recently went through the steps of setting up someone elses TLD with their MX servers, which has been very easy.

Conclusion

If you’re like me, an up and coming Linux sysadmin or enthusiast, hosting one’s own mail server can add lots of valuable experience. And for better or worse, there’s no one else to blame for if something goes wrong. And soon one thing leads to another, with additional monitoring, config management, blog posts ..

10/10 would selfhost again.

Up-to-date filebeat for 32bit Raspbian (armhf)

Fiddling around with ELK recently, I’ve been setting up a log server. Deploying filebeat to my Raspbian (RPi 2, 3, 4, nano) systems turned out somewhat challenging, mostly since elastic doesn’t provide official releases for 32bit ARM. There’s been an open ticket since 2018 asking for official ARM builds, and it seems that elastic is now at least providing .deb packages for 64bit ARM.

This got me thinking, what if I just compile a filebeat armhf binary and repackage the given arm64 .deb file? Turns out, it’s quite easy. Here’s my all-in-one script, tested on x64 Debian 10 and Ubuntu 20.10:

https://gist.github.com/lazywebm/63ce309cffe6483bb5fc2d8a9e7cf50b

The interesting stuff happens in the four last functions. Here’s a rundown:

  • working directory is ~/Downloads/filebeat_armhf
  • get latest golang amd64 package for cross-compiling, extract to working dir, specifically use it’s given go binary (ignore any global installations)
  • get latest filebeat arm64 .deb package
  • clone beats repo, checkout latest release branch
  • build arm (armhf) filebeat binary with new go release
  • repackage given arm64 .deb with new filebeat binary, removing other binary (filebeat-god, seems to be irrelevant), update md5sums file, crontrol file
  • working dir cleanup

Result of this poor man’s CI (at the time of writing) is a new deb file, ready to be deployed on Raspbian: ~/Downloads/filebeat_armhf/filebeat-7.11.2-armhf.deb

I have some further automation in place, deploying the new deb to a publicly available web server. A small puppet module is taking it from there:

if $facts['os']['distro']['id'] == 'Raspbian' {
 
# 'archive' requires puppet-archive module
  archive { '/root/filebeat-7.11.2-armhf.deb':
    ensure => 'present',
    source => 'https://example.com/filebeat-7.11.2-armhf.deb';
  }
 
  package { 'beat':
    provider => 'dpkg',
    ensure   => 'installed',
    source   => "/root/filebeat-7.11.2-armhf.deb",
    require  => Archive['/root/filebeat-7.11.2-armhf.deb'];
  }
 
# fileconfig config with pcfens-filebeat module here
}

Debian, QEMU, libvirt, qcow2 and fstrim

After some discussion with colleagues on how to best approach fstrim for qcow2 on libvirt in Debian 10, I sat down one sunday afternoon researching and applying fstrim to my libvirt VMs.

My hypervisors and VMs are mostly running vanilla Debian stable, which is why this post is not necessarily applicable to other distributions – but perhaps somewhat helpful nonetheless.

Directive

The goal was to have my libvirt VMs (around two dozen across two hypervisors) automatically discard unused space from their underlying qcow2 image files. Apart from saving space, I was hoping to take some time off my online backup mechanism, which can take up to four hours for seven VMs on spinning disks. The two main approaches – as far as I can see – are either to add a discard option to a VMs fstab, or use a manual fstrim timer provided by Debian. Some more explanation here. I’ll be using a custom cronjob to invoke the fstrim command manually every few days, more on that later.

State of things

All of my VMs root systems are hosted inside qcow2 images, which I find to be more flexible than using LVM volumes. Some of these VMs have extra data partitions (eg. blockchain data, apt-mirrors) which don’t need backups and are therefore arranged as LVM volume groups. That’s why I’ll only be looking at setting up fstrim for root partitions (but extending it’s functionality across all partitons is trivial). Debian 10 ships with QEMU 3.1. Additionally, there’s one Windows 10 VM.

Research

There’s a really helpful post regarding fstrim and KVM by Chris Irwin for (what I’m guessing) he’s doing with non-Debian hypervisors. I recommend reading it, but here’s a summary:

  • starting with QEMU 4.0, virtio supports the discard option natively
  • no need to add an additional virtio-scsi controller anymore
  • specific VM machine type has to be pc-q35-4.0 and upwards

Executing kvm -machine help on my hypervisor shows support only up to pc-q35-3.1, which expected with QEMU 3.1:

root@atlas:~# kvm -machine help
Supported machines are:
pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-3.1)
pc-i440fx-3.1        Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-3.0        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.9        Standard PC (i440FX + PIIX, 1996)
[...]
q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-3.1)
pc-q35-3.1           Standard PC (Q35 + ICH9, 2009)
pc-q35-3.0           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.9           Standard PC (Q35 + ICH9, 2009)
[...]

Setup part 1

Luckily, Debian is offering QEMU 5.0 through buster-backports as of now (November 2020). After manually upgrading the respective packages, I’m now able to use pc-q35-5.0.

Note: at this point I recommend shutting down all VMs on the hypervisor that’s being worked on.

apt update; apt install qemu qemu-block-extra qemu-system-common qemu-system-data qemu-system-gui qemu-system-x86 -t buster-backports
root@atlas:~# kvm -machine help
Supported machines are:
[...]
pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-5.0)
pc-i440fx-5.0        Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-4.2        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-4.1        Standard PC (i440FX + PIIX, 1996)
[...]
q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-5.0)
pc-q35-5.0           Standard PC (Q35 + ICH9, 2009)
pc-q35-4.2           Standard PC (Q35 + ICH9, 2009)
pc-q35-4.1           Standard PC (Q35 + ICH9, 2009)
[...]

Note: depending on your setup and configuration management, it might be recommended to set some kind of apt-pinning for qemu packages from buster-backports, as to not miss any updates.

Now, using virt-manager and enabling it’s XML editing setting, several things need to be taken care of:

  • for machine type, I’m using q35, which libvirt automatically extends to pc-q35-5.0
<type arch="x86_64" machine="pc-q35-5.0">hvm</type>
  • the discard option needs to be added to the qcow2 driver
<driver name="qemu" type="qcow2" discard="unmap"/>

By the way, WordPress won’t let me add angle brackets without selecting the ugly default code type, otherwise it thinks it’s HTML code. Can’t really be bothered to explore it further, but makes me think about converting my entire page to static content some time…

Little detour

At this point I had to apply a few more changes to VMs that were apparently created a while ago, with machine types like pc-i440fx-2.8. In order to apply q35 to their configs, libvirt wanted me to change the PCI controller type from pci-root to pcie-root.

<controller type="pci" index="0" model="pcie-root"/>

After booting any of these VMs, their network did not seem to come up again. With adding pcie-root, Debian cheerfully renamed the interface names according to the new PCIe bus on the virtualized systems, breaking their network settings. The naming scheme always went from the original ens0 to enp0s3. Predictable interface names anyone?

It was quickly rectified after manually logging into the machines local root console through virt-manager’s VNC connection and editing the network config.

Note: the Windows 10 VM was one of the Old Ones, but bravely handled the PCIe bus change by informing me that it’s now connected to “Network 2”. Whatever that means.

Setup part 2

With my VMs up and running again, fstrim should now be available:

root@proxy:~# fstrim -v /
/: 56.6 GiB (60766765056 bytes) trimmed

Success!

As mentioned earlier, I’ve opted for my own custom cronjob, with a small puppet module wrapped around it.

class js::module::fstrim_kvm {
 
  package { 'virt-what': ensure => installed }
 
  if $facts['virtual'] == 'kvm' {
 
    cron { 'fstrim-root':
      ensure  => present,
      command => '/sbin/fstrim -v / >> /var/log/fstrim.log',
      user    => 'root',
      minute  => [fqdn_rand(30)],
      hour    => '23',
      weekday => [3,7],
      require => Package['virt-what'],
    }
  }
}

The cronjob requires the package virt-what, which puppet is using via it’s built-in fact virtual to determine whether the host is a KVM (QEMU) VM. The cronjob executes at a random minute (as to not have them all running at the same time) during the 23rd hour twice a week, shortly before my VM backups are running. Also, if there’s a log server to pick up log data, having fstrim stats might be (mildly) interesting.

Results

Comparing qcow2 images on the hypervisor before and after fstrim, the images are now taking up almost 70% less space. Very nice.

total 148G
 28G -rw-r--r--  1 libvirt-qemu libvirt-qemu  28G Nov 21 16:49 vm01.qcow2
 21G -rw-r--r--  1 libvirt-qemu libvirt-qemu 101G Nov 21 16:49 vm02.qcow2
 14G -rw-r--r--  1 libvirt-qemu libvirt-qemu  14G Nov 21 16:49 vm03.qcow2
 53G -rw-r--r--  1 libvirt-qemu libvirt-qemu  53G Nov 21 16:49 vm04.qcow2
 11G -rw-r--r--  1 libvirt-qemu libvirt-qemu  11G Nov 21 16:49 vm05.qcow2
6,6G -rw-r--r--  1 libvirt-qemu libvirt-qemu 6,7G Nov 21 16:49 vm06.qcow2
 17G -rw-r--r--  1 libvirt-qemu libvirt-qemu  17G Nov 21 16:49 vm07.qcow2
total 43G
9,4G -rw-r--r--  1 libvirt-qemu libvirt-qemu  28G Nov 22 13:10 vm01.qcow2
7,1G -rw-r--r--  1 libvirt-qemu libvirt-qemu 101G Nov 22 13:10 vm02.qcow2
5,2G -rw-r--r--  1 libvirt-qemu libvirt-qemu  14G Nov 22 13:10 vm03.qcow2
5,0G -rw-r--r--  1 libvirt-qemu libvirt-qemu  53G Nov 22 13:10 vm04.qcow2
6,0G -rw-r--r--  1 libvirt-qemu libvirt-qemu  11G Nov 22 13:10 vm05.qcow2
2,8G -rw-r--r--  1 libvirt-qemu libvirt-qemu 6,8G Nov 22 13:10 vm06.qcow2
6,5G -rw-r--r--  1 libvirt-qemu libvirt-qemu  17G Nov 22 13:10 vm07.qcow2

To do

I’m yet to implement fstrim on my Windows VM (if possible), mostly because it’s only one VM with maybe a couple of gigabytes to reclaim. Also I’m too lazy to look into it. If you have a working solution, please drop a comment.

persistent postfix config inside PHP docker container

One of my recent tasks included migrating an internal PHP-FPM application from a Debian 9 host (with a global PHP 7.0 installation) to a more flexible docker setup. One of the requirements was to retain the ability for the app to send mails to it’s users, which meant having a local SMTP server directly accessible to the PHP docker instance, and relaying any mails to a server on the outside.

I decided to set up a dockerized PHP-FPM environment through PHP’s official docker repo, using their image tagged as php:7.4-fpm-buster.

After some trial and error regarding proper RUN commands in the Dockerfile, this is what I came up with, which allows for a persistent mail server setup inside the PHP-FPM container.

FROM php:7.4-fpm-buster
 
ENV TZ="Europe/Berlin"
RUN echo "date.timezone = Europe/Berlin" > /usr/local/etc/php/conf.d/timezone.ini
RUN date
 
RUN echo "postfix postfix/mailname string internalapp.example.com" | debconf-set-selections
RUN echo "postfix postfix/main_mailer_type string 'Internet Site'" | debconf-set-selections
 
RUN apt-get update && apt-get install -y postfix libldap2-dev libbz2-dev \
    && docker-php-ext-install bcmath ldap bz2
 
RUN postconf -e "myhostname = internalapp.example.com"
RUN postconf -e "relayhost = 172.18.0.1"
RUN /etc/init.d/postfix restart

Of course, “internalapp.example.com” is just a placeholder for the actual service URL. It’s important to set the postfix variables early through debconf-set-selections to allow for a promptless postfix installation later on, otherwise the container deployment gets stuck. I’ve also had to manually set the time zone, confirming it’s correctness by visually echoing date during deployment.

The relayhost is just the docker host itself, which is – in this case – running a postfix as well. Since I want it to act as a relay for my dockerized app, I’ve had to edit /etc/postfix/main.cf, allowing for relay access to it from my docker network (which has been explicitly persisted in it’s docker-compose.yml):

mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 172.18.0.0/24

One advantage of using the host mail server as a relay is everything gets logged in it’s local mail.log, which might be helpful for further debugging or auditing.

Keeping latest kernels in Debian with backports and puppet

I like running Debian stable as well as making use of recent kernels. Since I’m managing most of my infrastructure using puppet, I came up with a simple module which is included in my baseline role deployed on all systems.

The puppet apt module is needed here.

class js::module::kernel_update {
 
  class { 'apt':
    update => {
      frequency => 'daily',
    }
  }
 
  if $facts['os']['architecture'] == 'amd64' {
 
    if $facts['os']['distro']['codename'] == 'stretch' {
      package { 
        ['linux-image-amd64']:
          ensure => latest,
          install_options => ['-t', 'stretch-backports']
      }
    }
 
    if $facts['os']['distro']['codename'] == 'buster' {
      package { 
        ['linux-image-amd64']:
          ensure => latest,
          install_options => ['-t', 'buster-backports']
      }
    }
  }
}

Naturally the backports repo needs to be included for this to work. My sources.list.erb (also included in the baseline role) looks like this:

<% if @os['distro']['id'] == 'Debian' -%>

deb http://aptmirror/debian/ <%= @os['distro']['codename'] %> main contrib non-free
deb http://aptmirror/debian-security/ <%= @os['distro']['codename'] %>/updates main contrib non-free
deb http://aptmirror/debian/ <%= @os['distro']['codename'] %>-updates main contrib non-free
deb http://aptmirror/debian/ <%= @os['distro']['codename'] %>-backports main contrib non-free
deb http://apt.puppetlabs.com <%= @os['distro']['codename'] %> puppet

<% end -%>

Just replace ‘aptmirror‘ with an apt mirror to your liking. Or run one yourself.

Moving from Firefox ESR to Firefox Quantum, or bye RequestPolicy

When Firefox Quantum was released last fall I switched to the ESR branch, currently on v52.7.3. My main – and pretty much only – reason for not using Quantum until now was due to incompatibilites with addons not written as native WebExtensions. It’s been over six months since Quantums initial release, and as more WebExtension addons are availabe, I wanted to see if I’d be comfortable with moving on as well.

First of all, Quantum feels much faster than the old Firefox, even with a dozen enabled addons. My main concern was with RequestPolicy Continued, which I used for years to build my own whitelist in order to keep out as much browser tracking as possible. Since there is still no WebExtension port, I started exploring other addons and found that uBlock Origin is capable of everything RequestPolicy can do. I’ve used UO on Firefox before, but only as a general adblock addon with default settings. By denying any 3rd-party resources globally while using the default filter lists for blocking undesired 1st-party content, uBlock Origin has broader capabilites than RequestPolicy. Here’s a nice explanation. But since there’s no way to export my RP whitelist to UO, I had to start over – which is not as painful as I initially feared. UO is a lot more effective in building a global whitelist for Firefox. The UO github has good explanations on it’s different blocking modes.

 

Here’s what RequestPolicy Continued on Firefox ESR (52.7.3) vs. uBlock Origin in Hard Mode with Firefox 59.0.2 looks like on heise.de.

        

 

UO is globally rejecting any 3rd-party resource by default and I can create my whitelist on each website below. Note the yellow indicator, which applies the common blocklists to all 1st-party resources. In addition, I disabled web fonts globally in UO (bottom right indicator) which renders websites a little less pretty, but works for me so far.

I had no problem migrating my NoScript whitelist, since it already has a WebExtension port. A few other great privacy-related addons for Quantum include Cookie AutoDelete and Privacy Settings. There’s also an addon disabling Referrers globally, but it’s missing some functionality from RefControl, which I used before.

 

Overall, I’m happy with migrating to Firefox Quantum. It’s faster, less resource-hungry and I was able to transfer all of my privacy related workflows.

Upgrading to Debian Stretch & fixing Cacti thold notification mails

With the upgrade from Debian Jessie to Stretch, the Cacti package went from 0.8.8b to 0.8.8h. A problem I had – and apparently a few other people, according to the Cacti forums – was that Cacti 0.8.8h in combination with the thold v0.5 plugin and Stretch refused to send “Downed device notifications”, or threshold warnings in general. Sending test emails with the Cacti settings plugin worked just fine, but that was it.

The issue lies with the split() function, which had been deprecated for a while and was now removed from PHP 7. Cacti logged the following error:

PHP Fatal error:  Uncaught Error: Call to undefined function split() in /usr/share/cacti/site/plugins/thold/includes/polling.php:28

To fix the problem and have Cacti send mails again, simply replace split() with explode() in polling.php:

sed -i -e 's/split(/explode(/g' /usr/share/cacti/site/plugins/thold/includes/polling.php

Upgrading to Debian Stretch with dovecot, postfix & opendkim

Debian Stretch is about to be released. I’m already upgrading some of my systems, and want to document a few issues I encountered after upgrading my mail server from Debian Jessie to Stretch.

 

Dovecot forgot what’s SSLv2

Before the upgrade, dovecot was configured to reject login attempts with SSLv2 & SSLv3. The corresponding line in /etc/dovecot/dovecot.conf looked like this:

ssl_protocols = !SSLv3 !SSLv2

After upgrading, logging into the mail server failed. Looking at the syslogs

dovecot: imap-login: Fatal: Invalid ssl_protocols setting: Unknown protocol 'SSLv2'

With the upgrade to Stretch and openssl 1.1.0, support vor SSLv2 was dropped entirely. Dovecot simply doesn’t recognize the argument anymore. Editing dovecot.conf helped.

ssl_protocols = !SSLv3

opendkim using file based sockets (Update 2017-10-13)

UPDATE – previous releases of opendkim on Stretch (v2.11.0) were affected by a bug, ignoring it’s own config file. See the Debian bug report.

The correct way to (re)configure the systemd daemon is to edit the default conf and regenerate the systemd config.

vi /etc/default/opendkim
# listen on loopback on port 12301:
SOCKET=inet:12301@localhost
/lib/opendkim/opendkim.service.generate
systemctl daemon-reload; systemctl restart opendkim

Tell postfix to use the TCP socket again, if nessecary.

vi /etc/postfix/main.cf
# DKIM config
milter_protocol = 2
milter_default_action = accept
smtpd_milters = inet:localhost:12301
non_smtpd_milters = inet:localhost:12301
systemctl restart postfix

This should do it.

——————————————————–

Before the upgrade, opendkim (v2.9.2) was configured as an initd service using loopback to connect to postfix.

/etc/default/opendkim

SOCKET="inet:12301@localhost" # listen on loopback on port 12301

/etc/postfix/main.cf

# DKIM config
milter_protocol = 2
milter_default_action = accept
smtpd_milters = inet:localhost:12301
non_smtpd_milters = inet:localhost:12301
root@host:~# systemctl status opendkim
opendkim.service - LSB: Start the OpenDKIM service
   Loaded: loaded (/etc/init.d/opendkim)
   Active: active (running) since Mi 2017-05-31 15:23:34 CEST; 6 days ago
  Process: 715 ExecStart=/etc/init.d/opendkim start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/opendkim.service
           ├─791 /usr/sbin/opendkim -x /etc/opendkim.conf -u opendkim -P /var/run/opendkim/opendkim.pid
           └─796 /usr/sbin/opendkim -x /etc/opendkim.conf -u opendkim -P /var/run/opendkim/opendkim.pid

During the system upgrade, opendkim daemon was reconfigured as a native systemd daemon, which meant /etc/default/opendkim and /etc/init.d/opendkim became obsolete, even though I was asked to install the new package maintainers version of /etc/default/opendkim.

Now the opendkim (v2.11.0) systemd daemon looked like this:

opendkim.service - OpenDKIM DomainKeys Identified Mail (DKIM) Milter
   Loaded: loaded (/lib/systemd/system/opendkim.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/opendkim.service.d
           └─override.conf
   Active: active (running) since Wed 2017-06-07 13:10:15 CEST; 23s ago
 Main PID: 4806 (opendkim)
    Tasks: 7 (limit: 4915)
   CGroup: /system.slice/opendkim.service
           ├─4806 /usr/sbin/opendkim -P /var/run/opendkim/opendkim.pid -p local:/var/run/opendkim/opendkim.sock
           └─4807 /usr/sbin/opendkim -P /var/run/opendkim/opendkim.pid -p local:/var/run/opendkim/opendkim.sock

I tried editing /etc/postfix/main.cf & adding the postfix user to the opendkim group to reflect the changes:

# DKIM config
milter_protocol = 2
milter_default_action = accept
smtpd_milters = local:/var/run/opendkim/opendkim.sock
non_smtpd_milters = local:/var/run/opendkim/opendkim.sock
root@host:~# adduser postfix opendkim

Restarting opendkim & postfix, the connection still failed to work.

postfix/smtpd[4451]: warning: connect to Milter service local:/var/run/opendkim/opendkim.sock: No such file or directory

Some research revealed that postfix does chroot its process to /var/spool/postfix (didn’t know that). To reflect this, I created new subdirectories and edited the systemd daemon.

root@host:~# mkdir -p /var/spool/postfix/var/run/opendkim
root@host:~# chown -R opendkim:opendkim /var/spool/postfix/var
root@host:~# systemctl edit opendkim
[Service]
ExecStart=
ExecStart=/usr/sbin/opendkim -P /var/run/opendkim/opendkim.pid -p local:/var/spool/postfix/var/run/opendkim/opendkim.sock

Note that the double ExecStart isn’t a typo.

After restarting all affected services, my sent mails were getting a valid DKIM signature again.

opendkim[11357]: OpenDKIM Filter v2.11.0 starting (args: -P /var/run/opendkim/opendkim.pid -p local:/var/spool/postfix/var/run/opendkim/opendkim.sock)

Encrypt an existing Linux installation with LUKS and LVM

An issue I encountered recently – how to encrypt an exisiting Xubuntu Setup. There are several ways to achieve this. I want to document my process I used.

I’m working with following assumptions:

  • The Linux installation to be encrypted is the only OS on disk.
  • The system is a (X)Ubuntu or similar (Debian). Commands, paths to config files or package names might differ in other Distributions.
  • The system is EFI-enabled. This means there is a 512 MiB FAT partition at the beginning of the disk, containing the EFI loader. This partition has to remain untouched. If your system is using legacy boot, ignore instructions regarding EFI later on.
  • A Live Linux USB stick (e.g. Xubuntu 16.10) and a separate hard disk with at least the same size as the system drive are available and ready. When in doubt, use a disk which is larger than the system drive.
  • The entire process takes time.
  • Mistakes happen. Be ready to lose data from the installed system! Ideally, there are multiple recent backups in place.

Before booting from the USB linux, prepare the Linux system by installing necessary packages & latest updates.

root@host:~# apt update; apt upgrade; apt install cryptsetup pv lvm2 gparted

Remove old kernel images. This might take a while, depending on the age of the Linux installation.

root@host:~# apt autoclean; apt autoremove

Shut down the computer, connect the USB disk and the second hard drive. Boot into the live system. Make sure your keyboard layout is set accordingly.

root@live:~# dpkg-reconfigure keyboard-configuration

Install necessary packages on the live system as well.

root@live:~# apt update; apt install cryptsetup pv lvm2 gparted

Annoyingly, my live system auto-mounted the old system disk. Unmount if necessary.

Use fdisk -l to check the order of drives. In my case, sda is the old system disk, sdb is the USB stick, sdc is the second hard drive. Use dd to copy the entire system disk to the second drive, with pv monitoring progress. Don’t overwrite your system.

root@live:~# dd if=/dev/sda | pv --progress --eta --bytes --rate | dd of=/dev/sdc

When finished, open gparted and choose your system disk.

Delete the root and swap partition, create a new boot partition (512MiB, ext4, set boot / esp flags) and create a “cleared” partition from remaining available space. Leave the EFI partition untouched. Note: if there’s no EFI boot partition, format the entire disk and create partitions as described.

The result looks like this:

Consider secure erasing of the old system partition. It takes time, but leaves no trace of unencrypted data on the system drive.

root@live:~# cryptsetup open --type plain /dev/sda3 container --key-file /dev/urandom

Proceed to create the encrypted volume on the cleared partition and choose a strong password.

root@live:~# cryptsetup luksFormat -c aes-xts-plain64:sha512 -s 512 /dev/sda3

Open the encrypted volume.

root@live:~# cryptsetup luksOpen /dev/sda3 encrypted_system

Create a LVM volume group and logical volumes on top of the opened LUKS volume. Note: tempo is the name I chose. Feel free to use another name for the volume group, but keep it consistent.

root@live:~# pvcreate /dev/mapper/encrypted_system
root@live:~# vgcreate tempo /dev/mapper/encrypted_system
root@live:~# lvcreate -L 8G tempo -n swap
root@live:~# lvcreate -l 100%FREE tempo -n root

Set up the swap and root volume.

root@live:~# mkswap /dev/mapper/tempo-swap
root@live:~# mkfs.ext4 /dev/mapper/tempo-root

Mount the new root volume to /mnt.

root@live:~# mount /dev/mapper/tempo-root /mnt

Mount the old root partition, which has been copied to the second drive.

root@live:~# mount /dev/sdc3 /media/old_root/

Navigate to the old root directory and use tar to copy the root system to the new LVM volume. The command doesn’t compress file input but redirects it to stdout. The output is then piped to the 2nd command where tar reads it from stdin. This way, all file & system attributes are preserved.

root@live:~# cd /media/old_root/
root@live:~# tar cvf - . | tar xf - -C /mnt/

When finished, delete all contents from the boot directory, since this will be the mount point for the new boot partition. Use the piped tar command to copy contents from the second drive. Mount the EFI partiton as well.

root@live:~# rm -rf /mnt/boot/*
root@live:~# mount /dev/sda2 /mnt/boot
root@live:~# cd /media/old_root/boot/
root@live:~# tar cvf - . | tar xf - -C /mnt/boot/
root@live:~# mount /dev/sda1 /mnt/boot/efi

Get the UUID of the encrypted LUKS volume. We need this later on.

root@live:~# blkid /dev/sda3
/dev/sda3: UUID="0f348572-6937-410f-8e04-1b760d5d11fe" TYPE="crypto_LUKS" PARTUUID="85f58482-8b18-446a-8cb6-cfdfe30c7d55"

Prepare the new root system in /mnt for chroot.

root@live:~# for dir in /dev /dev/pts /proc /sys /run; do mount --bind $dir /mnt/$dir; done
root@live:~# chroot /mnt

In the chrooted environment, we need to create or edit several config files to tell Linux where to look for the LVM swap / root volumes and how to open them. Create /etc/crypttab with the name of the volume group (tempo in my case) and the LUKS UUID we got earlier.

# 				
encrypted_system  UUID=0f348572-6937-410f-8e04-1b760d5d11fe  none  luks,discard,lvm=tempo

Create a file named /etc/initramfs-tools/conf.d/cryptroot in the chrooted environment. Replace tempo with the name used to open the LUKS volume and the UUID of the LUKS partition.

CRYPTROOT=target=tempo-root,source=/dev/disk/by-uuid/0f348572-6937-410f-8e04-1b760d5d11fe

Run the follwing command in the chrooted environment. It should pass without issues.

root@live:~# update-initramfs -k all -c

Open /etc/default/grub in the chrooted environment. Find this line:

GRUB_CMDLINE_LINUX=""

Insert the appropriate values (volume group name, LUKS UUID):

GRUB_CMDLINE_LINUX="cryptops=target=tempo-root,source=/dev/disk/by-uuid/0f348572-6937-410f-8e04-1b760d5d11fe,lvm=tempo"

Update grub in the chrooted environment. It will read arguments from /etc/default/grub and create new boot entries.

root@live:~# update-grub

Open /etc/fstab in the chrooted environment. Update the entry for the encrypted root and swap volume. Use blkid to find the UUID of the new boot partition. Leave the EFI partition entry untouched. My new fstab looks like this:

UUID=2886e598-0d5c-4576-87e7-a234011e7725	/boot		ext4	defaults		0	2
UUID=E2F4-2888					/boot/efi	vfat	umask=0077		0	3
/dev/mapper/tempo-root				/		ext4	errors=remount-ro	0	1
/dev/mapper/tempo-swap				none		swap	sw			0	0

That’s it. Close the chrooted environment and shut down the computer. Remove the USB stick and second hard drive. A password prompt should appear during boot. If everything goes well, the newly encrypted system will boot. Check if all partitions are mounted accordingly. Reboot again to check if recovery mode is working as well. Note that you still have an exact copy of your system previous to encryption on the second hard drive. After verifying the encrypted system is working as intended, you might want to consider secure erasing secure erasing of the unencrypted disk.

PGP key generation – increase system entropy

While creating a new PGP key pair using Enigmail, the progress bar seems stuck, and there’s no CPU activity.

The problem – missing entropy for /dev/random. Take a look at the available kernel entropy:

user@host:~# watch -n 0.2 cat /proc/sys/kernel/random/entropy_avail

If the number stays below – say – 300, PGP can’t find enough random data through /dev/random and won’t generate keys. There’s still /dev/urandom, which Engimail/PGP apparently ignores. So in order to generate acceptable levels of entropy for /dev/random and Engimail, I’m installing haveged, a “random number generator feeding Linux’s random device”.

user@host:~# sudo apt install haveged
user@host:~# sudo systemctl enable haveged.service
user@host:~# sudo systemctl start haveged.service

Now my system’s availabe entropy is at 1800, enough for Enigmail to generate my PGP keys.

Scroll to top