Running a private mail server for six years, easy peasy

TL;DR – High-level overview of running my own, small private Linux mail server since late 2015. I’ve encountered surprisingly few issues and many valuable learnings. Initial setup (including monitoring, backups, configuration management) has taken some time, but recurring maintenance since then has been an estimated 10 – 20 minutes per month. Worth it for me, but probably not for most people. The next best thing, in my opinion, is mailbox.org with one’s own TLD.

Motivation

The main reason for this writeup is meant as a response to the sentiment I keep reading about in IT forums, that it’s “very time demanding”, “impossible to maintain”, “a pain to make sure your mails are being delivered”. I understand the reasons, and tend to agree when it comes to large-ish selfhosted mail deployments with hundreds of users and tens of thousands of mails per day, which also happens to be part of my current day job. It’s true that many IT people understandably don’t want to invest private time into things which appears to be another kind of work assignment. But personally, it fills me with satisfaction to self-host my own infrastructure, my little internet island where I’m root, especially in times of mega corporations trying (and succeeding) in redefining “the internet” as a portfolio of services only they can offer, with little alternative.

The Right Reasons

As mentioned above, I’m working in Linux administration / engineering and know my way around technical aspects of mail systems. I also love to self-host stuff, and it motivates me to approach all kinds of challenges when it comes to making things work. As one of the positive side effects, I’m often able to apply the experience I’ve gained privately during my career.

There’s a few things one should consider before diving into selfhosting mail systems for production use. There’s a nice overview of best practices by Phil Pennock which I mostly agree with. A few more points from my own experience:

  • Knowing that mail in itself is a painfully outdated protocol from the early days of the internet.
  • Knowing what an open relay is, and how to avoid it.
  • Familiarity with DNS, TTLs, records, zone files and specifically SPF, DKIM, DMARC.
  • Hosting should only happen in data centers, with dedicated, public IP adresses. Most residential IP spaces are likely blacklisted by default in many spam filters.
  • Knowing how to read mail headers.
  • Knowing about mail-tester.com, how it’s an awesome tool to debug one’s sending capabilites (and how there’s only a few free tries at a time).
  • Monitoring and backups should be in place, or should at least be considered. More below.
  • Researching the new public IP address with abuse DBs before implementation.
  • Knowing how to keep all parts of the system up to date.
  • Knowing that while the whole setup might be under one’s own control, and it’s possible to only allow the most secure TLS ciphers for user logins, mails between servers might still go unencrypted unless it’s specifically enforced, which might lead to deliverability problems if the other side doesn’t have encryption enabled. RFC2487 even states that enforcing encryption “MUST NOT be applied in case of a publicly-referenced [Postfix] SMTP server”. According to Google, about 90% to 93% of mail is encrypted in transit these days.
  • Knowing that mail is not a real-time communications medium, despite appearances. If a receiving mail server is down, the sending server might try resending for 24 – 48 hours before issuing a bounce to the original sender address. Having short downtimes is usually not a big problem with mail servers.
  • Despite doing everything correctly, sent mails might in some cases never arrive, without receiving a bounce message or any other indication something went wrong (looking at you, Microsoft).
  • Bonus points for having a trusted, technical person knowing about the setup with the ability to access the stuff in case of one’s incapacitation or death.

Tech Stack

My mail server is hosted inside a KVM / libvirt VM on a dedicated Hetzner server, a hosting provider I can recommend 100%. My OS of choice is Debian stable with backported kernels. An iptables script takes care of internal forwardings from hypervisor to VM. The mail server itself is Postfix, with Dovecot on top, as well as spamassassin, OpenDKIM, MySQL. There’s only one inbox with a catchall rule, where each login or service gets it’s own mail alias. Everything is monitored by Icinga2 and provisioned using Puppet. Since I’m accessing my mails only through either Thunderbird (Desktop) or K-9 Mail (Mobile), there’s no web frontend.

Implementation

While I’m not going into specifics regarding postfix, dovecot, etc. it’s important to mention a few architectual details. The mail server VM (residing as a qcow2 image file inside an encrypted LV, among others) is backed up twice per week using virsh blockcopy and transferred to another remote server. This setup has proven to be quite portable. I’ve since migrated my system to newer dedicated servers several times by just deploying my basic puppet hypervisor role, executing the iptables script, copying the VM the new server and updating my DNS records. I also like to test dist-upgrades by spinning up a local copy of the VM.

Monitoring is underrated when it comes to selfhosting. This is something I’ve learned soon after the initial deployment, in 2016, when postfix was down for about 14 hours due to carelessness on my part. I’ve since added Icinga2 to all of my systems for internal checks, as well as adding a secondary remote AWS EC2 Icinga2 instance for monitoring the monitoring server (yo dawg…) as well as various TCP ports from the outside.

my main Icinga2 instance watching over my mail server

Monitoring mails are delivered to an inbox outside of my mail setup. Same for cron mails. I can recommend an Android app called aNag, which visualizes Icinga2 state changes through push notifications, but I’m not going so far as to add some kind of oncall alerting. If something’s down, it stays down until I have time to fix it – which, so far, has not been the case with my mail server.

Mail rate and spaminess

My low mail throughput is one of the likely reasons my setup has been working well. Even while being subscribed to a bunch of newsletters and services, there’s only about 20 to 40 incoming mails per week. Looking at my sent folder, there’s just about 550 outgoing mails since late 2015.

I’ve had exactly one problem with deliverabilty during that time, where someone with a Hotmail account complained to never have received my mail – even though the Microsoft server claimed to have accepted it according to my logs. While Microsoft can be notoriously intransparent and unforgiving with (not) accepting mail, in this case it turned out to be a blacklisting issue. I had just moved servers and IP addresses shortly before, with the new IP having been on an internal MS blacklist. I raised a ticket with their mail infrastructure department, and to my surprise, the IP was cleared soon after.

I rarely ever see any spam. Once every few months I’ll receive a french SEO mail, which is more of a mild curiosity than a bother, and not really worth looking into.

Ongoing maintenance

As mentioned before, I nowadays spend maybe 2 to 5 hours per year on maintance, perhaps a bit more if a Debian dist-upgrade comes along. Every once in a while I’ll grep through my mail logs out of curiosity, but there’s rarely any surprises there. I recommend implementing some kind of auto-upgrade mechanism for security updates as well as subscriptions to various mailing lists, such as Debian Security.

Alternatives?

Writing all this down, it does seem to be an insanely inconveniencing thing to do, but I’ve invested many hours tuning my setup and it seems rock solid at this point. If I were to give up selfhosting, my first choice would be to migrate my TLD to mailbox.org. I consider mailbox.org to be one of the most capable and trustworthy mail providers out there. I also recently went through the steps of setting up someone elses TLD with their MX servers, which has been very easy.

Conclusion

If you’re like me, an up and coming Linux sysadmin or enthusiast, hosting one’s own mail server can add lots of valuable experience. And for better or worse, there’s no one else to blame for if something goes wrong. And soon one thing leads to another, with additional monitoring, config management, blog posts ..

10/10 would selfhost again.

Up-to-date filebeat for 32bit Raspbian (armhf)

Fiddling around with ELK recently, I’ve been setting up a log server. Deploying filebeat to my Raspbian (RPi 2, 3, 4, nano) systems turned out somewhat challenging, mostly since elastic doesn’t provide official releases for 32bit ARM. There’s been an open ticket since 2018 asking for official ARM builds, and it seems that elastic is now at least providing .deb packages for 64bit ARM.

This got me thinking, what if I just compile a filebeat armhf binary and repackage the given arm64 .deb file? Turns out, it’s quite easy. Here’s my all-in-one script, tested on x64 Debian 10 and Ubuntu 20.10:

https://gist.github.com/lazywebm/63ce309cffe6483bb5fc2d8a9e7cf50b

The interesting stuff happens in the four last functions. Here’s a rundown:

  • working directory is ~/Downloads/filebeat_armhf
  • get latest golang amd64 package for cross-compiling, extract to working dir, specifically use it’s given go binary (ignore any global installations)
  • get latest filebeat arm64 .deb package
  • clone beats repo, checkout latest release branch
  • build arm (armhf) filebeat binary with new go release
  • repackage given arm64 .deb with new filebeat binary, removing other binary (filebeat-god, seems to be irrelevant), update md5sums file, crontrol file
  • working dir cleanup

Result of this poor man’s CI (at the time of writing) is a new deb file, ready to be deployed on Raspbian: ~/Downloads/filebeat_armhf/filebeat-7.11.2-armhf.deb

I have some further automation in place, deploying the new deb to a publicly available web server. A small puppet module is taking it from there:

if $facts['os']['distro']['id'] == 'Raspbian' {
 
# 'archive' requires puppet-archive module
  archive { '/root/filebeat-7.11.2-armhf.deb':
    ensure => 'present',
    source => 'https://example.com/filebeat-7.11.2-armhf.deb';
  }
 
  package { 'beat':
    provider => 'dpkg',
    ensure   => 'installed',
    source   => "/root/filebeat-7.11.2-armhf.deb",
    require  => Archive['/root/filebeat-7.11.2-armhf.deb'];
  }
 
# fileconfig config with pcfens-filebeat module here
}

persistent postfix config inside PHP docker container

One of my recent tasks included migrating an internal PHP-FPM application from a Debian 9 host (with a global PHP 7.0 installation) to a more flexible docker setup. One of the requirements was to retain the ability for the app to send mails to it’s users, which meant having a local SMTP server directly accessible to the PHP docker instance, and relaying any mails to a server on the outside.

I decided to set up a dockerized PHP-FPM environment through PHP’s official docker repo, using their image tagged as php:7.4-fpm-buster.

After some trial and error regarding proper RUN commands in the Dockerfile, this is what I came up with, which allows for a persistent mail server setup inside the PHP-FPM container.

FROM php:7.4-fpm-buster
 
ENV TZ="Europe/Berlin"
RUN echo "date.timezone = Europe/Berlin" > /usr/local/etc/php/conf.d/timezone.ini
RUN date
 
RUN echo "postfix postfix/mailname string internalapp.example.com" | debconf-set-selections
RUN echo "postfix postfix/main_mailer_type string 'Internet Site'" | debconf-set-selections
 
RUN apt-get update && apt-get install -y postfix libldap2-dev libbz2-dev \
    && docker-php-ext-install bcmath ldap bz2
 
RUN postconf -e "myhostname = internalapp.example.com"
RUN postconf -e "relayhost = 172.18.0.1"
RUN /etc/init.d/postfix restart

Of course, “internalapp.example.com” is just a placeholder for the actual service URL. It’s important to set the postfix variables early through debconf-set-selections to allow for a promptless postfix installation later on, otherwise the container deployment gets stuck. I’ve also had to manually set the time zone, confirming it’s correctness by visually echoing date during deployment.

The relayhost is just the docker host itself, which is – in this case – running a postfix as well. Since I want it to act as a relay for my dockerized app, I’ve had to edit /etc/postfix/main.cf, allowing for relay access to it from my docker network (which has been explicitly persisted in it’s docker-compose.yml):

mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 172.18.0.0/24

One advantage of using the host mail server as a relay is everything gets logged in it’s local mail.log, which might be helpful for further debugging or auditing.

Keeping latest kernels in Debian with backports and puppet

I like running Debian stable as well as making use of recent kernels. Since I’m managing most of my infrastructure using puppet, I came up with a simple module which is included in my baseline role deployed on all systems.

The puppet apt module is needed here.

class js::module::kernel_update {
 
  class { 'apt':
    update => {
      frequency => 'daily',
    }
  }
 
  if $facts['os']['architecture'] == 'amd64' {
 
    if $facts['os']['distro']['codename'] == 'stretch' {
      package { 
        ['linux-image-amd64']:
          ensure => latest,
          install_options => ['-t', 'stretch-backports']
      }
    }
 
    if $facts['os']['distro']['codename'] == 'buster' {
      package { 
        ['linux-image-amd64']:
          ensure => latest,
          install_options => ['-t', 'buster-backports']
      }
    }
  }
}

Naturally the backports repo needs to be included for this to work. My sources.list.erb (also included in the baseline role) looks like this:

<% if @os['distro']['id'] == 'Debian' -%>

deb http://aptmirror/debian/ <%= @os['distro']['codename'] %> main contrib non-free
deb http://aptmirror/debian-security/ <%= @os['distro']['codename'] %>/updates main contrib non-free
deb http://aptmirror/debian/ <%= @os['distro']['codename'] %>-updates main contrib non-free
deb http://aptmirror/debian/ <%= @os['distro']['codename'] %>-backports main contrib non-free
deb http://apt.puppetlabs.com <%= @os['distro']['codename'] %> puppet

<% end -%>

Just replace ‘aptmirror‘ with an apt mirror to your liking. Or run one yourself.

Upgrading to Debian Stretch & fixing Cacti thold notification mails

With the upgrade from Debian Jessie to Stretch, the Cacti package went from 0.8.8b to 0.8.8h. A problem I had – and apparently a few other people, according to the Cacti forums – was that Cacti 0.8.8h in combination with the thold v0.5 plugin and Stretch refused to send “Downed device notifications”, or threshold warnings in general. Sending test emails with the Cacti settings plugin worked just fine, but that was it.

The issue lies with the split() function, which had been deprecated for a while and was now removed from PHP 7. Cacti logged the following error:

PHP Fatal error:  Uncaught Error: Call to undefined function split() in /usr/share/cacti/site/plugins/thold/includes/polling.php:28

To fix the problem and have Cacti send mails again, simply replace split() with explode() in polling.php:

sed -i -e 's/split(/explode(/g' /usr/share/cacti/site/plugins/thold/includes/polling.php

Upgrading to Debian Stretch with dovecot, postfix & opendkim

Debian Stretch is about to be released. I’m already upgrading some of my systems, and want to document a few issues I encountered after upgrading my mail server from Debian Jessie to Stretch.

 

Dovecot forgot what’s SSLv2

Before the upgrade, dovecot was configured to reject login attempts with SSLv2 & SSLv3. The corresponding line in /etc/dovecot/dovecot.conf looked like this:

ssl_protocols = !SSLv3 !SSLv2

After upgrading, logging into the mail server failed. Looking at the syslogs

dovecot: imap-login: Fatal: Invalid ssl_protocols setting: Unknown protocol 'SSLv2'

With the upgrade to Stretch and openssl 1.1.0, support vor SSLv2 was dropped entirely. Dovecot simply doesn’t recognize the argument anymore. Editing dovecot.conf helped.

ssl_protocols = !SSLv3

opendkim using file based sockets (Update 2017-10-13)

UPDATE – previous releases of opendkim on Stretch (v2.11.0) were affected by a bug, ignoring it’s own config file. See the Debian bug report.

The correct way to (re)configure the systemd daemon is to edit the default conf and regenerate the systemd config.

vi /etc/default/opendkim
# listen on loopback on port 12301:
SOCKET=inet:12301@localhost
/lib/opendkim/opendkim.service.generate
systemctl daemon-reload; systemctl restart opendkim

Tell postfix to use the TCP socket again, if nessecary.

vi /etc/postfix/main.cf
# DKIM config
milter_protocol = 2
milter_default_action = accept
smtpd_milters = inet:localhost:12301
non_smtpd_milters = inet:localhost:12301
systemctl restart postfix

This should do it.

——————————————————–

Before the upgrade, opendkim (v2.9.2) was configured as an initd service using loopback to connect to postfix.

/etc/default/opendkim

SOCKET="inet:12301@localhost" # listen on loopback on port 12301

/etc/postfix/main.cf

# DKIM config
milter_protocol = 2
milter_default_action = accept
smtpd_milters = inet:localhost:12301
non_smtpd_milters = inet:localhost:12301
root@host:~# systemctl status opendkim
opendkim.service - LSB: Start the OpenDKIM service
   Loaded: loaded (/etc/init.d/opendkim)
   Active: active (running) since Mi 2017-05-31 15:23:34 CEST; 6 days ago
  Process: 715 ExecStart=/etc/init.d/opendkim start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/opendkim.service
           ├─791 /usr/sbin/opendkim -x /etc/opendkim.conf -u opendkim -P /var/run/opendkim/opendkim.pid
           └─796 /usr/sbin/opendkim -x /etc/opendkim.conf -u opendkim -P /var/run/opendkim/opendkim.pid

During the system upgrade, opendkim daemon was reconfigured as a native systemd daemon, which meant /etc/default/opendkim and /etc/init.d/opendkim became obsolete, even though I was asked to install the new package maintainers version of /etc/default/opendkim.

Now the opendkim (v2.11.0) systemd daemon looked like this:

opendkim.service - OpenDKIM DomainKeys Identified Mail (DKIM) Milter
   Loaded: loaded (/lib/systemd/system/opendkim.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/opendkim.service.d
           └─override.conf
   Active: active (running) since Wed 2017-06-07 13:10:15 CEST; 23s ago
 Main PID: 4806 (opendkim)
    Tasks: 7 (limit: 4915)
   CGroup: /system.slice/opendkim.service
           ├─4806 /usr/sbin/opendkim -P /var/run/opendkim/opendkim.pid -p local:/var/run/opendkim/opendkim.sock
           └─4807 /usr/sbin/opendkim -P /var/run/opendkim/opendkim.pid -p local:/var/run/opendkim/opendkim.sock

I tried editing /etc/postfix/main.cf & adding the postfix user to the opendkim group to reflect the changes:

# DKIM config
milter_protocol = 2
milter_default_action = accept
smtpd_milters = local:/var/run/opendkim/opendkim.sock
non_smtpd_milters = local:/var/run/opendkim/opendkim.sock
root@host:~# adduser postfix opendkim

Restarting opendkim & postfix, the connection still failed to work.

postfix/smtpd[4451]: warning: connect to Milter service local:/var/run/opendkim/opendkim.sock: No such file or directory

Some research revealed that postfix does chroot its process to /var/spool/postfix (didn’t know that). To reflect this, I created new subdirectories and edited the systemd daemon.

root@host:~# mkdir -p /var/spool/postfix/var/run/opendkim
root@host:~# chown -R opendkim:opendkim /var/spool/postfix/var
root@host:~# systemctl edit opendkim
[Service]
ExecStart=
ExecStart=/usr/sbin/opendkim -P /var/run/opendkim/opendkim.pid -p local:/var/spool/postfix/var/run/opendkim/opendkim.sock

Note that the double ExecStart isn’t a typo.

After restarting all affected services, my sent mails were getting a valid DKIM signature again.

opendkim[11357]: OpenDKIM Filter v2.11.0 starting (args: -P /var/run/opendkim/opendkim.pid -p local:/var/spool/postfix/var/run/opendkim/opendkim.sock)

PGP key generation – increase system entropy

While creating a new PGP key pair using Enigmail, the progress bar seems stuck, and there’s no CPU activity.

The problem – missing entropy for /dev/random. Take a look at the available kernel entropy:

user@host:~# watch -n 0.2 cat /proc/sys/kernel/random/entropy_avail

If the number stays below – say – 300, PGP can’t find enough random data through /dev/random and won’t generate keys. There’s still /dev/urandom, which Engimail/PGP apparently ignores. So in order to generate acceptable levels of entropy for /dev/random and Engimail, I’m installing haveged, a “random number generator feeding Linux’s random device”.

user@host:~# sudo apt install haveged
user@host:~# sudo systemctl enable haveged.service
user@host:~# sudo systemctl start haveged.service

Now my system’s availabe entropy is at 1800, enough for Enigmail to generate my PGP keys.

Scroll to top