2013/09/29: Forward synchronising email

I'm using the way I synchronise my (private) mail for 4.5 years now. Recently, quite a few people asked about details, so I decided to write things up in a level of detail that should allow to easily reproduce my setting. It might seem a lot of infrastructure, but it really grew one simple script at a time.

It started with my dissatisfaction with (the then current version of) offlineimap. First, it took ages over a slow connection, as each time the full list of mails of the folder to be synchronised had to be transmitted. Secondly, it didn't handle well the case of the underlying tcp connection experiencing a time out. The latter situation even lead to data loss. So I decided I wanted to understand where the difficulties in mail synchronisation are. My approach to learn was to write my own mail-synchronisation solution, doing everything the most naive way, and see where it fails. I must say, I haven't learned that much, as, using it for 4.5 years for all my mails now, I haven't experienced any problems. Nevertheless, in my solution you still find the safety nets one would add when testing a program on live email in the expectation of the program doing things wrong.

Forward synchronisation

Admittedly, my situation is quite simple. I have a fixed set of machines on which I read and write emails (my server, my desktop, my laptop), I have a shell of each of them (in fact, root), and this set changes rarely.

So I can connect them to a tree (i.e., an undirected graph, that is connected and acyclic) and assume each machine knows its neighbours. Then forward synchronising is easy. Each machine detects local changes and tells its neighbours about it; if receiving such a notification it applies it locally and tells all neighbours, but the one it received it from, about it. In that way, changes detected locally are propagated to each machine precisely once.

maildirdiff

The scripts I use to generate and incorporate the notifications about changes to a maildir, including their man pages, are contained in the maildirdiff shell archive; my general programs page also contains other versions and a gpg-signature of the hashes.

This section only gives an overview over the scripts and data formats involved. The details of the semantics can be found in the corresponding man pages.

The maildir patch format

As explained, the idea is that each machine detects local changes and sends them to all neighbours. To describe the change to a single mail, I use a simple line-based format.

The first line contains the hostname of the machine that generated the patch, followed by an identifier that uniquely identifies that patch among all patches ever generated on this host. This information is actually never used in production (the uniqueness of distribution is guaranteed by the tree structure), but it allows to create log files, where it is very easy to trace the way a patch went.
The second line contains the name of the mail folder the patch is to be applied to.
The third line contains the command, followed by a universally unique identifier of the email, followed (in the case of CREATE and INFO) the "info" part of the mail, as in the maildir format (i.e., the information on whether the mail is read, replied to, etc). There are three commands.
- CREATE A mail that is to be added to be newly added to the said mailbox.
- INFO The mail exists already, but the "info" part changed.
- DELETE The mail has been deleted.
In case of CREATE the contents of the mail follow.

The status file format

To recognise what has changed locally since the last inspection, for every mailbox, a status file is kept. It contains a line for every mail with the following information separated by tabs.

The "unique" part of the file name, in the sense of the maildir format. In maildir format, each mail is stored as a separate file.
The "info" part in the sense of the maildir format.
The global id assigned to this mail. In this way, it is not necessary that the mails have the same file name on all machines. In fact, the abstract name space would even allow other mail-storage formats than maildir, should someone find the time to implement them.

maildirdiff, maildirpatch, and the loop

Given a status file and an actual maildir, maildirdiff produces a new status file, and a set of patches in the format just described. Given a patch and the location where the status files are stored, maildirpatch applies that patch, updating the status files. To avoid races on the status files, it acquires a lock in the status directory first. Additionally, maildirpatch takes two more arguments, the "deletion directory" and the log file. Concerning the deletion directory, as it all started as an experiment, maildirpatch is, of course, not allowed to delete files; instead every file to be deleted is moved to this directory. The actual deletion happens by my rotate-maildirdiff script, which is called by a daily cron job.

Given the primitives maildirdiff and maildirpatch, what is missing is a loop over all incoming patches and all mailsdirs which for me are precisely the subdirectories of one directory (/home/aehlig/MAIL in my case). This loop is provided by maildirdiff-sync. It reads a configuration file and then does the following.

iterate through all incoming patches
- unconditionally distribute to all neighbours, except the one received from
- if we are responsible for that maildir, apply the patch
iterate through all subdirectories of the MAILDIR or only the specified ones, if additional args are given
- if we are responsible for this directory, compute the diff as a set of patches, updating the status file, and distribute to all neighbours.

On my laptop, the configuration file looks as follows.

MAILDIR /home/aehlig/MAIL
STATUSDIR /home/aehlig/.maildirdiff/status
INDIR /home/aehlig/uucp-drop
OUTDIR /home/aehlig/.maildirdiff/out
DELDIR /home/aehlig/.maildirdiff/del
REJECTDIR /home/aehlig/.maildirdiff/reject
TMPDIR /home/aehlig/.maildirdiff/tmp
LOG /home/aehlig/.maildirdiff/log
INCLUDE [a-zA-Z0-9]
EXCLUDE SPAM
EXCLUDE MAIRIX
NEIGHBOUR isilmar batch-patch-uucp %s isilmar

A few remarks on the individual stanzas.

MAILDIR is the directory where all maildirs are immediate subdirectories of.
The INCLUDE and EXCLUDE statements describe, by means of regular expressions, which maildirs the synchronisation should apply to.
The STATUSDIR is the directory where all the status files are stored.
The DELDIR is the directory where to put files that are to be deleted. Remember, it started as an experiment, so I don't allow any of these scripts to delete anything.
Every patch that cannot be applied (i.e., a DELETE or INFO request that refers to a non present mail) ends up in the REJECTDIR.
The NEIGHBOUR lines define the topology. Each line states that there is a neighbour, with a given name (in the example isilmar) and a format string for a command (in the example batch-patch-uucp %s isilmar). The assumption is, that this neighbour will put its patches in a subdirectory of ${INDIR} called as its name, and executing the command with a patch name substituted in, will copy the patch to our INDIR on that neighbour. The name of the command already hints on the means of transport I'm using.

uucp

As means of putting a patch from one machine into the ${INDIR} on another machine, I use uucp with ssh as transport layer (specifying that as a pipe modem). The main advantage is, that I have a form of remote copy that I can invoke at any time, without having to care whether the other system currently is reachable or not. Other advantages are that after an interrupted transfer, uucp can continue at the very byte the interruption occurred. Additionally, it can transparently fall back to other transportation layers, like tunnelling ssh over http using http2tcp.

When working with uucp there are certain things to keep in mind.

uucp does not guarantee any order in which the files are delivered; however, you can reliably assume that a file of less important priority never overtakes a file of more important priority. So if we have 3 priority classes for creation, info, and deletion requests, we'll never have to reject a patch. When prioritising anyway, we can as well differentiate between important and less important maildirs (like the ones were heavy-traffic mailing lists are received in).
uucp is very good in transmitting batches. After each transferred file, however, a handshake occurs costing two full round trips. The latter is a problem, if your connection has high latency (and I've been to places where I had good throughput, but ping times of several whole seconds). Therefore, we batch the patches of the same priority level, using the usual format, i.e., #<command> <number of lines that follow>.

The scripts I'm using are batch-patch-uucp to add a patch to the batches, unbatch-patch-uucp to unbatch on my laptop (hilbert) for a given machine, and uucp-batch-drop-hilbert as drop script for my laptop (hilbert) on my server.

Heuristics on when to synchronise

So, the only thing missing are some heuristics on when to call maildirdiff-sync and on when to unbatch.

For maildirdiff-sync you get the bulk of changes by the simple heuristic that most probably the maildir has changed that you just left in your mail reader. Using mutt I have the following in my .muttrc.

folder-hook . 'set my_oldrecord=$record; set record=^; set my_folder=$record; set record=$my_oldrecord'
folder-hook . 'push ":set nowait_key<enter>!/home/aehlig/conf/mutt-enter-folder.pl $my_folder<enter>:set wait_key<enter>"'

Here the script mutt-enter-folder.pl simply calls maildirdiff-sync on the old folder.

This heuristics doesn't eliminate the need to have a cronjob or similar to regularly run maildirdiff-sync to inspect all folders, but that doesn't have to happen at high frequency.

The question on when to unbatch is a bit more complicated, as it has to be a compromise between large batches and prompt synchronisation.

On my laptop, I only unbatch if the machine goes online. Usually it is offline; to go online, I call a script that not only tells it to obtain an IP address via dhcp, but also to call home (uucico), run maildirdiff-sync and unbatch.
On my desktop, latency is always low and the machine is always online when I'm using it, so it's not worth to batch things; all patches are directly and immediately sent.
On my server, I have a daily cron job that unbatches. Additionally, my laptop/desktop occasionally ssh in and trigger an unbatch for themselves; afterwards they uucico. They do so, if they are online and got aware that an important mail arrived. The way they know about important new mail is that my mail system, if it decides after all filtering and sorting of a new incoming mail, that the mail is important enough, it delivers it, then it first does a maildirdiff-sync for the maildir delivered to and then informs me via jabber about the it. And the jabber client that receives the notification can then trigger the rest.

Outgoing mail

My approach to outgoing mails is pretty standard. My server is a mail server, and as such speaks SMTP. My desktop and my laptop have not been assigned a static IP address, so they relay all outgoing mail through my server. To do so, they use uucp, using the BSMTP format. BSMTP ("batched SMTP") essentially is everything you would say to an SMTP server, assuming it would always give the canonical positive answer (a corollary of that assumption is that mails cannot be rejected at the gateway, so if anything goes wrong, bounces have to be sent; therefore, only accept BSMTP from machines for which you are willing to relay unconditionally).

Sending BSMTP with postfix

On my desktop and my laptop, I have a local postfix running. To tell it about the BSMTP service, I have the following lines in the master.cf.

bsmtp     unix  -       n       n       -       -       pipe
  flags= user=uucp argv=/root/bsmtp/bsmtp $sender $nexthop $recipient

The referenced executable bsmtp is a simple perl script, that takes the sender, next hop, and recipients from the arguments and the mail from stdin, and pipes that information formatted as BSMTP to uux - $nexthop!rsmtp.

In the main.cf I just specify that the default is to use BSMTP and relaying to my server (called isilmar).

default_transport = bsmtp
relayhost = isilmar

Receiving BSMTP with qmail

On my server, rsmtp is a simple script that parses BSMTP and passes the mail to qmail-queue. It is located in some directory in the command-path and the hosts for which the server is mail relay host have rsmtp listed in their commands.

Cross-referenced by: