2014/05/21: Extracting a mail from an mbox

mbox(5) is a traditional mail storage format and most mail user agents support it. Mails are stored sequentially and the beginning of a new mail is indicated by a line starting with "From " (mind the space). A corollary of this separation is that at least lines starting with "From " in the body of the mail need to be quoted. Also, care has to be taken with locking when accessing the file.

If, however, you have a (private) working copy of an mbox and only want to copy mails into another scratch mbox things are quite simple. Once you have a line known to belong to the mail, just search upwards for "From " to get the beginning of the mail and then downwards to the line before the next "From " line (or the end of file) which is the last line of the email. As an example, consider the following fragment of an sh-script that from a git patch series saved as mail thread in an mbox extracts the patches in correct order and leaving out the cover letter (trailing spaces after "From").


cd $WRKDIR
grep -m 1 '^Subject: ' patches.mbox > patchcount
ed patchcount <<EOF
s|[^/]*/||
s|].*||
w
q
EOF
PATCHCOUNT=`cat patchcount`
mv patches.mbox patches.mbox.work
touch patches.mbox
ed patches.mbox.work << EOF
\$a
From 
.
w
q
EOF
for PATCHNR in `seq -w 1 $PATCHCOUNT`
do
ed patches.mbox.work << EOF
1
/Subject:.* $PATCHNR\\/$PATCHCOUNT
?^From 
ka
/^From 
-1
'a,.W patches.mbox
Q
EOF
done
rm patches.mbox.work
download