|
Sendmail/Postfix Milters in Python
by Jim Niemira
and
Stuart D. Gathman
This web page is written by Stuart D. Gathman and sponsored by
Business Management Systems, Inc.
(see LICENSE for copying permissions for this
documentation)
Last updated May 1, 2014
This site is kept for reference. A newer site is at
pymilter.org
What's New
- pymilter 1.0 removes the start.sh glue script. EL6 RPMs for packages
using pymilter (milter,pysrs,pygossip) now use daemonize as a replacement.
ACCEPT is supported as an untrapped exception policy. An optional dir for
getaddrset and getaddrdict in Milter.config supports moving some clutter.
Untrapped exceptions now report the registered milter name. An selinux
subpackage is include for EL6. Provide sqlite support for greylisting, and
Milter.greylist export and Milter.greysql import to migrate data.
- pyspf 2.0.9 adds a new test suite and support for RFC 7208, the
official (non-experimental) RFC for SPF.
- pyspf 2.0.8 adds much improved python3 support. All test suites
now pass with python3 and py3dns. SPF records are restricted to 7-bit
ascii. But some people try to use an extended set anyway, crashing
pyspf. We now return PermError for non-ascii SPF records. IP address
parsing and arithmetic is now handled by the ipaddr (ipaddress in python3)
module. I fixed a bug caused by a null CNAME in cache.
- milter 0.8.18 adds test cases and SMTP AUTH policies in sendmail access
for spf-milter. You can now also configure an untrapped exception
policy for spf-milter, and it rejects numeric HELO. For the bms milter,
from words can be in a file, and you can use the BAN feature for
configured public email providers like gmail and yahoo - it bans the
mailbox rather than the entire domain.
- pymilter 0.9.8 adds a test modules for unit testing milters.
It fixes a typo that prevented setsymlist from actually working all
these years (misspelled as setsmlist). The untrapped exception message
is changed to "pymilter: untrapped exception in milter app".
- milter 0.8.17 reports keysize of DKIM signatures, adds a simple
DKIM milter, and DKIM policies in the sendmail access file. It
also broke spf-milter for people using SMTP AUTH - sorry guys!
- milter 0.8.16 has dkim signing, and Authentication-Results header.
pymilter-0.9.7 has several improved diagnostics for milter programming errors.
- milter has dkim checking and logging in CVS. Will use DKIM Pass
for reputation tracking, and as an additional acceptable identity
along with HELO, PTR, or SPF.
- pymilter-0.9.4 supports python-2.6
- pymilter-0.9.2 supports the negotiate, data, and unknown callbacks.
Protocol steps are automatically negotiated by the high-level Milter
package by annotating callback methods with @nocallback or @noreply.
- pymilter-0.9.1 supports CHGFROM, introduced with sendmail-8.14,
and also supported by postfix-2.3.
To accomodate other open source projects using pymilter, this package has been
shedding modules which can be used by other packages.
- The pymilter package provides a robust toolkit for Python milters that wraps the C libmilter library. There
is also a
pure Python milter library that implements the milter protocol in Python.
- The milter package provides the beginnings of a
general purpose mail filtering system written in Python. It also includes
a simple spfmilter that supports policy by domain and spf result via
the sendmail access file.
- The pysrs package provides an
SRS
library, SES library, a sendmail socketmap daemon implementing
SRS, and (Real Soon Now) an srsmilter daemon implementing SRS,
now that sendmail-8.14 supports CHGFROM and this is supported in pymilter-0.9.
- The pyspf package
provides the
spf module, a well tested implementation of the of
the SPF protocol, which is useful for
detecting email forgery.
- The pygossip package provides the
gossip library and server daemon for the GOSSiP protocol, which
exchanges reputation of qualified domains. (Qualified in the milter package
means that example.com:PASS tracks a different reputation than
example.com:NEUTRAL.)
- The pydns package provides the
low level
DNS library for python DNS lookups. It is much smaller
and lighter than the more capable (and bigger)
dnspython library. Low level lookups
are needed to find SPF and MX records for instance.
- The pydspam package wraps an old version of
libdspam for python. The C API changed dramatically for new versions, and I
haven't gotten things updated yet. Another content filter might be in order.
At the lowest level, the milter module provides a thin wrapper around the
sendmail libmilter API. This API lets you register callbacks for
a number of events in the
process of sendmail receiving a message via SMTP.
These events include the initial connection from a MTA,
the envelope sender and recipients, the top level mail headers, and
the message body. There are options to mangle all of these components
of the message as it passes through the milter.
At the next level, the Milter module (note the case difference)
provides a Python friendly object oriented wrapper for the low level API. To
use the Milter module, an application registers a 'factory' to create an object
for each connection from a MTA to sendmail. These connection objects
must provide methods corresponding to the libmilter callback events.
Each event method returns a code to tell sendmail whether to proceed
with processing the message. This is a big advantage of milters over
other mail filtering systems. Unwanted mail can be stopped in its
tracks at the earliest possible point.
The Milter.Milter class provides default implementations for event
methods that
do nothing, and also provides wrappers for the libmilter methods to mutate
the message.
The mime module provides a wrapper for the Python email package that
fixes some bugs, and simplifies modifying selected parts of a MIME message.
Finally, the bms.py application is both a sample of how to use the
Milter and spf modules, and the beginnings of a general purpose SPAM filtering,
wiretapping, SPF checking, and Win32 virus protecting milter. It can
make use of the pysrs package when available for
SRS/SES checking and the pydspam package for Bayesian
content filtering. SPF checking
requires
pydns. Configuration documentation is currently included as comments
in the sample config file for the bms.py milter.
See also the HOWTO and
Milter Log Message Tags.
Python milter is under GPL. The authors can probably be convinced to
change this to LGPL if needed.
Milters can run on the same machine as sendmail, or another machine. The
milter can even run with a different operating system or processor than
sendmail.
Sendmail talks to the milter via a local or internet socket.
Sendmail keeps the
milter informed of events as it processes a mail connection. At any
point, the milter can cut the conversation short by telling sendmail
to ACCEPT, REJECT, or DISCARD the message. After receiving a complete
message from sendmail, the milter can again REJECT or DISCARD it, but it
can also ACCEPT it with changes to the headers or body.
What can you do with a milter?
Documentation for the C API is provided with sendmail.
Documentation for
pymilter is provided via Doxygen. Miltermodule provides a thin python
wrapper for the C API. Milter.py provides a simple OO wrapper on top of that.
The Python milter package includes a sample milter that replaces dangerous
attachments with a warning message, discards mail addressed to
MAILER-DAEMON, and demonstrates several SPAM abatement strategies.
The MimeMessage class to do this used to be based on the
mimetools and multifile standard python packages.
As of milter version 0.6.0, it is based on the email standard
python packages, which were derived from the
mimelib project.
The MimeMessage class patches several bugs in the email package,
and provides some backward compatibility.
The "defang" function of the sample milter was inspired by
MIMEDefang,
a Perl milter with flexible attachment processing options. The latest
version of MIMEDefang uses an apache style process pool to avoid reloading
the Perl interpreter for each message. This makes it fast enough for
production without using Perl threading.
mailchecker is
a Python project to provide flexible attachment processing for mail. I
will be looking at plugging mailchecker into a milter.
TMDA is a Python project
to require confirmation the first time someone tries to send to your
mailbox. This would be a nice feature to have in a milter.
There is also a Milter community website
where milter software and gory details of the API are discussed.
Is a milter written in python efficient?
The python milter process is multi-threaded and startup cost is incurred
only once. This is much more efficient than some implementations that
start a new interpreter for each connection. Testing in a production
environment did not use a significant percentage of the CPU. Furthermore,
python is easily extended in C for any step requiring expensive CPU
processing.
For example, the HTML parsing feature to remove scripts from HTML attachments
is rather CPU intensive in pure python. Using the C replacement for sgmllib
greatly speeds things up.
Goals
Confirmed Installations
Please email
me if you do not successfully install milter. The confirmed
installations are too numerous to list at this point.
Enough Already!
Nearly a dozen people have emailed me begging for a feature to copy
outgoing and/or incoming mail to a backup directory by user. Ok, it
looks like this is a most requested feature. In the meantime,
here are some things to consider:
- The milter package (bms.py) supports the mail_archive option
in the
[wiretap] section. This is not by user, however.
- If you want to equivalent of a Bcc added to each message, this
is very easy to do in the python code for bms.py. See below.
- If you want to copy to a file in a directory (thus avoiding having to
set up aliases), this is slightly more involved. The bms.py milter already
copies the message to a temporary file for use in replacing the message body
when banned attachments are found. You have to open a file, and copy the
Mesage object to it in eom().
- Finally, you are probably aware that most email clients already
keep a copy of outgoing mail? Presumably there is a good reason for
keeping another copy on the server.
To Bcc a message, call self.add_recipient(rcpt) in envfrom after
determining whether you want to copy (e.g. whether the sender is local). For
example,
def envfrom(...
...
if len(t) == 2:
self.rejectvirus = t[1] in reject_virus_from
if t[0] in wiretap_users.get(t[1],()):
self.add_recipient(wiretap_dest)
if t[1] == 'mydomain.com':
self.add_recipient('<copy-%s>' % t[0])
...
To make this a generic feature requires thinking about how the configuration
would look. Feel free to make specific suggestions about config file
entries. Be sure to handle both Bcc and file copies, and designating what
mail should be copied. How should "outgoing" be defined? Implementing it is
easy once the configuration is designed.
|