Problem with STARTTLS in local spampd filter

As part of my anti-spam solution I have Postfix send mail to another Postfix instance running on another port, and I’m still a little confused about how it all works, but basically I had a problem in my mail logs that looked like this:

root@sixsigma:/var/log# tail -f mail.log | grep "\(SSL\)\|\(TLS\)"
Feb  1 03:51:16 sixsigma postfix/smtpd[8636]: setting up TLS connection from localhost[127.0.0.1]
Feb  1 03:51:16 sixsigma postfix/smtpd[8636]: localhost[127.0.0.1]: TLS cipher list "ALL:+RC4:@STRENGTH"
Feb  1 03:51:16 sixsigma postfix/smtpd[8636]: SSL_accept:before/accept initialization
Feb  1 03:52:19 sixsigma postfix/smtpd[8556]: SSL_accept error from localhost[127.0.0.1]: -1
Feb  1 03:52:19 sixsigma postfix/smtpd[8556]: lost connection after STARTTLS from localhost[127.0.0.1]
Feb  1 03:52:19 sixsigma postfix/smtp[8555]: SSL_connect error to 127.0.0.1[127.0.0.1]:10025: -1
Feb  1 03:52:19 sixsigma postfix/smtp[8555]: B97C42542CE: Cannot start TLS: handshake failure
Feb  1 03:52:19 sixsigma postfix/smtpd[8651]: initializing the server-side TLS engine
Feb  1 03:52:19 sixsigma postfix/smtp[8555]: Host offered STARTTLS: [127.0.0.1]
Feb  1 03:55:06 sixsigma postfix/smtpd[8660]: initializing the server-side TLS engine
Feb  1 03:55:06 sixsigma postfix/smtpd[8660]: setting up TLS connection from localhost[127.0.0.1]
Feb  1 03:55:06 sixsigma postfix/smtpd[8660]: localhost[127.0.0.1]: TLS cipher list "ALL:+RC4:@STRENGTH"
Feb  1 03:55:06 sixsigma postfix/smtpd[8660]: SSL_accept:before/accept initialization
Feb  1 03:56:09 sixsigma postfix/smtpd[8664]: initializing the server-side TLS engine
Feb  1 03:56:09 sixsigma postfix/smtpd[8664]: setting up TLS connection from localhost[127.0.0.1]
Feb  1 03:56:09 sixsigma postfix/smtpd[8664]: localhost[127.0.0.1]: TLS cipher list "ALL:+RC4:@STRENGTH"
Feb  1 03:56:09 sixsigma postfix/smtpd[8664]: SSL_accept:before/accept initialization
Feb  1 03:56:16 sixsigma postfix/smtp[8649]: SSL_connect error to 127.0.0.1[127.0.0.1]:10025: -1
Feb  1 03:56:16 sixsigma postfix/smtpd[8636]: SSL_accept error from localhost[127.0.0.1]: -1
Feb  1 03:56:16 sixsigma postfix/smtp[8649]: 5E6172542D0: Cannot start TLS: handshake failure
Feb  1 03:56:16 sixsigma postfix/smtpd[8636]: lost connection after STARTTLS from localhost[127.0.0.1]
Feb  1 03:56:16 sixsigma postfix/smtp[8649]: Host offered STARTTLS: [127.0.0.1]
Feb  1 03:56:54 sixsigma postfix/smtpd[8636]: setting up TLS connection from localhost[127.0.0.1]
Feb  1 03:56:54 sixsigma postfix/smtpd[8636]: localhost[127.0.0.1]: TLS cipher list "ALL:+RC4:@STRENGTH"
Feb  1 03:56:54 sixsigma postfix/smtpd[8636]: SSL_accept:before/accept initialization

You can see an error “SSL_connect error to 127.0.0.1[127.0.0.1]:10025” which means, as far as I can tell, that when the primary Postfix instance uses SMTP to connect to the SMTPD at 127.0.0.1:10025 there is a problem with TLS support. It seems that the software listening on 127.0.0.1:10025 thinks it can support TLS but then can’t.

I did some research and learned about Per-site TLS policies. So I created a policy file that looks like this:

root@sixsigma:/etc# cat postfix/tls_per_site
# JE 2012-02-01: http://www.postfix.org/TLS_LEGACY_README.html#client_tls_per_site
localhost:10025 NONE

Basically it says not to use TLS when connecting to localhost on port 10025. The spampd software is listening on port 100025 and Postfix is using spampd as a content filter:

root@sixsigma:/etc# grep -R 10025 *
default/spampd:LISTENPORT=10025
postfix/main.cf:content_filter = scan:[127.0.0.1]:10025

I think when spampd is done it connects back to Postfix listening on port 10026:

root@sixsigma:/etc# grep -R 10026 *
default/spampd:DESTPORT=10026
postfix/master.cf:localhost:10026 inet    n       -       n       -       10      smtpd

So I configured the Postfix instance on 10026 not to use TLS:

root@sixsigma:/etc# cat postfix/master.cf
...
localhost:10026 inet    n       -       n       -       10      smtpd
        -o content_filter=
        -o local_recipient_maps=
        -o relay_recipient_maps=
        -o myhostname=filter.mynetwork.local
        -o smtpd_helo_restrictions=
        -o smtpd_client_restrictions=
        -o smtpd_sender_restrictions=
        -o smtpd_recipient_restrictions=permit_mynetworks,reject
        -o mynetworks=127.0.0.0/8
        -o smtpd_use_tls=no
        -o smtp_use_tls=no
...

Then I revised my primary Postfix TLS configuration so that it looked like this:

root@sixsigma:/etc# cat postfix/main.cf
...
# JE 2012-01-20: http://www.howtoforge.com/centos-5.1-server-lamp-email-dns-ftp-ispconfig-p5
# JE 2012-02-01: http://www.postfix.org/TLS_LEGACY_README.html#client_tls_per_site

smtpd_use_tls = yes
smtpd_tls_auth_only = no
smtpd_tls_cert_file = /root/cert/blackbrick.com.crt
smtpd_tls_key_file = /root/cert/blackbrick.key
smtpd_tls_CAfile = /root/cert/gd_bundle.crt
smtpd_tls_loglevel = 1
smtpd_tls_received_header = yes
smtpd_tls_session_cache_timeout = 3600s
smtpd_tls_session_cache_database = btree:/var/run/smtpd_tls_session_cache

smtp_use_tls = yes
smtp_tls_note_starttls_offer = yes
smtp_tls_per_site = hash:/etc/postfix/tls_per_site
smtp_tls_cert_file = /root/cert/blackbrick.com.crt
smtp_tls_key_file = /root/cert/blackbrick.key
smtp_tls_CAfile = /root/cert/gd_bundle.crt
smtp_tls_session_cache_database = btree:/var/run/smtp_tls_session_cache

tls_random_source = dev:/dev/urandom

So I have a smtp_tls_per_site parameter referencing my policy file, and the policy file says not to use TLS when connecting to localhost:10025.

Now when I watch the logs I’m not seeing any errors:

root@sixsigma:/var/log# tail -f mail.log | grep "\(SSL\)\|\(TLS\)"
Feb  1 04:33:58 sixsigma postfix/smtpd[8989]: setting up TLS connection from 60-240-67-126.tpgi.com.au[60.240.67.126]
Feb  1 04:33:59 sixsigma postfix/smtpd[8989]: Anonymous TLS connection established from 60-240-67-126.tpgi.com.au[60.240.67.126]: TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)

So it all seems to be working. I guess the only thing that I’m worried about is that I’ve somehow disabled TLS in situations where I want it (which is all of the time except for local traffic). But… it seems like I got it right.

As mentioned over on Re: Postfix Cannot start TLS:

If using Postfix 2.2 or earlier, disable opportunistic TLS for this destination.

  http://www.postfix.org/TLS_LEGACY_README.html#client_tls_per_site

With Postfix 2.3 and later, opportunistic TLS handshake failures trigger a plain-text retry, so no policy table entries are required to send email to sites with broken TLS (provided you are not trying to enforce TLS).

So that explains why I was seeing the errors in the logs but that mail was still being delivered. Postfix was trying again after TLS failed. Anyway, it should be a little faster now, at least. And I won’t have useless errors clogging up my logs anymore.