We recently came across a problem that was affecting mail delivery from servers in Brightbox Cloud. Specifically, it only looked to be affecting deliveries to one particular mail provider, BT Internet. It’s a peculiar case and thought it was worth explaining.
BT’s mail servers were rejecting mail from a Cloud Server claiming that the sender IP address had “no PTR record”.
to=<email@example.com> relay=mx.bt.lon5.cpcloud.co.uk[220.127.116.11]:25 status=deferred (host mx.bt.lon5.cpcloud.co.uk[18.104.22.168] refused to talk to me: 521 rgin04.bt.ext.cpcloud.co.uk Service not available - no PTR record for 109.107.x.x)
A PTR record is used when you look up an IP address to get a DNS name, commonly known as a reverse DNS lookup.
Every Cloud IP we provide has a default PTR record that looks like this:
$ host 22.214.171.124 126.96.36.199.in-addr.arpa domain name pointer cip-109-107-33-8.gb1.brightbox.com.
And you can customise your own Cloud IPs to reverse to any record you like, so long as you set up an A record first to prove you control the domain:
$ host brightbox.com brightbox.com has address 188.8.131.52 $ host 184.108.40.206 220.127.116.11.in-addr.arpa domain name pointer brightbox.com.
When running a server that will send mail directly, it’s generally a good idea to set up your own reverse DNS records that match the A record you’re using as your host name as many receiving mail servers check that as an anti-spam measure nowadays.
So, why would just one mail provider think there is no reverse DNS set up for the IP?
We discovered that someone else had reported a similar problem with the same provider back in 2013. A member of their staff had explained that their systems do a reverse DNS lookup in uppercase and that this had caused problems with some Cisco routers at the time.
We looked at our internal DNS server logs and, lo and behold, the reverse DNS lookups were coming in from them all in uppercase and our system was incorrectly returning the DNS equivalent of a 404 not found (NXDOMAIN):
$ host -t PTR 18.104.22.168.IN-ADDR.ARPA Host 22.214.171.124.IN-ADDR.ARPA not found: 3(NXDOMAIN)
This turned out to be a bug in our back-end DNS implementation which was being case sensitive for PTR records. The majority of PTR lookups are all lower-case and were succeeding, but a minority were incorrectly reported as not existing.
RFC434 does explicitly state (in quite lengthy and somewhat obtuse detail) that DNS servers should be case insensitive, so this was definitely a bug at our end and we promptly rolled out a fix for it a couple of days later.
We analysed the logs from several large mail relays that we operate and found some other occurrences of this from other providers. Only 0.003% of email delivery attempts failed due to a missing PTR record, but of those, all of them were successfully delivered on the second attempt without problems.
This might be due to a pool of mail servers at those providers where some don’t like the missing PTR records and some don’t mind, or some do DNS lookups differently.
So far, BT Internet are the only mail provider we’ve come across that had been consistently tripping up on this behaviour. But looking even closer, even they were accepting mail once in a while, usually only after several days of retries. Presumably they have only a very small number of mail servers in their pool that were acting in our favour and the chances of getting through to one were low.
As a side note, there is another reason that DNS queries might use varying case: The Bit 0x20 Transaction Identity Improvement feature, suggested back in 2008 by Paul Vixie and David Dagon. The details are all there in that draft, but the summary is to vary the case of records in DNS requests to make it more difficult to forge replies.
So any bit-0x20 lookups would have also been returned as nonexistent. The draft never became a standard but has had a little traction and is used in a few places. Google in particular talk about it in their public DNS security document and have enabled it but only for servers they know support it properly.
These kinds of lookups are easy to spot and an analysis of our logs shows that only about 1% of lookups were of this type.
This is the story of how just one single bounced email led to the discovery of a bug in our DNS. It would have been easy to dismiss this as an unusual one-off case, or blame the remote mail provider for being finickity, but instead we investigated more thoroughly, learned a few things and fixed a bug. Definitely worth the effort!
You can sign up for Brightbox Cloud in just a couple of minutes and get a £20 free credit to play with.
And if you want help running a mail cluster, or anything else, drop us a line about our managed services.