Varnish, Mollom, and Spam (oh, my!)

By Erik, Sat, 08/03/2013 - 22:15

I received an email the other day with the subject "Mollom's volume limit exceeded". Now, I work with the fine folks from Mollom and popped over to ask what was up? I have this blog and it's not all that crazy popular in terms of hits or probably even spam attempts, I do get spam that I have to clear out, but since I enabled Mollom two years ago, it has blocked 160K+ spam attempts… Mollom's free subscription for personal blogs is great and the spam that it auto blocks is not counted against you. So this volume limit should only be for "legitimate" comments and you're allowed 50 per day. So I logged on to see what was going on and there were 400 "ham" comments waiting to be approved. Ham is supposedly "the good stuff" but these were obviously spam in nature. In fact the largest bulk was from a single user with the titles like:

  • sadaArorway ZerPemPeeri spettysquinny Effelfartetly
  • Awaicktaipt WefEnepAlleld experrope intetefonna
  • Tubsmushlib Fleemnessebra madabains Uncetaestilia
  • Spamyerer Sepstitsexilk Omistatigma trossygriense
  • Garoliasots lopsattisaGak wargeadioge equifiteWip
  • Tookdoubosy UtteftHoole ViariaGetle frigninue
  • MAWMIBQUEUEGE crertitly AdvidoTip PoomGaums
  • TrearlMed Annelfhiple InerbapeAbank lalShealkCerb

And there were definitely some that were insurance or prescription drug related as well. So I asked why Mollom wasn't seeing these all of a sudden. We looked up my account and found that Mollom was seeing these but passing them back to me as "ham" because they were being reported from the IP address of 127.0.0.1. Turns out that Mollom won't block things that don't have an external IP address.

But wait, what could have changed? So I logged into my server and began poking around the logs and sure enough, the Apache logs were all showing the visitor IP as 127.0.0.1 as well. Then I knew… relatively recently, I had moved my site from my shared hosting space to a small box to run it behind varnish. I had spent a couple of hours one weekend writing a VCL, getting all the ports and configurations set, but only recently had I actually pushed the DNS change to serve it from the new, speedy box.

So our initial thought was that we needed to turn on the Reverse Proxy Configuration in my settings.php. And this turned out to be the key to the spam but only half the battle. Since I'm running Drupal and Varnish on the same box, I simply needed to add the following two conf variables

But apache's logs were still showing 127.0.0.1 as the visitor IP address. So Mollom was working, but not Apache. This was just a logging configuration issue and solvable by changing my logging format from:

CustomLog ${APACHE_LOG_DIR}/access.log combined

to

LogFormat "%{X-FORWARDED-FOR}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" eporama_combined CustomLog ${APACHE_LOG_DIR}/access.log eporama_combined

The standard combined log format is exactly the same, but instead of %h in the first spot, we swap in the X-FORWARDED-FOR variable and voilà. Mollom is happy, Apache logs are happy, and my comment moderation queue is extremely happy.

Keywords

Comments3

Hi,

I found your post to be incredibly helpful however when I went to fix my apache log format by updating my host file I got an error when I did an apache reload. It has to do with your syntax. Apparently a couple of the items in there make use of quotation marks and if you don't escape them, it will cause a problem. Anyhow, my final log format line looked like this (notice the "\" escape character).

LogFormat "%{X-FORWARDED-FOR}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" mysite_combined
CustomLog ${APACHE_LOG_DIR}/mysiteaccess.log mysite_combined

Once I figured out all the places I needed to add escape characters apache loaded up with out issue. This article was a huge help as I am using Varnish and Mollom on a few of my sites.

Thanks!

Nathan

Erik

8 years 11 months ago

Thanks for the catch. I have that in my code, but the input format was stripping the backslashes for me.

I have updated the filter and it should be "better" now.