« Leaving home base | Main | Zombies, worms, viruses and you »

Yet another referrer-spam access file tweak

Are you sick of this yet?

If you don’t know what I’m doing here, rather than re-explain it all, I suggest you read where I started and yesterday’s changes. If anybody is finding this vaguely interesting or morbidly amusing, I could tie it all up in a nice summary someday when I’m otherwise unoccupied or want to postpone something tedious.

Suffice it to say that, for one, examination of my server logs suggested that mod_rewrite was not always playing well with my site (reaching the max number of redirects and timing out, which suggests a loop) and julie was still not able to post comments, despite her tenaciousness in the face of continuing rejection.

So, I rewrote Kasia’s comment-spam hack with mod_access (which, as it happens, makes liberal use of mod_setenvif as well.) Here’s what I wound up with:

# Comment spam rules
SetEnvIfNoCase Request_Method POST spam_com
SetEnvIfNoCase Request_URI ".mt-tb\.cgi" !spam_com
SetEnvIfNoCase Request_URI ".mt-xmlrpc\.cgi" !spam_com
SetEnvIfNoCase Referer ".*flashesofpanic\.com.*" !spam_com

# Referral spam blacklist
SetEnvIfNoCase Referer .*\.locators\.com.* spam_ref
SetEnvIfNoCase Referer .*\.popex\.com.* spam_ref

# Access section
Order Deny,Allow
Deny from env=spam_ref
Deny from env=spam_com

The first section assumes that all POST requests are attempts at comment spam, and sets the environment variable spam_com appropriately. We then make three exceptions: for mt-tb.cgi, which allows trackbacks, for mt-xmlrpc.cgi, which allows ecto, and for requests referred from this site, which should allow comments submitted through forms on the site (i.e. legitimate comments.) Each of those un-set the spam_com variable if they match.

The next section sets a similar variable, spam_ref, if the “Referer” (sic) header matches certain known referrer-spam domains. So far, we’ve only used mod_setenvif.

Then, the third section actually issues the mod_access directives: if either of these variables were set in the first two sections, the request is denied and a 403 “Forbidden” error is returned instead.

I have reason to believe this is working, but when I tested it last night, the comment submission timed out without sending anything back to the browser. The comment was accepted, though, and I’ve had one or two comments since then. If you’re (still) having trouble commenting, please let me know and I’ll try to suss it out. I haven’t taken the time to spoof a request that would trip the tests yet, so my basis for saying, “it’s working,” is just that comment spam and referrer spam are way down here lately.

A weakness to this approach is that it relies on a blacklist approach for the referrer spam blocking, and as this becomes more widespread, administering that blacklist is rapidly going to become impractical (consider, for example, having to blacklist everyone who spends fifty bucks on Reffy—or nothing for Reef.) The comment-spam block is a wholesale lockdown which then whitelists certain conditions; how can we build a similar algorithm for referer values?

Now playing: Too Close To Heaven from Too Close To Heaven • The Unreleased Fisherman’s Blues Sessions by The Waterboys

Post a comment