Kicking comment spam where it hurts
Ian Hicks has been seeing “odd” spam coming in to technical discussion lists at the W3C (World Wide Web Consortium, for those not up on their TLAs.) The message is pretty curious; it’s actually almost on-topic for the list, but when you read it closely, it looks like something Eliza would generate from the message it replies to. And, then there’s the porn links spamvertized at the bottom of the message. Hmmm, Google gaming, perhaps? The publicly-archived-mailing-list version of comment spam? Almost certainly.
What’s interesting about Hixie noticing this is that he’s actually in a position to do something about it. Thinking in terms of page markup…
I’m thinking that HTML should have an element that basically says “content within this section may contain links from external sources; just because they are here does not mean we are endorsing them” which Google could then use to block Google rank whoring. I know a bunch of people being affected by Web log spam would jump at that chance to use this element if it was put into a spec.
It’s an interesting thought, and definitely a tag you’d see wrapping the comments section of nearly every weblog on earth. Still, when I start imagining the consequences, I’m not as excited. There’s plenty of disagreement within computer science about whether languages (programming, scripting, or markup) should be simple and restrictive (they shouldn’t let their users screw up) or powerful and dangerous (they can do wonderful things, but you’ve got plenty of rope to hang yourself.) This tag definitely falls under “powerful and dangerous.”
For one thing, it would need to be used to be effective, and look how many websites are still being laid out in tables rather than CSS. For another, it would really need to be used judiciously. I’ve drawn a lot of benefit from information posted to just the sort of web archive which might get wrapped in that tag. I suppose if the text of the messages is still indexed, they’d still be reachable, but it would make it notably more difficult to troubleshoot some problems. Really judicious use of the tag would be required.
On the other hand, if someone steps in immediately to “take the bullet” and make these comments and list archives an unattractive target for link spammers, perhaps they won’t get clogged with dross in the first place.
I suppose it’s the comment spammers mucking up web archives for us, just the way the email spammers are making our mail unusable, and the real problem are the unscrupulous gaming the system to the detriment of all. That’s a damn shame, of course. But I’d be really cautious about implementing a tool to hasten the same sort of damage the link spammers are steering us toward anyway.
Now playing: Don’t Bang The Drum from This Is The Sea by The Waterboys