Reply to topic
HMS Forums - Now 95% free of spam
comprug
Forum Regular

Joined: 15 Feb 2006
Posts: 341
Reply with quote
Over the past few months, the HostMySite forums were receiving at least 15 posts a day of crude, damaging spam. Needless to say, it was a moderation nightmare for us moderators. Over time, the forum became less and less popular among HMS customers and clients, myself included. I attribute part of its decline to spam. With "v word" ads being more plentiful than coding posts, I began to dread my visits to the forum, and felt strained by the need to moderate the forum.

The HostMySite development and Marketing team worked very hard to stop spam, trying new captchas and everything in their arsenal, but it didn't work, and I can guess that the extra human moderation was costly to productivity.

A couple weeks ago, I took a look at the offending software, XRumer, on their video tour. While I admire the programmers for their ability to program such a complex, invincible piece of software, I realized that to fight a bot, you must use a bot. I began working on an anti-spam bot that could crawl the forum, classify spam, and delete it. In this post I will explain how I reduced spam by 95% using my "PLOW" system, or Posts, Links, and Overt Words.

The first step to addressing spam was to identify spam by keywords, such as the v word. By searching for obvious keywords, I could eliminate 50% of spam with zero false positives. But I soon realized that often times the only way to identify spam was by using more common keywords such as "girl". Yet "girl" seemed too general. By having the bot crawl the forum every minute, I could guarantee that the bot would only see the first post in which words such as that would not be used. Girl is often used if a flame war breaks out. And bots don't piggy back on posts most of the time.

To futher decrease spam, I also deleted posts with Russian email addresses, or free email often used by spammers such as cashette.com. This was very effective, as nobody on this forum will use a .ru address.

To reduce spam a further 45%, I looked at the number of links in the post if the user had only one post. If it was 5 or more, the post would be declared spam.

While some spam still gets through, I am happy to say that spam has decreased by 95%. Hopefully the forum will be on the rise again.

If you are interested in the source of the bot, or wish to deploy a similar bot, pm me

Best Regards,
Comprug
Client Moderator and Bot Developer
http://quirksonrails.blogspot.com/
Jason Weible


Joined: 03 Apr 2007
Posts: 3
Location: Dallas, TX
Reply with quote
Did you ever try Freecap? I've been using it on our phpBB forums and haven't had any spammer bots sign up. It is definitely way better than the default one phpBB uses.

http://www.puremango.co.uk/cm_php_captcha_script_113.php
This may be a possibility
comprug
Forum Regular

Joined: 15 Feb 2006
Posts: 341
Reply with quote
Jason,
this may be a possibility, although I don't know about the development team's arrangements. Darrell (forum administrator) did tell me however that he had tried many different captchas. At this point, it might be possible to install this, although the bot usually works. the other problem would be that it is not clear that the spam bots don't use mechanical turks on XRumer. This program is pretty smart.
Jason101
Forum Regular

Joined: 14 Mar 2006
Posts: 548
Location: Harrisburg, PA
Reply with quote
It's starting to get ridiculous. There are more spam posts in one day, than there are legit posts in one month Evil or Very Mad

We need something solid implemented, like a moderator post review of all new posters. That will virtually eliminate all spam from being posted on the boards.
PRB


Joined: 29 Nov 2006
Posts: 23
Reply with quote
I would just hate to see how much the other 95% of spam would be...
Moderator Review
comprug
Forum Regular

Joined: 15 Feb 2006
Posts: 341
Reply with quote
We need something solid implemented, like a moderator post review of all new posters. That will virtually eliminate all spam from being posted on the boards.
Do we really want our INBOXES to be spammed with registration requests? This is better than the current situation, but I don't know.... Wait! I got it! Add a questionaire to the registration form:
HostMySite is a web (blank)
. Spammers won't customize their bot. On the php page processing requests, put:
Code:
<? $code = strtolower($_POST['txtvalidate']); ?>
if ($code != 'host') {

header( 'Location: http://forums.hostmysite.com/fake_success_page.php ) ;
} else {
// evaluate request
}?>

HMS Forums - Now 95% free of spam
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
All times are GMT  
Page 1 of 1  

  
  
 Reply to topic