In the last year this site, and other sites using different wiki engines, have been plagued by robots planting lots of links on random selection of pages. This has become a serious problem, and needs a fix. But before that, I would like to thank all the people who recently have despammed this site manually!
On TaviPatches Jay Sheth has provided a [link] to an anti spam patch he has written for 'Tavi. This patch has been adopted into the main code of 'Tavi, and with release 0.26 this will be available as an option. That is if you set the variable $UseCaptcha=1 in config.php, then the preview window will add a box at the top using [captcha], and not allow saving of pages without entering the correct combination of characters.
Hopefully this will prevent this and other sites to be spammed. Note that the 'Tavi implementation does not include the logging of spam attempts. This due to the simple reasoning, that I didn't want to change the database tables. I do however challenge someone to make a patch which would allow for this logging, with a description on how to add the necessary tables and stuff.
Thanks for implementing the anti-spam feature Even, and for giving me credit in the source. Your implementation looks good. I will look at the code in more detail, try it some more, and then add feedback here. - JaySheth?
Some thoughts on the spam issue:
{ip blacklists are of very limited effectivness, it is very easy to change your ip. in fact most isp's use dynamic ip's which change frequently. Blacklisting ip's has the side effect of potentially blocking legitimate users who happen to get the same ip that was previously used by a spammer - therefore expiration time on ip blocking should usually be short, unless there is a pattern of repeated attacks from the same ip. But consider for instance a large school with an internal network and a single public ip, one virus infected computer blocks the whole school.}
The current solutions requires the user to accept cookies without telling him so. If you have cookies disabled you just return to the edit page when saving as if you had entered a wrong captcha. The anti-spam system should work without cookies or at least should warn the user to accept them.
Anyone got an idea why we are still being spammed. Or how? How do they circumvent the gotcha stuff? Anyone got any idea? Is the gotcha idea proven to be unuseful? Do we need other measures?
Your text based captcha won't stop a determined spammer. Being text based it is quite a bit of work but not too difficult to write a program that can decipher your captcha. Even image based captchas are not immune, a bunch of computer science students at MIT had a contest and turned in some impressive results with image recognition programs. Also, if there is enough money in it, some spammers have resorted to hiring cheap labor to actually sit at keyboards and manually spam a site, though fortunatly this is still pretty rare. none-the-less a text based captcha is sufficient to stop most of the spammers. Some sites have found that a question and answer format is very effective, such as: "What is two plus three?". This is especially effective if the questions require knowledge of a particular subject.
MediaWiki? seems to do this via IP banning and username revocation. While the aboves are good preventatives there are ways around everything. The Captcha is one of the harder things to get around but also very tough for humans to get through. If accessibility is a concern then captchas are just mean! For the most part the real problems are likely to be human. If we can ban IPs and usernames - perhaps even only from certain areas - we will do fine! - Frink