[IMC-Tech] Preventing HTTP based spam by Captchas and content
filters
Alster
alster at indymedia.org
Fri Jan 27 18:34:52 PST 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi!
John Milton schrieb:
> 99.3 (current version) does have it, we are being bombed too (hamilton)
> and have had to switch it on...
As previously mentioned by Mark, CAPTCHAs effectively lock impaired
users out. The amount of people you lock out can be reduced by making
use of both visible and audible [1] CAPTCHAs.
Visual CAPTCHAS can be defeated easily. A software written by students
at UC Berkely was able to defeat 92% of all tested (visual) CAPTCHAS in
2003 [2]. Another software called PWNtcha [3] is able to defeat many
common implementations, too, including those used by dadaimc when
consisting of digits only (it doesn't seem to do lowercase letters,
though adding this should be easy) [4]. Some CAPTCHA implemenations even
suffer from a faulty design which makes them susceptible to replay
attacks [5].
There are more reasons not to use CAPTCHA, as outlined in the OWASP
guide [6].
Whoever still wants to use it can find readymade implementations at
various locations on the web [7].
A better countermeasure may be 'whitelisting human behaviour', i.e.
finding out how humans behave and restrict your web applications to only
accept new postings within these perimeters. For example, no human who
is writing an article will submit her article within five second after
loading the posting form. No human will submit multiple articles within
a short period of time (same session ID). Every human will follow your
page navigation where page one can inform him about the moderation
policy and set a cookie and page two can be the actual posting form
which requires the other cookies to be set and have been set no longer
than 5 minutes ago. Some of these checks can also be implemented by
using a web application firewall, so a site/server admin is not
neccessarily dependant of the application developers here.
Of course, these logical checks can also be defeated easily, but those
circumvention techniques will have to be written for each codebase
seperately as each implementation differs. For spammers, it may not be
worth that for defeating IMC codebases' checks.
In addition, there is the content based approach. No human will only
post a set URL in an article, and known bad URLs (there are DNSBLs and
plain text lists of spamvertized websites) should not be posted anyway.
I'm not currently aware of an implementation using baysian filtering for
cleaning text posts on websites. I'm not sure why nobody did this yet,
maybe there's a good reason I'm missing, but I can't currently think of
one. Bayesia filtering still works pretty well for email, and email spam
is a much older problem than what is known as guestbook or blog or
comment or just http based content submission spam.
Alster
[1] http://community.livejournal.com/lj_dev/588129.html
[2] http://www.cs.sfu.ca/~mori/research/gimpy/
[3] http://www.pwntcha.net/
[4] http://www.pwntcha.net/test.html?file=20060128030934Eky5rP.png
This link may be offline, soon, as it links to a temporary page.
[5] http://www.puremango.co.uk/cm_breaking_captcha_115.php
[6] Look for "CAPTCHA" at
http://searchappsecurity.techtarget.com/originalContent/0,289142,sid92_gci1157286,00.html
Original guide (PDF) at http://www.owasp.org
[7] http://en.wikipedia.org/wiki/Captcha#Captcha_implementations
- --
GPG key
http://keys.indymedia.org/cgi-bin/lookup?op=get&search=05059C17
Fingerprint 1B8B 128F 8435 541C B3A5 1B7E CF5A 9D55 0505 9C17
All other http://docs.indymedia.org/view/Main/AlsteR
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
iD8DBQFD2thMz1qdVQUFnBcRAhQKAJ4oAlxbJTpbnc2YSwH76vht6obZvgCffTD1
uPEGi0wQhwuyUnu6c+ZoGQc=
=5DQt
-----END PGP SIGNATURE-----
More information about the imc-tech
mailing list