Monthly Archives: February 2011

How to spam this blog

1 minute, 40 seconds

As a follow up to last week’s post (How to comment on this blog), this week I bring you the results of the no-captcha test.

After much spam slipping through reCAPTCHA, I decided to nix a captcha all together. Originally I thought that just requiring a field via javascript and doing no server side checking would work. This was silly of me, of course. The spammers, having the source code of WordPress, would just blindly submit a comment to any post, bypassing any client side JS checks I had in place.

The fix was to create a field that was not known to spammers like the reCAPTCHA is. Further, if it is appended via javascript, then it is even harder to automate. I wrote the simple-math plugin (have a copy!) and implemented it as follow:

  • Turn off reCAPTCHA
  • Add a field via javascript
  • Ask a simple math question, validated in client side JS
  • Only validate that the field exists, not that the math is right, on the server side

The jury is, and I’m fully vindicated. Here’s the stats:

Hits Comment
Attempts
Comment
Succeses
Attempts
per
Visit
Defense
Success
Rate
Feb 6th-12th 1191 57 17 4.79% 70.18%
Feb 12 11pm – Feb 13 10am 58 20 13 34.48% 35.00%
Feb 13th-Feb18th 1204 132 0 10.96% 100.00%

#spamstats td, #spamstats th {padding:4px;margin:5px}
#spamstats td {text-align:center;}
#spamstats tr:hover {background:#ccc}

The important thing to note is twofold. The first is that the average number of raw hits (excluding me, yahoo and google) was the same week to week. Further, the number of attempts went up 200% of which 100% were thwarted (Defense Success Rate). Again, I suspect this is all possible because it’s not easy, nor worth while (it’s OK, plip isn’t a big blog, I know…sniff) to automate spamming against one off solutions like mine.

I should note that I used the free version of Splunk to garner the ad hoc stats for this post. As I was hemming and hawing on whether to count cookies or IPs or hits, it wasn’t worth while to use the old school command line style stats. Splunk scoffs at this level of stats and reporting. Really, it’s above it, but will happily crank out what you ask for it with ease. Here’s a purty graph:

Caveat Emptor: I work at Splunk.

How to comment on this blog

1 minute, 20 seconds

It seems that reCAPTCHA is a victim of its own success. Y’all know I’m a huge, huge fan. However, recently the spammers have started to submit comments, successfully getting past the reCAPTCHA . I suspect this is a mechanical turk or some such tomfoolery. Of course the comments don’t get approved, but they’re still a bother to have to delete.

Our friend over at hanskellner.com ( guess which friend?) also has the same problem with submitted span. This makes it clear that reCAPTCHA is being targeted (well, not clear, but it’s better than n=1!). However, he found a solution to stop the spammers. He added a static math question to his comment form. That is, it’s always “what is 5 + 6”, never any other question. Funny enough, his spam stopped all together. He still has his reCAPTCHA giong, but now it’s a two factor anti-spam.

I posit that the reCAPTCHA code is easy enough to programmatically detect, but some random math question isn’t, so it breaks the spam scripts. Let’s test this theory, shall we? I’ve just written a word press plug-in called simple-math. Using a simple to hack, all client side javascript there’s now an easy to solve math problem on the comment form. It is random, choosing two numbers between 0 and 9. I haven’t tested it too broadly, but you’re welcome to a copy.

I’ll let it run for a week and see how it goes and report back.

Feb 13th Update: I fought the law, and law won! Spammers got past round one of simple math. I’ve updated it to now check for the existence of the field on post, but still, no checking for a right answer on the server. As well, the field is created via javascript. Spammers, back to you for round 2.