Back to previous post: Further Lolita alert

Go to Making Light's front page.

Forward to next post: “—but that would be wrong (click!)

Subscribe (via RSS) to this post's comment thread. (What does this mean? Here's a quick introduction.)

October 12, 2003

More porn spam
Posted by Teresa at 01:17 PM *

Now it’s “Preteen”. I’ve gotten hit with ten comments so far. I’ve locked them out. They’ve spammed porn advertisements into my Christmas post, and my post about Tim Maroney’s death. This is vile and indecent.

Patrick, from the next room, just said they hit thirty-eight of his topic threads before he got them blocked.

Grrrrrrrrrr.

Block them now: 209.210.176.20

IMPORTANT ADDITIONAL INFO: adamsj says:
Since we’ve gotten attacks from multiple addresses in this range (I’ve had three different ones in the last day), block the whole range, thus, as the MT Support Forum notes:

209.210.176.

That’ll get ‘em all, for now.

Make sure you include the period at the end of the address.

The savvy Erik Olson has been trying to find a way to block this spammer’s range of IP addresses (209.210.176.0 - 209.210.176.63) without blocking unoffending addresses in the adjacent ranges. However, having now looked into MT’s source code, he says:
They [Movable Type] are just doing matching, not netmasking. … You have two choices.

1) Block 209.210.176. which blocks the whole 255, or

2) Block 209.210.176.1, 29.210.176.2, all the way to 209.210.173.63, individually.

That is, if you don’t want to have to block unoffending addresses, you have to block 209.210.176.1, 209.210.176.2, 209.210.176.3, … 209.210.176.63, inclusive, as sixty-three separately blocked IP addresses.

Fast-breaking information is likely to show up in the comments following this post, so keep an eye on them.

UPDATES: Matt, over on Electrolite, reports that the guy is also using 62.42.228.6 , so block that one too.

In a message thread on Where Worlds Collide, Scott of The Gamer’s Nook also identified 199.20.16.200 as an IP address belonging to this spammer.

Erik Olson has dug up a related Malaysian spammer: 219.95.14.69

Mary says this guy — no evident relation to the Lolita and Preteen spammer — has been posting ads for Viagra and hardcore in the comments of Pacific Views: 80.50.117.113

Wink advocates using this approach.

Movable Type’s authors discuss the issues, problems, and some possible answers to comment spam.

Joe Katzman’s Winds of Change has further suggestions on what needs to be done, and how to do it.

Mitch Wagner has commended Yoz Grahame’s piece in Wistful Chocolate, Seven Quick Tips for a Spam-Free Blog.

Comments on More porn spam:
#1 ::: Ampersand ::: (view all by) ::: October 12, 2003, 01:29 PM:

Thanks for the info!

#2 ::: Yonmei ::: (view all by) ::: October 12, 2003, 01:31 PM:

Bastard!

You've got his postal address? Sign him up. To credit card offers. To magazine subscriptions. To... well, anything that comes through the door. If all your American readers each signed him up to 4 items apiece, how much mail would he end up getting on a daily basis?

#3 ::: David Bilek ::: (view all by) ::: October 12, 2003, 01:34 PM:

The address on the WHOIS lookup is *not* necessarily accurate. People lie on those all the time.

That IP block is owned by SISNA CORP out of Salt Lake City. SISNA contact numbers are:

Main Office Voice/Fax: 801-415-8145
Customer Service: 801-924-2600

#4 ::: adamsj ::: (view all by) ::: October 12, 2003, 01:35 PM:

I feel like a spammer myself, repeating this, but:

Since we've gotten attacks from multiple addresses in this range (I've had three different ones in the last day), block the whole range, thus, as the MT Support Forum notes:

209.210.176.

That'll get 'em all, for now.

#5 ::: Erik V. Olson ::: (view all by) ::: October 12, 2003, 01:40 PM:

testing, ignore

#6 ::: Erik V. Olson ::: (view all by) ::: October 12, 2003, 01:42 PM:

And again... Ignore

#7 ::: Teresa Nielsen Hayden ::: (view all by) ::: October 12, 2003, 01:49 PM:

j, that includes the period at the end, yes?

#8 ::: adamsj ::: (view all by) ::: October 12, 2003, 01:54 PM:

Teresa,

According to the MT Support Forum, yes.

#9 ::: adamsj ::: (view all by) ::: October 12, 2003, 02:05 PM:

Here are the relevant posts:

http://www.movabletype.org/support/index.php?act=ST&f=8&t=29039&hl=ip+banning&s=1b59ac7279842913990f7ef2d33172b7

http://www.movabletype.org/support/index.php?act=ST&f=12&t=26932&hl=ip+banning&s=1b59ac7279842913990f7ef2d33172b7

http://www.movabletype.org/support/index.php?act=ST&f=12&t=28997&hl=ip+banning&s=1b59ac7279842913990f7ef2d33172b7

There is some doubt in my mind whether they're saying add these through MT IP Banning or directly into .htaccess, but I see no harm in doing both. I used the GUI, and have yet to be hit again.

#10 ::: adamsj ::: (view all by) ::: October 12, 2003, 02:06 PM:

Oh, and ampersand--you've got a very dark sense of humor.

#11 ::: Teresa Nielsen Hayden ::: (view all by) ::: October 12, 2003, 02:10 PM:

Ampersand's right. Bloggers who block fast can save themselves a lot of cleanup.

Next: Teresa goes off and deletes thirty-eight spams in Electrolite's comment threads.

#12 ::: Erik V. Olson ::: (view all by) ::: October 12, 2003, 02:14 PM:

.htaccess is the way to go, simply because you can use netmasks or CIDR to limit what you cut, and you don't have to type in a few dozen IPs.

It's dependent on htaccess configuation being turned on. Contact your sysadmin if this doesn't work. Otherwise, create the following file in the root of your webserver, named ".htaccess". The leading dot is important.

<Files *>
Order Allow, deny
deny from 209.210.176.0/26
</Files>

Or, if you prefer netmasks to CIDR
deny from 209.210.176.0/255.255.255.178

Kip's spammer is a 16 bit netblock (an old Class B) issued to Telecom Malaysia. If they are using DCHP, and this spammer's dialing up, the only way is to kill the whole Class B -- over 16,000 addresses. You can do that in one line

deny from 219.95.0.0/16
or
deny from 219.95.0.0/255.255.0.0

How much extra damage this will do, I don't know. You may want to try just the one ip first, if you keep getting hit from the 219.95.14.0 block, expanded it to the /24, if that doesn't work, go for the whole /16

And, just so you know, one way to give up is.

deny from 0.0.0.0/0

Though, tecnically, the trailing /0 is redundant.


#13 ::: adamsj ::: (view all by) ::: October 12, 2003, 02:22 PM:

Erik's solution is best, if you can get your web hosting company to do it. (There is an .htaccess file in the Moveable Type installation, but it is not the same as an Apache .htaccess file.)

#14 ::: adamsj ::: (view all by) ::: October 12, 2003, 02:33 PM:

I think I was wrong about the last parenthetical--the .htaccess in the webdocs/logs directory does appear to be an Apache-style .htaccess file.

However, it is 444 and owned by root, so:

Erik's solution is still best, if you can get your web hosting company to do it.

#15 ::: catie murphy ::: (view all by) ::: October 12, 2003, 02:52 PM:

Ok, my web hosting company is generally very good about responding to this kind of thing when I tell them about it, but they always ask exactly what the solution is, and I'm not a hundred and ten percent sure of what part Erik is suggesting to ask the hosting company to do. Do I specifically want to ask them to add:


Order Allow, deny
deny from 209.210.176.0/26

to the .htaccess for their servers?

-Catie, who understands just enough to be dangerous :)

#16 ::: qB ::: (view all by) ::: October 12, 2003, 03:40 PM:

My (gorgeous) host already has MT Blacklist installed and so it's operational over on my site. But not before I had to delete numerous large mammery glands and tweenies from my comments.

It seems to have a long, but manageable, list of things like, to take the Nabakov example, "Sisnai" which traps the whole netblock (or something).

I'll find out when I get up tomorrow if it works.

#17 ::: qB ::: (view all by) ::: October 12, 2003, 03:47 PM:

Actually, it's his existing plug-in apparently. But it's a good stop-gap. Not long for the programme though.

#18 ::: John Cole ::: (view all by) ::: October 12, 2003, 04:07 PM:

Thanks so much. This is so irritating.

#19 ::: Erik V. Olson ::: (view all by) ::: October 12, 2003, 04:11 PM:

Catie --

No, you don't need them to add it. You can upload the .htaccess text file in the same way you put your website there -- usually, FTP, but whatever tool you used will work.

It goes into the same directory that the main index.html file goes into.

BTW, you can't just type < and > into these comments, since they render HTML. HTML tags are marked by angle brackets, so they "render" into nothingness. Instead, you have to use HTML entities for them -- "&gt;" is > "&lt;" is <.

All html entities take the from of &FOO;. ‡ The ampersand marks the start, the semicolon, the end, and FOO describes exactly which character you need.

And, yes, you can't just type an ampersand here, you use "&amp;" You can find a comprehensive list of them here.

‡ If you do a view source, you'll discover how to do footnotes -- and that the FOO part *is* case sensitive. †

† See? &Dagger; is different than &dagger;.

#20 ::: Scott ::: (view all by) ::: October 12, 2003, 04:28 PM:

Thanks for the additional IPs, Teresa; all duly logged into the banning software.

I couldn't believe I was nailed so badly by Preteen (89, for those who don't read my site). Lolita still has residue I'm just too tired to remove at the moment.

#21 ::: adamsj ::: (view all by) ::: October 12, 2003, 04:34 PM:

Erik,

I'll be darned--I did not know that the .htaccess file in webdocs would work like a regular .htaccess file (I am a database guy, not a webserver guy), but it sure enough does.

I did know that someone smarter than me would be along to fix the problem, though--thanks!

Everyone else,

Wherever Erik's comments and mine contradict each other in this thread, listen to him, not to me.

#22 ::: catie murphy ::: (view all by) ::: October 12, 2003, 04:42 PM:

Erik--

Actually, I understood how to do it for my own site. I was just trying to grok if there was in fact something I could tell my hosting provider that they could do on their level so that the twelve zillion other people who host with them wouldn't have to edit their own .htaccess files.

*embarrassed look* Er, I knew the comments wouldn't render the brackets. I just didn't think about it when I posted. *slinks off* :)

Thanks for the help. :)

-Catie

#23 ::: Eric Chapman ::: (view all by) ::: October 12, 2003, 05:24 PM:

Okay, here's a question. Judging from what you've hinted about them advertising, if you have their address etc, why not report them to the feds? This is the sort of thing they LOVE throwing people in jail for: it gets them all sorts of good press.

#24 ::: Graydon ::: (view all by) ::: October 12, 2003, 05:29 PM:

It's not just porn spam; there's a "jewelry store" gibberish post with lots of links in the Iraq Antiquities comment thread.

It's enough to make one have fond thoughts about very large orbitting Xray lasers.

#25 ::: Tim Hall ::: (view all by) ::: October 12, 2003, 06:31 PM:

Blogdex is listing the victims so far; a lot of Big Names.

Hopefully, if "The Power of the Blogosphere" can destroy lying journalists and racist politicians, it should be able to grind a scumbag spammer into the dust.

#26 ::: Charlie Stross ::: (view all by) ::: October 12, 2003, 07:08 PM:

More to block ...

After Feorag's prattle got hit, I blocked access to the cgi-bin directory for the known offender IP addresses (in Apache). I then renamed mt-comment.cgi to something less obvious.

Examining the server logs since then, I discover (a) lots of attempts to spam being blocked, and then a change of strategy: a number of hosts are (b) trying to call mt-comments.cgi by name and being rebuffed. As mt-comments.cgi is no longer pointed to by the prattle blog, these can only be robots looking for a Movable Type installation.

The offenders are:

38.144.36.13
65.214.36.118
203.54.241.113
4.63.166.229
12.148.209.198
66.196.90.39
62.42.228.6
82.41.201.108
12.148.209.198
68.194.33.229
81.131.176.87

If you're doing IP-level blacklisting, have fun.

#27 ::: --kip ::: (view all by) ::: October 12, 2003, 07:20 PM:

Gack.

First, thanks are due to Erik for digging up the Malaysian spammer, who appears to be much less... resourceful than Lolita Long-in-Tooth.

Second: does anyone know why they keep hitting the same posts over and over? All three waves (Lolita, Preteen 1, Preteen 2) have hit the same five posts (plus others), over a range of dates, some with comments already and some without, all within my Comics category. --I mean, I suppose you'd have to snag a copy of the bot and look at the code; I'm just pondering idly while hosing down my weblog for the second goddamn time today.

#28 ::: Glen Engel-Cox ::: (view all by) ::: October 12, 2003, 07:53 PM:

One person mentioned MT-Blacklist already, but here's the URL for it for those interested.

http://www.jayallen.org/journey/2003/09/killing_comment_spam_for_dummies

Jay's going to be releasing the full plug-in tomorrow, and it looks like a pretty nice solution, especially as the spammers start to modify their IPs. MT-Blacklist will stop the posting of the live URL itself.

#29 ::: Jenn M Lee ::: (view all by) ::: October 12, 2003, 07:54 PM:

Just a quick word of thanks for sharing all this information--and the comment letting me know about it.

Making Light--more than good fun!

#30 ::: Glen Engel-Cox ::: (view all by) ::: October 12, 2003, 07:56 PM:

Doh! Read the other entry and see you've already found Jay's site.

(Must Remember: Read Entire Site Before Posting.)

#31 ::: Erik V. Olson ::: (view all by) ::: October 12, 2003, 08:06 PM:

As mt-comments.cgi is no longer pointed to by the prattle blog, these can only be robots looking for a Movable Type installation

Which is what I was wondering. I'm still wondering how they spider the site -- I'm trying to find a trap that I can install that doesn't involve rewriting parts of MT.

One defense is to rename mt-comments.cgi, call only it, and install a trap cgi that automatically blocks anyone calling mt-comments.cgi.

Problem: Anybody who follows an old link to your comments will get blocked.

Renaming the mt-comments.cgi alone may be enough to stop this spammer's bot -- but it may not last long.

To rename it, you'll need to rename the mt-comments.cgi file, then you'll need to edit your template to change the call. Look for the OpenComments javascript, in it, there will be a "window.open" call that should have, as a first parameter, a URL ending in mt-comments.cgi. Change that to match what you've renamed the comments cgi.

Easiest way to not lose.

1) Copy mt-comments.cgi to something else. Make up a name. I'm deliberatly *not* giving a name here, I want all of you to have different ones. (If the spider is looking for names, and you all change to the same name, he'll change the spider.) Make certain it ends in .cgi, though.

2) Edit your template, change the "window.open" call to the new name. On this blog, the function is right at the top, I don't know if that's universally true. Save it off.

Reload the page. Make sure comments still work. Now.

3) Rename mt-comments.cgi to mt-comments.off.

Reload page, make sure comments still work. If they do, then you're done. If they don't, rename mt-comments.off back to mt-comments.cgi, and check your templates to make sure you've changed the OpenComments function.

I wish I could test this, but I don't run MT or have a blog, I'm just a sysadmin. As Knuth famously wrote, "Beware of bugs in the above code; I have only proved it correct, not tried it."

The truly paranoid would back everything up first. The properly paranoid would make sure that the restore worked, as well.

#32 ::: Erik V. Olson ::: (view all by) ::: October 12, 2003, 08:18 PM:

Hmm. Charlie -- anyone with a cached version of your page who hit a comment link would have tried to get the old mt-comments.cgi. I wouldn't condem those IPs without further proof. At least three of them are DSL lines.

cat baby bathwather > /dev/null

#33 ::: Bruce Baugh ::: (view all by) ::: October 12, 2003, 08:19 PM:

I assume I'm not alone in finding this a somewhat boggling combination of enterprise and pathetic futility on the spammer's part. Surely the return on random links in blog comments must be low?

#34 ::: Erik V. Olson ::: (view all by) ::: October 12, 2003, 08:58 PM:

I bet it's worse that for email spam. The question, however, is how much worse -- and how much effort did the spammer spend on the bot?

If it took him five minutes, and he gets one hit -- he won.

#35 ::: Graydon ::: (view all by) ::: October 12, 2003, 09:04 PM:

Anything that shuts down liberal blogs is probably worth money in the right quarters, too; this doesn't have to be directly commercial to be a net win for someone, especially if the existence of the porn links is an offence under some child pornography statute or other.

Anyone got a distribution-by-political -proclivities of this particular spamming activity?

#36 ::: Mary ::: (view all by) ::: October 12, 2003, 09:35 PM:

I found one more porn comment on Pacific Views from someone named Klaus:

80.50.117.113 (klaus -- selling viagra and hardcore sex)

He left one two days ago.

#37 ::: colin roald ::: (view all by) ::: October 12, 2003, 09:59 PM:

A couple of comments I haven't seen anybody make here, yet.

One, the purpose of a comment spam is different from that of an email spam. Comment spammers don't want *you* to visit their site, necessarily. They want *Google* to know that mighty Making Light thinks that their site is the place to go for anyone searching for "lolita". So the spammer is perfectly happy to sneak their stupid comment into the most ancient unread thread on Teresa's site.

Two, for those comfortable with a bit of unix command line hackery (and I know this isn't most people, but I might as well say it): comments can be deleted automatically with curl. Direct deletion from mysql is probably better, if you can, but for reference, here is another way:

bash% for ((i=84; i<=98; i++)); do 
> curl --cookie 'user=colin%3A%3ANEkSDEADBEEF%3A0' \
   "http://www.yoursite.org/cgi-bin/mt.cgi?__mode=delete&_type=comment&id=$i&blog_id=1&submit=Delete" ; 
> done

The above deletes the fifteen comments with ids between 84 and 98 (inclusive), which worked for me because the spammer's comments all came in a block with nobody else in between. You'll need to change the user auth cookie to your own, and possibly also the blog_id. Blog_id you can get just by looking at the URLs for your MT admin pages. You can get your user auth cookie from your cookies.txt file, or with curl again:

bash% curl --dump-header tmp-headers \
    'http://www.yoursite.org/cgi-bin/mt.cgi?username=colin&password=unguessable'

Then look in tmp-headers and copy the part after Set-Cookie.

Since all of the above is using HTTP calls to MT itself to do the deleting, it doesn't require shell access to your MT server. Any connected machine with curl will do as well.

#38 ::: pericat ::: (view all by) ::: October 12, 2003, 10:03 PM:

I tried to post this this morning and hit the dread "500" error. Perhaps I have an Offensive IP Address? :)

Anyway, Yoz Grahame has a good post up with several tips for blocking spam bots. I was seeing several Lolitas and viagras, et. al., and last Thursday put the first two into place: changing mt-comments.cgi to a different name, and relinking the comment form itself in the templates. More details on what steps I took here, which may be of use, I dunno.

#39 ::: James D. Macdonald ::: (view all by) ::: October 12, 2003, 10:04 PM:

You can use this generator to produce names to replace mt-comments.cgi with:

http://www.winguides.com/security/password.php

#40 ::: Marty Helgesen ::: (view all by) ::: October 12, 2003, 10:29 PM:

There is a group that will take reports of porn spam and forward them to the Justice Department and to your U.S. Attorney for possible prosecution.
http://www.obscenitycrimes.com/pub/IOCprivacy.cfm

#41 ::: Mean Dean ::: (view all by) ::: October 12, 2003, 11:15 PM:

I wish I could find out who these slimey 'bastages' are ...

... I came home from church to find 32 such messages ...

I know I said this recently in another comment on your blog (sorry repeating myself) ... but sometimes I wish I weren't so straight-laced. Otherwise I might consider a solution I learned on slashdot recently:

http://yro.slashdot.org/comments.pl?sid=77014&threshold=-1&commentsort=3&tid=111&mode=thread&cid=6855944

Instead, I'm just resigned to volunteer and to help Jay test his application on an older version of MT.

#42 ::: Kevin J. Maroney ::: (view all by) ::: October 12, 2003, 11:36 PM:

Someone needs to be working on the boilerplate terms-of-use for MT installations which allows people to sue for theft of services when these things happen.

#43 ::: Perfectly Sassy ::: (view all by) ::: October 13, 2003, 12:18 AM:

I got hit by the preteen spam. First e-mails that came in were the ones posted by "Preteen" on my blog. What a way to start the day.

I noticed that the IP numbers are in chronological order.

Lolita used 209.210.176.20
Preteen used 209.210.176.21

I wonder if there's any significance to that...

#44 ::: Charlie Stross ::: (view all by) ::: October 13, 2003, 06:37 AM:

I've been doing some thinking, and I think we are misunderstanding the nature of the problem.

On the social engineering front: Colin Roald is right, the purpose of blog comment spamming is to bomb google and other search engines. If the search engines stop indexing blogs, the spammers will (probably) go away. On the other hand, there's a cost associated with that benefit ...

Meanwhile, on the software engineering front, the current outbreak of blogspamming can be attributed to two things: a nascent software monoculture and an external API that invites antisocial behaviour.

The monoculture is, of course, Movable Type. Not that it's the only blogging tool out there, but it is notably becoming more and more popular, especially among those people who run high-volume blogs themselves. It is a better target for spamming than, say, Blogger or Livejournal because local MT installations have local administrators who may not be paying much attention to what happens down in the old article comments, whereas the big hosted blogs have got administrative staff and software developers in-house.

The API problem is simple: mt-comment.cgi provides a regular, robust tool for comment posting that expects a list of arguments with fixed names in a set order. It's trivial to write a robot that will feed mt-comment spam -- in fact, I think I could probably do so in a day or thereabouts (in the past I wrote spiders and I've hacked perl for a living; this stuff isn't rocket science).

How can we solve either problem? The monoculture problem or the API problem?

I'm going to focus on the API problem because it's easier.

In a CGI application, the application throws up a page of HTML containing a form. Each field in the form has a name. When you hit the "submit" button, your browser bundles up the data you've entered into those fields and sends them to the CGI application on the server, which receives them as fieldname/content pairs.

All the robot has to know is that URLS ending in mt-comments.cgi expect paramenters called "name", "email", "text" and a couple of other things and when it spiders a URL ending in mt-comments.cgi it should submit a POST request with the payload attached.

Now. What happens if we employ one-time fieldnames?

A one-time fieldname works like this: our comment system expects the user's name, email, text, and so on. But each time it issues a page, it (a) generates a random number key, which it embeds invisibly in the form, (b) generates a random string for each field name, (c) stores these strings in a database along with their corresponding field name, keyed off the random number (which must be unique), and (d) uses these random numbers as the fields in the form. When the user (or spider) sends the filled-in form back in a POST request, it will include the session key; the comments script can then figure out which random string corresponds to which field. A naive spider will not know that "sfggnjqwrthn" is a synonym for "text", and will therefore fail to post the comment.

The first level of retalliation the spammers will employ is to say: "aha! But the fields come up in the same order on the comments form every time! So we simply fill in the fields, whatever they're called, in the order they appear on the page."

The comment system therefore needs to re-order the fields in the form randomly each time it's called. (This may mean replacing a template-based comment form with one generated by software. Not difficult.)

The second level response by the spammers will be, "aha! They may be using random field names and scrambling the order of fields, but the human readable captions remain constant. So I just write a teensy little parser and my AI-augmented spider will be able to continue to spam ..."

This raises the barrier considerably. It can also be defeated in a couple of ways. I personally have a strong dislike for using GIFs of letters that a human has to recognize because, although it works very well, it discriminates against the visually handicapped. But the interesting recent finding about humans recognizing English text despite mis-spellings by the shape of words suggests a way to do this: have the form generator script creatively insert typos in some of the form captions. We can also provide it with a list of synonyms ("Comments:" can be replaced meaningfully with "Inscribe here:", "Vent:", "Tell me all about it:", "Write:", "Pub your ish:" ... none of which functional synonyms contain the same words).

I'm not currently aware of a CGI front-end module that does all this obfuscation, and it would be a pain to write (I figure a week or two to do the job properly). But if written, it would raise the bar to comment-spamming spiders so high that they'd need to be near-as-dammit AI-complete to be able to cope with the morphing, mis-spelled, one-time-fieldname-wielding comment forms.

So, three questions:

1) Does anyone know if I'm reinventing the wheel here?

2) What have I missed?

3) Do we have any perl programmers in the house who'd like to do a public service by writing a module called, say, CGI::Spamproof, or to build equivalent functionality into mt-comment.cgi?


(PS: I'd post this in my own blog, but I'm currently leaving a rather important rant front and centre for a couple of days.)

#45 ::: spacewaitress ::: (view all by) ::: October 13, 2003, 08:17 AM:

Anything that shuts down liberal blogs is probably worth money in the right quarters, too; this doesn't have to be directly commercial to be a net win for someone, especially if the existence of the porn links is an offence under some child pornography statute or other.

Anyone got a distribution-by-political -proclivities of this particular spamming activity?

Graydon: I'm glad I'm not the only one paranoid enough to have thought of this! Many of the blogs who got hit by this bastard mentioned they were considering shutting down their comment threads or closing up shop entirely.

What if the purpose of this spam is not to advertise porn, but rather to shut down a nascent, particularly effective form of political discourse?

#46 ::: adamsj ::: (view all by) ::: October 13, 2003, 09:13 AM:

But--but--but--if we shut down our weblogs, then the terrorists have won!

#47 ::: J Greely ::: (view all by) ::: October 13, 2003, 02:08 PM:

Anyone interested in cross-referencing between sites who were hit by this spam and the first hundred or so sites that Google returns when you search for "mt-comments.cgi"? I'll wager that this is a better explanation than "targeting popular liberal blogs"; Google knows whose comments are most likely to be linked to.

This suggests that the next step for protecting yourself is to have your ISP add all of the standard blog URLs to the robots.txt file.

Makes me glad I set MT up with mod_perl, so there's no .cgi to betray me.

-j

#48 ::: Tim Hall ::: (view all by) ::: October 13, 2003, 02:30 PM:

Don't think is specifically liberal blogs; the hard right liberatian Samizdata got hit badly as well.

This incident was the Blogosphere's equivalent to Usenet's Cantor & Seigal "Green Card" spam all those years ago.

#49 ::: James D. Macdonald ::: (view all by) ::: October 13, 2003, 04:34 PM:

Now that one person has figured out how to do it, it won't be long before the others follow?

Before long we'll be snowed under with drifts of Nigerians seeking bank account numbers, folks wanting to tell us about cheap printer toner, how to increase our penis and breast sizes while decreasing our waistlines, home mortgage offers (no credit? no problem!) and nudges toward Hot Times with Underage Hamsters.

Sites without active and able administrators will be lost first. How long will we be able to cope when the active and able administrators can't take a week's vacation, can't go to sleep, can't go to the movies, without coming back to hundreds of posts with links to Generic Viagra?

We'll be going to Registered Users Only, with email verification, passwords, all that. Real soon.

The Balkanization of the 'Net continues. More and more gated communities get built. Tragedy of the Commons continues.

It was nice while it lasted.

#50 ::: Kathryn Cramer ::: (view all by) ::: October 13, 2003, 04:55 PM:

I got one from soneone w/ the byline "David" who posted a superficially legit comment w/ a viagra link in the comment section of on an old post.

About 36 hours later, I got 2 from Lolita, one to the same older post and the other to a new one. It was my perception that Lolita had noticied that I did not immediately delete Dave the viagra guy. I deleted all three and then realized that I should have banned their IP addresses immediately afterwards. (All this happened late last week.)

#51 ::: Claude Muncey ::: (view all by) ::: October 13, 2003, 05:57 PM:

Charile:

1. I don't think so -- but I am not a MT blogger at this point.

2. Just maybe. I'm trying to think of a scheme for altering the number of fields that does not completely confuse the user . . .

Also, I think that the strongest defenses will have to be built across sites, not one at a time. One of the keys to the response to this jerk has been the sharing of information between MT bloggers, generally through email and posts on each other's blogs. I wonder about an approach that scans comments across a number of sites simultaneously, perhaps from the database end (more feasible where a number of sites are being hosted on one machine, but still possible on separate ones) to examine and catalog posts with a suspicious pattern to provide a more intelligent early warning system. For the blogger running MT on an individual system consider a encrypted connection to a server that would collect suspicious posts, and push out IP's to ban, similar to some commercial anti-virus software. This might help the MT admin who cannot get to their site for a number of days. Just some initial thoughts.

What we are looking at here is building an immune system for blogging.

3. I wish. My perl is limited to sysadmin/dbadmin work on Win32 systems.

#52 ::: Alan Bostick ::: (view all by) ::: October 14, 2003, 12:11 AM:

Charlie Stross: How much of a religious war would I be starting if I were to suggest that the task is sufficiently complicated that it is better implemented in python than in perl?

#53 ::: Charlie Stross ::: (view all by) ::: October 14, 2003, 05:12 AM:

Alan Bostick: none at all. I think the particular project needs to be implemented in Python ... and Tcl, and Ruby, and BASIC, and COBOL, and ASP, and every other language anybody uses for processing form-entered data via the web, not just in the Perl CGI environment.

It's just that when you've got a hammer, every problem resembles a nail, and my swiss army pneumatic hammer of choice is Perl. (Which is also probably the most widespread CGI programming language, and therefore the place where an easily available form-obfuscating drop-in replacement for the standard CGI module would do the most good in the shortest time.)

#54 ::: Glen Engel-Cox ::: (view all by) ::: October 14, 2003, 09:25 AM:

The wonderful thing about being on MT is that there's a community of developers who jump in when things like this happen.

Jay Allen's MT-blacklist is now released. It's easy to install (copy three files to three locations in your MT directory structure, CHMOD one of the files, then run configure where you basically press one button), and there's no mods to the MT source code or templates necessary.

I dropped a ten-spot on Jay as a small thank-you for saving me hours of deleting this crap in the future.

#55 ::: James D. Macdonald ::: (view all by) ::: October 14, 2003, 10:38 AM:

Here's why the spammers win:

Ever play Space Invaders?

Shooting down any individual bug is pretty easy.

The game always wins.

How? The bugs keep coming faster and faster, more and more of them, until you're overwhelmed. Game over.

The spammers are like that. More and more, faster and faster, morphing all the time. The trojans and worms and viruses have been turning ordinary home machines out there into spam boxes. More and more of them. More and more IP numbers. More and more variants. More and more spammers. All working feverishly to put their message in front of you, because if 0.00001% of their messages get a response, they win. There's money on the line. It's their full-time job. Some of them really know the 'net and networks because they were engineers laid off or fired in the dot-bomb. Others are talented amateurs. They talk to each other. They listen to us planning. We have to guard every point. All they have to do is find one vulnerability.

Game over, dudes. It's coming.

#57 ::: Patrick Nielsen Hayden ::: (view all by) ::: October 14, 2003, 11:21 AM:

Jim, far be it from me to discourage a sysadmin from the duty of predicting the imminent demise of all that is good and fair, but in fact I suspect your Space Invaders model might not be quite so inevitable.

Weblog spam isn't like email spam. They aren't trying for our attention; they're trying to get their URLs onto sites with Google PageRank mojo, in order to boost their own PageRank.

They're doing it because they can--because Google PageRank is hijackable this way, and because the MT weblog world has come to constitute a highly structured and predictable universe of richly-linked sites sharing similar internals. Thus their robots can merrily course from weblog to weblog, invoking mt-comments.cgi everywhere they go.

None of these conditions is written in stone. We may not have to plug every single hole; it may be sufficient to simply make it harder to game your average MT-based site. It's conceivable that the problem may be addressed from the Google end as well. PageRank isn't an open standard; Google has every reason to tweak it in order to defeat self-serving hacks, and they frequently do. Yes, of course there are criminals in the world, and of course some of them can deploy massive resources toward bad ends. But it's not written in stone that weblog comment sections need to be the most attractive low-hanging fruit in sight.

Most to the point, it's striking to see you, of all people, moaning that no good can ever prevail and everything will come to grief. Well, maybe this system or that system will come to grief, but some people will manage to do good things with it as long as they can, and then they'll build different systems, because that's what people do. And the fuckin' grave is not our fuckin' goal. Now have another cup of coffee!

#58 ::: Teresa Nielsen Hayden ::: (view all by) ::: October 14, 2003, 11:42 AM:

Want to see what we're up against? Go here.

I also Googled up the string ["What a nice site, you know :)" Underage], that being the name and spam-comment used in the third wave. I got 223 hits, which has to be low because Google won't have spidered them all yet.

If webloggers want something they can all do in unison on Fridays, I suggest that conducting mass tests of comment spammers' bandwidth would be a great community-building exercise.

#59 ::: J Greely ::: (view all by) ::: October 14, 2003, 01:28 PM:

Yeah, what he said. MT was targeted for two basic reasons: a lot of popular blogs use it, and Google cheerfully indexes mt-comments.cgi and sorts the best targets to the top (or at least it did; a search that returned 22 million hits yesterday returned only 2 million today, so they may already be plugging that hole for us).

It was easy to hit MT blogs, but like telemarketers calling at all hours, they overdid it, hitting so many people so hard and fast that it became a must-fix problem, and not everyone used the same fix. Worse, (almost) everyone who was hit has now figured out how to ban IP addresses, delete lots of comments quickly, and shut off comments completely until it's over.

Now the spammers, who aren't wizard programmers themselves, have to go hire another clever-but-evil college student to code their next release, and he'll have to work harder. Maybe another blogging system will be an easier target this time. Maybe Google will simply stop indexing certain CGI scripts. Maybe someone will track down one of those clever-but-evil college students and make his life miserable, as an example to the others.

Not that I'd advocate any such action. Oh, no, never that.

-j

#60 ::: Jeremy Leader ::: (view all by) ::: October 14, 2003, 02:11 PM:

That's an interesting point J Greely makes about how these spammers over-did it. It seems like they would have been smarter to post smaller numbers of more varied messages to a wide variety of weblogs.

Think of it as a parasite keeping the host healthy: post too many spams in one place, and someone's going to figure out how to block you. Keep it at an annoying but sub-critical level, and chances are the blog admins will have other more pressing problems to deal with.

Teresa, that link to the Linux SysAdmins of SD blog back in March was amusing; it looked a lot like a snippet of a real Usenet or mailing list or BBS discussion. All it needed was a few "me too!" and "unsubscribe" posts to finish it off.

Overall, this is something that gives me faith in the overall goodness of the Universe: Evil so often is so stupid!

#61 ::: J Greely ::: (view all by) ::: October 14, 2003, 03:16 PM:

On a related note, a few years back one of my friends was interviewing an applicant, and he noticed a certain online ad company on his employment history. "So what did you do when you were working there?"

"I invented the pop-up ad."

He was no longer proud of this achievement.

-j

#62 ::: Teresa Nielsen Hayden ::: (view all by) ::: October 14, 2003, 06:33 PM:

I was guilty of pessimism. Hours later, Google's only found 225 sites that got hit with "Underage".

#63 ::: J Greely ::: (view all by) ::: October 14, 2003, 10:52 PM:

Sigh; just in case I was falling victim to a small sample size (knowing a little about how their server farm works), I just rechecked my Google search. It's back up to 21 million hits for mt-comments.cgi. It's possible that a real change is slowly propagating through their servers, but it's also possible that my earlier search was against a server that just happened to have poor results for that search.

One of the first-page results this time was an entry titled "blogspam countermeasures", that included a link to a free visual CAPTCHA system that adds the "type the obscured number correctly to add a comment" system to MT and other systems. It looks a bit simplistic, and if too many people started to use it it would become a target for a specific countermeasure, but for those comfortable with such things, it's out there.

I don't think the bar has to be set as high for comment systems as it does for Hotmail and Yahoo accounts, though. You don't need an AI-grade Turing test, and you're actually better off not using the same thing everyone else does: 1,000 small-fry all using slightly different tests aren't worth the coding effort.

I think a viable strategy would be a Sesame Street plugin for MT: "one of these things is not like the others". When you install the plugin, you create your own list of categories and objects, and when someone tries to add a comment, they're presented with a list of objects, and asked to choose the one that doesn't belong. No pictures, no fancy formatting, just a decent grasp of the language the blog is written in.

Actually, any challenge/response system that allowed each blog owner to create their own unique challenge would raise the bar enough to stop all but the most determined spammers, I think. You could just copy down questions out of a crossword puzzle book.

"What's a four-letter word for mass-posted commercial advertising?"

-j

#64 ::: colin roald ::: (view all by) ::: October 15, 2003, 12:49 AM:

J Greely writes: Actually, any challenge/response system that allowed each blog owner to create their own unique challenge would raise the bar enough to stop all but the most determined spammers, I think.

Yes, I think that's completely right. My own, which I just hacked into MT, is "Are you a spam robot?   () Yes   () No."

I fully expect this to work, as long as it's just me. If it stops being obscure, it's vulnerable to a random-guess strategy. Having to type a word is much more secure, but it means the question has to be quite carefully worded to make sure there is only one answer and that answer is obvious to everyone you want to let comment.

(The hidden-extra-field tactic described by Burningbird and others can be thought of as a multiple-choice question with one answer option. If the spammer knows the question exists, it can pretty much automatically guess the right answer.)

#65 ::: Paula Lieberman ::: (view all by) ::: October 20, 2003, 04:21 AM:

I wonder.... if a blog owner posts a "term of use" end user license, stating terms of use which include wording to the effect of "commercial solicitations posted for the purpose of directing blog readers to advertised URLs which having nothing to do with give and take of live natural persons holding discussions are prohibited. Posting on this blog constitutes acceptance of terms include... violators subject to legal action and recovery of costs for time and effort to remove offending commercial posts."

One could then take the slime to small claims court for breaking the contract and theft of service....

#66 ::: Alan Bostick ::: (view all by) ::: October 20, 2003, 12:23 PM:

Paula Lieberman: One could then take the slime to small claims court for breaking the contract and theft of service....

The costs of pursuing the claim would rival what you could recover in small claims court. Suppose you managed to get to Bismarck, North Dakota, (or wherever) to properly serve notice to the spammer. You get your date in court, and the spammer doesn't show up. Naturally, the judge finds in your favor.

Now it's up to you to enforce the judgment. Another trip to Bismarck....

(You could, of course, sell the judgment to a collection agency, at a deep discount.)

#67 ::: James D. Macdonald ::: (view all by) ::: October 20, 2003, 01:06 PM:

Maybe you won't collect against the guy (though I think you can do the Small Claims thing in your home jurisdiction). Enough of those by enough people and a pile of judgments against the guy will start showing up on his credit history, making his life miserable.

#68 ::: fdsfdsf ::: (view all by) ::: November 21, 2003, 05:45 PM:

gdfgdfgdfgdfg

#69 ::: James D. Macdonald ::: (view all by) ::: November 25, 2003, 12:22 AM:

http://www.cnn.com/2003/TECH/internet/11/24/spam.rage.reut/index.html

Male enlargement ads prompt spam rage

Monday, November 24, 2003 Posted: 12:05 PM EST (1705 GMT)
SAN FRANCISCO, California (Reuters) -- Call it spam rage: A Silicon Valley computer programmer has been arrested for threatening to torture and kill employees of the company he blames for bombarding his computer with Web ads promising to enlarge his penis.
#70 ::: Kate Nepveu (irony alert) ::: (view all by) ::: January 13, 2004, 12:45 PM:

Comment spam by eMule in a thread about comment spam!

#71 ::: wonderyak ::: (view all by) ::: February 16, 2004, 02:52 PM:

a good measure against blog commentspamming is to use image-checking... scripts are unable to see the picture, so they suck!

//regards, wonderyak

#72 ::: Mez sees comment spam, partly in Polish? ::: (view all by) ::: June 15, 2004, 03:44 AM:

Should we start boycotting the ringtone place now?

#73 ::: Teresa Nielsen Hayden ::: (view all by) ::: June 15, 2004, 04:18 PM:

Might as well. Would you want to do business with an outfit that advertises via spam?

#74 ::: Julia Jones finds much comment spam ::: (view all by) ::: July 20, 2004, 08:00 PM:

Lot-n-lots of spam for the product you will need to enjoy the pron spam...

#75 ::: Andy Perrin ::: (view all by) ::: July 20, 2004, 10:14 PM:

New trend in porn spam: With all the vulgar terms blocked, the spammers have resorted to medical terminology. This morning I was invited to view some choice "teen vulva."

We are improving their vOc*bu!ary.

#76 ::: Julia Jones finds comment spam ::: (view all by) ::: August 11, 2004, 01:40 PM:

There's something... weird... about finding meds spam on a thread called "more porn spam".

#77 ::: Andrew Willett finds yet more spam ::: (view all by) ::: September 26, 2004, 12:26 AM:

Jebus, haven't these people anything better to do?

Choose:
Smaller type (our default)
Larger type
Even larger type, with serifs

Dire legal notice
Making Light copyright 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 by Patrick & Teresa Nielsen Hayden. All rights reserved.