Digg Spamming

3 minute read

I wrote about Digg, the social news site, a while back. However, for all the cool stuff it affords, it’s also beginning to show some signs of the pitfalls of a fast growing site.

Wikipedia notes a few criticisms, the worst of which seems to be recurring stories. Wikipedia doesn’t note what I think will become an overwhelming problem in the very near future: Digg spam.

I’m not talking about traditional spamming by bots that submit URLs to porn sites, debt consolidation, or poker. It seems like Digg’s staff has that quite well in hand. Rather, Digg spamming (or Damming) for my purposes is a practice by wherein the author of a blog signs up on Digg and submits every single article they write. Sound familiar? It should.

That’s what Pinging is. The difference here is that those services provide the weblog ping API. They want to aggregate any and all blog articles. This is Technorati‘s forte.

Digg doesn’t offer this capability. Nor should it. Imagine if you will, trying to pick articles out of Technorati to digg. Digg moves fast enough with user submissions (and will move faster as the user base grows). If it starts to move too fast, interesting stories won’t get the number of diggs necessary to move them onto the front page. The cream will stop rising to the top because the milk is getting poured too fast.

Further, most people’s blog entries simply aren’t worth of a digg, which is the whole reason someone else doesn’t submit them. For this reason, I’ve never submitted one of my own articles. Of course I find them interesting. I wrote them. The real question is if some else does. It’s Digg’s first filtering mechanism for the web. Slashdot requires editors to approve stories. That’s also a downfall as the news gets flavored by the editorial staff’s taste. Because Digg relies on users to approve stories to the front page, such flavoring is avoided. While this is a positive thing, it may not be the best method to avoid Damming.

My theoretical discussion of this issue can only go so far though. A more intuitive understanding can be developed through the use of a concrete example. I’d like to single out the user that exposed this type of abuse to me by doing it so repetitively. macaquentosh is perhaps the worst example I can find.

A quick glance yields that nearly all of his submissions are for a single site, one called TopMac (I refuse to link it, but if you want to see it, add “.blogspot.com” to the name). I’ve seen it before, most notably it was being spammed into many threads in Macworld’s forums, so when I saw it again, I knew it was worth looking into. Not all of his submissions are for that site though, which may throw you until you notice that the other sites are run by the same guy. Surprise, surprise, one of his sites is all about credit cards and other types of credit. I know that will come as a huge shock.

Maybe I’m being too harsh though. Maybe he has some original, interesting content of his own that he’s sharing. At least then he’s contributing something. Unfortunately, he fails that test too. He’s copying, verbatim, in their entirity, articles from other websites. He’s not adding his own commentary or anything. He’s just copying the text for publication on his own site. Plagiarism anyone? He at least attributes the text to the originating website in the format “Souce: Macworld” where “Macworld” is a link. Is it a link to the article on that site or the main page of that site? Nope. It’s a link to one of his other sites. Huh? Is this guy also trying to skew Google’s search results?

This guy has no shame, so maybe no one should be surprised. However, what he’s doing at Digg is detrimental for other reasons. For one, he’s submitting articles to Digg that are actually copied from other sites, so he’s getting diggs that should be for those sites. Additionally because of this, he’s adding to the problem of duplicate stories (which right now is the most significant problem on the site).

My proposal to Digg would be to automatically review users that consistently submit stories from the same domain, particularly the free blogging domains. This kind of shameless self-promotion will destroy the usefulness of the site if they’re not careful.

Update: Mark Hasman (aka macaquentosh) is now using two usernames so he can dig the stories that his other pseudonym posts, artificially bringing the number of digs to 2. I’ve also reported his plagiarism to Macworld and Mac Observer.

Update 2: Digg has removed Mark’s usernames from the system.