Click here to Skip to main content
15,867,568 members
Articles / Web Development / ASP.NET
Tip/Trick

A Plan for Spam

Rate me:
Please Sign up or sign in to vote.
5.00/5 (15 votes)
2 Jan 2013CPOL5 min read 43.7K   7   19
How to abate the CodeProject spam crisis.

We are presently experiencing a hard pressure from a narrow group of "TV and Media" spammers who cynically challenge out ability to resist this kind of crime. Members of CodeProject are doing remarkable effort for extermination of unwanted parasites, but the measures taken seem to be not quite satisfactory. My reason for this short article is related to discussion of what we can do with between Chris Maunder and myself:
http://www.codeproject.com/Messages/4462716/Re-Live-streamers.aspx[^],
http://www.codeproject.com/Messages/4462726/Re-Live-streamers.aspx[^].

Several hours later, a fresh idea came to my mind, a variant of the ideas we already discussed. I would ask interested members to think about it and discuss it, criticize and support. Generally, we need some brain storm to help Chris and others to arm the site with suitable improved protection against spam, the way not threatening legitimate members and not boosting the overhead of using and maintaining the site too much.

I'm coming back to the idea of Bayesian filtering. I've successfully used it on my e-mails a while ago, but, after all, replaced it all by my own approach (this is not a place to discuss it because it cannot be applied to the site). I think, Bayesian filtering approach did not find its dominating place in e-mail services by some natural reasons, such as human operator/user overhead and unavoidable false negatives/positives of the method. However, I'm starting to think that if we use this idea, with a special twist (which can be further discussed), we can apply it for the protection of CodeProject.

This short article is named after the article "A Plan for Spam" by Paul Graham: http://www.paulgraham.com/spam.html[^].

See also another article: http://www.paulgraham.com/better.html[^].

I think, after reading of the articles the idea will be clear enough.

As to the implementation, please looks at this open-source product: http://nbayes.codeplex.com[^].

And this is a CodeProject article: A Naive Bayesian Spam Filter for C#[^].

That was just to demonstrate that the implementation won't be a big problem.

Still, the problem is: how to decide on the cancellation of the spammer's account? Don't we face the same problems: false negative/positive and excessive amount of the intervention of the administrator. Remember now, that I pointed out the main problem with the workload put on a human administrator: the requires chores are not automated, or not optimized to meet the goals.

Now, here is the main idea:

Let's invert the situation socially. Instead of making the decision on cancellation of a offender's account, let's make the potential offender applying for the "legalization" of a potentially spamming post. Hold on! Don't deny this idea from the very beginning, before I explain how it practically may look. I'm going to demonstrate that this can be done gently enough.

First of all, let's remember the starting point. At starting point, the filter is empty (or all available filters are empty), so, without intervention of the member caring about extermination of spam, nothing is filtered out, ever. The filters are started to populate as some member spots the spam and report it as such. It should be a special reporting action for spam, which feeds the spammed context into a filter. A filter starts populating and gradually acquires the ability to detect spamming content automatically. Yes, which some false positives/negatives. For the detail of this process, please come back to the articles by Paul Graham.

As a first step, the post content is not placed on the CodeProject content page (Questions & Answers, or something else). Instead, a potential offender gets the message on a page. Something like that:

CodeProject informs:
Sorry, we cannot place you post immediately. It contains some content detected by our filters as potential spam. The detection was bases on previous spam reports of CodeProject members. If you believe this is not spam, you will need to post your explanation here [URL]

The content goes to the database. On the request by the potential spammer, the page with legalization form is generated; and the report goes to the database, where the status of prospective post is stored. Again, it should not happen often; and legitimate members posting their messages will almost never get this message. I know this from my experience with Bayesian filtering for e-mail.

Now, by the request of the administrator, all the filtered members' messages will be generated on a single page. Usually, one glance on the messages will be enough to judge if this is spam or not. Importantly, this is quite unlikely that a real spammers will pledge for legalization of their contents. So, I think that the action most typically be will be "Yes to all" (pretty like in the movie "Bruce Almighty", 2003; no, this is not spam, I have no interest in promotion of this commercial product and cited it only for illustration of the protection method; I pledge for legalization of this post Smile | :) ). Of course, this "yes to all" is applied to the posts awaiting for approval/legalization. And it will be equally easy to have a single button "Remove all offending posts and member accounts" for all checked items.)

If you clearly imaging it, you will see that this procedure will be much easier than what we have now.

The access to this approval/legalization and member extermination procedure is a matter of some discussion. This aspect is not as important. I would suggest that the right for the final extermination of an offenders' accounts will be left to the administration, while the right for legalization and the right for extermination of offender's post (from this page; it is already there from the page of the question in Question & Answers forum) could be granted to members with some level of reputation.

Please discuss this idea and share your ideas. Maybe we can come up with some variant of my approach or something completely different.

Thank you for attention for this rather unpleasant matter and the effort already paid in order to sustain the site.

—SA

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect
United States United States
Physics, physical and quantum optics, mathematics, computer science, control systems for manufacturing, diagnostics, testing, and research, theory of music, musical instruments… Contact me: https://www.SAKryukov.org

Comments and Discussions

 
QuestionNice article Pin
Rakshith Kumar29-Oct-13 23:59
Rakshith Kumar29-Oct-13 23:59 
AnswerRe: Nice article Pin
Sergey Alexandrovich Kryukov30-Oct-13 3:10
mvaSergey Alexandrovich Kryukov30-Oct-13 3:10 
GeneralMy vote of 5 Pin
Michael Haephrati8-Mar-13 2:27
professionalMichael Haephrati8-Mar-13 2:27 
GeneralRe: My vote of 5 Pin
Sergey Alexandrovich Kryukov8-Mar-13 5:39
mvaSergey Alexandrovich Kryukov8-Mar-13 5:39 
QuestionIt's not a bad idea, but it can be taken further. Pin
Pete O'Hanlon8-Jan-13 6:03
subeditorPete O'Hanlon8-Jan-13 6:03 
AnswerRe: It's not a bad idea, but it can be taken further. Pin
Sergey Alexandrovich Kryukov8-Mar-13 5:41
mvaSergey Alexandrovich Kryukov8-Mar-13 5:41 
QuestionAlready tried - but maybe not well enough Pin
Chris Maunder7-Jan-13 15:07
cofounderChris Maunder7-Jan-13 15:07 
GeneralRe: Already tried - but maybe not well enough Pin
SoMad9-Jan-13 11:47
professionalSoMad9-Jan-13 11:47 
GeneralRe: Already tried - but maybe not well enough Pin
Sergey Alexandrovich Kryukov9-Jan-13 13:04
mvaSergey Alexandrovich Kryukov9-Jan-13 13:04 
GeneralRe: Already tried - but maybe not well enough Pin
SoMad9-Jan-13 13:14
professionalSoMad9-Jan-13 13:14 
GeneralRe: Already tried - but maybe not well enough Pin
Sergey Alexandrovich Kryukov9-Jan-13 13:15
mvaSergey Alexandrovich Kryukov9-Jan-13 13:15 
GeneralRe: Already tried - but maybe not well enough Pin
Chris Maunder9-Jan-13 14:32
cofounderChris Maunder9-Jan-13 14:32 
GeneralRe: Already tried - but maybe not well enough Pin
SoMad9-Jan-13 14:34
professionalSoMad9-Jan-13 14:34 
GeneralRe: Already tried - but maybe not well enough Pin
Sergey Alexandrovich Kryukov9-Jan-13 15:02
mvaSergey Alexandrovich Kryukov9-Jan-13 15:02 
AnswerRe: Already tried - but maybe not well enough Pin
Dan Neely14-Feb-14 4:30
Dan Neely14-Feb-14 4:30 
GeneralRe: Already tried - but maybe not well enough Pin
Sergey Alexandrovich Kryukov14-Feb-14 4:44
mvaSergey Alexandrovich Kryukov14-Feb-14 4:44 
GeneralA good start. Pin
SoMad5-Jan-13 17:08
professionalSoMad5-Jan-13 17:08 
GeneralRe: A good start. Pin
Sergey Alexandrovich Kryukov8-Mar-13 5:42
mvaSergey Alexandrovich Kryukov8-Mar-13 5:42 
GeneralSee my suggestion Pin
Indivara2-Jan-13 18:14
professionalIndivara2-Jan-13 18:14 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.