Bayesian filtering for spam *

Question

I was wondering if there is any good and clean oo implementation of bayesian filtering for spam and text classification? For learning purposes.

Answer

I definitely recommend Weka which is an Open Source Data Mining Software written in Java:

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.

As mentioned above, it ships with a bunch of different classifiers like SVM, Winnow, C4.5, Naive Bayes (of course) and many more (see the API doc). Note that a lot of classifiers are known to have much better perfomance than Naive Bayes in the field of spam detection or text classification.

Furthermore Weka brings you a very powerful GUI

< br > via < a class="StackLink" href=" http://stackoverflow.com/questions/1083/" >Bayesian filtering for spam< /a>
Share on Google Plus

About Cinema Guy

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.
    Blogger Comment
    Facebook Comment

0 comments:

Post a Comment