a text classification library.
Please see README.md
hext
This is currently the beginning of a text classification library.
##Installation/Running stack install hext
hackage - https://hackage.haskell.org/package/hext-0.1.0.3
To run:stack build
stack exec hext-exe
##Usage
Currently, the only algorithm implementation is the Naive Bayes algorithm: to run your own data through this algorithm in order to classify your text, you need:
- classified data: this can be sourced from a database where the only fields that are needed are the text itself, and it's class
- a sample string which will be classified by the algorithm
In order to run the program, the classified data specified above must be converted into a BayesModel a
using the teach
function, where a
is your own defined data type representing the class to classify your text. Your class must be and instance of Ord
and Eq
.
With your new BayesModel
filled with knowledge, it's time to classify your text using runBayes
. An example of this can be seen in app/Main.hs
where data Class = Positive | Negative deriving (Eq, Ord, Show)
to label movie reviews as either positive or negative.
##Contributing
I encourage contributing to this package, in the form of implementing algorithms that are not yet in the project, improving efficiency, or the like.