000 04391na a2200973 4500
001 233328
003 koha_MIRAKIL
005 20221226090135.0
008 190118b tu 000 0
040 _aCY-NiCIU
_btur
_cCY-NiCIU
_erda
041 0 _aeng
090 _aYL 391
_b A35 2014
100 1 _aAdi, Abdulwahab O.
245 0 _aDocument classification using naive bayes algorithm
_cAbdulwahab O. Adi; Supervisor: Erbuğ Çelebi
260 _aNicosia
_bCyprus International University
_c2014
300 _aIX, 49 p.
_bfigure
_c30.5 cm
_eCD
336 _2rdacontent
_atext
_btxt
337 _2rdamedia
_aunmediated
_bn
338 _2rdacarrier
_avolume
_bnc
500 _3Includes CD
520 _a'ABSTRACT In this study, we have implemented a naïve Bayes Classifier in the Java Language. The classifier was tested on the popular 20 News group data set for majority of document categorization and clustering algorithm implementation. The ultimate object is for better understanding of the algorithm as an a way for automatic document categorization is done and also to be able to ponder new methods that can be proposed for future research purposes. At the end of this research, we successfully tested the performance of our implementation using three methods. The accuracy was measured by comparing it's with the accuracies of other algorithms using the same dataset. It turned out to work as postulated theoretically in normal academic environs. Also, we were able to conclude that the naïve Bayes classifier performs well among other similar classifiers but it also has its short comings as well. Keywords: Bayes Theorem, Supervised Learning, Document Classification, Naïve Bayes Classifier, Tokenization, Stemming, Machine Learning, Information Retrieval, Java '
650 0 0 _aMakine öğrenme
650 0 0 _aMachine learning
650 0 0 _aBayes teoremi
650 0 0 _aBayes Theorem
700 0 _aSupervisor: Çelebi, Erbuğ
_91656
942 _2ddc
_cTS
505 1 _g1
_tCHAPTER ONE
505 1 _g1
_tINTRODUCTION
505 1 _g3
_tObjectives
505 1 _g3
_tOrganization of Thesis
505 1 _g4
_tCHAPTER 2
505 1 _g4
_tLITERATURE REVIEW
505 1 _g4
_tMachine Learning
505 1 _g5
_tSupervised Learning
505 1 _g7
_tUnsupervised Learning
505 1 _g7
_tSemi-Supervised Learning
505 1 _g8
_tReinforcement Learning
505 1 _g8
_tTransduction
505 1 _g10
_tLearning to Learn
505 1 _g10
_tDevelopmental Learning
505 1 _g11
_tPREVIOUS WORK DONE
505 1 _g11
_tNaive Bayes Classifier As A Spam Detector
505 1 _g12
_tNaive Bayes Classifier in Sentiment Analysis
505 1 _g13
_tNaive Bayes Classifier in Cancer Diagnosis
505 1 _g14
_tNaive Bayes Classifier in Plant Specie classification
505 1 _g15
_tCHAPTER 3
505 1 _g15
_tNAİVE BAYES CLASSIFIER
505 1 _g15
_tBayes Theorem
505 1 _g15
_tText Classification Simplified
505 1 _g18
_tPrior Probability,P(c)
505 1 _g19
_tLikelihood Probability, Pd/c
505 1 _g20
_tLaplace Smoothening
505 1 _g22
_tSimple Text Classification Examples
505 1 _g26
_tCHAPTER 4
505 1 _g26
_tIMPLEMENTATION
505 1 _g26
_tIntroduction
505 1 _g26
_tJava an NLP Libraries
505 1 _g27
_tProgram Design
505 1 _g27
_tExperimental Setup
505 1 _g28
_tLoading the data set
505 1 _g29
_tStop Word Removal
505 1 _g30
_tTokenization
505 1 _g33
_tStemming
505 1 _g36
_tBag of Word Creation
505 1 _g39
_tEvaluation
505 1 _g39
_tClassification
505 1 _g40
_tDesign Summary
505 1 _g41
_tCHAPTER 5
505 1 _g41
_tEVALUATION
505 1 _g41
_tCross Validation method
505 1 _g41
_tComparison with other Classifier Application
505 1 _g42
_tIcsiboost-bigram
505 1 _g42
_tExpected Maximum alorithm
505 1 _g42
_tVaried Training Set based Evaluation
505 1 _g43
_tRESULTS OF EVALUATION PROCEDURES
505 1 _g43
_tCross Validation Method
505 1 _g44
_tComparison with other Classifier Programs
505 1 _g45
_tVaried Training Set based Evaluation
505 1 _g46
_tCHAPTER 6
505 1 _g46
_tCONCLUSION AND FUTURE WORK
505 1 _g47
_tREFERENCES
999 _c434
_d434