Utilizing text classification methods to monitor criminal activities on Twitter Omar Al-Debagy; Supervisor: Devrim Seral

Yazar: Katkıda bulunan(lar):Dil: İngilizce Yayın ayrıntıları:Nicosia Cyprus International University 2015Tanım: XIV, 81 p. table, figure 30.5 cm CDİçerik türü:
  • text
Ortam türü:
  • unmediated
Taşıyıcı türü:
  • volume
Konu(lar): Özet: 'ABSTRACT The purpose of this research is to find the most efficient text classification techniques to monitor and track criminal activities on Twitter that are written in Arabic. In this research, we compared the performance of different classifiers with different preprocessing options. Classification made using three different classes, which they are supporting ISIS (Islamic State of Iraq and Syria), opposing ISIS, and neutral tweets which they are unrelated to any of the two other classes. The classifiers that we used in this study are Naïve Bayes, Support Vector Machine, k-Nearest Neighbors, and Random Forests, with different preprocessing options like TF/IDF, stemming, stop words removal, and normalization. So we concluded that the most efficient classifier for this research will be Naïve Bayes with accuracy of 88% by utilizing 3000 tweets as training set and 1200 tweets as test set. '
Materyal türü: Thesis

Includes CD

Includes references (76-81 p.)

'ABSTRACT The purpose of this research is to find the most efficient text classification techniques to monitor and track criminal activities on Twitter that are written in Arabic. In this research, we compared the performance of different classifiers with different preprocessing options. Classification made using three different classes, which they are supporting ISIS (Islamic State of Iraq and Syria), opposing ISIS, and neutral tweets which they are unrelated to any of the two other classes. The classifiers that we used in this study are Naïve Bayes, Support Vector Machine, k-Nearest Neighbors, and Random Forests, with different preprocessing options like TF/IDF, stemming, stop words removal, and normalization. So we concluded that the most efficient classifier for this research will be Naïve Bayes with accuracy of 88% by utilizing 3000 tweets as training set and 1200 tweets as test set. '

Araştırmaya Başlarken  
  Sıkça Sorulan Sorular