UTILIZATION OF ADVANCED MACHINE LEARNING TECHNIQUES FOR DETECTING HATE SPEECH ON SOCIAL MEDIA PLATFORMS / ISAAC MUMBA NGOY ; SUPERVISOR, ASST. PROF. DR. KIAN JAZAYERI
Dil: İngilizce 2024Tanım: 64 sheets ; 30 cm +1 CD ROMİçerik türü:- text
- unmediated
- volume
Materyal türü | Geçerli Kütüphane | Koleksiyon | Yer Numarası | Kopya numarası | Durum | Notlar | İade tarihi | Barkod | Materyal Ayırtmaları | |
---|---|---|---|---|---|---|---|---|---|---|
Thesis | CIU LIBRARY Depo | Tez Koleksiyonu | YL 3578 N46 2024 (Rafa gözat(Aşağıda açılır)) | C.1 | Kullanılabilir | Management Information System | T4025 | |||
Suppl. CD | CIU LIBRARY Görsel İşitsel | Tez Koleksiyonu | YL 3578 N46 2024 (Rafa gözat(Aşağıda açılır)) | C.1 | Kullanılabilir | Management Information System | CDT4025 |
CIU LIBRARY raflarına göz atılıyor, Raftaki konumu: Depo, Koleksiyon: Tez Koleksiyonu Raf tarayıcısını kapatın(Raf tarayıcısını kapatır)
Thesis (MSc) - Cyprus International University. Institute of Graduate Studies and Research Management Information System
This study examines the efficacy of diverse machine learning models in detecting hate speech within English-language tweets, with a focus on advanced ensemble methods. The study evaluates a range of models, including Random Forest, Stacking Classifier, Support Vector Machine (SVM), Logistic Regression, Naive Bayes, K-Nearest Neighbors (KNN), AdaBoost, and Gradient Boosting. Random Forest emerged as the top performer, achieving an accuracy of 99.90%, precision of 99.94%, recall of 99.87%, F1-score of 99.90%, and an AUC-ROC of 0.999, closely followed by the Stacking Classifier and SVM. A key contribution of this research lies in its emphasis on preprocessing techniques, particularly the use of lemmatization and contraction expansion, which have been less commonly applied in the field compared to stemming. These techniques, along with text cleaning, normalization, and tokenization, were crucial in improving the models' accuracy and ability to capture the nuances of hate speech. Feature extraction was conducted using the Term Frequency-Inverse Document Frequency (TF-IDF), further augmenting the models' ability to differentiate between hate speech and non-hate speech content. The study highlights the significance of sophisticated preprocessing in increasing the robustness of machine learning models for hate speech detection. This research delivers critical insights that can enhance the effectiveness of hate speech detection systems on social media platforms and establishes a foundation for future studies focused on advanced deep learning approaches and the ethical aspects of deploying these models.