Normal görünüm MARC görünümü ISBD görünümü

HCAB-SMOTE: A HYBRID CLUSTERED AFFINITIVE BORDERLINE SMOTE APPROACH FOR IMBALANCED DATA BINARY CLASSIFICATION/ Hisham AL MAJZOUB; Supervisor: Öykü AKAYDIN, Co-supervisors: Islam ELGEDAWY, Mehtap KÖSE ULUKÖK

Yazar:

MAJZOUB, Hisham AL

Katkıda bulunan(lar):

Tanım: p. VII, 102; color figure, table, color graphic, 30.5 cm CDİçerik türü:

text

Ortam türü:

unmediated

Taşıyıcı türü:

volume

Konu(lar):

Tez notu: Thesis (Ph.D) - CYPRUS INTERNATIONAL UNIVERSITY INSTITUTE GRADUATE STUDIES AND RESEARCH MANAGEMENT INFORMATION SYSTEMS DEPARTMENT Özet: ABSTRACT In this thesis, three algorithms are developed and implemented to optimize the performance of machine learning oversampling algorithms that increase the accuracy of the classification tasks over datasets having an imbalanced class problem. Machine learning uses historical data to reveal hidden patterns and improve the decision-making process in different business, medical, or other fields. However, it faces lots of obstacles and challenges, one of them is the dataset structure issue called imbalanced class dataset problem. In imbalanced class datasets, the distribution of the instances is imbalanced between the classes, leading the classification algorithm to act in a biased manner toward the class having the most instances and obtaining low classification accuracy for instances falling in the minority class. Most often, the goal of using machine learning is to get the patterns of the minority class instances so that the model can predict the class of the new unlabeled instances, but this process acquires low accuracy if the dataset has an imbalanced class problem. Different methods are available to reduce the effect of the imbalanced class problem on the generated models bias, but to increase the classification accuracy with those methods it has to deeply modify the original data through removing a high number of majority instances through undersampling methods or generating a huge number of new instances within the minority class through oversampling methods. The main focus of this thesis is to optimize the oversampling algorithm SMOTE to increase the classification accuracy of the intended class with minimal data altering. This thesis proposes the Affinitive Borderline – SMOTE (AB-SMOTE) that outperforms the classification accuracy of the former Borderline - SMOTE due to oversampling new instances within the borderline area instead of oversampling the instances around it. Then, the thesis develops Clustered Affinitive Borderline – SMOTE (CAB-SMOTE) which clusters the borderline area into different smaller clusters and oversamples within these clusters, delivering higher classification accuracy than AB-SMOTE in classifying the minority instances. Finally, the thesis proposes the Hybrid Clustered Affinitive Borderline - SMOTE which combines the undersampling method for removing noisy borderline instances from the majority and minority classes with oversampling CAB-SMOTE. Thus, obtaining the highest classification accuracy among other oversampling techniques. Therefore, these methods can be used to improve the accuracy of some machine learning applications to make them more reliable for the decision-making process that leads to decrease cost, and increase profit. Keywords Imbalanced Data ꞏ Borderline SMOTE ꞏ Oversampling ꞏ SMOTE ꞏ ABSMOTE ꞏCAB-SMOTE ꞏ HCAB-SMOTEꞏ K-Means Clustering

Materyal türü:

Thesis

Mevcut ( 1 )
Başlık notları ( 4 )

Mevcut
Materyal türü	Geçerli Kütüphane	Koleksiyon	Yer Numarası	Durum	Notlar	İade tarihi	Barkod	Materyal Ayırtmaları
Thesis	CIU LIBRARY Tez Koleksiyonu	Tez Koleksiyonu	D 210 A46 2020 (Rafa gözat(Aşağıda açılır))	Kullanılabilir	Management Information Systems Department		T2061

Toplam ayırtılanlar: 0

CIU LIBRARY raflarına göz atılıyor, Raftaki konumu: Tez Koleksiyonu, Koleksiyon: Tez Koleksiyonu Raf tarayıcısını kapatın(Raf tarayıcısını kapatır)

Önceki	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Kullanılabilir kapak resmi yok	Sonraki
Önceki	D 208 O56 2020 THE USE OF POLYPROPYLENE FIBER IN ALKER STABILIZED EARTHEN CONSTRUCTION WITH A VIEW OF ANALYZING ITS MECHANICAL PROPERTIES AND CRACK PROPAGATION/	D 209 A44 2020 SYNTHESIS, CHARACTERIZATION, APPLICATION OF ZINC OXIDE NANOPARTICLES-CARBONIZED SAW DUST MATRIX FOR THE TREATMENT OF HEAVY METALS CONTAMINATED WATER/	D 21 O53 2005 Türkiye'nin çeşitli bölgelerinden izole edilen influenza viruslarının tip ve alttiplerinin belirlenmesi	D 210 A46 2020 HCAB-SMOTE: A HYBRID CLUSTERED AFFINITIVE BORDERLINE SMOTE APPROACH FOR IMBALANCED DATA BINARY CLASSIFICATION/	D 211 H26 2020 THE ROLE OF INTELLECTUAL CAPITAL IN ORGANIZATIONAL PERFORMANCE: A STUDY OF SMALL AND MEDIUM BUSINESS IN ERBIL/	D 212 S43 2020 INVESTIGATION THE PILLARS OF SUSTAINABILITY RISK MANAGEMENT AS AN EXTENSION OF ENTERPRISE RISK MANAGEMENT ON PALESTINIAN INSURANCE FIRMS' PROFITABILITY/	D 213 I95 2020 USING CONFLICT MANAGEMENT IN IMPROVING OWNERS AND CONTRACTORS RELATIONSHIP QUALITY IN THE CONSTRUCTION INDUSTRY: THE MEDIATION ROLE OF TRUST/	Sonraki

Includes CD

Thesis (Ph.D) - CYPRUS INTERNATIONAL UNIVERSITY INSTITUTE GRADUATE STUDIES AND RESEARCH MANAGEMENT INFORMATION SYSTEMS DEPARTMENT

Includes REFERENCES p. 96-102

ABSTRACT
In this thesis, three algorithms are developed and implemented to optimize
the performance of machine learning oversampling algorithms that increase the
accuracy of the classification tasks over datasets having an imbalanced class
problem. Machine learning uses historical data to reveal hidden patterns and
improve the decision-making process in different business, medical, or other fields.
However, it faces lots of obstacles and challenges, one of them is the dataset
structure issue called imbalanced class dataset problem. In imbalanced class
datasets, the distribution of the instances is imbalanced between the classes, leading
the classification algorithm to act in a biased manner toward the class having the
most instances and obtaining low classification accuracy for instances falling in the
minority class. Most often, the goal of using machine learning is to get the patterns
of the minority class instances so that the model can predict the class of the new
unlabeled instances, but this process acquires low accuracy if the dataset has an
imbalanced class problem. Different methods are available to reduce the effect of
the imbalanced class problem on the generated models bias, but to increase the
classification accuracy with those methods it has to deeply modify the original data
through removing a high number of majority instances through undersampling
methods or generating a huge number of new instances within the minority class
through oversampling methods. The main focus of this thesis is to optimize the
oversampling algorithm SMOTE to increase the classification accuracy of the
intended class with minimal data altering. This thesis proposes the Affinitive
Borderline – SMOTE (AB-SMOTE) that outperforms the classification accuracy of
the former Borderline - SMOTE due to oversampling new instances within the
borderline area instead of oversampling the instances around it. Then, the thesis
develops Clustered Affinitive Borderline – SMOTE (CAB-SMOTE) which clusters
the borderline area into different smaller clusters and oversamples within these
clusters, delivering higher classification accuracy than AB-SMOTE in classifying
the minority instances. Finally, the thesis proposes the Hybrid Clustered Affinitive
Borderline - SMOTE which combines the undersampling method for removing
noisy borderline instances from the majority and minority classes with
oversampling CAB-SMOTE. Thus, obtaining the highest classification accuracy
among other oversampling techniques. Therefore, these methods can be used to
improve the accuracy of some machine learning applications to make them more
reliable for the decision-making process that leads to decrease cost, and increase
profit.
Keywords Imbalanced Data ꞏ Borderline SMOTE ꞏ Oversampling ꞏ SMOTE ꞏ ABSMOTE
ꞏCAB-SMOTE ꞏ HCAB-SMOTEꞏ K-Means Clustering

Ayırt
Yazdır
Cihaza gönder
Künyeyi kaydet
BIBTEX Dublin Core ISBD MARCXML MODS (XML) RIS
Başka yerde arama

Bu eser adını şu dizinde arayın:
Other Libraries (WorldCat) Other Databases (Google Scholar) Online Stores (Bookfinder.com) Open Library (openlibrary.org)