International Journal of
Computer Engineering Research

  • Abbreviation: Int. J. Comput. Eng. Res.
  • Language: English
  • ISSN: 2141-6494
  • DOI: 10.5897/IJCER
  • Start Year: 2010
  • Published Articles: 33

Full Length Research Paper

Analysis and result of classification algorithm on email classification

Elifenesh Yitagesu Desta
  • Elifenesh Yitagesu Desta
  • Department of Computer Science, College of Computing, Madda Walabu University,P. O. Box 247, Bale Robe, Ethiopia.
  • Google Scholar
Tekalign Tujo Gurmessa
  • Tekalign Tujo Gurmessa
  • Department of Computer Science, College of Computing, Madda Walabu University,P. O. Box 247, Bale Robe, Ethiopia.
  • Google Scholar


  •  Received: 21 January 2019
  •  Accepted: 13 May 2019
  •  Published: 31 July 2019

Abstract

In this time, one of the most and fastest forms of communication is electronic mail or what we call e-mail. However, the increase of e-mail users has resulted in the dramatic increase of spam emails in the past few years. Spam is the use of electronic messaging systems to send bulk data. In this paper, e-mail data were classified as ham email and spam email using supervised learning algorithms. Three different classifiers such as Naïve Bayesian (NB) classifier, K-nearest neighbor (KNN) classifier and Support Vector Machine (SVM) classifier were used. The experiment was performed by applying filtering on the classifiers. The result shows the difference between the classifier before and after applying filtering algorithm. To examine the performance of the selected classification methods or algorithms, namely Naïve Bayes, SVM and KNN, true positive, false positive, precision, recall and F-measure were validated. There was a time difference using those classification algorithms. KNN and SMO algorithms are almost the best classifiers among the three before applying filtering algorithm. Sequential minimal optimization (SMO) is an algorithm used to solve quadratic programming (QP) problem that arises during the training of support vector machines (SVM) and after applying filtering algorithm. SMO algorithm is the best classifier algorithm.  For this experiment, the data mining tool called WEKA was used.

Key words: WEKA, classifier, K-nearest neighbor (KNN), support vector machines (SVM), Naïve Bayesian (NB), boosting.