•  
  •  
 

Document Type

Article

Keywords

Spam detection, Ensemble learning, Evolutionary Algorithm, Data Analysis

Abstract

Email communication is a crucial aspect of modern interactions. With the growing volume of spam emails, there is a pressing need for more effective antispam filters to detect unwanted messages. Existing spam detection techniques often fall short, prompting researchers to leverage machine learning and artificial intelligence to enhance online security. This study introduces an advanced spam detection technique using an ensemble learning approach. First, key features are extracted from both spam and non-spam emails via the term frequency-inverse document frequency (TF-IDF) method. Several classification algorithms, including cubist, naïve Bayes, support vector machine, rpart, and ctree, are applied to classify emails on the basis of the extracted features. A stacking model that uses an evolutionary algorithm is implemented to further increase the detection accuracy. The effectiveness of the proposed methodology is evaluated on widely adopted spam datasets. The results demonstrate its robustness, achieving an accuracy of 98.39% on the Enron dataset and 98% on the SpamAssassin dataset, highlighting its efficiency and significance in spam email detection.

Share

COinS