•  
  •  
 

Document Type

Article

Keywords

Malware detection, Classification, Feature selection, Ensemble model

Abstract

The rapid development of technology using Android-based smartphones has led to various threats of malware targeting these devices. Over time, android malware has become increasingly complex and challenging to mitigate. Detection relies on identifying specific sets of features that characterize malicious behavior, and these features have become increasingly complex and diverse as the complexity of malware has increased. Traditional approaches often suffer from high-dimensional feature spaces that increase the computational complexity and reduce the detection accuracy. Therefore, in this paper, a feature optimization approach is proposed that strategically selects the most informative malware features and discards redundant and noisy features to ensure computational efficiency. The ensemble design model using a voting approach is utilized with three base classifiers (LMT, KStar, and Decision Table) that are fed from a feature selection using the Relief algorithm. The proposed models were evaluated through several experiments using three datasets (Derbin, Malgenome, and Prerna) comprising 35,135 samples (10,820 malware samples and 24315 benign samples) across feature settings of 50, 100, 150, and all features. The experimental results highlighted that the detection/classification accuracy can be enhanced via an optimal feature vector. Overall, the model using 150 features was able to achieve the highest performance of 99.61%.

Share

COinS