Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis

Balogun, A.O. and Basri, S. and Jadid, S.A. and Mahamad, S. and Al-momani, M.A. and Bajeh, A.O. and Alazzawi, A.K. (2020) Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis. Advances in Intelligent Systems and Computing, 1224 A. pp. 492-503.

Full text not available from this repository.
Official URL: https://www.scopus.com/inward/record.uri?eid=2-s2....

Abstract

High dimensionality is a data quality problem that negatively influences the predictive capabilities of prediction models in software defect prediction (SDP). As a viable solution, feature selection (FS) has been used to address the high dimensionality problem in SDP. From existing studies, Filter-based feature selection (FFS) and Wrapper Feature Selection (WFS) are the two basic types of FS methods. WFS methods have been regarded to have superior performance between the two. However, WFS methods have been known to have high computational cost as the number of executions required for feature subset search, evaluation and selection is not known prior. This often leads to overfitting of prediction models due to easy trapping in local maxima. Applying appropriate search method in WFS subset evaluator phase can resolve its trapping in local maxima. Best First Search (BFS) and Greedy Step-wise Search (GSS) methods have been extensively and conventionally used as viable search methods in WFS with positive impacts. However, metaheuristic search methods can also be as effective as BFS and GSS. Consequently, this study conducts an empirical comparative analysis of 13 search methods (11 state-of-the-art metaheuristic search and 2 conventional search methods) in WFS methods for SDP. The experimental results showed that metaheuristic (AS, BS, BAT, CS, ES, FS, FLS, GS, NSGA-II, PSOS, RS) as search methods in WFS proved to be better than conventional search methods (BFS and GSS). Although the average computational time of metaheuristic-based WFS methods is relatively high. We recommend that metaheuristic search can be used as alternate search methods for WFS methods in SDP. © 2020, Springer Nature Switzerland AG.

Item Type: Article
Impact Factor: cited By 2
Uncontrolled Keywords: Computer software; Forecasting; Predictive analytics, Comparative analysis; Computational costs; Empirical analysis; Feature selection methods; High dimensionality; Meta-heuristic search; Predictive capabilities; Software defect prediction, Feature extraction
Depositing User: Ms Sharifah Fahimah Saiyed Yeop
Date Deposited: 27 Aug 2021 05:50
Last Modified: 27 Aug 2021 05:50
URI: http://scholars.utp.edu.my/id/eprint/24728

Actions (login required)

View Item
View Item