Statistical Techniques for feature selection in Machine learning Models
Authors: Vaibhav Tummalapalli, Kiran Konakalla
DOI: https://doi.org/10.37082/IJIRMPS.v13.i3.232566
Short DOI: https://doi.org/g9q356
Country: USA
Full-text Research PDF File:
View |
Download
Abstract: Feature selection is a critical step in the machine learning pipeline, particularly when working with high-dimensional datasets common in the automotive and marketing domains. Selecting the most informative predictors not only improves model accuracy and interpretability but also enhances computational efficiency and decision-making speedy factors in real-time business applications. This paper explores four foundational statistical techniques for feature selection: Information Value (IV), Chi-Square Test, Analysis of Variance (ANOVA), and Correlation Coefficients. Each method is presented with its theoretical foundation, historical significance, and mathematical formulation. Beyond academic context, we highlight practical implications of feature selection in driving operational efficiency, reducing model training costs, and improving the effectiveness of customer segmentation, campaign targeting, and vehicle sales predictions. By understanding and leveraging these techniques, practitioners can streamline model development and ensure actionable insights that translate to measurable business outcomes
Keywords:
Paper Id: 232566
Published On: 2025-06-07
Published In: Volume 13, Issue 3, May-June 2025