Chi-Squared and P-Values vs. Machine Learning Feature Selection
Fraunhoffer et al. used the least absolute shrinkage and selection operator (LASSO) and random forest (RF) methods for feature selection, which may not be ideal. They noted that incorporating master regulator transcripts from the neoplastic cell phenotype played a significant role in the LASSO feature selection for all drugs, with the highest proportions from GEM and 5-FU1. However, feature selection in machine learning may not provide true associations2,3,4. Instead, Chi-squared tests and p-values should be used to ensure true associations, rather than relying on LASSO and RF methods5,6,7.