Select KBest using Chi-Square Test is applied to select only the Top 10 best features for each domain respectively.
As we found out that the Top 10 best for MMSE is quite similar to the Top 10 best for GDS but with a higher average score, thus the final data frame for the demographic domain will only consist top 10 best features of MMSE.
The following domains also will only consist Top 10 best features with MMSE as targets.
PART A: Extract Top 10 of best features from each domain
final_demographic = final_demographic[['total_income','AGE','Marital Status_Balu/Duda','NEGERI_Kelantan','GENDER_Lelaki','NEGERI_Selangor','Job Sector Previously_Public Sector','GENDER_Perempuan','Marital Status_Berkahwin','Tinggal _Sendirian']]
final_health = final_health[['Bekas Perokok - Tahun Berhenti','Jika Kurang 1 Tahun Berapa Batang','WHODAS_baseline_None','WHODAS_baseline_Some','Rokok Sehari','Minum Alkohol_Ya','WHODAS_baseline_Moderate','Smoking_Bekas Perokok','WHODAS_baseline_Serious','ADL_Low - Patient very dependent']]
final_social = final_social[['TMOSSF (Tangible Support)_None of the time','Average Total Neighbourhood_Very Satisfied','sumLubben_More Social Engagement','TMOSSF (Informational)_All of the time','TMOSSF (Tangible Support)_All of the time','TMOSSF (Affective Support)_All of the time','TMOSSF (Informational)_None of the time','Total Social Cohesion Scale _Strongly Agree','TMOSSF (Positive Social Interaction)_All of the time','TMOSSF (Affective Support)_None of the time']]
final_psychology = final_psychology[['Total_EpQ(Average)_Yes','Total_Loneliness _Some of the time','Total SWLS_Agree','Total SWLS_Neutral','Quality Of Life','total_flourishing','Total_EpQ(Average)_No','Total SWLS_Disagree','Total_Loneliness _Often','Total_Loneliness _Hardly ever']]
PART B: All features of each domain