The Purpose of the study is to examine the data and create a bankruptcy warning alert that may be usefully included in the company's current system. The optimal model will be chosen using predetermined performance parameters using publicly accessible datasets after constructing predictive machine learning models for bankruptcy occurrence, evaluating several classification approaches to find which is the best approach for this application. The used Methodology utilizes five of the most advanced machine learning algorithms, including Neural Network, Logistic Regression, Decision Tree, Random Forest, and K-Nearest Neighbors. To select the optimal model, this study passes through five stages: Data Retrieval, Data Preparation and Data Exploration, Data preprocessing, Data Modelling, and at the end Data Evaluation. Findings show that the study crowned Random Forest as the champion model for predicting US companies' bankruptcy. By examining the metrics used to compare the performance of the five algorithms. Random Forest provides the highest accuracy and F1 score of 0.99. It is worth mentioning that Decision Tree, K-Neighbors Neighbors, and Neural Network record similar accuracy of 0.76 and an F1 score (weighted average) of 0.86. while Logistic Regression came last with an unsatisfactory accuracy of 0.56 and an F1 score of 0.71.
Amira Ibrahim Ali Abdelhady Ghanem