Balancing and variable reduction of firm bankruptcy data

David L. Olson; Bongsug Chae

doi:10.18757/jscms.2022.6243

Balancing and variable reduction of firm bankruptcy data

Authors

David L. Olson University of Nebraska Lincoln
Bongsug Chae Kansas State University

DOI:

https://doi.org/10.18757/jscms.2022.6243

Abstract

Financial stress experienced by supply chain elements causes stress to all members. Predictive data mining is a common tool for predicting bankruptcy. Bankruptcy often involves highly imbalanced datasets with a large number of potential variables, with bankrupt firms being by far the minority case. This study uses data from four studies of firm bankruptcy and examines the impact of data balancing and variable selection on model accuracy. The models used are random forest and gradient boosting based on decision trees, logistic regression, neural networks, and support vector machines. Two machine learning methods are used to trim the number of variables. Stepwise regression and entropy from decision trees are used to generate reduced variable sets. The complexity parameter was used to set levels on number of variables using the entropy (decision tree) option. The impact of reducing variables is examined. Error metrics used were type I and type II error (sensitivity and specificity), overall average error (accuracy), and area under the recall curve (AuC). The average error of extreme gradient boosting and random forest models was found to be better than support vector machines, which had a slight advantage over logistic regression and neural networks. Variable reduction was found to lead to mixed results with respect to relative accuracy. Overall accuracy increased with slight reduction in the number of variables (using stepwise regression), but deteriorated as the number of variables was reduced to the smaller number of variables. The experiments into balancing found that unbalanced data had high error rates, which dropped a great deal with even 10 percent balancing, but balancing beyond 10 percent was found to provide little additional accuracy.

Downloads

Published

2022-07-30

How to Cite

Olson, D. L., & Chae, B. (2022). Balancing and variable reduction of firm bankruptcy data. Journal of Supply Chain Management Science, 3(1-2), 3–15. https://doi.org/10.18757/jscms.2022.6243

Download Citation

Issue

Vol. 3 No. 1-2 (2022): January 2022 - June 2022

Section

Research Articles

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

JSCMS is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. The license means that anyone is free to share (to copy, distribute, and transmit the work), to remix (to adapt the work) under the following conditions:

The original authors must be given credit
For any reuse or distribution, it must be made clear to others what the license terms of this work are
Any of these conditions can be waived if the copyright holders give permission
Nothing in this license impairs or restricts the author's moral rights

Balancing and variable reduction of firm bankruptcy data

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)