To read this content please select one of the options below:

Prediction of construction accident outcomes based on an imbalanced dataset through integrated resampling techniques and machine learning methods

Kerim Koc (Department of Civil Engineering, Yildiz Technical University, Istanbul, Turkey)
Ömer Ekmekcioğlu (Department of Civil Engineering, Istanbul Technical University, Istanbul, Turkey)
Asli Pelin Gurgun (Department of Civil Engineering, Yildiz Technical University, Istanbul, Turkey)

Engineering, Construction and Architectural Management

ISSN: 0969-9988

Article publication date: 23 June 2022

Issue publication date: 27 November 2023

1113

Abstract

Purpose

Central to the entire discipline of construction safety management is the concept of construction accidents. Although distinctive progress has been made in safety management applications over the last decades, construction industry still accounts for a considerable percentage of all workplace fatalities across the world. This study aims to predict occupational accident outcomes based on national data using machine learning (ML) methods coupled with several resampling strategies.

Design/methodology/approach

Occupational accident dataset recorded in Turkey was collected. To deal with the class imbalance issue between the number of nonfatal and fatal accidents, the dataset was pre-processed with random under-sampling (RUS), random over-sampling (ROS) and synthetic minority over-sampling technique (SMOTE). In addition, random forest (RF), Naïve Bayes (NB), K-Nearest neighbor (KNN) and artificial neural networks (ANNs) were employed as ML methods to predict accident outcomes.

Findings

The results highlighted that the RF outperformed other methods when the dataset was preprocessed with RUS. The permutation importance results obtained through the RF exhibited that the number of past accidents in the company, worker's age, material used, number of workers in the company, accident year, and time of the accident were the most significant attributes.

Practical implications

The proposed framework can be used in construction sites on a monthly-basis to detect workers who have a high probability to experience fatal accidents, which can be a valuable decision-making input for safety professionals to reduce the number of fatal accidents.

Social implications

Practitioners and occupational health and safety (OHS) departments of construction firms can focus on the most important attributes identified by analysis results to enhance the workers' quality of life and well-being.

Originality/value

The literature on accident outcome predictions is limited in terms of dealing with imbalanced dataset through integrated resampling techniques and ML methods in the construction safety domain. A novel utilization plan was proposed and enhanced by the analysis results.

Keywords

Acknowledgements

The authors would like to thank the Republic of Turkey, Social Security Institution (SSI) for their support and providing the dataset. The authors would like to acknowledge that this paper is submitted in partial fulfillment of the requirements for PhD degree at Yildiz Technical University.

Citation

Koc, K., Ekmekcioğlu, Ö. and Gurgun, A.P. (2023), "Prediction of construction accident outcomes based on an imbalanced dataset through integrated resampling techniques and machine learning methods", Engineering, Construction and Architectural Management, Vol. 30 No. 9, pp. 4486-4517. https://doi.org/10.1108/ECAM-04-2022-0305

Publisher

:

Emerald Publishing Limited

Copyright © 2022, Emerald Publishing Limited

Related articles