Peng Liu, Elia El‐Darzi, Lei Lei, Christos Vasilakis, Panagiotis Chountas and Wei Huang
Purpose – Data preparation plays an important role in data mining as most real life data sets contained missing data. This paper aims to investigate different treatment methods…
Abstract
Purpose – Data preparation plays an important role in data mining as most real life data sets contained missing data. This paper aims to investigate different treatment methods for missing data. Design/methodology/approach – This paper introduces, analyses and compares well‐established treatment methods for missing data and proposes new methods based on naïve Bayesian classifier. These methods have been implemented and compared using a real life geriatric hospital dataset. Findings – In the case where a large proportion of the data is missing and many attributes have missing data, treatment methods based on naïve Bayesian classifier perform very well. Originality/value – This paper proposes an effective missing data treatment method and offers a viable approach to predict inpatient length of stay from a data set with many missing values.