To read this content please select one of the options below:

Behaviour analysis of internet survey completion using decision trees: An exploratory study

Che‐Chern Lin (National Kaohsiung Normal University, Kaohsiung City, Taiwan)
Hung‐Jen Yang (National Kaohsiung Normal University, Kaohsiung City, Taiwan)
Lung‐Hsing Kuo (National Kaohsiung Normal University, Kaohsiung City, Taiwan)

Online Information Review

ISSN: 1468-4527

Article publication date: 20 February 2009

1419

Abstract

Purpose

The purpose of this paper is to explore teachers' behaviours in completing an internet survey using decision trees. Furthermore, to reduce the complexity of the decision trees, a statistical technique was used to decrease the number of input variables in the decision trees.

Design/methodology/approach

A dataset of 47,647 samples was used to build the decision trees. These samples were collected from an internet survey of teachers in Taiwan. The output of the decision trees was the answering time (the time taken to complete the internet questionnaire). Eight variables were selected as the inputs for the decision trees. Two techniques were employed to build the decision trees – the exhaustive chi‐squared automatic interaction detector (ECHAID) and classification and regression tree (CRT) analysis. To reduce the complexity of the decision models, factor analysis technique was used to decrease the data dimensions (number of input variables) and to obtain a simplified decision model. One‐way ANOVA was used to validate the effects of the dimension reduction.

Findings

From the results of the factor analysis, a simplified decision tree is recommended using four input variables – teaching years, school level, sex and area. The classification accuracy of the simplified model is statistically equivalent to that of the original one, which used eight input variables.

Originality/value

The complexity of decision trees theoretically depends on the number of input variables. This study used a statistical technique to decrease the number of input variables and thereby reduce the complexity of the decision trees. A statistical technique was employed to validate that the classification accuracy is not statistically different between the original decision model and the simplified one. The decision models proposed in this paper can be applied in estimating the answering time for completing a questionnaire during an internet survey.

Keywords

Citation

Lin, C., Yang, H. and Kuo, L. (2009), "Behaviour analysis of internet survey completion using decision trees: An exploratory study", Online Information Review, Vol. 33 No. 1, pp. 117-134. https://doi.org/10.1108/14684520910944427

Publisher

:

Emerald Group Publishing Limited

Copyright © 2009, Emerald Group Publishing Limited

Related articles