Automated classification of HTML forms on e‐commerce web sites
Abstract
Purpose
Most e‐commerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results.
Design/methodology/approach
Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them.
Findings
The authors tested their classifier on an e‐commerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method.
Originality/value
The paper is of value to those involved with information management and e‐commerce.
Keywords
Citation
Ru, Y. and Horowitz, E. (2007), "Automated classification of HTML forms on e‐commerce web sites", Online Information Review, Vol. 31 No. 4, pp. 451-466. https://doi.org/10.1108/14684520710780412
Publisher
:Emerald Group Publishing Limited
Copyright © 2007, Emerald Group Publishing Limited