Search results

1 – 3 of 3
Per page
102050
Citations:
Loading...
Access Restricted. View access options
Article
Publication date: 28 August 2009

Vassiliki A. Koutsonikola, Sophia G. Petridou, Athena I. Vakali and Georgios I. Papadimitriou

Web users' clustering is an important mining task since it contributes in identifying usage patterns, a beneficial task for a wide range of applications that rely on the web. The…

452

Abstract

Purpose

Web users' clustering is an important mining task since it contributes in identifying usage patterns, a beneficial task for a wide range of applications that rely on the web. The purpose of this paper is to examine the usage of Kullback‐Leibler (KL) divergence, an information theoretic distance, as an alternative option for measuring distances in web users clustering.

Design/methodology/approach

KL‐divergence is compared with other well‐known distance measures and clustering results are evaluated using a criterion function, validity indices, and graphical representations. Furthermore, the impact of noise (i.e. occasional or mistaken page visits) is evaluated, since it is imperative to assess whether a clustering process exhibits tolerance in noisy environments such as the web.

Findings

The proposed KL clustering approach is of similar performance when compared with other distance measures under both synthetic and real data workloads. Moreover, imposing extra noise on real data, the approach shows minimum deterioration among most of the other conventional distance measures.

Practical implications

The experimental results show that a probabilistic measure such as KL‐divergence has proven to be quite efficient in noisy environments and thus constitute a good alternative, the web users clustering problem.

Originality/value

This work is inspired by the usage of divergence in clustering of biological data and it is introduced by the authors in the area of web clustering. According to the experimental results presented in this paper, KL‐divergence can be considered as a good alternative for measuring distances in noisy environments such as the web.

Details

International Journal of Web Information Systems, vol. 5 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Available. Content available
Article
Publication date: 28 August 2009

Ismail Khalil

370

Abstract

Details

International Journal of Web Information Systems, vol. 5 no. 3
Type: Research Article
ISSN: 1744-0084

Available. Content available
Article
Publication date: 23 November 2010

566

Abstract

Details

International Journal of Web Information Systems, vol. 6 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

1 – 3 of 3
Per page
102050