To read this content please select one of the options below:

Preserving privacy on the searchable internet

Ruxia Ma (School of Information, Renmin University of China, Beijing, China)
Xiaofeng Meng (School of Information, Renmin University of China, Beijing, China)
Zhongyuan Wang (Microsoft Research Asia, Beijing, China)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 24 August 2012

607

Abstract

Purpose

The Web is the largest repository of information. Personal information is usually scattered on various pages of different websites. Search engines have made it easier to find personal information. An attacker may collect a user's scattered information together via search engines, and infer some privacy information. The authors call this kind of privacy attack “Privacy Inference Attack via Search Engines”. The purpose of this paper is to provide a user‐side automatic detection service for detecting the privacy leakage before publishing personal information.

Design/methodology/approach

In this paper, the authors propose a user‐side automatic detection service. In the user‐side service, the authors construct a user information correlation (UICA) graph to model the association between user information returned by search engines. The privacy inference attack is mapped into a decision problem of searching a privacy inferring path with the maximal probability in the UICA graph and it is proved that it is a nondeterministic polynomial time (NP)‐complete problem by a two‐step reduction. A Privacy Leakage Detection Probability (PLD‐Probability) algorithm is proposed to find the privacy inferring path: it combines two significant factors which can influence the vertexes' probability in the UICA graph and uses greedy algorithm to find the privacy inferring path.

Findings

The authors reveal that privacy inferring attack via search engines is very serious in real life. In this paper, a user‐side automatic detection service is proposed to detect the risk of privacy inferring. The authors make three kinds of experiments to evaluate the seriousness of privacy leakage problem and the performance of methods proposed in this paper. The results show that the algorithm for the service is reasonable and effective.

Originality/value

The paper introduces a new family of privacy attacks on the Web: privacy inferring attack via search engines and presents a privacy inferring model to describe the process and principles of personal privacy inferring attack via search engines. A user‐side automatic detection service is proposed to detect the privacy inference before publishing personal information. In this user‐side service, the authors propose a Privacy Leakage Detection Probability (PLD‐Probability) algorithm. Extensive experiments show these methods are reasonable and effective.

Keywords

Citation

Ma, R., Meng, X. and Wang, Z. (2012), "Preserving privacy on the searchable internet", International Journal of Web Information Systems, Vol. 8 No. 3, pp. 322-344. https://doi.org/10.1108/17440081211258196

Publisher

:

Emerald Group Publishing Limited

Copyright © 2012, Emerald Group Publishing Limited

Related articles