Search results

1 – 10 of 199
Per page
102050
Citations:
Loading...
Available. Content available
Article
Publication date: 11 September 2007

Peter Willett and Stephen Robertson

576

Abstract

Details

Journal of Documentation, vol. 63 no. 5
Type: Research Article
ISSN: 0022-0418

Access Restricted. View access options
Article
Publication date: 1 April 1974

KAREN SPARCK JONES

This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in…

327

Abstract

This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in searching, and for generating the index language used for these purposes. It concentrates on the literature from 1968 to 1973. Section I defines the topic and its context. Sections II and III consider work in syntax and semantics respectively in detail. Section IV comments on ‘indirect’ indexing. Section V briefly surveys operating mechanized systems. In Section VI major experiments in automatic indexing are reviewed, and Section VII attempts an overall conclusion on the current state of automatic indexing techniques.

Details

Journal of Documentation, vol. 30 no. 4
Type: Research Article
ISSN: 0022-0418

Access Restricted. View access options
Article
Publication date: 1 October 2005

Birger Hjørland and Karsten Nissen Pedersen

To suggest that a theory of classification for information retrieval (IR), asked for by Spärck Jones in a 1970 paper, presupposes a full implementation of a pragmatic…

3468

Abstract

Purpose

To suggest that a theory of classification for information retrieval (IR), asked for by Spärck Jones in a 1970 paper, presupposes a full implementation of a pragmatic understanding. Part of the Journal of Documentation celebration, “60 years of the best in information research”.

Design/methodology/approach

Literature‐based conceptual analysis, taking Spärck Jones as its starting‐point. Analysis involves distinctions between “positivism” and “pragmatism” and “classical” versus Kuhnian understandings of concepts.

Findings

Classification, both manual and automatic, for retrieval benefits from drawing upon a combination of qualitative and quantitative techniques, a consideration of theories of meaning, and the adding of top‐down approaches to IR in which divisions of labour, domains, traditions, genres, document architectures etc. are included as analytical elements and in which specific IR algorithms are based on the examination of specific literatures. Introduces an example illustrating the consequences of a full implementation of a pragmatist understanding when handling homonyms.

Practical implications

Outlines how to classify from a pragmatic‐philosophical point of view.

Originality/value

Provides, emphasizing a pragmatic understanding, insights of importance to classification for retrieval, both manual and automatic.

Details

Journal of Documentation, vol. 61 no. 5
Type: Research Article
ISSN: 0022-0418

Keywords

Access Restricted. View access options
Article
Publication date: 1 February 1977

S.E. ROBERTSON

This paper is concerned with recent work in the theory of information retrieval. More particularly, it is concerned with theories which tackle the problem of retrieval…

610

Abstract

This paper is concerned with recent work in the theory of information retrieval. More particularly, it is concerned with theories which tackle the problem of retrieval performance, in a sense which will be explained. The aim is not an exhaustive survey of such work; rather it is an analysis and synthesis of those contributions which I feel to be important or find interesting.

Details

Journal of Documentation, vol. 33 no. 2
Type: Research Article
ISSN: 0022-0418

Access Restricted. View access options
Article
Publication date: 1 April 1996

ALEXANDER M. ROBERTSON and PETER WILLETT

This paper describes the development of a genetic algorithm (GA) for the assignment of weights to query terms in a ranked‐output document retrieval system. The GA involves a…

108

Abstract

This paper describes the development of a genetic algorithm (GA) for the assignment of weights to query terms in a ranked‐output document retrieval system. The GA involves a fitness function that is based on full relevance information, and the rankings resulting from the use of these weights are compared with the Robertson‐Sparck Jones F4 retrospective relevance weight. Extended experiments with seven document test collections show that the ga can often find weights that are slightly superior to those produced by the deterministic weighting scheme. That said, there are many cases where the two approaches give the same results, and a few cases where the F4 weights are superior to the ga weights. Since the ga has been designed to identify weights yielding the best possible level of retrospective performance, these results indicate that the F4 weights provide an excellent and practicable alternative. Evidence is presented to suggest that negative weights may play an important role in retrospective relevance weighting.

Details

Journal of Documentation, vol. 52 no. 4
Type: Research Article
ISSN: 0022-0418

Access Restricted. View access options
Article
Publication date: 1 January 1979

KAREN SPARCK JONES

Previous experiments demonstrated the value of relevance weighting for search terms, but relied on substantial relevance information for the terms. The present experiments were…

154

Abstract

Previous experiments demonstrated the value of relevance weighting for search terms, but relied on substantial relevance information for the terms. The present experiments were designed to study the effects of weights based on very limited relevance information, for example supplied by one or two relevant documents. The tests simulated iterative searching, as in an on‐line system, and show that even very little relevance information can be of considerable value.

Details

Journal of Documentation, vol. 35 no. 1
Type: Research Article
ISSN: 0022-0418

Access Restricted. View access options
Article
Publication date: 1 March 1986

S.E. ROBERTSON

A Bayesian argument is used to suggest modifications to the Robertson/Sparck Jones relevance weighting formula, to accommodate the addition to the query of terms taken from the…

404

Abstract

A Bayesian argument is used to suggest modifications to the Robertson/Sparck Jones relevance weighting formula, to accommodate the addition to the query of terms taken from the relevant documents identified during the search.

Details

Journal of Documentation, vol. 42 no. 3
Type: Research Article
ISSN: 0022-0418

Access Restricted. View access options
Article
Publication date: 1 March 1978

D.J. HARPER and C.J. VAN RIJSBERGEN

This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently. Initially this…

166

Abstract

This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently. Initially this model was tested with complete relevance information against a similar model which assumes index terms are distributed independently. The experiments demonstrated conclusively that index terms are not independent for a number of diverse document collections. It was concluded that the use of relevance information together with dependence information could potentially improve retrieval effectiveness. As a result of further experiments the initial strict dependence model was modified and in particular a new relevance‐based term weight was developed. This modified dependence model was then used as the basis for relevance feedback, i.e. with partial relevance information only, and significant increases in retrieval effectiveness were achieved. The evaluation method used in the feedback experiments emphasized the effect of the feedback on documents which the potential user would not previously have seen. Finally the incorporation of relevance feedback in an operational system is considered and in particular it is argued that if high recall searches are required, relevance feedback based on the modified dependence model may be superior to the widely used Boolean search.

Details

Journal of Documentation, vol. 34 no. 3
Type: Research Article
ISSN: 0022-0418

Access Restricted. View access options
Article
Publication date: 1 March 1973

C.J. VAN RIJSBERGEN and K. SPARCK JONES

Many retrieval experiments are intended to discover ways of improving performance, taking the results obtained with some particular technique as a baseline. The fact that…

66

Abstract

Many retrieval experiments are intended to discover ways of improving performance, taking the results obtained with some particular technique as a baseline. The fact that substantial alterations to a system often have little or no effect on particular collections is puzzling. This may be due to the initially poor separation of relevant and non‐relevant documents. The paper presents a procedure for characterizing this separation for a collection, which can be used to show whether proposed modifications of the base system are likely to be useful.

Details

Journal of Documentation, vol. 29 no. 3
Type: Research Article
ISSN: 0022-0418

Access Restricted. View access options
Article
Publication date: 1 January 1992

DAVID ELLIS

This paper explores the role of paradigms in information retrieval research. The nature of a paradigm is outlined and the fundamental sense of a paradigm as an exemplar is…

775

Abstract

This paper explores the role of paradigms in information retrieval research. The nature of a paradigm is outlined and the fundamental sense of a paradigm as an exemplar is identified. The applicability of the paradigm concept to a multi‐disciplinary field such as information science is discussed and it is concluded that paradigms can be a legitimate feature of information science though they may not be connected with the development of normal science. The features of two paradigms operating in information retrieval research, (1) the physical paradigm and (2) the cognitive paradigm are outlined, and their origins, nature and role examined. It is argued that although most work in information retrieval research takes place within the physical and cognitive paradigms, neither provides the basis for a powerful paradigm directed science. An explanation for the failure to develop a powerful body of theory articulated within a well developed paradigmatic framework is offered with reference to the inherent categorial duality of the field.

Details

Journal of Documentation, vol. 48 no. 1
Type: Research Article
ISSN: 0022-0418

1 – 10 of 199
Per page
102050