Search results | Emerald Insight

Content available

Article

Publication date: 11 September 2007

In memoriam: Karen Spärck Jones

Peter Willett and Stephen Robertson

HTML

Downloads

576

View access options

Article

Publication date: 1 February 1972

THE SHANNON MODEL OF IR SYSTEMS

B.C. BROOKES

This note was evoked by the reference by Karen Sparck Jones to a paper by Zunde and Slamecka which has recently been reprinted in Introduction to Information Science, edited by…

HTML

PDF (175 KB)

Downloads

74

Abstract

This note was evoked by the reference by Karen Sparck Jones to a paper by Zunde and Slamecka which has recently been reprinted in Introduction to Information Science, edited by Saracevic. Zunde and Slamecka purport to show that, for optimum performance of IR systems, the frequency distribution of descriptor terms should conform with a geometric progression. This result is at variance with the widely accepted result derived from the Shannon model which shows that optimum performance of an IR system occurs when the descriptor terms are equi‐probable, i.e. when their frequency distribution is uniform. The uncertainty arising from these two different solutions to the same problem clearly led Karen Sparck Jones to have some reservations about the theoretical justification for her interesting idea of weighting search terms to give them, in effect, the equal weights that the usual Shannon result demands for optimum performance. But Sparck Jones need have no such reservations. The result obtained by Zunde and Slamecka, though plausible because it has some fortuitous semblance to the distributions of terms found in real systems, is in fact erroneous.

Details

Journal of Documentation, vol. 28 no. 2

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 April 1974

AUTOMATIC INDEXING

KAREN SPARCK JONES

This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in…

HTML

PDF (2.4 MB)

Downloads

327

Abstract

This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in searching, and for generating the index language used for these purposes. It concentrates on the literature from 1968 to 1973. Section I defines the topic and its context. Sections II and III consider work in syntax and semantics respectively in detail. Section IV comments on ‘indirect’ indexing. Section V briefly surveys operating mechanized systems. In Section VI major experiments in automatic indexing are reviewed, and Section VII attempts an overall conclusion on the current state of automatic indexing techniques.

Details

Journal of Documentation, vol. 30 no. 4

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 October 2005

A substantive theory of classification for information retrieval

Birger Hjørland and Karsten Nissen Pedersen

To suggest that a theory of classification for information retrieval (IR), asked for by Spärck Jones in a 1970 paper, presupposes a full implementation of a pragmatic…

HTML

PDF (114 KB)

Downloads

3468

Abstract

Purpose

To suggest that a theory of classification for information retrieval (IR), asked for by Spärck Jones in a 1970 paper, presupposes a full implementation of a pragmatic understanding. Part of the Journal of Documentation celebration, “60 years of the best in information research”.

Design/methodology/approach

Literature‐based conceptual analysis, taking Spärck Jones as its starting‐point. Analysis involves distinctions between “positivism” and “pragmatism” and “classical” versus Kuhnian understandings of concepts.

Findings

Classification, both manual and automatic, for retrieval benefits from drawing upon a combination of qualitative and quantitative techniques, a consideration of theories of meaning, and the adding of top‐down approaches to IR in which divisions of labour, domains, traditions, genres, document architectures etc. are included as analytical elements and in which specific IR algorithms are based on the examination of specific literatures. Introduces an example illustrating the consequences of a full implementation of a pragmatist understanding when handling homonyms.

Practical implications

Outlines how to classify from a pragmatic‐philosophical point of view.

Originality/value

Provides, emphasizing a pragmatic understanding, insights of importance to classification for retrieval, both manual and automatic.

Details

Journal of Documentation, vol. 61 no. 5

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 1 October 2005

Erratum

This article has been withdrawn as it was published elsewhere and accidentally duplicated. The original article can be seen here: 10.1108/eb026488. When citing the article, please…

HTML

PDF (58 KB)

Downloads

2307

Abstract

This article has been withdrawn as it was published elsewhere and accidentally duplicated. The original article can be seen here: 10.1108/eb026488. When citing the article, please cite: KAREN SPARCK JONES, (1970), “SOME THOUGHTS ON CLASSIFICATION FOR RETRIEVAL”, Journal of Documentation, Vol. 26 Iss: 2, pp. 89 - 101.

Details

Journal of Documentation, vol. 61 no. 5

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 January 1979

SEARCH TERM RELEVANCE WEIGHTING GIVEN LITTLE RELEVANCE INFORMATION

KAREN SPARCK JONES

Previous experiments demonstrated the value of relevance weighting for search terms, but relied on substantial relevance information for the terms. The present experiments were…

HTML

PDF (1000 KB)

Downloads

154

Abstract

Previous experiments demonstrated the value of relevance weighting for search terms, but relied on substantial relevance information for the terms. The present experiments were designed to study the effects of weights based on very limited relevance information, for example supplied by one or two relevant documents. The tests simulated iterative searching, as in an on‐line system, and show that even very little relevance information can be of considerable value.

Details

Journal of Documentation, vol. 35 no. 1

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 October 2005

Revisiting classification for retrieval

Karen Spärck Jones

This short note seeks to respond to Hjørland and Pederson's paper “A substantive theory of classification for information retrieval” which starts from Spärck Jones's, “Some…

HTML

PDF (45 KB)

Downloads

1414

Abstract

Purpose

This short note seeks to respond to Hjørland and Pederson's paper “A substantive theory of classification for information retrieval” which starts from Spärck Jones's, “Some thoughts on classification for retrieval”, originally published in 1970.

Design/methodology/approach

The note comments on the context in which the 1970 paper was written, and on Hjørland and Pedersen's views, emphasising the need for well‐grounded classification theory and application.

Findings

The note maintains that text‐based, a posteriori, classification, as increasingly found in applications, is likely to be more useful, in general, than a priori classification.

Originality/value

The note elaborates on points made in a well‐received earlier paper.

Details

Journal of Documentation, vol. 61 no. 5

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

Information retrieval, Classification

View access options

Article

Publication date: 1 April 1975

A PERFORMANCE YARDSTICK FOR TEST COLLECTIONS

KAREN SPARCK JONES

It would be very helpful in retrieval experiments if good retrieval performance for a test collection was known, so that performance for particular devices could be fully…

HTML

PDF (344 KB)

Downloads

36

Abstract

It would be very helpful in retrieval experiments if good retrieval performance for a test collection was known, so that performance for particular devices could be fully evaluated. This paper presents one performance yardstick, based on optimally weighted request terms, and illustrates its application to different test collections.

Details

Journal of Documentation, vol. 31 no. 4

Type: Research Article

DOI:

ISSN: 0022-0418

Content available

Article

Publication date: 1 November 2006

JDoc60 series: information research over six decades

David Bawden

HTML

Downloads

454

View access options

Article

Publication date: 1 March 1978

AN EVALUATION OF FEEDBACK IN DOCUMENT RETRIEVAL USING CO‐OCCURRENCE DATA

D.J. HARPER and C.J. VAN RIJSBERGEN

This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently. Initially this…

HTML

PDF (1.3 MB)

Downloads

166

Abstract

This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently. Initially this model was tested with complete relevance information against a similar model which assumes index terms are distributed independently. The experiments demonstrated conclusively that index terms are not independent for a number of diverse document collections. It was concluded that the use of relevance information together with dependence information could potentially improve retrieval effectiveness. As a result of further experiments the initial strict dependence model was modified and in particular a new relevance‐based term weight was developed. This modified dependence model was then used as the basis for relevance feedback, i.e. with partial relevance information only, and significant increases in retrieval effectiveness were achieved. The evaluation method used in the feedback experiments emphasized the effect of the feedback on documents which the potential user would not previously have seen. Finally the incorporation of relevance feedback in an operational system is considered and in particular it is argued that if high recall searches are required, relevance feedback based on the modified dependence model may be superior to the widely used Boolean search.

Details

Journal of Documentation, vol. 34 no. 3

Type: Research Article

DOI:

ISSN: 0022-0418

Abstract

Details

Abstract

Details

Abstract

Details

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Details

Abstract

Details

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Abstract

Details

Abstract

Details

Access

Year

Content type

All feedback is valuable

Report an issue or find answers to frequently asked questions