Search results

Article

Publication date: 1 January 1981

A MODEL OF A DOCUMENT RETRIEVAL SYSTEM BASED ON THE CONCEPT OF A SEMANTIC DISJUNCTIVE NORMAL FORM

A new method of document retrieval is presented on the basis of fundamental fuzzy set theory operations and the notion of a semantic disjunctive normal form. Concepts of semantic…

HTML

PDF (497 KB)

Downloads

44

Abstract

A new method of document retrieval is presented on the basis of fundamental fuzzy set theory operations and the notion of a semantic disjunctive normal form. Concepts of semantic normal forms are defined, i.e. the semantic disjunctive normal form and the semantic conjunctive normal form, and their elementary properties, are presented. The syntax and the semantics of the proposed document retrieval language are given and an algorithm for allocating documents to particular queries is described. The document retrieval strategy based on the concept of a semantic disjunctive normal form is exemplified. A basic advantage of the use of the fuzzy set theory for the document retrieval system description is that it takes, in a simple way, into consideration the differentiation of descriptor importance, document search patterns and the differentiation of formal relevance grades of individual documents to a given query. In an information system the documents of the highest grades of formal relevance to a given query are retrieved by means of the application of simple operations of the fuzzy set theory.

Details

Kybernetes, vol. 10 no. 1

Type: Research Article

DOI:

ISSN: 0368-492X

View access options

Article

Publication date: 1 January 1982

ON A PROBABILISTIC APPROACH TO DETERMINING THE SIMILARITY BETWEEN BOOLEAN SEARCH REQUEST FORMULATIONS

TADEUSZ RADECKI

A new and promising approach to document clustering consists of utilizing previously formed clusters of queries to cluster documents. To employ this approach in practice a…

HTML

PDF (961 KB)

Downloads

70

Abstract

A new and promising approach to document clustering consists of utilizing previously formed clusters of queries to cluster documents. To employ this approach in practice a similarity measure for queries must be available. This requirement does not cause any problem in the case of information retrieval systems in which both the search request formulations and document representations are sets of weighted or unweighted index terms. However, in most operational retrieval systems search request formulations are Boolean combinations of index terms. Research into similarity measures for search request formulations of this type has already been undertaken by the author and reported elsewhere. The present paper provides further results of investigations in this area. The novelty of the approach discussed is the incorporation within the methodology described earlier of a weighting mechanism to indicate the relative importance of particular attributes of a given Boolean search request formulation. A modification suggested is based on the standard probabilistic approach to information retrieval.

Details

Journal of Documentation, vol. 38 no. 1

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 March 1982

REDUCING THE PERILS OF MERGING BOOLEAN AND WEIGHTED RETRIEVAL SYSTEMS

TADEUSZ RADECKI

A need for developing an information retrieval technique maintaining the appeal of Boolean retrieval schemes and in addition providing the advantages of a ranked search output has…

HTML

PDF (336 KB)

Downloads

54

Abstract

A need for developing an information retrieval technique maintaining the appeal of Boolean retrieval schemes and in addition providing the advantages of a ranked search output has been pointed out in the literature for many years. However, a previous attempt to incorporate into the Boolean retrieval schemes a weighting mechanism to produce ranked lists of documents has not been fully successful. Specifically, further research has demonstrated that the theory behind the previous approach is characterized by disturbing ambiguities and inconsistencies, with equivalent Boolean search request formulations yielding different rankings of documents retrieved. As a result of this more recent research an alternative approach has been outlined. However, a closer analysis of this second approach reveals that it is also not free from some intrinsic weaknesses. The present paper provides the results of this new analysis and suggests a more rigorous methodology.

Details

Journal of Documentation, vol. 38 no. 3

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 February 1993

ONLINE SEARCH INTERFACE DESIGN

BRIAN VICKERY and ALINA VICKERY

There is a huge amount of information and data stored in publicly available online databases that consist of large text files accessed by Boolean search techniques. It is widely…

HTML

PDF (4.1 MB)

Downloads

545

Abstract

There is a huge amount of information and data stored in publicly available online databases that consist of large text files accessed by Boolean search techniques. It is widely held that less use is made of these databases than could or should be the case, and that one reason for this is that potential users find it difficult to identify which databases to search, to use the various command languages of the hosts and to construct the Boolean search statements required. This reasoning has stimulated a considerable amount of exploration and development work on the construction of search interfaces, to aid the inexperienced user to gain effective access to these databases. The aim of our paper is to review aspects of the design of such interfaces: to indicate the requirements that must be met if maximum aid is to be offered to the inexperienced searcher; to spell out the knowledge that must be incorporated in an interface if such aid is to be given; to describe some of the solutions that have been implemented in experimental and operational interfaces; and to discuss some of the problems encountered. The paper closes with an extensive bibliography of references relevant to online search aids, going well beyond the items explicitly mentioned in the text. An index to software appears after the bibliography at the end of the paper.

Details

Journal of Documentation, vol. 49 no. 2

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 January 1979

MATHEMATICS AND INFORMATION RETRIEVAL

GERARD SALTON

The development of a given discipline in science and technology often depends on the availability of theories capable of describing the processes which control the field and of…

HTML

PDF (1.5 MB)

Downloads

429

Abstract

The development of a given discipline in science and technology often depends on the availability of theories capable of describing the processes which control the field and of modelling the interactions between these processes. The absence of an accepted theory of information retrieval has been blamed for the relative disorder and the lack of technical advances in the area. The main mathematical approaches to information retrieval are examined in this study, including both algebraic and probabilistic models, and the difficulties which impede the formalization of information retrieval processes are described. A number of developments are covered where new theoretical understandings have directly led to the improvement of retrieval techniques and operations.

Details

Journal of Documentation, vol. 35 no. 1

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 February 1990

A MODEL OF KNOWLEDGE BASED INFORMATION RETRIEVAL WITH HIERARCHICAL CONCEPT GRAPH

YOUNG WHAN KIM and JIN H. KIM

This paper discusses a knowledge based information retrieval model with hierarchical thesaurus. The model computes the conceptual distance between a query and an object and both…

HTML

PDF (1.1 MB)

Downloads

294

Abstract

This paper discusses a knowledge based information retrieval model with hierarchical thesaurus. The model computes the conceptual distance between a query and an object and both are indexed with weighted terms from a hierarchical thesaurus. The hierarchical thesaurus is represented by a hierarchical‐concept graph (HCG) in which nodes represent concepts and directed edges represent generalisation relationships. Rada et al. have developed a similar model. However, their model considered only a binary indexing scheme and revealed some counter‐intuitive results. Our proposed model extends theirs by allowing the index term and the edge of the HCG to be weighted. A new concept mapping method is devised to overcome Rada's counter‐intuitive results. In addition, a scheme for allowing Boolean operators in user queries is provided with a formula for computing conceptual distance from negated index terms. Experimental results have shown that our model simulates human performance more closely than Rada's model.

Details

Journal of Documentation, vol. 46 no. 2

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 February 1983

OUTLINE OF A GENERAL PROBABILISTIC RETRIEVAL MODEL

ABRAHAM BOOKSTEIN

For reasons of technical convenience, current retrieval algorithms based on probabilistic reasoning are derived from models that assume patrons evaluate documents using a two…

HTML

PDF (601 KB)

Downloads

88

Abstract

For reasons of technical convenience, current retrieval algorithms based on probabilistic reasoning are derived from models that assume patrons evaluate documents using a two value relevance scale. This paper extends the theory by describing a model which includes a more general relevance scale. This model permits a re‐examination of the earlier theory as a special case of that developed here and leads to a more satisfying interpretation of the ranking principle of the earlier models.

Details

Journal of Documentation, vol. 39 no. 2

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 March 1986

INSTRUCT: a teaching package for experimental methods in information retrieval. Part I. The users view

Ian G Hendry, Peter Willett and Frances E. Wood

This paper describes INSTRUCT, an interactive computer program which has been developed as a teaching aid for use within schools of librarianship and information science. The…

HTML

PDF (925 KB)

Downloads

65

Abstract

This paper describes INSTRUCT, an interactive computer program which has been developed as a teaching aid for use within schools of librarianship and information science. The program demonstrates some of the techniques that have been suggested for implementing document retrieval systems in the future, and currently runs on a search file that comprises 6,004 documents from the Library and Information Science Abstracts database. INSTRUCT has facilities for natural language query processing, including the use of a stop‐word list, a stemming algorithm and a fuzzy‐matching routine that allows the automatic identification of a range of word variants; the provision of ranked output using automatic term weighting and a nearest‐neighbour searching procedure; and automatic relevance feedback using probabilistic relevance weights. The program is menu‐driven and can be used by searchers with little or no user training.

Details

Program, vol. 20 no. 3

Type: Research Article

DOI:

ISSN: 0033-0337

View access options

Article

Publication date: 1 February 1978

RANKING IN PRINCIPLE

S.E. ROBERTSON and N.J. BELKIN

It is often suggested that information retrieval systems should rank documents rather than simply retrieving a set. Two separate reasons are adduced for this: that relevance…

HTML

PDF (472 KB)

Downloads

184

Abstract

It is often suggested that information retrieval systems should rank documents rather than simply retrieving a set. Two separate reasons are adduced for this: that relevance itself is a multi‐valued or continuous variable; and that retrieval is an essentially approximate process. These two reasons lead to different ranking principles, one according to degree of relevance, the other according to probability of relevance. This paper explores the possibility of combining the two principles, but concludes that while neither is adequate alone, nor can any single all‐embracing ranking principle be constructed to replace the two. The only general solution to the problem would be to find an optimal ranking by exploring the effect on the user of every possible ranking. However, some more practical approximate solutions appear possible.

Details

Journal of Documentation, vol. 34 no. 2

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 April 1980

Informative Futures: Computer Simulation for Library Management

Pauline A. Oswitch

The megadimensional nature of the complex social systems of the twentieth century, and the increasing levels of interrelatedness, present the individual with a bewildering array…

HTML

PDF (3.4 MB)

Downloads

236

Abstract

The megadimensional nature of the complex social systems of the twentieth century, and the increasing levels of interrelatedness, present the individual with a bewildering array of information sources and services.

Details

Library Management, vol. 1 no. 4

Type: Research Article

DOI:

ISSN: 0143-5124

A MODEL OF A DOCUMENT RETRIEVAL SYSTEM BASED ON THE CONCEPT OF A SEMANTIC DISJUNCTIVE NORMAL FORM

Abstract

Details

ON A PROBABILISTIC APPROACH TO DETERMINING THE SIMILARITY BETWEEN BOOLEAN SEARCH REQUEST FORMULATIONS

Abstract

Details

REDUCING THE PERILS OF MERGING BOOLEAN AND WEIGHTED RETRIEVAL SYSTEMS

Abstract

Details

ONLINE SEARCH INTERFACE DESIGN

Abstract

Details

MATHEMATICS AND INFORMATION RETRIEVAL

Abstract

Details

A MODEL OF KNOWLEDGE BASED INFORMATION RETRIEVAL WITH HIERARCHICAL CONCEPT GRAPH

Abstract

Details

OUTLINE OF A GENERAL PROBABILISTIC RETRIEVAL MODEL

Abstract

Details

INSTRUCT: a teaching package for experimental methods in information retrieval. Part I. The users view

Abstract

Details

RANKING IN PRINCIPLE

Abstract

Details

Informative Futures: Computer Simulation for Library Management

Abstract

Details

Access

Year

Content type

Abstract

Details

Abstract

Details

Abstract

Details

Abstract

Details

Abstract

Details

Abstract

Details

Abstract

Details

Abstract

Details

Abstract

Details

Abstract

Details

Access

Year

Content type

All feedback is valuable

Report an issue or find answers to frequently asked questions