Applications of n ‐grams in textual information systems
Abstract
This paper provides an introduction to the use of n‐grams in textual information systems, where an n‐gram is a string of n, usually adjacent, characters extracted from a section of continuous text. Applications that can be implemented efficiently and effectively using sets of n‐grams include spelling error detection and correction, query expansion, information retrieval with serial, inverted and signature files, dictionary look‐up, text compression, and language identification.
Keywords
Citation
Robertson, A.M. and Willett, P. (1998), "Applications of
Publisher
:MCB UP Ltd
Copyright © 1998, MCB UP Limited