Haihua Chen, Jeonghyun (Annie) Kim, Jiangping Chen and Aisa Sakata
This study aims to explore the applications of natural language processing (NLP) and data analytics in understanding large-scale digital collections in oral history archives.
Abstract
Purpose
This study aims to explore the applications of natural language processing (NLP) and data analytics in understanding large-scale digital collections in oral history archives.
Design/methodology/approach
NLP and data analytics were used to analyse the oral interview transcripts of 904 survivors of the Japanese American incarceration camps collected from Densho Digital Repository, relying specifically on descriptive analysis, keyword extraction, topic modelling and sentiment analysis (SA).
Findings
The researchers found multiple geographic areas of large residential communities of ethnic Japanese people and the place names of the concentration camps. The keywords and topics extracted reflect the deplorable conditions and militaristic nature of the camps and the forced labour of the internees. When remembering history, the main focus for the narrators remains the redress and reparation movement to obtain the restitution of their civil rights. SA further found that the forcible removal and incarceration of Japanese Americans during Second World War negatively impacted and brought deep trauma to the narrators.
Originality/value
This case study demonstrated how NLP and data analytics could be applied to analyse oral history archives and open avenues for discovery. Archival researchers and the general public may benefit from this type of analysis in making connections between temporal, spatial and emotional elements, which will contribute to a holistic understanding of individuals and communities in terms of their collective memory.
Details
Keywords
Mahdi Zeynali Tazehkandi and Mohsen Nowkarizi
The purpose of this paper is to present a review on the use of the recall metric for evaluating information retrieval systems, especially search engines.
Abstract
Purpose
The purpose of this paper is to present a review on the use of the recall metric for evaluating information retrieval systems, especially search engines.
Design/methodology/approach
This paper investigates different researchers’ views about recall metrics.
Findings
Five different definitions for recall were identified. For the first group, recall refers to completeness, but it does not specify where all the relevant documents are located. For the second group, recall refers to retrieving all the relevant documents from the collection. However, it seems that the term “collection” is ambiguous. For the third group (first approach), collection means the index of search engines and, for the fourth group (second approach), collection refers to the Web. For the fifth group (third approach), ranking of the retrieved documents should also be accounted for in calculating recall.
Practical implications
It can be said that in the first, second and third approaches, the components of the retrieval algorithm, the retrieval algorithm and crawler, and the retrieval algorithm and crawler and ranker, respectively, are evaluated. To determine the effectiveness of search engines for the use of users, it is better to use the third approach in recall measurement.
Originality/value
The value of this paper is to collect, identify and analyse literature that is used in recall. In addition, different views of researchers about recall are identified.