Search results

1 – 2 of 2
Per page
102050
Citations:
Loading...
Access Restricted. View access options
Article
Publication date: 25 February 2022

Souheila Ben Guirat, Ibrahim Bounhas and Yahya Slimani

The semantic relations between Arabic word representations were recognized and widely studied in theoretical studies in linguistics many centuries ago. Nonetheless, most of the…

147

Abstract

Purpose

The semantic relations between Arabic word representations were recognized and widely studied in theoretical studies in linguistics many centuries ago. Nonetheless, most of the previous research in automatic information retrieval (IR) focused on stem or root-based indexing, while lemmas and patterns are under-exploited. However, the authors believe that each of the four morphological levels encapsulates a part of the meaning of words. That is, the purpose is to aggregate these levels using more sophisticated approaches to reach the optimal combination which enhances IR.

Design/methodology/approach

The authors first compare the state-of-the art Arabic natural language processing (NLP) tools in IR. This allows to select the most accurate tool in each representation level i.e. developing four basic IR systems. Then, the authors compare two rank aggregation approaches which combine the results of these systems. The first approach is based on linear combination, while the second exploits classification-based meta-search.

Findings

Combining different word representation levels, consistently and significantly enhances IR results. The proposed classification-based approach outperforms linear combination and all the basic systems.

Research limitations/implications

The work stands by a standard experimental comparative study which assesses several NLP tools and combining approaches on different test collections and IR models. Thus, it may be helpful for future research works to choose the most suitable tools and develop more sophisticated methods for handling the complexity of Arabic language.

Originality/value

The originality of the idea is to consider that the richness of Arabic is an exploitable characteristic and no more a challenging limit. Thus, the authors combine 4 different morphological levels for the first time in Arabic IR. This approach widely overtook previous research results.

Peer review

The peer review history for this article is available at: https://publons.com/publon/10.1108/OIR-11-2020-0515

Details

Online Information Review, vol. 46 no. 7
Type: Research Article
ISSN: 1468-4527

Keywords

Available. Open Access. Open Access
Article
Publication date: 4 August 2020

Mohamed Boudchiche and Azzeddine Mazroui

We have developed in this paper a morphological disambiguation hybrid system for the Arabic language that identifies the stem, lemma and root of a given sentence words. Following…

1085

Abstract

We have developed in this paper a morphological disambiguation hybrid system for the Arabic language that identifies the stem, lemma and root of a given sentence words. Following an out-of-context analysis performed by the morphological analyser Alkhalil Morpho Sys, the system first identifies all the potential tags of each word of the sentence. Then, a disambiguation phase is carried out to choose for each word the right solution among those obtained during the first phase. This problem has been solved by equating the disambiguation issue with a surface optimization problem of spline functions. Tests have shown the interest of this approach and the superiority of its performances compared to those of the state of the art.

Details

Applied Computing and Informatics, vol. 20 no. 3/4
Type: Research Article
ISSN: 2634-1964

Keywords

1 – 2 of 2
Per page
102050