Search results

1 – 4 of 4
Article
Publication date: 16 December 2019

Chihli Hung and You-Xin Cao

This paper aims to propose a novel approach which integrates collocations and domain concepts for Chinese cosmetic word of mouth (WOM) sentiment classification. Most sentiment…

Abstract

Purpose

This paper aims to propose a novel approach which integrates collocations and domain concepts for Chinese cosmetic word of mouth (WOM) sentiment classification. Most sentiment analysis works by collecting sentiment scores from each unigram or bigram. However, not every unigram or bigram in a WOM document contains sentiments. Chinese collocations consist of the main sentiments of WOM. This paper reduces the complexity of the document dimensionality and makes an improvement for sentiment classification.

Design/methodology/approach

This paper builds two contextual lexicons for feature words and sentiment words, respectively. Based on these contextual lexicons, this paper uses the techniques of associated rules and mutual information to build possible Chinese collocation sets. This paper applies preference vector modelling as the vector representation approach to catch the relationship between Chinese collocations and their associated concepts.

Findings

This paper compares the proposed preference vector models with benchmarks, using three classification techniques (i.e. support vector machine, J48 decision tree and multilayer perceptron). According to the experimental results, the proposed models outperform all benchmarks evaluated by the criterion of accuracy.

Originality/value

This paper focuses on Chinese collocations and proposes a novel research approach for sentiment classification. The Chinese collocations used in this paper are adaptable to the content and domains. Finally, this paper integrates collocations with the preference vector modelling approach, which not only achieves a better sentiment classification performance for Chinese WOM documents but also avoids the curse of dimensionality.

Details

The Electronic Library , vol. 38 no. 1
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 29 July 2014

Chih-Fong Tsai and Chihli Hung

Credit scoring is important for financial institutions in order to accurately predict the likelihood of business failure. Related studies have shown that machine learning…

1163

Abstract

Purpose

Credit scoring is important for financial institutions in order to accurately predict the likelihood of business failure. Related studies have shown that machine learning techniques, such as neural networks, outperform many statistical approaches to solving this type of problem, and advanced machine learning techniques, such as classifier ensembles and hybrid classifiers, provide better prediction performance than single machine learning based classification techniques. However, it is not known which type of advanced classification technique performs better in terms of financial distress prediction. The paper aims to discuss these issues.

Design/methodology/approach

This paper compares neural network ensembles and hybrid neural networks over three benchmarking credit scoring related data sets, which are Australian, German, and Japanese data sets.

Findings

The experimental results show that hybrid neural networks and neural network ensembles outperform the single neural network. Although hybrid neural networks perform slightly better than neural network ensembles in terms of predication accuracy and errors with two of the data sets, there is no significant difference between the two types of prediction models.

Originality/value

The originality of this paper is in comparing two types of advanced classification techniques, i.e. hybrid and ensemble learning techniques, in terms of financial distress prediction.

Details

Kybernetes, vol. 43 no. 7
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 23 November 2012

Chihli Hung, Chih‐Fong Tsai, Shin‐Yuan Hung and Chang‐Jiang Ku

A grid information retrieval model has benefits for sharing resources and processing mass information, but cannot handle conceptual heterogeneity without integration of semantic…

Abstract

Purpose

A grid information retrieval model has benefits for sharing resources and processing mass information, but cannot handle conceptual heterogeneity without integration of semantic information. The purpose of this research is to propose a concept‐based retrieval mechanism to catch the user's query intentions in a grid environment. This research re‐ranks documents over distributed data sources and evaluates performance based on the user judgment and processing time.

Design/methodology/approach

This research uses the ontology lookup service to build the concept set in the ontology and captures the user's query intentions as a means of query expansion for searching. The Globus toolkit is used to implement the grid service. The modification of the collection retrieval inference (CORI) algorithm is used for re‐ranking documents over distributed data sources.

Findings

The experiments demonstrate that this proposed approach successfully describes the user's query intentions evaluated by user judgment. For processing time, building a grid information retrieval model is a suitable strategy for the ontology‐based retrieval model.

Originality/value

Most current semantic grid models focus on construction of the semantic grid, and do not consider re‐ranking search results from distributed data sources. The significance of evaluation from the user's viewpoint is also ignored. This research proposes a method that captures the user's query intentions and re‐ranks documents in a grid based on the CORI algorithm. This proposed ontology‐based retrieval mechanism calculates the global relevance score of all documents in a grid and displays those documents with higher relevance to users.

Details

Online Information Review, vol. 36 no. 6
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 11 April 2008

Chihli Hung and Stefan Wermter

The purpose of this paper is to examine neural document clustering techniques, e.g. self‐organising map (SOM) or growing neural gas (GNG), usually assume that textual information…

Abstract

Purpose

The purpose of this paper is to examine neural document clustering techniques, e.g. self‐organising map (SOM) or growing neural gas (GNG), usually assume that textual information is stationary on the quantity.

Design/methodology/approach

The authors propose a novel dynamic adaptive self‐organising hybrid (DASH) model, which adapts to time‐event news collections not only to the neural topological structure but also to its main parameters in a non‐stationary environment. Based on features of a time‐event news collection in a non‐stationary environment, they review the main current neural clustering models. The main deficiency is a need of pre‐definition of the thresholds of unit‐growing and unit‐pruning. Thus, the dynamic adaptive self‐organising hybrid (DASH) model is designed for a non‐stationary environment.

Findings

The paper compares DASH with SOM and GNG based on an artificial jumping corner data set and a real world Reuters news collection. According to the experimental results, the DASH model is more effective than SOM and GNG for time‐event document clustering.

Practical implications

A real world environment is dynamic. This paper provides an approach to present news clustering in a non‐stationary environment.

Originality/value

Text clustering in a non‐stationary environment is a novel concept. The paper demonstrates DASH, which can deal with a real world data set in a non‐stationary environment.

Details

The Electronic Library, vol. 26 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

1 – 4 of 4