Search results | Emerald Insight

Article

Publication date: 25 October 2018

Capturing user sentiments for online Indian movie reviews: A comparative analysis of different machine-learning models

Shrawan Kumar Trivedi, Shubhamoy Dey and Anil Kumar

Sentiment analysis and opinion mining are emerging areas of research for analyzing Web data and capturing users’ sentiments. This research aims to present sentiment analysis of an…

HTML

PDF (459 KB)

Downloads

377

Abstract

Purpose

Sentiment analysis and opinion mining are emerging areas of research for analyzing Web data and capturing users’ sentiments. This research aims to present sentiment analysis of an Indian movie review corpus using natural language processing and various machine learning classifiers.

Design/methodology/approach

In this paper, a comparative study between three machine learning classifiers (Bayesian, naïve Bayesian and support vector machine [SVM]) was performed. All the classifiers were trained on the words/features of the corpus extracted, using five different feature selection algorithms (Chi-square, info-gain, gain ratio, one-R and relief-F [RF] attributes), and a comparative study was performed between them. The classifiers and feature selection approaches were evaluated using different metrics (F-value, false-positive [FP] rate and training time).

Findings

The results of this study show that, for the maximum number of features, the RF feature selection approach was found to be the best, with better F-values, a low FP rate and less time needed to train the classifiers, whereas for the least number of features, one-R was better than RF. When the evaluation was performed for machine learning classifiers, SVM was found to be superior, although the Bayesian classifier was comparable with SVM.

Originality/value

This is a novel research where Indian review data were collected and then a classification model for sentiment polarity (positive/negative) was constructed.

Details

The Electronic Library, vol. 36 no. 4

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 22 March 2024

A hybrid approach for optimizing software defect prediction using a grey wolf optimization and multilayer perceptron

Mohd Mustaqeem, Suhel Mustajab and Mahfooz Alam

Software defect prediction (SDP) is a critical aspect of software quality assurance, aiming to identify and manage potential defects in software systems. In this paper, we have…

HTML

PDF (4.5 MB)

Downloads

154

Abstract

Purpose

Software defect prediction (SDP) is a critical aspect of software quality assurance, aiming to identify and manage potential defects in software systems. In this paper, we have proposed a novel hybrid approach that combines Grey Wolf Optimization with Feature Selection (GWOFS) and multilayer perceptron (MLP) for SDP. The GWOFS-MLP hybrid model is designed to optimize feature selection, ultimately enhancing the accuracy and efficiency of SDP. Grey Wolf Optimization, inspired by the social hierarchy and hunting behavior of grey wolves, is employed to select a subset of relevant features from an extensive pool of potential predictors. This study investigates the key challenges that traditional SDP approaches encounter and proposes promising solutions to overcome time complexity and the curse of the dimensionality reduction problem.

Design/methodology/approach

The integration of GWOFS and MLP results in a robust hybrid model that can adapt to diverse software datasets. This feature selection process harnesses the cooperative hunting behavior of wolves, allowing for the exploration of critical feature combinations. The selected features are then fed into an MLP, a powerful artificial neural network (ANN) known for its capability to learn intricate patterns within software metrics. MLP serves as the predictive engine, utilizing the curated feature set to model and classify software defects accurately.

Findings

The performance evaluation of the GWOFS-MLP hybrid model on a real-world software defect dataset demonstrates its effectiveness. The model achieves a remarkable training accuracy of 97.69% and a testing accuracy of 97.99%. Additionally, the receiver operating characteristic area under the curve (ROC-AUC) score of 0.89 highlights the model’s ability to discriminate between defective and defect-free software components.

Originality/value

Experimental implementations using machine learning-based techniques with feature reduction are conducted to validate the proposed solutions. The goal is to enhance SDP’s accuracy, relevance and efficiency, ultimately improving software quality assurance processes. The confusion matrix further illustrates the model’s performance, with only a small number of false positives and false negatives.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 17 no. 2

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 7 February 2025

An integrated framework for aspect category-based sentiment analysis using adaptive feature selection and category-aware decision fusion strategies

Qingqing Li, Ziming Zeng, Shouqiang Sun and Tingting Li

Aspect category-based sentiment analysis (ACSA) has been widely used in consumer preference mining and marketing strategy formulation. However, existing studies ignore the…

HTML

PDF (1.8 MB)

Downloads

17

Abstract

Purpose

Aspect category-based sentiment analysis (ACSA) has been widely used in consumer preference mining and marketing strategy formulation. However, existing studies ignore the variability in features and the intrinsic correlation among diverse aspect categories in ACSA tasks. To address these problems, this paper aims to propose a novel integrated framework.

Design/methodology/approach

The integrated framework consists of three modules: text feature extraction and fusion, adaptive feature selection and category-aware decision fusion. First, text features from global and local views are extracted and fused to comprehensively capture the potential information in the different dimensions of the review text. Then, an adaptive feature selection strategy is devised for each aspect category to determine the optimal feature set. Finally, considering the intrinsic associations between aspect categories, a category-aware decision fusion strategy is constructed to enhance the performance of ACSA tasks.

Findings

Comparative experimental results demonstrate that the integrated framework can effectively detect aspect categories and their corresponding sentiment polarities from review texts, achieving a macroaveraged F1 score (Fmacro) of 72.38% and a weighted F1 score (F1) of 79.39%, with absolute gains of 2.93% to 27.36% and 4.35% to 20.36%, respectively, compared to the baselines.

Originality/value

This framework can simultaneously detect aspect categories and corresponding sentiment polarities from review texts, thereby assisting e-commerce enterprises in gaining insights into consumer preferences, prioritizing product improvements, and adjusting marketing strategies.

Details

The Electronic Library, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 15 January 2024

Multi-layers deep learning model with feature selection for automated detection and classification of highway pavement cracks

Faris Elghaish, Sandra Matarneh, Essam Abdellatef, Farzad Rahimian, M. Reza Hosseini and Ahmed Farouk Kineber

Cracks are prevalent signs of pavement distress found on highways globally. The use of artificial intelligence (AI) and deep learning (DL) for crack detection is increasingly…

HTML

PDF (4.1 MB)

Downloads

194

Abstract

Purpose

Cracks are prevalent signs of pavement distress found on highways globally. The use of artificial intelligence (AI) and deep learning (DL) for crack detection is increasingly considered as an optimal solution. Consequently, this paper introduces a novel, fully connected, optimised convolutional neural network (CNN) model using feature selection algorithms for the purpose of detecting cracks in highway pavements.

Design/methodology/approach

To enhance the accuracy of the CNN model for crack detection, the authors employed a fully connected deep learning layers CNN model along with several optimisation techniques. Specifically, three optimisation algorithms, namely adaptive moment estimation (ADAM), stochastic gradient descent with momentum (SGDM), and RMSProp, were utilised to fine-tune the CNN model and enhance its overall performance. Subsequently, the authors implemented eight feature selection algorithms to further improve the accuracy of the optimised CNN model. These feature selection techniques were thoughtfully selected and systematically applied to identify the most relevant features contributing to crack detection in the given dataset. Finally, the authors subjected the proposed model to testing against seven pre-trained models.

Findings

The study's results show that the accuracy of the three optimisers (ADAM, SGDM, and RMSProp) with the five deep learning layers model is 97.4%, 98.2%, and 96.09%, respectively. Following this, eight feature selection algorithms were applied to the five deep learning layers to enhance accuracy, with particle swarm optimisation (PSO) achieving the highest F-score at 98.72. The model was then compared with other pre-trained models and exhibited the highest performance.

Practical implications

With an achieved precision of 98.19% and F-score of 98.72% using PSO, the developed model is highly accurate and effective in detecting and evaluating the condition of cracks in pavements. As a result, the model has the potential to significantly reduce the effort required for crack detection and evaluation.

Originality/value

The proposed method for enhancing CNN model accuracy in crack detection stands out for its unique combination of optimisation algorithms (ADAM, SGDM, and RMSProp) with systematic application of multiple feature selection techniques to identify relevant crack detection features and comparing results with existing pre-trained models.

Details

Smart and Sustainable Built Environment, vol. 14 no. 2

Type: Research Article

DOI:

ISSN: 2046-6099

Keywords

View access options

Article

Publication date: 8 December 2022

Auto-encoder-based algorithm for the selection of key characteristics for products to reduce inspection efforts

Jonathan S. Greipel, Regina M. Frank, Meike Huber, Ansgar Steland and Robert H. Schmitt

To ensure product quality within a manufacturing process, inspection processes are indispensable. One task of inspection planning is the selection of inspection characteristics…

HTML

PDF (715 KB)

Downloads

254

Abstract

Purpose

To ensure product quality within a manufacturing process, inspection processes are indispensable. One task of inspection planning is the selection of inspection characteristics. For optimization of costs and benefits, key characteristics can be defined by which the product quality can be checked with sufficient accuracy. The manual selection of key characteristics requires substantial planning effort and becomes uneconomic if many product variants prevail. This paper, therefore, aims to show a method for the efficient determination of key characteristics.

Design/methodology/approach

The authors present a novel Algorithm for the Selection of Key Characteristics (ASKC) based on an auto-encoder and a risk analysis. Given historical measurement data and tolerances, the algorithm clusters characteristics with redundant information and selects key characteristics based on a risk assessment. The authors compare ASKC with the algorithm Principal Feature Analysis (PFA) using artificial and historical measurement data.

Findings

The authors find that ASKC delivers superior results than PFA. Findings show that the algorithms enable the cost-efficient selection of key characteristics while maintaining the informative value of the inspection concerning the quality.

Originality/value

This paper fills an identified gap for simplified inspection planning with the method for the efficient selection of key features via ASKC.

Details

International Journal of Quality & Reliability Management, vol. 40 no. 7

Type: Research Article

DOI:

ISSN: 0265-671X

Keywords

View access options

Article

Publication date: 19 August 2021

Reducing complexity in multivariate electricity price forecasting

Hendrik Kohrs, Benjamin Rainer Auer and Frank Schuhmacher

In short-term forecasting of day-ahead electricity prices, incorporating intraday dependencies is vital for accurate predictions. However, it quickly leads to dimensionality…

HTML

PDF (2.8 MB)

Downloads

156

Abstract

Purpose

In short-term forecasting of day-ahead electricity prices, incorporating intraday dependencies is vital for accurate predictions. However, it quickly leads to dimensionality problems, i.e. ill-defined models with too many parameters, which require an adequate remedy. This study addresses this issue.

Design/methodology/approach

In an application for the German/Austrian market, this study derives variable importance scores from a random forest algorithm, feeds the identified variables into a support vector machine and compares the resulting forecasting technique to other approaches (such as dynamic factor models, penalized regressions or Bayesian shrinkage) that are commonly used to resolve dimensionality problems.

Findings

This study develops full importance profiles stating which hours of which past days have the highest predictive power for specific hours in the future. Using the profile information in the forecasting setup leads to very promising results compared to the alternatives. Furthermore, the importance profiles provide a possible explanation why some forecasting methods are more accurate for certain hours of the day than others. They also help to explain why simple forecast combination schemes tend to outperform the full battery of models considered in the comprehensive comparative study.

Originality/value

With the information contained in the variable importance scores and the results of the extensive model comparison, this study essentially provides guidelines for variable and model selection in future electricity market research.

Details

International Journal of Energy Sector Management, vol. 16 no. 1

Type: Research Article

DOI:

ISSN: 1750-6220

Keywords

View access options

Article

Publication date: 23 November 2010

Topic‐based web site summarization

Yongzheng Zhang, Evangelos Milios and Nur Zincir‐Heywood

Summarization of an entire web site with diverse content may lead to a summary heavily biased towards the site's dominant topics. The purpose of this paper is to present a novel…

HTML

PDF (200 KB)

Downloads

455

Abstract

Purpose

Summarization of an entire web site with diverse content may lead to a summary heavily biased towards the site's dominant topics. The purpose of this paper is to present a novel topic‐based framework to address this problem.

Design/methodology/approach

A two‐stage framework is proposed. The first stage identifies the main topics covered in a web site via clustering and the second stage summarizes each topic separately. The proposed system is evaluated by a user study and compared with the single‐topic summarization approach.

Findings

The user study demonstrates that the clustering‐summarization approach statistically significantly outperforms the plain summarization approach in the multi‐topic web site summarization task. Text‐based clustering based on selecting features with high variance over web pages is reliable; outgoing links are useful if a rich set of cross links is available.

Research limitations/implications

More sophisticated clustering methods than those used in this study are worth investigating. The proposed method should be tested on web content that is less structured than organizational web sites, for example blogs.

Practical implications

The proposed summarization framework can be applied to the effective organization of search engine results and faceted or topical browsing of large web sites.

Originality/value

Several key components are integrated for web site summarization for the first time, including feature selection and link analysis, key phrase and key sentence extraction. Insight into the contributions of links and content to topic‐based summarization was gained. A classification approach is used to minimize the number of parameters.

Details

International Journal of Web Information Systems, vol. 6 no. 4

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

View access options

Article

Publication date: 26 July 2011

Optimization of actuator performance using robust engineering and feature selection methodologies: A case study

Boby John

The purpose of this study is to demonstrate the variation between the set torque and the actual torque at which the actuator trips can be minimized using Taguchi's robust…

HTML

PDF (157 KB)

Downloads

329

Abstract

Purpose

The purpose of this study is to demonstrate the variation between the set torque and the actual torque at which the actuator trips can be minimized using Taguchi's robust engineering methodology. The paper also aims to demonstrate the application of feature selection approach for the identification of insignificant effects in unreplicated fractional factorial experiments.

Design/methodology/approach

The methodology used was design of experiments with the set torque as the signal factor and the tripping torque as response variable. The compounded noise factor was identified based on the type of operations and load variation, which are not under the manufacturer's control. The effect of five control factors (with two levels each) and two interactions were studied. The experiments were designed using L8 orthogonal array.

Findings

The result showed that the factors spring height, spring thickness, star washer position and the interaction between drive shaft length and spring height play a significant role in actuator performance. The implementation of the optimum combination of factors resulted in improving the overall capability indices, Cp from 0.52 to 2.12 and Cpk from 0.4 to 1.67.

Practical implications

This study provides valuable information to actuator manufacturers on optimizing actuator performance.

Originality/value

To the best of the author's knowledge, no study has been conducted using Taguchi's robust engineering methodology to optimize actuator performance. In addition, no attempt has been made in the past to identify the insignificant factors and interactions using feature selection approach for unreplicated fractional factorial experiments.

Details

International Journal of Productivity and Performance Management, vol. 60 no. 6

Type: Research Article

DOI:

ISSN: 1741-0401

Keywords

View access options

Article

Publication date: 5 June 2007

Feature‐based fault diagnosis system of induction motors using vibration signal

Tian Han, Bo‐Suk Yang and Zhong‐Jun Yin

The purpose of this paper is to identify the efficiency of vibration signals for fault diagnosis system of induction motors.

HTML

PDF (977 KB)

Downloads

1368

Abstract

Purpose

The purpose of this paper is to identify the efficiency of vibration signals for fault diagnosis system of induction motors.

Design/methodology/approach

A fault diagnosis system for induction motors using vibration signals is designed based on pattern recognition. Genetic algorithm is used for feature reduction and neural network tuning.

Findings

The usage of genetic algorithm improves the system performance through selecting significant features and optimizing network structure. The efficiency of vibration signals is demonstrated.

Practical implications

Condition monitoring and fault diagnosis for induction motors is one of the main industry maintenance parts. Motors faults usually result in whole production line breakdown. In this paper, one fault diagnosis system is proposed for induction motors based on feature recognition through combination of feature extraction, genetic algorithm and neural network techniques. From the paper, one can learn practically the whole procedure of feature‐based fault diagnosis system and the efficiency of GA and vibration signals for motor fault diagnosis. One real test has been done to validate the system performance. The results indicate that this system is promising for the real application in industry.

Originality/value

The use of genetic algorithm for feature selection and neural network tuning; the choice of vibration analysis for fault diagnosis of induction motor.

Details

Journal of Quality in Maintenance Engineering, vol. 13 no. 2

Type: Research Article

DOI:

ISSN: 1355-2511

Keywords

View access options

Article

Publication date: 2 February 2022

Short-term cooling load prediction for office buildings based on feature selection scheme and stacking ensemble model

Wenzhong Gao, Xingzong Huang, Mengya Lin, Jing Jia and Zhen Tian

The purpose of this paper is to target on designing a short-term load prediction framework that can accurately predict the cooling load of office buildings.

HTML

PDF (5.7 MB)

Downloads

337

Abstract

Purpose

The purpose of this paper is to target on designing a short-term load prediction framework that can accurately predict the cooling load of office buildings.

Design/methodology/approach

A feature selection scheme and stacking ensemble model to fulfill cooling load prediction task was proposed. Firstly, the abnormal data were identified by the data density estimation algorithm. Secondly, the crucial input features were clarified from three aspects (i.e. historical load information, time information and meteorological information). Thirdly, the stacking ensemble model combined long short-term memory network and light gradient boosting machine was utilized to predict the cooling load. Finally, the proposed framework performances by predicting cooling load of office buildings were verified with indicators.

Findings

The identified input features can improve the prediction performance. The prediction accuracy of the proposed model is preferable to the existing ones. The stacking ensemble model is robust to weather forecasting errors.

Originality/value

The stacking ensemble model was used to fulfill cooling load prediction task which can overcome the shortcomings of deep learning models. The input features of the model, which are less focused on in most studies, are taken as an important step in this paper.

Details

Engineering Computations, vol. 39 no. 5

Type: Research Article

DOI:

ISSN: 0264-4401

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Access

Year

Content type

All feedback is valuable

Report an issue or find answers to frequently asked questions