Search results

1 – 6 of 6

View access options

Article

Publication date: 1 November 2019

A study of boosted evolutionary classifiers for detecting spam

Email is a rapid and cheapest medium of sharing information, whereas unsolicited email (spam) is constant trouble in the email communication. The rapid growth of the spam creates…

HTML

PDF (849 KB)

Downloads

160

Abstract

Purpose

Email is a rapid and cheapest medium of sharing information, whereas unsolicited email (spam) is constant trouble in the email communication. The rapid growth of the spam creates a necessity to build a reliable and robust spam classifier. This paper aims to presents a study of evolutionary classifiers (genetic algorithm [GA] and genetic programming [GP]) without/with the help of an ensemble of classifiers method. In this research, the classifiers ensemble has been developed with adaptive boosting technique.

Design/methodology/approach

Text mining methods are applied for classifying spam emails and legitimate emails. Two data sets (Enron and SpamAssassin) are taken to test the concerned classifiers. Initially, pre-processing is performed to extract the features/words from email files. Informative feature subset is selected from greedy stepwise feature subset search method. With the help of informative features, a comparative study is performed initially within the evolutionary classifiers and then with other popular machine learning classifiers (Bayesian, naive Bayes and support vector machine).

Findings

This study reveals the fact that evolutionary algorithms are promising in classification and prediction applications where genetic programing with adaptive boosting is turned out not only an accurate classifier but also a sensitive classifier. Results show that initially GA performs better than GP but after an ensemble of classifiers (a large number of iterations), GP overshoots GA with significantly higher accuracy. Amongst all classifiers, boosted GP turns out to be not only good regarding classification accuracy but also low false positive (FP) rates, which is considered to be the important criteria in email spam classification. Also, greedy stepwise feature search is found to be an effective method for feature selection in this application domain.

Research limitations/implications

The research implication of this research consists of the reduction in cost incurred because of spam/unsolicited bulk email. Email is a fundamental necessity to share information within a number of units of the organizations to be competitive with the business rivals. In addition, it is continually a hurdle for internet service providers to provide the best emailing services to their customers. Although, the organizations and the internet service providers are continuously adopting novel spam filtering approaches to reduce the number of unwanted emails, the desired effect could not be significantly seen because of the cost of installation, customizable ability and the threat of misclassification of important emails. This research deals with all the issues and challenges faced by internet service providers and organizations.

Practical implications

In this research, the proposed models have not only provided excellent performance accuracy, sensitivity with low FP rate, customizable capability but also worked on reducing the cost of spam. The same models may be used for other applications of text mining also such as sentiment analysis, blog mining, news mining or other text mining research.

Originality/value

A comparison between GP and GAs has been shown with/without ensemble in spam classification application domain.

Details

Global Knowledge, Memory and Communication, vol. 69 no. 4/5

Type: Research Article

DOI:

ISSN: 2514-9342

Keywords

View access options

Article

Publication date: 29 October 2018

Analysing user sentiment of Indian movie reviews: A probabilistic committee selection model

Shrawan Kumar Trivedi and Shubhamoy Dey

To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be…

HTML

PDF (372 KB)

Downloads

318

Abstract

Purpose

To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be achieved via natural language processing and machine learning classifiers. This paper aims to propose a novel probabilistic committee selection classifier (PCC) to analyse and classify the sentiment polarities of movie reviews.

Design/methodology/approach

An Indian movie review corpus is assembled for this study. Another publicly available movie review polarity corpus is also involved with regard to validating the results. The greedy stepwise search method is used to extract the features/words of the reviews. The performance of the proposed classifier is measured using different metrics, such as F-measure, false positive rate, receiver operating characteristic (ROC) curve and training time. Further, the proposed classifier is compared with other popular machine-learning classifiers, such as Bayesian, Naïve Bayes, Decision Tree (J48), Support Vector Machine and Random Forest.

Findings

The results of this study show that the proposed classifier is good at predicting the positive or negative polarity of movie reviews. Its performance accuracy and the value of the ROC curve of the PCC is found to be the most suitable of all other classifiers tested in this study. This classifier is also found to be efficient at identifying positive sentiments of reviews, where it gives low false positive rates for both the Indian Movie Review and Review Polarity corpora used in this study. The training time of the proposed classifier is found to be slightly higher than that of Bayesian, Naïve Bayes and J48.

Research limitations/implications

Only movie review sentiments written in English are considered. In addition, the proposed committee selection classifier is prepared only using the committee of probabilistic classifiers; however, other classifier committees can also be built, tested and compared with the present experiment scenario.

Practical implications

In this paper, a novel probabilistic approach is proposed and used for classifying movie reviews, and is found to be highly effective in comparison with other state-of-the-art classifiers. This classifier may be tested for different applications and may provide new insights for developers and researchers.

Social implications

The proposed PCC may be used to classify different product reviews, and hence may be beneficial to organizations to justify users’ reviews about specific products or services. By using authentic positive and negative sentiments of users, the credibility of the specific product, service or event may be enhanced. PCC may also be applied to other applications, such as spam detection, blog mining, news mining and various other data-mining applications.

Originality/value

The constructed PCC is novel and was tested on Indian movie review data.

Details

The Electronic Library, vol. 36 no. 4

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 25 October 2018

Capturing user sentiments for online Indian movie reviews: A comparative analysis of different machine-learning models

Shrawan Kumar Trivedi, Shubhamoy Dey and Anil Kumar

Sentiment analysis and opinion mining are emerging areas of research for analyzing Web data and capturing users’ sentiments. This research aims to present sentiment analysis of an…

HTML

PDF (459 KB)

Downloads

374

Abstract

Purpose

Sentiment analysis and opinion mining are emerging areas of research for analyzing Web data and capturing users’ sentiments. This research aims to present sentiment analysis of an Indian movie review corpus using natural language processing and various machine learning classifiers.

Design/methodology/approach

In this paper, a comparative study between three machine learning classifiers (Bayesian, naïve Bayesian and support vector machine [SVM]) was performed. All the classifiers were trained on the words/features of the corpus extracted, using five different feature selection algorithms (Chi-square, info-gain, gain ratio, one-R and relief-F [RF] attributes), and a comparative study was performed between them. The classifiers and feature selection approaches were evaluated using different metrics (F-value, false-positive [FP] rate and training time).

Findings

The results of this study show that, for the maximum number of features, the RF feature selection approach was found to be the best, with better F-values, a low FP rate and less time needed to train the classifiers, whereas for the least number of features, one-R was better than RF. When the evaluation was performed for machine learning classifiers, SVM was found to be superior, although the Bayesian classifier was comparable with SVM.

Originality/value

This is a novel research where Indian review data were collected and then a classification model for sentiment polarity (positive/negative) was constructed.

Details

The Electronic Library, vol. 36 no. 4

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 8 March 2018

Service failures after online flash sales: role of deal proneness, attribution, and emotion

Khadija Ali Vakeel, K. Sivakumar, K.R. Jayasimha and Shubhamoy Dey

The purpose of this paper is to focus on failures in online flash sales (OFS) and to explore why consumers participate in an OFS even after experiencing service failure. It also…

HTML

PDF (344 KB)

Downloads

2472

Abstract

Purpose

The purpose of this paper is to focus on failures in online flash sales (OFS) and to explore why consumers participate in an OFS even after experiencing service failure. It also examines the role of deal proneness, attribution, and emotions.

Design/methodology/approach

Using a mixed method approach to gain insights into this relatively unexplored phenomenon of OFS, this research uses netnography followed by a survey study.

Findings

The findings show that deal-prone customers tend to ignore service failures during OFS and re-participate in the future. In the context of OFS, failures attributed to internal locus of attribution (LOA) also have a negative effect on re-participation compared with failures attributed to external LOA. Furthermore, there is a three-way interaction among deal proneness, LOA, and past emotions. The results show that negative past emotions further exacerbate the impact of attribution on the link between deal proneness and re-participation.

Originality/value

In contrast with prior research, the paper shows that consumers participate even after service failure. The proposed difference is between customers who experience different LOA and past emotions offers insights into their behavior after service failure in a new context of an online/electronic commerce event – flash sales. This paper specifically explores the role of internal LOA and finds that it has a more negative impact than external LOA on re-participation.

Details

Journal of Service Management, vol. 29 no. 2

Type: Research Article

DOI:

ISSN: 1757-5818

Keywords

View access options

Article

Publication date: 14 November 2016

A novel committee selection mechanism for combining classifiers to detect unsolicited emails

Shrawan Kumar Trivedi and Shubhamoy Dey

The email is an important medium for sharing information rapidly. However, spam, being a nuisance in such communication, motivates the building of a robust filtering system with…

HTML

PDF (557 KB)

Downloads

238

Abstract

Purpose

The email is an important medium for sharing information rapidly. However, spam, being a nuisance in such communication, motivates the building of a robust filtering system with high classification accuracy and good sensitivity towards false positives. In that context, this paper aims to present a combined classifier technique using a committee selection mechanism where the main objective is to identify a set of classifiers so that their individual decisions can be combined by a committee selection procedure for accurate detection of spam.

Design/methodology/approach

For training and testing of the relevant machine learning classifiers, text mining approaches are used in this research. Three data sets (Enron, SpamAssassin and LingSpam) have been used to test the classifiers. Initially, pre-processing is performed to extract the features associated with the email files. In the next step, the extracted features are taken through a dimensionality reduction method where non-informative features are removed. Subsequently, an informative feature subset is selected using genetic feature search. Thereafter, the proposed classifiers are tested on those informative features and the results compared with those of other classifiers.

Findings

For building the proposed combined classifier, three different studies have been performed. The first study identifies the effect of boosting algorithms on two probabilistic classifiers: Bayesian and Naïve Bayes. In that study, AdaBoost has been found to be the best algorithm for performance boosting. The second study was on the effect of different Kernel functions on support vector machine (SVM) classifier, where SVM with normalized polynomial (NP) kernel was observed to be the best. The last study was on combining classifiers with committee selection where the committee members were the best classifiers identified by the first study i.e. Bayesian and Naïve bays with AdaBoost, and the committee president was selected from the second study i.e. SVM with NP kernel. Results show that combining of the identified classifiers to form a committee machine gives excellent performance accuracy with a low false positive rate.

Research limitations/implications

This research is focused on the classification of email spams written in English language. Only body (text) parts of the emails have been used. Image spam has not been included in this work. We have restricted our work to only emails messages. None of the other types of messages like short message service or multi-media messaging service were a part of this study.

Practical implications

This research proposes a method of dealing with the issues and challenges faced by internet service providers and organizations that use email. The proposed model provides not only better classification accuracy but also a low false positive rate.

Originality/value

The proposed combined classifier is a novel classifier designed for accurate classification of email spam.

Details

VINE Journal of Information and Knowledge Management Systems, vol. 46 no. 4

Type: Research Article

DOI:

ISSN: 2059-5891

Keywords

Open Access

Article

Publication date: 12 July 2022

The impact of knowledge risk management on sustainability

Malgorzata Zieba, Susanne Durst and Christoph Hinteregger

The purpose of this study is to examine the effect of knowledge risk management (KRM) on organizational sustainability and the role of innovativeness and agility in this…

HTML

PDF (512 KB)

Downloads

6507

Abstract

Purpose

The purpose of this study is to examine the effect of knowledge risk management (KRM) on organizational sustainability and the role of innovativeness and agility in this relationship.

Design/methodology/approach

The study presents the results of a quantitative survey performed among 179 professionals from knowledge-intensive organizations dealing with knowledge risks and their management in organizations. Data included in this study are from both private and public organizations located all over the world and were collected through an online survey.

Findings

The results have confirmed that innovativeness and agility positively impact the sustainability of organizations; agility also positively impacts organizational innovativeness. The partial influence of KRM on both innovativeness and agility of organizations has been confirmed as well.

Research limitations/implications

The paper findings contribute in different ways to the ongoing debates in the literature. First, they contribute to the general study of risk management by showing empirically its role in organizations in the given case of organizational sustainability. Second, by emphasizing the risks related to knowledge, this study contributes to emerging efforts highlighting the particular role of knowledge for sustained organizational development. Third, by linking KRM and organizational sustainability, this paper contributes empirically to building knowledge in this very recent field of study. This understanding is also useful for future development in the field of KM as a whole.

Originality/value

The paper lays the ground for both a deeper and more nuanced understanding of knowledge risks in organizations in general and regarding sustainability in particular. As such, the paper offers new food for thought for researchers dealing with the topics of knowledge risks, knowledge management and organizational risk management in general.

Details

Journal of Knowledge Management, vol. 26 no. 11

Type: Research Article

DOI:

ISSN: 1367-3270

Keywords

Access

Year

All dates (6)

Content type

Article (6)

1 – 6 of 6

A study of boosted evolutionary classifiers for detecting spam

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Analysing user sentiment of Indian movie reviews: A probabilistic committee selection model

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Social implications

Originality/value

Details

Keywords

Capturing user sentiments for online Indian movie reviews: A comparative analysis of different machine-learning models

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Service failures after online flash sales: role of deal proneness, attribution, and emotion

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

A novel committee selection mechanism for combining classifiers to detect unsolicited emails

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

The impact of knowledge risk management on sustainability

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Originality/value

Details

Keywords

Access

Year

Content type

All feedback is valuable

Report an issue or find answers to frequently asked questions