Search results
1 – 5 of 5Referred data set produces reliable information about the network flows and common attacks meeting with real-world criteria. Accordingly, this study aims to focus on the use of…
Abstract
Purpose
Referred data set produces reliable information about the network flows and common attacks meeting with real-world criteria. Accordingly, this study aims to focus on the use of imbalanced intrusion detection benchmark knowledge discovery in database (KDD) data set. KDD data set is most preferably used by many researchers for experimentation and analysis. The proposed algorithm improvised random forest classification with error tuning factors (IRFCETF) deals with experimentation on KDD data set and evaluates the performance of a complete set of network traffic features through IRFCETF.
Design/methodology/approach
In the current era of applications, the attention of researchers is immersed by a diverse number of existing time applications that deals with imbalanced data classification (ImDC). Real-time application areas, artificial intelligence (AI), Industrial Internet of Things (IIoT), etc. are dealing ImDC undergo with diverted classification performance due to skewed data distribution (SkDD). There are numerous application areas that deal with SkDD. Many of the data applications in AI and IIoT face the diverted data classification rate in SkDD. In recent advancements, there is an exponential expansion in the volume of computer network data and related application developments. Intrusion detection is one of the demanding applications of ImDC. The proposed study focusses on imbalanced intrusion benchmark data set, KDD data set and other benchmark data set with the proposed IRFCETF approach. IRFCETF justifies the enriched classification performance on imbalanced data set over the existing approach. The purpose of this work is to review imbalanced data applications in numerous application areas including AI and IIoT and tuning the performance with respect to principal component analysis. This study also focusses on the out-of-bag error performance-tuning factor.
Findings
Experimental results on KDD data set shows that proposed algorithm gives enriched performance. For referred intrusion detection data set, IRFCETF classification accuracy is 99.57% and error rate is 0.43%.
Research limitations/implications
This research work extended for further improvements in classification techniques with multiple correspondence analysis (MCA); hierarchical MCA can be focussed with the use of classification models for wide range of skewed data sets.
Practical implications
The metrics enhancement is measurable and helpful in dealing with intrusion detection systems–related imbalanced applications in current application domains such as security, AI and IIoT digitization. Analytical results show improvised metrics of the proposed approach than other traditional machine learning algorithms. Thus, error-tuning parameter creates a measurable impact on classification accuracy is justified with the proposed IRFCETF.
Social implications
Proposed algorithm is useful in numerous IIoT applications such as health care, machinery automation etc.
Originality/value
This research work addressed classification metric enhancement approach IRFCETF. The proposed method yields a test set categorization for each case with error reduction mechanism.
Details
Keywords
Bharat Arun Tidke, Rupa Mehta, Dipti Rana, Divyani Mittal and Pooja Suthar
In online social network analysis, the problem of identification and ranking of influential nodes based on their prominence has attracted immense attention from researchers and…
Abstract
Purpose
In online social network analysis, the problem of identification and ranking of influential nodes based on their prominence has attracted immense attention from researchers and practitioners. Identification and ranking of influential nodes is a challenging problem using Twitter, as data contains heterogeneous features such as tweets, likes, mentions and retweets. The purpose of this paper is to perform correlation between various features, evaluation metrics, approaches and results to validate selection of features as well as results. In addition, the paper uses well-known techniques to find topical authority and sentiments of influential nodes that help smart city governance and to make importance decisions while understanding the various perceptions of relevant influential nodes.
Design/methodology/approach
The tweets fetched using Twitter API are stored in Neo4j to generate graph-based relationships between various features of Twitter data such as followers, mentions and retweets. In this paper, consensus approach based on Twitter data using heterogeneous features has been proposed based on various features such as like, mentions and retweets to generate individual list of top-k influential nodes based on each features.
Findings
The heterogeneous features are meant for integrating to accomplish identification and ranking tasks with low computational complexity, i.e. O(n), which is suitable for large-scale online social network with better accuracy than baselines.
Originality/value
Identified influential nodes can act as source in making public decisions and their opinion give insights to urban governance bodies such as municipal corporation as well as similar organization responsible for smart urban governance and smart city development.
Details
Keywords
Mitali Desai, Rupa G. Mehta and Dipti P. Rana
Scholarly communications, particularly, questions and answers (Q&A) present on digital scholarly platforms provide a new avenue to gain knowledge. However, several studies have…
Abstract
Purpose
Scholarly communications, particularly, questions and answers (Q&A) present on digital scholarly platforms provide a new avenue to gain knowledge. However, several studies have raised a concern about the content anomalies in these Q&A and suggested a proper validation before utilizing them in scholarly applications such as influence analysis and content-based recommendation systems. The content anomalies are referred as disinformation in this research. The purpose of this research is firstly, to assess scholarly communications in order to identify disinformation and secondly, to help scholarly platforms determine the scholars who probably disseminate such disinformation. These scholars are referred as the probable sources of disinformation.
Design/methodology/approach
To identify disinformation, the proposed model deduces (1) content redundancy and contextual redundancy in questions (2) contextual nonrelevance in answers with respect to the questions and (3) quality of answers with respect to the expertise of the answering scholars. Then, the model determines the probable sources of disinformation using the statistical analysis.
Findings
The model is evaluated on ResearchGate (RG) data. Results suggest that the model efficiently identifies disinformation from scholarly communications and accurately detects the probable sources of disinformation.
Practical implications
Different platforms with communication portals can use this model as a regulatory mechanism to restrict the prorogation of disinformation. Scholarly platforms can use this model to generate an accurate influence assessment mechanism and also relevant recommendations for their scholars.
Originality/value
The existing studies majorly deal with validating the answers using statistical measures. The proposed model focuses on questions as well as answers and performs a contextual analysis using an advanced word embedding technique.
Details
Keywords
Jenish Dhanani, Rupa Mehta and Dipti P. Rana
In the Indian judicial system, the court considers interpretations of similar previous judgments for the present case. An essential requirement of legal practitioners is to…
Abstract
Purpose
In the Indian judicial system, the court considers interpretations of similar previous judgments for the present case. An essential requirement of legal practitioners is to determine the most relevant judgments from an enormous amount of judgments for preparing supportive, beneficial and favorable arguments against the opponent. It urges a strong demand to develop a Legal Document Recommendation System (LDRS) to automate the process. In existing works, traditionally preprocessed judgment corpus is processed by Doc2Vec to learn semantically rich judgment embedding space (i.e. vector space). Here, vectors of semantically relevant judgments are in close proximity, as Doc2Vec can effectively capture semantic meanings. The enormous amount of judgments produces a huge noisy corpus and vocabulary which possesses a significant challenge: traditional preprocessing cannot fully eliminate noisy data from the corpus and due to this, the Doc2Vec demands huge memory and time to learn the judgment embedding. It also adversely affects the recommendation performance in terms of correctness. This paper aims to develop an effective and efficient LDRS to support civilians and the legal fraternity.
Design/methodology/approach
To overcome previously mentioned challenges, this research proposes the LDRS that uses the proposed Generalized English and Indian Legal Dictionary (GEILD) which keeps the corpus of relevant dictionary words only and discards noisy elements. Accordingly, the proposed LDRS significantly reduces the corpus size, which can potentially improve the space and time efficiency of Doc2Vec.
Findings
The experimental results confirm that the proposed LDRS with GEILD yield superior performance in terms of accuracy, F1-Score, MCC-Score, with significant improvement in the space and time efficiency.
Originality/value
The proposed LDRS uses the customized domain-specific preprocessing and novel legal dictionary (i.e. GEILD) to precisely recommend the relevant judgments. The proposed LDRS can be incorporated with online legal search repositories/engines to enrich their functionality.
Details
Keywords
This paper aims to provide a systematic review of the research focusing on the decarbonization strategy of businesses, stock return performance, and investment styles.
Abstract
Purpose
This paper aims to provide a systematic review of the research focusing on the decarbonization strategy of businesses, stock return performance, and investment styles.
Design/methodology/approach
The paper utilizes bibliometric methods and content analysis to present a broad overview of the research on the association between decarbonization strategies in businesses and financial performance in the last few decades. The final dataset contains 272 records published between 2001 and early 2021, available in the Web of Science (WoS) database.
Findings
The authors find a relatively small number of publications before 2010 and the research focus increases only after 2016. There exists limited knowledge on the links between climate change strategies and firm performance till date. The top management journals have also failed to respond to the importance of decarbonization strategies in firms and their relationship with stock returns and investment styles. Furthermore, there is a limited indication of publications from ecology and the environmental sciences, in general, being included or cited by the business and management research studies, thus highlighting weak network linkages between the two fields.
Research limitations/implications
The study contributes to the literature on decarbonization strategies of businesses, and the strategies' relation with firm performance by consolidating the extant research and thus finding the research gaps and research areas that require further investigation.
Practical implications
For the industry professionals, this research provides a comprehensive repository of articles on incorporating decarbonization strategies in industry professionals' decisions on improving firm performance.
Originality/value
This paper examines the history and development of themes, related to firms' emission mitigation strategies, firm performance and investment styles, across the journal articles in the WoS database published from 2001 to early 2021. In addition, the authors highlight research directions and the need for research on sustainable strategies in businesses, stock return, and investment styles.
Details