Search results
1 – 2 of 2Referred data set produces reliable information about the network flows and common attacks meeting with real-world criteria. Accordingly, this study aims to focus on the use of…
Abstract
Purpose
Referred data set produces reliable information about the network flows and common attacks meeting with real-world criteria. Accordingly, this study aims to focus on the use of imbalanced intrusion detection benchmark knowledge discovery in database (KDD) data set. KDD data set is most preferably used by many researchers for experimentation and analysis. The proposed algorithm improvised random forest classification with error tuning factors (IRFCETF) deals with experimentation on KDD data set and evaluates the performance of a complete set of network traffic features through IRFCETF.
Design/methodology/approach
In the current era of applications, the attention of researchers is immersed by a diverse number of existing time applications that deals with imbalanced data classification (ImDC). Real-time application areas, artificial intelligence (AI), Industrial Internet of Things (IIoT), etc. are dealing ImDC undergo with diverted classification performance due to skewed data distribution (SkDD). There are numerous application areas that deal with SkDD. Many of the data applications in AI and IIoT face the diverted data classification rate in SkDD. In recent advancements, there is an exponential expansion in the volume of computer network data and related application developments. Intrusion detection is one of the demanding applications of ImDC. The proposed study focusses on imbalanced intrusion benchmark data set, KDD data set and other benchmark data set with the proposed IRFCETF approach. IRFCETF justifies the enriched classification performance on imbalanced data set over the existing approach. The purpose of this work is to review imbalanced data applications in numerous application areas including AI and IIoT and tuning the performance with respect to principal component analysis. This study also focusses on the out-of-bag error performance-tuning factor.
Findings
Experimental results on KDD data set shows that proposed algorithm gives enriched performance. For referred intrusion detection data set, IRFCETF classification accuracy is 99.57% and error rate is 0.43%.
Research limitations/implications
This research work extended for further improvements in classification techniques with multiple correspondence analysis (MCA); hierarchical MCA can be focussed with the use of classification models for wide range of skewed data sets.
Practical implications
The metrics enhancement is measurable and helpful in dealing with intrusion detection systems–related imbalanced applications in current application domains such as security, AI and IIoT digitization. Analytical results show improvised metrics of the proposed approach than other traditional machine learning algorithms. Thus, error-tuning parameter creates a measurable impact on classification accuracy is justified with the proposed IRFCETF.
Social implications
Proposed algorithm is useful in numerous IIoT applications such as health care, machinery automation etc.
Originality/value
This research work addressed classification metric enhancement approach IRFCETF. The proposed method yields a test set categorization for each case with error reduction mechanism.
Details
Keywords
Mitali Desai, Rupa G. Mehta and Dipti P. Rana
Scholarly communications, particularly, questions and answers (Q&A) present on digital scholarly platforms provide a new avenue to gain knowledge. However, several studies have…
Abstract
Purpose
Scholarly communications, particularly, questions and answers (Q&A) present on digital scholarly platforms provide a new avenue to gain knowledge. However, several studies have raised a concern about the content anomalies in these Q&A and suggested a proper validation before utilizing them in scholarly applications such as influence analysis and content-based recommendation systems. The content anomalies are referred as disinformation in this research. The purpose of this research is firstly, to assess scholarly communications in order to identify disinformation and secondly, to help scholarly platforms determine the scholars who probably disseminate such disinformation. These scholars are referred as the probable sources of disinformation.
Design/methodology/approach
To identify disinformation, the proposed model deduces (1) content redundancy and contextual redundancy in questions (2) contextual nonrelevance in answers with respect to the questions and (3) quality of answers with respect to the expertise of the answering scholars. Then, the model determines the probable sources of disinformation using the statistical analysis.
Findings
The model is evaluated on ResearchGate (RG) data. Results suggest that the model efficiently identifies disinformation from scholarly communications and accurately detects the probable sources of disinformation.
Practical implications
Different platforms with communication portals can use this model as a regulatory mechanism to restrict the prorogation of disinformation. Scholarly platforms can use this model to generate an accurate influence assessment mechanism and also relevant recommendations for their scholars.
Originality/value
The existing studies majorly deal with validating the answers using statistical measures. The proposed model focuses on questions as well as answers and performs a contextual analysis using an advanced word embedding technique.
Details