Search results
1 – 2 of 2Ashiqur Rahman, Ehsan Mohammadi and Hamed Alhoori
With the remarkable capability to reach the public instantly, social media has become integral in sharing scholarly articles to measure public response. Since spamming by bots on…
Abstract
Purpose
With the remarkable capability to reach the public instantly, social media has become integral in sharing scholarly articles to measure public response. Since spamming by bots on social media can steer the conversation and present a false public interest in given research, affecting policies impacting the public’s lives in the real world, this topic warrants critical study and attention.
Design/methodology/approach
We used the Altmetric dataset in combination with data collected through the Twitter Application Programming Interface (API) and the Botometer API. We combined the data into an extensive dataset with academic articles, several features from the article and a label indicating whether the article had excessive bot activity on Twitter or not. We analyzed the data to see the possibility of bot activity based on different characteristics of the article. We also trained machine-learning models using this dataset to identify possible bot activity in any given article.
Findings
Our machine-learning models were capable of identifying possible bot activity in any academic article with an accuracy of 0.70. We also found that articles related to “Health and Human Science” are more prone to bot activity compared to other research areas. Without arguing the maliciousness of the bot activity, our work presents a tool to identify the presence of bot activity in the dissemination of an academic article and creates a baseline for future research in this direction.
Research limitations/implications
We considered the features available from the Altmetric dataset. It can be exciting research to extract additional features about the authors of the article, the location of the publication, international collaboration and other demographic features of the authors to see the relation of these features with bot activity.
Practical implications
Since public interest in scientific findings can shape the decisions of policymakers, it is essential to identify the possibility of bot activity in the dissemination of any given scholarly article. Without arguing whether the social bots are good or bad and without arguing about the validity of a scholarly article, our work proposes a tool to interpret the public interest in an article by identifying the possibility of bot activity toward an article. This work publishes the models and data generated through the study and provides a benchmark and guideline for future works in this direction.
Originality/value
While the majority of the existing research focuses on identifying and preventing bot activity on social media, our work is novel in predicting the possibility of bot activity in the dissemination of an academic article using Altmetric metadata for the article. Little work has been performed in this specific area, and the models developed from our research give policymakers and the public a tool to interpret and understand the public interest in a scientific publication with appropriate caution.
Details
Keywords
Hossein Dehdarirad, Javad Ghazimirsaeid and Ammar Jalalimanesh
The purpose of this investigation is to identify, evaluate, integrate and summarize relevant and qualified papers through conducting a systematic literature review (SLR) on the…
Abstract
Purpose
The purpose of this investigation is to identify, evaluate, integrate and summarize relevant and qualified papers through conducting a systematic literature review (SLR) on the application of recommender systems (RSs) to suggest a scholarly publication venue for researcher's paper.
Design/methodology/approach
To identify the relevant papers published up to August 11, 2018, an SLR study on four databases (Scopus, Web of Science, IEEE Xplore and ScienceDirect) was conducted. We pursued the guidelines presented by Kitchenham and Charters (2007) for performing SLRs in software engineering. The papers were analyzed based on data sources, RSs classes, techniques/methods/algorithms, datasets, evaluation methodologies and metrics, as well as future directions.
Findings
A total of 32 papers were identified. The most data sources exploited in these papers were textual (title/abstract/keywords) and co-authorship data. The RS classes in the selected papers were almost equally used. DBLP was the main dataset utilized. Cosine similarity, social network analysis (SNA) and term frequency–inverse document frequency (TF–IDF) algorithm were frequently used. In terms of evaluation methodologies, 24 papers applied only offline evaluations. Furthermore, precision, accuracy and recall metrics were the popular performance metrics. In the reviewed papers, “use more datasets” and “new algorithms” were frequently mentioned in the future work part as well as conclusions.
Originality/value
Given that a review study has not been conducted in this area, this paper can provide an insight into the current status in this area and may also contribute to future research in this field.
Details