Florian Fahrenbach, Kate Revoredo and Flavia Maria Santoro
This paper aims to introduce an information and communication technology (ICT) artifact that uses text mining to support the innovative and standardized assessment of professional…
Abstract
Purpose
This paper aims to introduce an information and communication technology (ICT) artifact that uses text mining to support the innovative and standardized assessment of professional competences within the validation of prior learning (VPL). Assessment means comparing identified and documented professional competences against a standard or reference point. The designed artifact is evaluated by matching a set of curriculum vitae (CV) scraped from LinkedIn against a comprehensive model of professional competence.
Design/methodology/approach
A design science approach informed the development and evaluation of the ICT artifact presented in this paper.
Findings
A proof of concept shows that the ICT artifact can support assessors within the validation of prior learning procedure. Rather the output of such an ICT artifact can be used to structure documentation in the validation process.
Research limitations/implications
Evaluating the artifact shows that ICT support to assess documented learning outcomes is a promising endeavor but remains a challenge. Further research should work on standardized ways to document professional competences, ICT artifacts capture the semantic content of documents, and refine ontologies of theoretical models of professional competences.
Practical implications
Text mining methods to assess professional competences rely on large bodies of textual data, and thus a thoroughly built and large portfolio is necessary as input for this ICT artifact.
Originality/value
Following the recent call of European policymakers to develop standardized and ICT-based approaches for the assessment of professional competences, an ICT artifact that supports the automatized assessment of professional competences within the validation of prior learning is designed and evaluated.
Details
Keywords
Daniel Šandor and Marina Bagić Babac
Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…
Abstract
Purpose
Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.
Design/methodology/approach
For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.
Findings
The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.
Originality/value
This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.
Details
Keywords
Matjaž Kragelj and Mirjana Kljajić Borštnar
The purpose of this study is to develop a model for automated classification of old digitised texts to the Universal Decimal Classification (UDC), using machine-learning methods.
Abstract
Purpose
The purpose of this study is to develop a model for automated classification of old digitised texts to the Universal Decimal Classification (UDC), using machine-learning methods.
Design/methodology/approach
The general research approach is inherent to design science research, in which the problem of UDC assignment of the old, digitised texts is addressed by developing a machine-learning classification model. A corpus of 70,000 scholarly texts, fully bibliographically processed by librarians, was used to train and test the model, which was used for classification of old texts on a corpus of 200,000 items. Human experts evaluated the performance of the model.
Findings
Results suggest that machine-learning models can correctly assign the UDC at some level for almost any scholarly text. Furthermore, the model can be recommended for the UDC assignment of older texts. Ten librarians corroborated this on 150 randomly selected texts.
Research limitations/implications
The main limitations of this study were unavailability of labelled older texts and the limited availability of librarians.
Practical implications
The classification model can provide a recommendation to the librarians during their classification work; furthermore, it can be implemented as an add-on to full-text search in the library databases.
Social implications
The proposed methodology supports librarians by recommending UDC classifiers, thus saving time in their daily work. By automatically classifying older texts, digital libraries can provide a better user experience by enabling structured searches. These contribute to making knowledge more widely available and useable.
Originality/value
These findings contribute to the field of automated classification of bibliographical information with the usage of full texts, especially in cases in which the texts are old, unstructured and in which archaic language and vocabulary are used.
Details
Keywords
Friso van Dijk, Joost Gadellaa, Chaïm van Toledo, Marco Spruit, Sjaak Brinkkemper and Matthieu Brinkhuis
This paper aims that privacy research is divided in distinct communities and rarely considered as a singular field, harming its disciplinary identity. The authors collected…
Abstract
Purpose
This paper aims that privacy research is divided in distinct communities and rarely considered as a singular field, harming its disciplinary identity. The authors collected 119.810 publications and over 3 million references to perform a bibliometric domain analysis as a quantitative approach to uncover the structures within the privacy research field.
Design/methodology/approach
The bibliometric domain analysis consists of a combined directed network and topic model of published privacy research. The network contains 83,159 publications and 462,633 internal references. A Latent Dirichlet allocation (LDA) topic model from the same dataset offers an additional lens on structure by classifying each publication on 36 topics with the network data. The combined outcomes of these methods are used to investigate the structural position and topical make-up of the privacy research communities.
Findings
The authors identified the research communities as well as categorised their structural positioning. Four communities form the core of privacy research: individual privacy and law, cloud computing, location data and privacy-preserving data publishing. The latter is a macro-community of data mining, anonymity metrics and differential privacy. Surrounding the core are applied communities. Further removed are communities with little influence, most notably the medical communities that make up 14.4% of the network. The topic model shows system design as a potentially latent community. Noteworthy is the absence of a centralised body of knowledge on organisational privacy management.
Originality/value
This is the first in-depth, quantitative mapping study of all privacy research.
Details
Keywords
Linzi Wang, Qiudan Li, Jingjun David Xu and Minjie Yuan
Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models…
Abstract
Purpose
Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models primarily integrate word embedding and matrix decomposition, which only generates keyword-based hot topics with weak interpretability, making it difficult to meet the specific needs of users. Mining phrase-based hot topics with syntactic dependency structure have been proven to model structure information effectively. A key challenge lies in the effective integration of the above information into the hot topic mining process.
Design/methodology/approach
This paper proposes the nonnegative matrix factorization (NMF)-based hot topic mining method, semantics syntax-assisted hot topic model (SSAHM), which combines semantic association and syntactic dependency structure. First, a semantic–syntactic component association matrix is constructed. Then, the matrix is used as a constraint condition to be incorporated into the block coordinate descent (BCD)-based matrix decomposition process. Finally, a hot topic information-driven phrase extraction algorithm is applied to describe hot topics.
Findings
The efficacy of the developed model is demonstrated on two real-world datasets, and the effects of dependency structure information on different topics are compared. The qualitative examples further explain the application of the method in real scenarios.
Originality/value
Most prior research focuses on keyword-based hot topics. Thus, the literature is advanced by mining phrase-based hot topics with syntactic dependency structure, which can effectively analyze the semantics. The development of syntactic dependency structure considering the combination of word order and part-of-speech (POS) is a step forward as word order, and POS are only separately utilized in the prior literature. Ignoring this synergy may miss important information, such as grammatical structure coherence and logical relations between syntactic components.
Details
Keywords
Paramita Ray and Amlan Chakrabarti
Social networks have changed the communication patterns significantly. Information available from different social networking sites can be well utilized for the analysis of users…
Abstract
Social networks have changed the communication patterns significantly. Information available from different social networking sites can be well utilized for the analysis of users opinion. Hence, the organizations would benefit through the development of a platform, which can analyze public sentiments in the social media about their products and services to provide a value addition in their business process. Over the last few years, deep learning is very popular in the areas of image classification, speech recognition, etc. However, research on the use of deep learning method in sentiment analysis is limited. It has been observed that in some cases the existing machine learning methods for sentiment analysis fail to extract some implicit aspects and might not be very useful. Therefore, we propose a deep learning approach for aspect extraction from text and analysis of users sentiment corresponding to the aspect. A seven layer deep convolutional neural network (CNN) is used to tag each aspect in the opinionated sentences. We have combined deep learning approach with a set of rule-based approach to improve the performance of aspect extraction method as well as sentiment scoring method. We have also tried to improve the existing rule-based approach of aspect extraction by aspect categorization with a predefined set of aspect categories using clustering method and compared our proposed method with some of the state-of-the-art methods. It has been observed that the overall accuracy of our proposed method is 0.87 while that of the other state-of-the-art methods like modified rule-based method and CNN are 0.75 and 0.80 respectively. The overall accuracy of our proposed method shows an increment of 7–12% from that of the state-of-the-art methods.
Details
Keywords
Milja Niinihuhta, Anja Terkamo-Moisio, Tarja Kvist and Arja Häggman-Laitila
This study aims to describe nurse leaders’ experiences of work-related well-being and its association with background variables, working conditions, work engagement, sense of…
Abstract
Purpose
This study aims to describe nurse leaders’ experiences of work-related well-being and its association with background variables, working conditions, work engagement, sense of coherence and burnout.
Design/methodology/approach
An electronic survey design was used. Data was collected between December 2015 and May 2016 with an instrument that included demographic questions and four internationally validated scales: the Utrecht Work Engagement Scale, QPS Nordic 34+, the shortened Sense of Coherence scale and the Maslach Burnout Inventory. Data was analysed using statistical methods.
Findings
A total of 155 nurse leaders completed the questionnaire, giving a 44% response rate. Most of them worked as nurse managers (89%). Participants’ work-related well-being scores ranged from 8 to 10. Statistically significant relationships were found between participants’ work-related well-being and their leadership skills, current position, sense of coherence and levels of burnout. In addition, there were statistically significant relationships between work-related well-being and all dimensions of working conditions.
Originality/value
This study underlines the fact that work-related well-being should not be evaluated based on a single factor. The participants’ perceived work-related well-being was high, although almost half of them reported always or often experiencing stress. The results suggest that nurse leaders may have resources such as good leadership and problem-solving skills, supportive working conditions and a high sense of coherence that prevent the experienced stress from adversely affecting their work-related well-being.
Details
Keywords
Lijuan Shi, Zuoning Jia, Huize Sun, Mingshu Tian and Liquan Chen
This paper aims to study the affecting factors on bird nesting on electronic railway catenary lines and the impact of bird nesting events on railway operation.
Abstract
Purpose
This paper aims to study the affecting factors on bird nesting on electronic railway catenary lines and the impact of bird nesting events on railway operation.
Design/methodology/approach
First, with one year’s bird nest events in the form of unstructured natural language collected from Shanghai Railway Bureau, the records were structured with the help of python software tool. Second, the method of root cause analysis (RCA) was used to identify all the possible influencing factors which are inclined to affect the probability of bird nesting. Third, the possible factors then were classified into two categories to meet subsequent analysis separately, category one was outside factors (i.e. geographic conditions related factors), the other was inside factors (i.e. railway related factors).
Findings
It was observed that factors of city population, geographic position affect nesting observably. Then it was demonstrated that both location and nesting on equipment part have no correlation with delay, while railway type had a significant but low correlation with delay.
Originality/value
This paper discloses the principle of impacts of nest events on railway operation.
Details
Keywords
The deformation of the roadbed is easily influenced by the external environment to improve the accuracy of high-speed railway subgrade settlement prediction.
Abstract
Purpose
The deformation of the roadbed is easily influenced by the external environment to improve the accuracy of high-speed railway subgrade settlement prediction.
Design/methodology/approach
A high-speed railway subgrade settlement interval prediction method using the secretary bird optimization (SBOA) algorithm to optimize the BP neural network under the premise of gray relational analysis is proposed.
Findings
Using the SBOA algorithm to optimize the BP neural network, the optimal weights and thresholds are obtained, and the best parameter prediction model is combined. The data were collected from the sensors deployed through the subgrade settlement monitoring system, and the gray relational analysis is used to verify that all four influencing factors had a great correlation to the subgrade settlement, and the collected data are verified using the model.
Originality/value
The experimental results show that the SBOA-BP model has higher prediction accuracy than the BP model, and the SBOA-BP model has a wider range of prediction intervals for a given confidence level, which can provide higher guiding value for practical engineering applications.