Search results
1 – 10 of 10Daniel Šandor and Marina Bagić Babac
Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…
Abstract
Purpose
Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.
Design/methodology/approach
For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.
Findings
The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.
Originality/value
This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.
Details
Keywords
Karlo Puh and Marina Bagić Babac
As the tourism industry becomes more vital for the success of many economies around the world, the importance of technology in tourism grows daily. Alongside increasing tourism…
Abstract
Purpose
As the tourism industry becomes more vital for the success of many economies around the world, the importance of technology in tourism grows daily. Alongside increasing tourism importance and popularity, the amount of significant data grows, too. On daily basis, millions of people write their opinions, suggestions and views about accommodation, services, and much more on various websites. Well-processed and filtered data can provide a lot of useful information that can be used for making tourists' experiences much better and help us decide when selecting a hotel or a restaurant. Thus, the purpose of this study is to explore machine and deep learning models for predicting sentiment and rating from tourist reviews.
Design/methodology/approach
This paper used machine learning models such as Naïve Bayes, support vector machines (SVM), convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (BiLSTM) for extracting sentiment and ratings from tourist reviews. These models were trained to classify reviews into positive, negative, or neutral sentiment, and into one to five grades or stars. Data used for training the models were gathered from TripAdvisor, the world's largest travel platform. The models based on multinomial Naïve Bayes (MNB) and SVM were trained using the term frequency-inverse document frequency (TF-IDF) for word representations while deep learning models were trained using global vectors (GloVe) for word representation. The results from testing these models are presented, compared and discussed.
Findings
The performance of machine and learning models achieved high accuracy in predicting positive, negative, or neutral sentiments and ratings from tourist reviews. The optimal model architecture for both classification tasks was a deep learning model based on BiLSTM. The study’s results confirmed that deep learning models are more efficient and accurate than machine learning algorithms.
Practical implications
The proposed models allow for forecasting the number of tourist arrivals and expenditure, gaining insights into the tourists' profiles, improving overall customer experience, and upgrading marketing strategies. Different service sectors can use the implemented models to get insights into customer satisfaction with the products and services as well as to predict the opinions given a particular context.
Originality/value
This study developed and compared different machine learning models for classifying customer reviews as positive, negative, or neutral, as well as predicting ratings with one to five stars based on a TripAdvisor hotel reviews dataset that contains 20,491 unique hotel reviews.
Details
Keywords
Marina Bagić Babac and Vedran Podobnik
Due to an immense rise of social media in recent years, the purpose of this paper is to investigate who, how and why participates in creating content at football websites…
Abstract
Purpose
Due to an immense rise of social media in recent years, the purpose of this paper is to investigate who, how and why participates in creating content at football websites. Specifically, it provides a sentiment analysis of user comments from gender perspective, i.e. how differently men and women write about football. The analysis is based on user comments published on Facebook pages of the top five 2015-2016 Premier League football clubs during the 1st and the 19th week of the season.
Design/methodology/approach
This analysis uses a data collection via social media website and a sentiment analysis of the collected data.
Findings
Results show certain unexpected similarities in social media activities between male and female football fans. A comparison of the user comments from Facebook pages of the top five 2015-2016 Premier League football clubs revealed that men and women similarly express hard emotions such as anger or fear, while there is a significant difference in expressing soft emotions such as joy or sadness.
Originality/value
This paper provides an original insight into qualitative content analysis of male and female comments published at social media websites of the top five Premier League football clubs during the 1st and the 19th week of the 2015-2016 season.
Details
Keywords
Josip Gegač, Nikola Greb and Marina Bagić Babac
The purpose of this paper is to explore the Values in Action (VIA) classification of human strengths and virtues by using unsupervised machine learning techniques, specifically…
Abstract
Purpose
The purpose of this paper is to explore the Values in Action (VIA) classification of human strengths and virtues by using unsupervised machine learning techniques, specifically topic modeling algorithms, on a sample of X (formerly known as Twitter) posts. This study aims to investigate if and to what extent the structure of posts with the highest positive sentiment, as determined by topic modeling algorithms, aligns with the structure of the VIA classification.
Design/methodology/approach
This study uses a sample of X posts as the data set for the analysis. Unsupervised machine learning techniques, specifically topic modeling algorithms, are used to extract and categorize topics from X posts. The sentiment analysis algorithm is used to identify posts with the most positive sentiment. The structure and representation of these positive sentiment posts are then compared with the structure of the VIA classification.
Findings
The results of this study reveal a correlation between the structure of posts with the highest positive sentiment, as determined by topic modeling algorithms, and the structure of the VIA classification. This indicates that the topic structures derived from the X posts exhibit similarities to the categorization of character strengths proposed by the VIA classification. The findings of this study provide empirical validation for the VIA classification framework when applied to social media data.
Originality/value
This paper contributes to the literature by using unsupervised machine learning techniques to validate the VIA classification on social media data. The use of these innovative methods adds a novel dimension to the research on character strengths and virtues.
Details
Keywords
Social media platforms are highly visible platforms, so politicians try to maximize their benefits from their use, especially during election campaigns. On the other side, people…
Abstract
Purpose
Social media platforms are highly visible platforms, so politicians try to maximize their benefits from their use, especially during election campaigns. On the other side, people express their views and sentiments toward politicians and political issues on social media, thus enabling them to observe their online political behavior. Therefore, this study aims to investigate user reactions on social media during the 2016 US presidential campaign to decide which candidate invoked stronger emotions on social media.
Design/methodology/approach
For testing the proposed hypotheses regarding emotional reactions to social media content during the 2016 presidential campaign, regression analysis was used to analyze a data set that consists of Trump’s 996 posts and Clinton’s 1,253 posts on Facebook. The proposed regression models are based on viral (likes, shares, comments) and emotional Facebook reactions (Angry, Haha, Sad, Surprise, Wow) as well as Russell’s valence, arousal, dominance (VAD) circumplex model for valence, arousal and dominance.
Findings
The results of regression analysis indicate how Facebook users felt about both presidential candidates. For Clinton’s page, both positive and negative content are equally liked, while Trump’s followers prefer funny and positive emotions. For both candidates, positive and negative content influences the number of comments. Trump’s followers mostly share positive content and the content that makes them angry, while Clinton’s followers share any content that does not make them angry. Based on VAD analysis, less dominant content, with high arousal and more positive emotions, is more liked on Trump’s page, where valence is a significant predictor for commenting and sharing. More positive content is more liked on Clinton’s page, where both positive and negative emotions with low arousal are correlated to commenting and sharing of posts.
Originality/value
Building on an empirical data set from Facebook, this study shows how differently the presidential candidates communicated on social media during the 2016 election campaign. According to the findings, Trump used a hard campaign strategy, while Clinton used a soft strategy.
Details
Keywords
Mateo Hitl, Nikola Greb and Marina Bagić Babac
The purpose of this study is to investigate how expressing gratitude and forgiveness on social media platforms relates to the overall sentiment of users, aiming to understand the…
Abstract
Purpose
The purpose of this study is to investigate how expressing gratitude and forgiveness on social media platforms relates to the overall sentiment of users, aiming to understand the impact of these expressions on social media interactions and individual well-being.
Design/methodology/approach
The hypothesis posits that users who frequently express gratitude or forgiveness will exhibit more positive sentiment in all posts during the observed period, compared to those who express these emotions less often. To test the hypothesis, sentiment analysis and statistical inference will be used. Additionally, topic modelling algorithms will be used to identify and assess the correlation between expressing gratitude and forgiveness and various topics.
Findings
This research paper explores the relationship between expressing gratitude and forgiveness in X (formerly known as Twitter) posts and the overall sentiment of user posts. The findings suggest correlations between expressing these emotions and the overall tone of social media content. The findings of this study can inform future research on how expressing gratitude and forgiveness can affect online sentiment and communication.
Originality/value
The authors have demonstrated that social media users who frequently express gratitude or forgiveness over an extended period of time exhibit a more positive sentiment compared to those who express these emotions less. Additionally, the authors observed that BERTopic modelling analysis performs better than latent dirichlet allocation and Top2Vec modelling analyses when analysing short messages from social media. This research, through the application of innovative techniques and the confirmation of previous theoretical findings, paves the way for further studies in the fields of positive psychology and machine learning.
Details
Keywords
Social media allow for observing different aspects of human behaviour, in particular, those that can be evaluated from explicit user expressions. Based on a data set of posts with…
Abstract
Purpose
Social media allow for observing different aspects of human behaviour, in particular, those that can be evaluated from explicit user expressions. Based on a data set of posts with user opinions collected from social media, this paper aims to show an insight into how the readers of different news portals react to online content. The focus is on users’ emotions about the content, so the findings of the analysis provide a further understanding of how marketers should structure and deliver communication content such that it promotes positive engagement behaviour.
Design/methodology/approach
More than 5.5 million user comments to posted messages from 15 worldwide popular news portals were collected and analysed, where each post was evaluated based on a set of variables that represent either structural (e.g. embedded in intra- or inter-message structure) or behavioural (e.g. exhibiting a certain behavioural pattern that appeared in response to a posted message) component of expressions. The conclusions are based on a set of regression models and exploratory factor analysis.
Findings
The findings show and theorise the influence of social media content on emotional user engagement. This provides a more comprehensive understanding of the engagement attributed to social media content and, consequently, could be a better predictor of future behaviour.
Originality/value
This paper provides original data analysis of user comments and emotional reactions that appeared on social media news websites in 2018.
Details
Keywords
Karlo Puh and Marina Bagić Babac
Predicting the stock market's prices has always been an interesting topic since its closely related to making money. Recently, the advances in natural language processing (NLP…
Abstract
Purpose
Predicting the stock market's prices has always been an interesting topic since its closely related to making money. Recently, the advances in natural language processing (NLP) have opened new perspectives for solving this task. The purpose of this paper is to show a state-of-the-art natural language approach to using language in predicting the stock market.
Design/methodology/approach
In this paper, the conventional statistical models for time-series prediction are implemented as a benchmark. Then, for methodological comparison, various state-of-the-art natural language models ranging from the baseline convolutional and recurrent neural network models to the most advanced transformer-based models are developed, implemented and tested.
Findings
Experimental results show that there is a correlation between the textual information in the news headlines and stock price prediction. The model based on the GRU (gated recurrent unit) cell with one linear layer, which takes pairs of the historical prices and the sentiment score calculated using transformer-based models, achieved the best result.
Originality/value
This study provides an insight into how to use NLP to improve stock price prediction and shows that there is a correlation between news headlines and stock price prediction.
Details
Keywords
Antonijo Marijić and Marina Bagić Babac
Genre classification of songs based on lyrics is a challenging task even for humans, however, state-of-the-art natural language processing has recently offered advanced solutions…
Abstract
Purpose
Genre classification of songs based on lyrics is a challenging task even for humans, however, state-of-the-art natural language processing has recently offered advanced solutions to this task. The purpose of this study is to advance the understanding and application of natural language processing and deep learning in the domain of music genre classification, while also contributing to the broader themes of global knowledge and communication, and sustainable preservation of cultural heritage.
Design/methodology/approach
The main contribution of this study is the development and evaluation of various machine and deep learning models for song genre classification. Additionally, we investigated the effect of different word embeddings, including Global Vectors for Word Representation (GloVe) and Word2Vec, on the classification performance. The tested models range from benchmarks such as logistic regression, support vector machine and random forest, to more complex neural network architectures and transformer-based models, such as recurrent neural network, long short-term memory, bidirectional long short-term memory and bidirectional encoder representations from transformers (BERT).
Findings
The authors conducted experiments on both English and multilingual data sets for genre classification. The results show that the BERT model achieved the best accuracy on the English data set, whereas cross-lingual language model pretraining based on RoBERTa (XLM-RoBERTa) performed the best on the multilingual data set. This study found that songs in the metal genre were the most accurately labeled, as their text style and topics were the most distinct from other genres. On the contrary, songs from the pop and rock genres were more challenging to differentiate. This study also compared the impact of different word embeddings on the classification task and found that models with GloVe word embeddings outperformed Word2Vec and the learning embedding layer.
Originality/value
This study presents the implementation, testing and comparison of various machine and deep learning models for genre classification. The results demonstrate that transformer models, including BERT, robustly optimized BERT pretraining approach, distilled bidirectional encoder representations from transformers, bidirectional and auto-regressive transformers and XLM-RoBERTa, outperformed other models.
Details
Keywords
Marina Bagić Babac and Vedran Podobnik
Due to the significant rise in the use of social media in recent years, the purpose of this paper is to investigate who, how and why participates in creating content at political…
Abstract
Purpose
Due to the significant rise in the use of social media in recent years, the purpose of this paper is to investigate who, how and why participates in creating content at political social networking websites utilising a content analysis of posts and comments published on Facebook during the 2015 general election campaign in Croatia. It shows consequences of a transition from traditional to social media campaigns and the effectiveness of social media at activating and moving public opinion during the general election campaign.
Design/methodology/approach
This study uses a data collection through a social media website, a classification of data set items by content attributes and a statistical analysis of the classified data.
Findings
Building on an empirical data set from Croatia, this study reveals that different political parties implement different election campaign strategies on social media to influence citizens who, consequently, respond differently to each of them. The results indicate that political messages with positive emotions evocate positive response from citizens, while neutral content is more likely to invoke negative comments and criticism, and support to the opponent. Another implication of the results is that two-way and tolerant communication of political actors increases citizen engagement, whereas unidirectional communication decreases it.
Originality/value
This paper provides an original insight into qualitative content analysis of posts and user comments published on Facebook during the 2015 general election campaign in Croatia.
Details