Search results | Emerald Insight

Open Access

Article

Publication date: 15 February 2022

Modular framework for similarity-based dataset discovery using external knowledge

Martin Nečaský, Petr Škoda, David Bernhauer, Jakub Klímek and Tomáš Skopal

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking…

HTML

PDF (2.9 MB)

Downloads

1431

Abstract

Purpose

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking the luxury of centralized database administration, database schemes, shared attributes, vocabulary, structure and semantics. The existing dataset catalogs provide basic search functionality relying on keyword search in brief, incomplete or misleading textual metadata attached to the datasets. The search results are thus often insufficient. However, there exist many ways of improving the dataset discovery by employing content-based retrieval, machine learning tools, third-party (external) knowledge bases, countless feature extraction methods and description models and so forth.

Design/methodology/approach

In this paper, the authors propose a modular framework for rapid experimentation with methods for similarity-based dataset discovery. The framework consists of an extensible catalog of components prepared to form custom pipelines for dataset representation and discovery.

Findings

The study proposes several proof-of-concept pipelines including experimental evaluation, which showcase the usage of the framework.

Originality/value

To the best of authors’ knowledge, there is no similar formal framework for experimentation with various similarity methods in the context of dataset discovery. The framework has the ambition to establish a platform for reproducible and comparable research in the area of dataset discovery. The prototype implementation of the framework is available on GitHub.

Details

Data Technologies and Applications, vol. 56 no. 4

Type: Research Article

DOI:

ISSN: 2514-9288

Keywords

Open Access

Article

Publication date: 2 May 2017

Analyzing students online learning behavior in blended courses using Moodle

Rosalina Rebucas Estacio and Rodolfo Callanta Raga Jr

The purpose of this paper is to describe a proposal for a data-driven investigation aimed at determining whether students’ learning behavior can be extracted and visualized from…

HTML

PDF (630 KB)

Downloads

51632

Abstract

Purpose

The purpose of this paper is to describe a proposal for a data-driven investigation aimed at determining whether students’ learning behavior can be extracted and visualized from action logs recorded by Moodle. The paper also tried to show whether there is a correlation between the activity level of students in online environments and their academic performance with respect to final grade.

Design/methodology/approach

The analysis was carried out using log data obtained from various courses dispensed in a university using a Moodle platform. The study also collected demographic profiles of students and compared them with their activity level in order to analyze how these attributes affect students’ level of activity in the online environment.

Findings

This work has shown that data mining algorithm like vector space model can be used to aggregate the action logs of students and quantify it into a single numeric value that can be used to generate visualizations of students’ level of activity. The current investigation indicates that there is a lot of variability in terms of the correlation between these two variables.

Practical implications

The value presented in the study can help instructors monitor course progression and enable them to rapidly identify which students are not performing well and adjust their pedagogical strategies accordingly.

Originality/value

A plan to continue the work by developing a complete dashboard style interface that instructors can use is already underway. More data need to be collected and more advanced processing tools are necessary in order to obtain a better perspective on this issue.

Details

Asian Association of Open Universities Journal, vol. 12 no. 1

Type: Research Article

DOI:

ISSN: 1858-3431

Keywords

Open Access

Article

Publication date: 9 December 2022

Topic optimization–incorporated collaborative recommendation for social tagging

Xuwei Pan, Xuemei Zeng and Ling Ding

With the continuous increase of users, resources and tags, social tagging systems gradually present the characteristics of “big data” such as large number, fast growth, complexity…

HTML

PDF (1.2 MB)

Downloads

657

Abstract

Purpose

With the continuous increase of users, resources and tags, social tagging systems gradually present the characteristics of “big data” such as large number, fast growth, complexity and unreliable quality, which greatly increases the complexity of recommendation. The contradiction between the efficiency and effectiveness of recommendation service in social tagging is increasingly becoming prominent. The purpose of this study is to incorporate topic optimization into collaborative filtering to enhance both the effectiveness and the efficiency of personalized recommendations for social tagging.

Design/methodology/approach

Combining the idea of optimization before service, this paper presents an approach that incorporates topic optimization into collaborative recommendations for social tagging. In the proposed approach, the recommendation process is divided into two phases of offline topic optimization and online recommendation service to achieve high-quality and efficient personalized recommendation services. In the offline phase, the tags' topic model is constructed and then used to optimize the latent preference of users and the latent affiliation of resources on topics.

Findings

Experimental evaluation shows that the proposed approach improves both precision and recall of recommendations, as well as enhances the efficiency of online recommendations compared with the three baseline approaches. The proposed topic optimization–incorporated collaborative recommendation approach can achieve the improvement of both effectiveness and efficiency for the recommendation in social tagging.

Originality/value

With the support of the proposed approach, personalized recommendation in social tagging with high quality and efficiency can be achieved.

Details

Data Technologies and Applications, vol. 58 no. 3

Type: Research Article

DOI:

ISSN: 2514-9288

Keywords

Open Access

Article

Publication date: 20 September 2022

Open problems in medical federated learning

Joo Hun Yoo, Hyejun Jeong, Jaehyeok Lee and Tai-Myoung Chung

This study aims to summarize the critical issues in medical federated learning and applicable solutions. Also, detailed explanations of how federated learning techniques can be…

HTML

PDF (1.4 MB)

Downloads

3619

Abstract

Purpose

This study aims to summarize the critical issues in medical federated learning and applicable solutions. Also, detailed explanations of how federated learning techniques can be applied to the medical field are presented. About 80 reference studies described in the field were reviewed, and the federated learning framework currently being developed by the research team is provided. This paper will help researchers to build an actual medical federated learning environment.

Design/methodology/approach

Since machine learning techniques emerged, more efficient analysis was possible with a large amount of data. However, data regulations have been tightened worldwide, and the usage of centralized machine learning methods has become almost infeasible. Federated learning techniques have been introduced as a solution. Even with its powerful structural advantages, there still exist unsolved challenges in federated learning in a real medical data environment. This paper aims to summarize those by category and presents possible solutions.

Findings

This paper provides four critical categorized issues to be aware of when applying the federated learning technique to the actual medical data environment, then provides general guidelines for building a federated learning environment as a solution.

Originality/value

Existing studies have dealt with issues such as heterogeneity problems in the federated learning environment itself, but those were lacking on how these issues incur problems in actual working tasks. Therefore, this paper helps researchers understand the federated learning issues through examples of actual medical machine learning environments.

Details

International Journal of Web Information Systems, vol. 18 no. 2/3

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

Open Access

Article

Publication date: 18 October 2022

Human resource heterogeneity, hold-up and firm cash holdings

Tingting Huang, Yilin Pan, Kai Zhu and Xinyuan Chen

This paper aims to study the impact of human resource heterogeneity on firms’ cash-holding policies.

HTML

PDF (464 KB)

Downloads

1140

Abstract

Purpose

This paper aims to study the impact of human resource heterogeneity on firms’ cash-holding policies.

Design/methodology/approach

The authors construct a proxy for human resource heterogeneity using the dissimilarity in employees’ skill structure between the firm and its peers in the same industry.

Findings

The authors report evidence that firms with heterogeneous human resources hold more cash than other firms. This effect is more pronounced in labor-intensive firms and firms more susceptible to hold-up by employees, i.e. firms located in regions with more labor disputes and firms surrounded by more external employment opportunities. In addition, the authors demonstrate that high cash holdings triggered by human resource heterogeneity reduce the scale and efficiency of firms’ capital investment.

Originality/value

This study highlights the role of human resource heterogeneity in determining firms’ cash policies. This paper adds to the understanding of labor adjustment costs within the firm and provides insights into firms’ cash-holding decisions.

Details

China Accounting and Finance Review, vol. 25 no. 1

Type: Research Article

DOI:

ISSN: 1029-807X

Keywords

Open Access

Article

Publication date: 28 July 2020

A hybrid approach for log signature generation

Prabhat Pokharel, Roshan Pokhrel and Basanta Joshi

Analysis of log message is very important for the identification of a suspicious system and network activity. This analysis requires the correct extraction of variable entities…

HTML

PDF (2.1 MB)

Downloads

1253

Abstract

Analysis of log message is very important for the identification of a suspicious system and network activity. This analysis requires the correct extraction of variable entities. The variable entities are extracted by comparing the logs messages against the log patterns. Each of these log patterns can be represented in the form of a log signature. In this paper, we present a hybrid approach for log signature extraction. The approach consists of two modules. The first module identifies log patterns by generating log clusters. The second module uses Named Entity Recognition (NER) to extract signatures by using the extracted log clusters. Experiments were performed on event logs from Windows Operating System, Exchange and Unix and validation of the result was done by comparing the signatures and the variable entities against the standard log documentation. The outcome of the experiments was that extracted signatures were ready to be used with a high degree of accuracy.

Details

Applied Computing and Informatics, vol. 19 no. 1/2

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

Open Access

Article

Publication date: 5 August 2021

An embedded bandit algorithm based on agent evolution for cold-start problem

Rui Qiu and Wen Ji

Many recommender systems are generally unable to provide accurate recommendations to users with limited interaction history, which is known as the cold-start problem. This issue…

HTML

PDF (145 KB)

Downloads

772

Abstract

Purpose

Many recommender systems are generally unable to provide accurate recommendations to users with limited interaction history, which is known as the cold-start problem. This issue can be resolved by trivial approaches that select random items or the most popular one to recommend to the new users. However, these methods perform poorly in many cases. This paper aims to explore the problem that how to make accurate recommendations for the new users in cold-start scenarios.

Design/methodology/approach

In this paper, the authors propose embedded-bandit method, inspired by Word2Vec technique and contextual bandit algorithm. The authors describe user contextual information with item embedding features constructed by Word2Vec. In addition, based on the intelligence measurement model in Crowd Science, the authors propose a new evaluation method to measure the utility of recommendations.

Findings

The authors introduce Word2Vec technique for constructing user contextual features, which improved the accuracy of recommendations compared to traditional multi-armed bandit problem. Apart from this, using this study’s intelligence measurement model, the utility also outperforms.

Practical implications

Improving the accuracy of recommendations during the cold-start phase can greatly raise user stickiness and increase user favorability, which in turn contributes to the commercialization of the app.

Originality/value

The algorithm proposed in this paper reflects that user contextual features can be represented by clicked items embedding vector.

Details

International Journal of Crowd Science, vol. 5 no. 3

Type: Research Article

DOI:

ISSN: 2398-7294

Keywords

Open Access

Article

Publication date: 14 September 2022

Government subsidization and corporate product strategies: evidence from Chinese exporters

Xiaodong Lu, Jingjun Liu and Janus Jian Zhang

This study aims to take advantage of exporters’ product codes and examine the effects of government subsidization on corporate product strategies by focusing on the dimension of…

HTML

PDF (184 KB)

Downloads

1237

Abstract

Purpose

This study aims to take advantage of exporters’ product codes and examine the effects of government subsidization on corporate product strategies by focusing on the dimension of product differentiation.

Design/methodology/approach

This study uses harmonized system (HS) product codes to construct a novel measure of product differentiation among a sample of Chinese exporters during 2000–2012. It uses propensity score matching to construct a comparable sample of control firms for exporters receiving government subsidies, and then a difference-in-differences (DID) analysis is conducted.

Findings

This study finds that product differentiation decreases immediately upon receiving a government subsidy. This finding suggests that in an emerging market, firms use their subsidy to imitate competitors rather than increase innovation. Further analyses show that this effect is concentrated among wholly foreign-owned enterprises and firms that focus on general trade rather than processing trade. In addition, the authors find some evidence that government subsidization leads to an increase in the number of product lines and decreases in domestic value added and export product quality.

Originality/value

This study constructs a novel measure of product differentiation for a large sample of Chinese exporters and provides insights that government subsidization can affect corporate product strategies.

Details

China Accounting and Finance Review, vol. 25 no. 3

Type: Research Article

DOI:

ISSN: 1029-807X

Keywords

Open Access

Article

Publication date: 21 October 2021

Analyzing TripAdvisor reviews of wine tours: an approach based on text mining and sentiment analysis

Elena Barbierato, Iacopo Bernetti and Irene Capecchi

Wine packaged tours as a specific aspect of wine tourism have so far been neglected in research, for this reason, the purpose of this study is to study the key elements for the…

HTML

PDF (2.4 MB)

Downloads

4628

Abstract

Purpose

Wine packaged tours as a specific aspect of wine tourism have so far been neglected in research, for this reason, the purpose of this study is to study the key elements for the success of the wine tour in Tuscany (Italy), evaluating the points of strength and weakness.

Design/methodology/approach

The study combines approaches of text mining, sentiment analysis and natural language processing, drawing on data from the TripAdvisor platform, obtaining through an automatic procedure 9,616 reviews from 600 tours in the years 2010–2020.

Findings

The authors identified six elements of successful wine tours expressed by research subjects: tour guide; logistical aspects; the quality of the wine; the quality of the food; complementary tourist and recreational activities; the landscape and historic villages. The key strength associated with success was the integration of the leading wine product with food, landscape and historic villages, while the main criticisms were concerned with the organization and planning of the tour. Furthermore, the tour guide also plays a fundamental role in consumer satisfaction.

Research limitations/implications

The limitations of the method were linked to the origin of the data used. The main one is that TripAdvisor does not allow you to have social and personal information about the tourist who wrote the review; therefore, the methods are substantially complementary to the traditional survey through questionnaires.

Practical implications

The proposed model can be used both by professionals to improve the quality of their products and by policymakers to promote the territorial development of quality wine-growing areas.

Social implications

The proposed model can be useful for policymakers to promote the territorial development of quality wine-growing areas.

Originality/value

The methodology we tested is easily transferable to many countries and to the authors’ knowledge, for the first time attempts to combine multidimensional scaling, sentiment analysis and natural language processing approaches.

Details

International Journal of Wine Business Research, vol. 34 no. 2

Type: Research Article

DOI:

ISSN: 1751-1062

Keywords

Open Access

Article

Publication date: 7 April 2022

Predictions through Lean startup? Harnessing AI-based predictions under uncertainty

Santo Raneri, Fabian Lecron, Julie Hermans and François Fouss

Artificial intelligence (AI) has started to receive attention in the field of digital entrepreneurship. However, few studies propose AI-based models aimed at assisting…

HTML

PDF (1.6 MB)

Downloads

3685

Abstract

Purpose

Artificial intelligence (AI) has started to receive attention in the field of digital entrepreneurship. However, few studies propose AI-based models aimed at assisting entrepreneurs in their day-to-day operations. In addition, extant models from the product design literature, while technically promising, fail to propose methods suitable for opportunity development with high level of uncertainty. This study develops and tests a predictive model that provides entrepreneurs with a digital infrastructure for automated testing. Such an approach aims at harnessing AI-based predictive technologies while keeping the ability to respond to the unexpected.

Design/methodology/approach

Based on effectuation theory, this study identifies an AI-based, predictive phase in the “build-measure-learn” loop of Lean startup. The predictive component, based on recommendation algorithm techniques, is integrated into a framework that considers both prediction (causal) and controlled (effectual) logics of action. The performance of the so-called active learning build-measure-predict-learn algorithm is evaluated on a data set collected from a case study.

Findings

The results show that the algorithm can predict the desirability level of newly implemented product design decisions (PDDs) in the context of a digital product. The main advantages, in addition to the prediction performance, are the ability to detect cases where predictions are likely to be less precise and an easy-to-assess indicator for product design desirability. The model is found to deal with uncertainty in a threefold way: epistemological expansion through accelerated data gathering, ontological reduction of uncertainty by revealing prior “unknown unknowns” and methodological scaffolding, as the framework accommodates both predictive (causal) and controlled (effectual) practices.

Originality/value

Research about using AI in entrepreneurship is still in a nascent stage. This paper can serve as a starting point for new research on predictive techniques and AI-based infrastructures aiming to support digital entrepreneurs in their day-to-day operations. This work can also encourage theoretical developments, building on effectuation and causation, to better understand Lean startup practices, especially when supported by digital infrastructures accelerating the entrepreneurial process.

Details

International Journal of Entrepreneurial Behavior & Research, vol. 29 no. 4

Type: Research Article

DOI:

ISSN: 1355-2554

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Social implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Access

Year

Content type

All feedback is valuable

Report an issue or find answers to frequently asked questions