Search results
1 – 7 of 7Fan Wu, Ya-Han Hu and Ping-Rong Wang
Most academic libraries provide book recommendation services to enable readers to recommend books to the libraries. To facilitate decision-making in book acquisition, this study…
Abstract
Purpose
Most academic libraries provide book recommendation services to enable readers to recommend books to the libraries. To facilitate decision-making in book acquisition, this study aimed to develop a method to determine the ranking of the recommended books based on the recommender network.
Design/methodology/approach
The recommender network was conducted to establish relationships among book recommenders and their similar readers by using circulation records. Furthermore, social computing techniques were used to evaluate the degree of representativeness of the recommenders and subsequently applied as a criterion to rank the recommended books. Empirical studies were performed to demonstrate the effectiveness of the proposed ranking system. The Spearman’s correlation coefficients between the proposed ranking system and the ranking obtained using reader circulation statistics were used as performance measure.
Findings
The ranking calculated using the proposed ranking mechanism was highly and moderately correlated to the ranking obtained using reader circulation statistics. The ranking of recommended books by the librarians was moderately and poorly correlated to the ranking calculated using reader circulation statistics.
Practical implications
The book recommender can be used to improve the accuracy of book recommendations.
Originality/value
This study is the first that considers the recommender network on library book acquisition. The results also show that the proposed ranking mechanism can facilitate effective book-acquisition decisions in libraries.
Details
Keywords
Ya-Han Hu, Wen-Ming Shiau, Sheng-Pao Shih and Cho-Ju Chen
The purpose of this paper is to combine basic movie information factors, external factors and review factors, to predict box-office performance and identify the most crucial…
Abstract
Purpose
The purpose of this paper is to combine basic movie information factors, external factors and review factors, to predict box-office performance and identify the most crucial factor of influence for box-office performance.
Design/methodology/approach
Five movie genres and first-week movie reviews found on IMDb were collected. The movie reviews were quantified using sentiment analysis tools SentiStrength and Stanford CoreNLP, in which quantified data were combined with basic movie information and external environment factors to predict movie box-office performance. A movie box-office performance prediction model was then developed using data mining (DM) technologies with M5 model trees (M5P), linear regression (LR) and support vector regression (SVR), after which movie box-office performance predictions were made.
Findings
The results of this paper showed that the inclusion of movie reviews generated more accurate prediction results. Concerning movie review-related factors, the one that exhibited the greatest effect on box-office performance was the number of movie reviews made, whereas movie review content only displayed an effect on box-office performance for specific movie genres.
Research limitations/implications
Because this paper collected movie data from the IMDb, the data were limited and primarily consisted of movies released in the USA; data pertaining to less popular movies or those released outside of the USA were, thus, insufficient.
Practical implications
This paper helps to verify whether the consideration of the features extracted from movie reviews can improve the performance of movie box-office.
Originality/value
Through various DM technologies, this paper shows that movie reviews enhanced the accuracy of box-office performance predictions and the content of movie reviews has an effect on box-office performance.
Details
Keywords
Hsu-Che Wu, Ya-Han Hu and Yen-Hao Huang
Credit ratings have become one of the primary references for financial institutions to assess credit risk. Conventional credit rating approaches mainly concentrated on two-class…
Abstract
Purpose
Credit ratings have become one of the primary references for financial institutions to assess credit risk. Conventional credit rating approaches mainly concentrated on two-class classification (i.e. good or bad credit), which lacks adequate precision to perform credit risk evaluations in practice. In addition, most of previous researches directly focussed on employing various data mining techniques, but rare studies discussed the influence of data preprocessing before classifier construction. The paper aims to discuss these issues.
Design/methodology/approach
This study considers nine-class classification (i.e. nine credit risk level) to credit rating prediction. For the development of more accurate classifiers, the paper adopts two-stage analysis, which integrates multiple data preprocessing and supervised learning techniques. Specifically, the first stage applies feature selection, data clustering, and data resampling methods to preprocess the data, and then the second stage utilizes several classification techniques and classifier ensembles to construct prediction models.
Findings
The results show that Bagging-DT with data resampling method achieves excellent accuracy (82.96 percent), indicating that the proposed two-stage prediction model is better than conventional one-stage models.
Originality/value
Practical implication of this study can lower credit rating expenses and also allow corporations to gain credit rating information instantly.
Details
Keywords
Cheng-Che Shen, Ya-Han Hu, Wei-Chao Lin, Chih-Fong Tsai and Shih-Wen Ke
The purpose of this paper is to focus on examining the research impact of papers written with and without funding. Specifically, the citation analysis method is used to compare…
Abstract
Purpose
The purpose of this paper is to focus on examining the research impact of papers written with and without funding. Specifically, the citation analysis method is used to compare the general and funded papers published in two leading international conferences, which are ACM SIGIR and ACM SIGKDD.
Design/methodology/approach
The authors investigate the number of general and funded papers to see whether the number of funded papers is larger than the number of general papers. In addition, the total citations and the number of highly cited papers with and without funding are also compared.
Findings
The analysis results of ACM SIGIR papers show that in most cases the number of funded papers is larger than the number of general papers. Moreover, the total captions, the average number of citations per paper, and the number of highly cited papers all reveal the superiority of funded papers over general papers. However, the findings are somewhat different for the ACM SIGKDD papers. This may be because ACM SIGIR began much earlier than ACM SIGKDD, which relates to the maturity of the research problems addressed in these two conferences.
Originality/value
The value of this paper is the first attempt at examining the research impact of general and funded research papers by the citation analysis method. The research impact of other research areas can be further investigated by other analysis methods.
Details
Keywords
Shih-Wen Ke, Wei-Chao Lin, Chih-Fong Tsai and Ya-Han Hu
Conference publications are an important aspect of research activities. There are generally both oral presentations and poster sessions at large international conferences. One can…
Abstract
Purpose
Conference publications are an important aspect of research activities. There are generally both oral presentations and poster sessions at large international conferences. One can hypothesise that, for the same conferences, the papers presented in oral sessions should have a higher research impact than the papers presented in poster sessions. However, there has been no related study examining the validity of this hypothesis. In other words, the difference of research impact between papers presented orally or during poster sessions has not been discussed in literature. Therefore, the purpose of this paper is to conduct a citation analysis to compare the research impact of papers presented in oral and poster sessions.
Design/methodology/approach
In this paper, data from three leading conferences in the field of computer vision are examined, namely CVPR (2011 and 2012), ICCV (2011) and ECCV (2012). Several types of citation-related statistics are collected, including the number of highly cited papers (i.e. high number of citations) presented in oral and poster sessions, the total citations of both types of papers, the average citations of oral and poster papers, and the average citations of each frequently cited paper of both types.
Findings
There are three main findings. First, a larger proportion of highly cited papers are from oral sessions than poster sessions. Second, the average number of citations per paper is larger for those presented in oral sessions than poster sessions. Third, the average number of citations for highly cited papers presented in oral sessions is not necessarily greater than for the ones presented in poster sessions.
Originality/value
The originality of this paper is that it is the first attempt to examine the differences of citation impacts of conference papers presented in oral and poster sessions. The findings of this study will allow future bibliometrics research to further explore this related issue for longer periods and different fields.
Details
Keywords
Chih-Fong Tsai, Ya-Han Hu and Shih-Wen George Ke
Ranking relevant journals is very critical for researchers to choose their publication outlets, which can affect their research performance. In the management information systems…
Abstract
Purpose
Ranking relevant journals is very critical for researchers to choose their publication outlets, which can affect their research performance. In the management information systems (MIS) subject, many related studies conducted surveys as the subjective method for identifying MIS journal rankings. However, very few consider other objective methods, such as journals’ impact factors and h-indexes. The paper aims to discuss these issues.
Design/methodology/approach
In this paper, top 50 ranked journals identified by researchers’ perceptions are examined in terms of the correlation to the rankings by their impact factors and h-indexes. Moreover, a hybrid method to combine these different rankings based on Borda count is used to produce new MIS journal rankings.
Findings
The results show that there are low correlations between the subjective and objective based MIS journal rankings. In addition, the new MIS journal rankings by the Borda count approach can also be considered for future researches.
Originality/value
The contribution of this paper is to apply the Borda count approach to combine different MIS journal rankings produced by subjective and objective methods. The new MIS journal rankings and previous studies can be complementary to allow researchers to determine the top-ranked journals for their publication outlets.
Details
Keywords
Chih‐Fong Tsai, Ya‐Han Hu, Chia‐Sheng Hung and Yu‐Feng Hsu
Customer lifetime value (CLV) has received increasing attention in database marketing. Enterprises can retain valuable customers by the correct prediction of valuable customers…
Abstract
Purpose
Customer lifetime value (CLV) has received increasing attention in database marketing. Enterprises can retain valuable customers by the correct prediction of valuable customers. In the literature, many data mining and machine learning techniques have been applied to develop CLV models. Specifically, hybrid techniques have shown their superiorities over single techniques. However, it is unknown which hybrid model can perform the best in customer value prediction. Therefore, the purpose of this paper is to compares two types of commonly‐used hybrid models by classification+classification and clustering+classification hybrid approaches, respectively, in terms of customer value prediction.
Design/methodology/approach
To construct a hybrid model, multiple techniques are usually combined in a two‐stage manner, in which the first stage is based on either clustering or classification techniques, which can be used to pre‐process the data. Then, the output of the first stage (i.e. the processed data) is used to construct the second stage classifier as the prediction model. Specifically, decision trees, logistic regression, and neural networks are used as the classification techniques and k‐means and self‐organizing maps for the clustering techniques to construct six different hybrid models.
Findings
The experimental results over a real case dataset show that the classification+classification hybrid approach performs the best. In particular, combining two‐stage of decision trees provides the highest rate of accuracy (99.73 percent) and lowest rate of Type I/II errors (0.22 percent/0.43 percent).
Originality/value
The contribution of this paper is to demonstrate that hybrid machine learning techniques perform better than single ones. In addition, this paper allows us to find out which hybrid technique performs best in terms of CLV prediction.
Details