Martyn Harris, Mark Levene, Dell Zhang and Dan Levene
The purpose of this paper is to present a language-agnostic approach to facilitate the discovery of “parallel passages” stored in historic and cultural heritage digital archives.
Abstract
Purpose
The purpose of this paper is to present a language-agnostic approach to facilitate the discovery of “parallel passages” stored in historic and cultural heritage digital archives.
Design/methodology/approach
The authors explore a novel, and relatively simple approach, using a character-based statistical language model combined with a tailored version of the Basic Local Alignment Tool to extract exact and approximate string patterns shared between groups of documents.
Findings
The approach is applicable to a wide range of languages, and compensates for variability in the text of the documents as a result of differences in dialect, authorship, language change over time and errors due to inaccurate transcriptions and optical character recognition errors as a result of the digitisation process.
Research limitations/implications
A number of case studies demonstrate that the approach is practical and generalisable to a wide range of archives with documents in different languages, domains and of varying quality.
Practical implications
The approach described can be applied to any digital archive of modern and contemporary texts. This makes the approach applicable to digital archives recording historic texts, but also those composed of more recent news articles, for example.
Social implications
The analysis of “parallel passages” enables researchers to quantify the presence and extent of text-reuse in a collection of documents, which can provide useful data on author style, text genres and cultural contexts.
Originality/value
The approach is novel and addresses a need by humanities researchers for tools that can identify similar documents and local similarities represented by shared text sequences in a potentially vast large archive of documents. As far as the authors are aware, there are no tools currently exist that provide the same level of tolerance to the language of the documents.
Details
Keywords
Maayan Zhitomirsky-Geffet, Judit Bar-Ilan and Mark Levene
One of the under-explored aspects in the process of user information seeking behaviour is influence of time on relevance evaluation. It has been shown in previous studies that…
Abstract
Purpose
One of the under-explored aspects in the process of user information seeking behaviour is influence of time on relevance evaluation. It has been shown in previous studies that individual users might change their assessment of search results over time. It is also known that aggregated judgements of multiple individual users can lead to correct and reliable decisions; this phenomenon is known as the “wisdom of crowds”. The purpose of this paper is to examine whether aggregated judgements will be more stable and thus more reliable over time than individual user judgements.
Design/methodology/approach
In this study two simple measures are proposed to calculate the aggregated judgements of search results and compare their reliability and stability to individual user judgements. In addition, the aggregated “wisdom of crowds” judgements were used as a means to compare the differences between human assessments of search results and search engine’s rankings. A large-scale user study was conducted with 87 participants who evaluated two different queries and four diverse result sets twice, with an interval of two months. Two types of judgements were considered in this study: relevance on a four-point scale, and ranking on a ten-point scale without ties.
Findings
It was found that aggregated judgements are much more stable than individual user judgements, yet they are quite different from search engine rankings.
Practical implications
The proposed “wisdom of crowds”-based approach provides a reliable reference point for the evaluation of search engines. This is also important for exploring the need of personalisation and adapting search engine’s ranking over time to changes in users preferences.
Originality/value
This is a first study that applies the notion of “wisdom of crowds” to examine an under-explored in the literature phenomenon of “change in time” in user evaluation of relevance.
Details
Keywords
Judit Bar‐Ilan and Mark Levene
The aim of this paper is to develop a methodology for assessing search results retrieved from different sources.
Abstract
Purpose
The aim of this paper is to develop a methodology for assessing search results retrieved from different sources.
Design/methodology/approach
This is a two phase method, where in the first stage users select and rank the ten best search results from a randomly ordered set. In the second stage they are asked to choose the best pre‐ranked result from a set of possibilities. This two‐stage method allows users to consider each search result separately (in the first stage) and to express their views on the rankings as a whole, as they were retrieved by the search provider. The method was tested in a user study that compared different country‐specific search results of Google and Live Search (now Bing). The users were Israelis and the search results came from six sources: Google Israel, Google.com, Google UK, Live Search Israel, Live Search US and Live Search UK. The users evaluated the results of nine pre‐selected queries, created their own preferred ranking and picked the best ranking from the six sources.
Findings
The results indicate that the group of users in this study preferred their local Google interface, i.e. Google succeeded in its country‐specific customisation of search results. Live Search was much less successful in this aspect.
Research limitations/implications
Search engines are highly dynamic, thus the findings of the case study have to be viewed cautiously.
Originality/value
The main contribution of the paper is a two‐phase methodology for comparing and evaluating search results from different sources.
Details
Keywords
Judit Bar‐Ilan, Mark Levene and Mazlita Mat‐Hassan
The objective of this paper is to characterize the changes in the rankings of the top ten results of major search engines over time and to compare the rankings between these…
Abstract
Purpose
The objective of this paper is to characterize the changes in the rankings of the top ten results of major search engines over time and to compare the rankings between these engines.
Design/methodology/approach
The papers compare rankings of the top‐ten results of the search engines Google and AlltheWeb on ten identical queries over a period of three weeks. Only the top‐ten results were considered, since users do not normally inspect more than the first results page returned by a search engine. The experiment was repeated twice, in October 2003 and in January 2004, in order to assess changes to the top‐ten results of some of the queries during the three months interval. In order to assess the changes in the rankings, three measures were computed for each data collection point and each search engine.
Findings
The findings in this paper show that the rankings of AlltheWeb were highly stable over each period, while the rankings of Google underwent constant yet minor changes, with occasional major ones. Changes over time can be explained by the dynamic nature of the web or by fluctuations in the search engines' indexes. The top‐ten results of the two search engines had surprisingly low overlap. With such small overlap, the task of comparing the rankings of the two engines becomes extremely challenging.
Originality/value
The paper shows that because of the abundance of information on the web, ranking search results is of extreme importance. The paper compares several measures for computing the similarity between rankings of search tools, and shows that none of the measures is fully satisfactory as a standalone measure. It also demonstrates the apparent differences in the ranking algorithms of two widely used search engines.
Details
Keywords
Recent years have seen “really simple syndication” or “rich site summary”(RSS) syndication of frequently updated content become ubiquitous across the internet. RSS's XML‐based…
Abstract
Purpose
Recent years have seen “really simple syndication” or “rich site summary”(RSS) syndication of frequently updated content become ubiquitous across the internet. RSS's XML‐based format allows these data to be stored in a semi‐structured format but, despite the presence of online aggregators and readers, and the related work in clustering feeds and mining subjects by keywords, much potentially useful information present in RSS may remain undiscovered. This paper aims to address this issue in an experimental setting.
Design/methodology/approach
This paper presents two distinct technologies which employ the semi‐structured nature of RSS content to allow users to mine information directly from raw RSS feeds: occurrence mining counts occurrences of text strings in feeds, whilst value mining mines structured ticker tape numeric data. It describes both technologies and their implementation in an experiment, where 35 students mined small numbers of RSS feeds and visualised the data mined.
Findings
This paper analyses the results of the experiment and cites examples of data mined and visualisations produced. The subject matter of data mined is also explored and potential applications of the technologies are considered.
Research limitations/implications
The mining technologies proposed in this paper have been developed to mine textual and numeric data directly from feeds, but can be extended to mine other data types present in RSS and to include other variants like Atom.
Originality/value
These technologies are seen to be applicable to data mining, the role of data and visualisations in social data analysis, issue tracking in news mining and time series analysis.
Details
Keywords
Darrin Kass and Christian Grandzol
This study examined the leadership development of MBA students enrolled in an Organizational Behavior course. Students enrolled in either an in-class section or a section that…
Abstract
This study examined the leadership development of MBA students enrolled in an Organizational Behavior course. Students enrolled in either an in-class section or a section that included an intensive, outdoor training component called Leadership on the Edge. Results from Kouzes and Posner’s Leadership Practices Inventory (2003) showed that students in the outdoor training section demonstrated greater improvements in leadership practices over the course of the semester. Reflective comments from students in the outdoor section indicated it was a transformative personal experience that is unlikely to be emulated in a classroom. Implications for leadership educators are discussed.
Mark Klassen, Grant Alexander Wilson and C. Brooke Dobni
The purpose of the paper is to emphasize the performance benefits of a long-term innovation and value creation perspective. This paper responds to the recent concept of the…
Abstract
Purpose
The purpose of the paper is to emphasize the performance benefits of a long-term innovation and value creation perspective. This paper responds to the recent concept of the imagination premium method for valuing companies. It offers four key takeaways to create a long-term innovation-focused orientation for future value creation.
Design/methodology/approach
The research is based on both consulting experience and insight from several studies of executives that were supported by the U.S. Conference Board.
Findings
The research differentiates how high versus low innovators create long-term perspectives and value. High innovators have explicit processes that support innovation, leadership that focuses on long-term performance, resources committed to long-term projects and innovation and knowledge management systems that transfer knowledge throughout the organization.
Research limitations/implications
The research offers strategic directives aimed at creating long-term value but acknowledges that there are other means to accomplish such objectives.
Practical implications
This paper offers strategies for executives to create an innovation-focused organizational culture that drives lasting long-term value.
Social implications
Focusing on long-term innovation prioritizes larger social, environmental and business objectives over superficial short-term stock price changes, leading to greater value-creation.
Originality/value
This paper advocates that leadership play the long game and adopt a longer-term view of innovation due to its long-term competitive, employee engagement, sustainability and performance benefits.
Details
Keywords
Giacomo Zatini and Armando della Porta
Researchers have paid limited attention to how the fashion sector has evolved in the years following the pandemic. This study aims to address this gap by providing an overview of…
Abstract
Purpose
Researchers have paid limited attention to how the fashion sector has evolved in the years following the pandemic. This study aims to address this gap by providing an overview of the Italian fashion sector and its financial performance related to the concept of resilience.
Design/methodology/approach
The model is based on a segmentation analysis of 5,129 firms in the Italian fashion sector, utilizing financial variables such as return on equity and return on sales. Moreover, it employs significance tests with the aid of Levene’s test and ANOVA.
Findings
It was discovered that the debt ratio, operating cash flow and aggregate growth ratio (AGR) over a five-year period exhibit significant differences across clusters. Additionally, it was determined that the debt ratio and operating cash flow are key financial indicators of firm resilience. These data have confirmed the resilience of the Italian fashion sector.
Originality/value
This study is among the first to focus on the financial performance of the Italian fashion sector, its resilience and post-pandemic recovery, as well as employing a reverse engineering system to identify the most suitable financial indicators for defining a sector’s resilience.