Michael D. Ekstrand, Katherine Landau Wright and Maria Soledad Pera
This paper investigates how school teachers look for informational texts for their classrooms. Access to current, varied and authentic informational texts improves learning…
Abstract
Purpose
This paper investigates how school teachers look for informational texts for their classrooms. Access to current, varied and authentic informational texts improves learning outcomes for K-12 students, but many teachers lack resources to expand and update readings. The Web offers freely available resources, but finding suitable ones is time-consuming. This research lays the groundwork for building tools to ease that burden.
Design/methodology/approach
This paper reports qualitative findings from a study in two stages: (1) a set of semistructured interviews, based on the critical incident technique, eliciting teachers' information-seeking practices and challenges; and (2) observations of teachers using a prototype teaching-oriented news search tool under a think-aloud protocol.
Findings
Teachers articulated different objectives and ways of using readings in their classrooms, goals and self-reported practices varied by experience level. Teachers struggled to formulate queries that are likely to return readings on specific course topics, instead searching directly for abstract topics. Experience differences did not translate into observable differences in search skill or success in the lab study.
Originality/value
There is limited work on teachers' information-seeking practices, particularly on how teachers look for texts for classroom use. This paper describes how teachers look for information in this context, setting the stage for future development and research on how to support this use case. Understanding and supporting teachers looking for information is a rich area for future research, due to the complexity of the information need and the fact that teachers are not looking for information for themselves.
Details
Keywords
Oghenemaro Anuyah, Ashlee Milton, Michael Green and Maria Soledad Pera
The purpose of this paper is to examine strengths and limitations that search engines (SEs) exhibit when responding to web search queries associated with the grade school…
Abstract
Purpose
The purpose of this paper is to examine strengths and limitations that search engines (SEs) exhibit when responding to web search queries associated with the grade school curriculum
Design/methodology/approach
The authors employed a simulation-based experimental approach to conduct an in-depth empirical examination of SEs and used web search queries that capture information needs in different search scenarios.
Findings
Outcomes from this study highlight that child-oriented SEs are more effective than traditional ones when filtering inappropriate resources, but often fail to retrieve educational materials. All SEs examined offered resources at reading levels higher than that of the target audience and often prioritized resources with popular top-level domain (e.g. “.com”).
Practical implications
Findings have implications for human intervention, search literacy in schools, and the enhancement of existing SEs. Results shed light on the impact on children’s education that result from introducing misconception about SEs when these tools either retrieve no results or offer irrelevant resources, in response to web search queries pertinent to the grade school curriculum.
Originality/value
The authors examined child-oriented and popular SEs retrieval of resources aligning with task objectives and user capabilities–resources that match user reading skills, do not contain hate-speech and sexually-explicit content, are non-opinionated, and are curriculum-relevant. Findings identified limitations of existing SEs (both directly or indirectly supporting young users) and demonstrate the need to improve SE filtering and ranking algorithms.
Details
Keywords
Maria Soledad Pera and Yiu‐Kai Ng
The web provides its users with abundant information. Unfortunately, when a web search is performed, both users and search engines must deal with an annoying problem: the presence…
Abstract
Purpose
The web provides its users with abundant information. Unfortunately, when a web search is performed, both users and search engines must deal with an annoying problem: the presence of spam documents that are ranked among legitimate ones. The mixed results downgrade the performance of search engines and frustrate users who are required to filter out useless information. To improve the quality of web searches, the number of spam documents on the web must be reduced, if they cannot be eradicated entirely. This paper aims to present a novel approach for identifying spam web documents, which have mismatched titles and bodies and/or low percentage of hidden content in markup data structure.
Design/methodology/approach
The paper shows that by considering the degree of similarity among the words in the title and body of a web docuemnt D, which is computed by using their word‐correlation factors; using the percentage of hidden context in the markup data structure within D; and/or considering the bigram or trigram phase‐similarity values of D, it is possible to determine whether D is spam with high accuracy
Findings
By considering the content and markup of web documents, this paper develops a spam‐detection tool that is: reliable, since we can accurately detect 84.5 percent of spam/legitimate web documents; and computational inexpensive, since the word‐correlation factors used for content analysis are pre‐computed.
Research limitations/implications
Since the bigram‐correlation values employed in the spam‐detection approach are computed by using the unigram‐correlation factors, it imposes additional computational time during the spam‐detection process and could generate higher number of misclassified spam web documents.
Originality/value
The paper verifies that the spam‐detection approach outperforms existing anti‐spam methods by at least 3 percent in terms of F‐measure.
Details
Keywords
Maria Soledad Pera and Yiu‐Kai Ng
Tens of thousands of news articles are posted online each day, covering topics from politics to science to current events. To better cope with this overwhelming volume of…
Abstract
Purpose
Tens of thousands of news articles are posted online each day, covering topics from politics to science to current events. To better cope with this overwhelming volume of information, RSS (news) feeds are used to categorize newly posted articles. Nonetheless, most RSS users must filter through many articles within the same or different RSS feeds to locate articles pertaining to their particular interests. Due to the large number of news articles in individual RSS feeds, there is a need for further organizing articles to aid users in locating non‐redundant, informative, and related articles of interest quickly. This paper aims to address these issues.
Design/methodology/approach
The paper presents a novel approach which uses the word‐correlation factors in a fuzzy set information retrieval model to: filter out redundant news articles from RSS feeds; shed less‐informative articles from the non‐redundant ones; and cluster the remaining informative articles according to the fuzzy equivalence classes on the news articles.
Findings
The clustering approach requires little overhead or computational costs, and experimental results have shown that it outperforms other existing, well‐known clustering approaches.
Research limitations/implications
The clustering approach as proposed in this paper applies only to RSS news articles; however, it can be extended to other application domains.
Originality/value
The developed clustering tool is highly efficient and effective in filtering and classifying RSS news articles and does not employ any labor‐intensive user‐feedback strategy. Therefore, it can be implemented in real‐world RSS feeds to aid users in locating RSS news articles of interest.