Multimodal presentation of E-commerce product reviews and ratings: empirical investigation using multimodality

Rajab Ghandour (Liverpool John Moores University, Liverpool, UK)

Journal of Trade Science

ISSN: 2815-5793

Article publication date: 11 September 2024

Issue publication date: 13 November 2024

Downloads

290

pdf (29.5 MB)

Abstract

Purpose

The aim of the research is to evaluate different modality for product reviews presentation and its impact on users’ performance, purchase intention and enjoyment.

Design/methodology/approach

The study utilized an experimental approach with 48 opportunistic participants in three groups (16 users per group). Participants were randomly assigned to experimental conditions to ensure unbiased treatment. Data were collected through controlled interventions or manipulations, with pre-defined measures to assess specific outcomes. Statistical techniques such as ANOVA were employed to analyse the data, allowing for comparisons between experimental variables.

Findings

The findings revealed that integrating facial expression avatars and emojis into an e-commerce platform effectively communicates product reviews and ratings. Moreover, the use of animation significantly enhanced user enjoyment. This suggests that visual representations not only convey information effectively but also contribute to a more engaging and enjoyable user experience.

Research limitations/implications

While this experiment offers valuable insights into the impact of different e-commerce presentation layouts on user behaviour, further research could delve deeper into specific aspects such as the influence of individual user characteristics and the long-term effects of layout preferences.

Originality/value

This study contributes original insights by demonstrating the efficacy of facial expressive avatars and emojis in conveying product reviews and ratings within e-commerce platforms. Moreover, it adds value by highlighting the positive impact of animation on user enjoyment. By combining these elements, the research offers a novel approach to enhancing user engagement and understanding of customer feedback in online shopping environments. The findings provide valuable guidance for e-commerce platforms seeking innovative ways to communicate product information effectively and enhance the overall user experience, ultimately benefiting both businesses and consumers.

Keywords

Citation

Ghandour, R. (2024), "Multimodal presentation of E-commerce product reviews and ratings: empirical investigation using multimodality", Journal of Trade Science, Vol. 12 No. 4, pp. 247-267. https://doi.org/10.1108/JTS-03-2024-0018

Publisher

:

Emerald Publishing Limited

License

Published in Journal of Trade Science. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode

1. Introduction

Giving customers the ability to purchase products or services through electronic commerce (e-commerce) is considered a modern approach to gaining customer satisfaction and loyalty. One of the most important features of e-commerce is enabling users to browse a variety of products simultaneously without leaving their physical location. Product reviews are a critical source of information for users during online purchases. According to Duan et al. (2008), consumers evaluate product information, such as reviews, to help them fulfil their consumption goals. When users share reviews of a product or service, they influence other users' purchase decisions. E-commerce interfaces are designed to attract users and facilitate easy and simple purchases.

Many studies have investigated the impact of web design on user performance, emphasizing that the easier an e-commerce interface is to use, the greater its utilization and transaction performance. For example, Hong et al. (2004) suggested that a matrix-based product presentation supports searching, while a list view product presentation format supports browsing. Most user interfaces predominantly use visual stimuli to communicate information, but users may become visually overloaded. People are generally accustomed to multimodal interaction, as they speak, move, gesture and shift their gaze in an effective flow of communication (Oviatt, 1999). Multimodal applications may use non-speech sound, text and hypertext, animation and video, speech, handwriting, gestures and computer vision.

Previous studies have shown the significant impact of reviews and ratings on users’ purchasing decisions and behaviour (Duan et al., 2008; Chevalier and Mayzlin, 2006). For instance, Duan et al. (2008) found that online reviews significantly influence product sales, while Chevalier and Mayzlin (2006) demonstrated that positive reviews can boost product popularity. However, most of these studies focus on traditional text-based reviews. The presentation of product reviews and ratings, particularly through multimodal elements, remains underexplored. This gap is critical because it overlooks the potential benefits of leveraging multimodal elements to create a more immersive and informative shopping experience. For example, Oviatt (1999) discussed the advantages of multimodal interfaces in enhancing user interaction by integrating non-speech sound, text and video. Similarly, recent research by Moreno and Mayer (2007) in educational settings has shown that multimodal presentations can significantly enhance comprehension and retention by distributing cognitive load across different sensory channels. Despite these findings, there is a notable lack of research specifically examining how different modalities for presenting product reviews and ratings affect users' performance, purchase intentions and overall enjoyment in e-commerce contexts (Srinivasan and Murphy, 2011). Moreover, studies by Holzwarth et al. (2006) and Bente et al. (2008) indicated that visual elements such as avatars and facial expressions can enhance user trust and engagement. Yet, the integration of such elements in e-commerce product reviews is seldom investigated. This study aims to fill this gap by exploring the efficacy of facial expressive avatars, emojis and animations in conveying product reviews and ratings, thereby providing a more engaging and enjoyable user experience.

Understanding the impact of multimodal presentation on user experience is vital for e-commerce platforms aiming to enhance user engagement and satisfaction. In an era where user attention spans are short and competition is fierce, employing innovative ways to present information can provide a significant competitive edge. By examining how facial expressive avatars, emojis and animations influence users' interaction with product reviews, this study addresses a pressing need for more dynamic and engaging content delivery methods. Marketing research technologies, such as eye-tracking and facial expression analysis, allow marketers to gain more insights into consumer behaviour (Chinchanachokchai and McKelvey, 2023). This research is particularly important for e-commerce businesses seeking to differentiate themselves and improve customer retention through enhanced user experiences.

Multimodality has been shown to be effective in various contexts beyond e-commerce. In education, recent studies have demonstrated that multimodal learning environments enhance student engagement and understanding by integrating visual, auditory and kinaesthetic learning modalities (Aslan et al., 2019; Cai et al., 2022). Scholars found that students learn better when information is presented in both verbal and visual formats. In education, multimodal communication in e-learning context enhances the learning potential for learners by catering to their educational needs (Haniya et al., 2019). Moreover, multimodal sensory marketing can enhance consumer behaviour that increases consumer engagement (Lick, 2022). These studies highlight the potential benefits of multimodal approaches in different fields, suggesting that similar benefits could be realized in the context of e-commerce.

1.1 Aim and experimental design

The main aim of this study was to empirically identify the suitability of visual multimodal presentations in communicating review messages and ratings of products. The experiment also aimed to determine whether the inclusion of such metaphors can improve user experience, enjoyment and usability in e-commerce interfaces. An e-commerce experimental platform was developed as the basis for this experiment. Three conditions were implemented and investigated: (a) emoji, (b) avatars with facial expressions, and (c) animation. The study involved three groups to test these different conditions, measuring and comparing the usability performance of the groups in terms of efficiency, effectiveness and enjoyment.

1.2 Contribution

This study makes several key contributions to the field of e-commerce and user experience design. Firstly, it provides empirical evidence on the effectiveness of facial expressive avatars and emojis in conveying product reviews and ratings. This insight can guide e-commerce platforms in adopting more visually appealing and informative review systems. Secondly, by demonstrating the positive impact of animation on user enjoyment, the research highlights an innovative approach to enhancing user engagement. These findings not only contribute to academic knowledge but also offer practical implications for e-commerce businesses looking to optimize their review presentation strategies. Ultimately, this research benefits both businesses and consumers by promoting more effective communication and a more enjoyable shopping experience.

2. Literature review

2.1 Product reviews and rating in E-commerce

As technology has evolved, the number of people performing e-commerce transactions is rapidly increasing. For example, as of January 2024, there were approximately 5.18 billion internet users worldwide, representing around 65.6% of the global population (DataReportal, 2024). Many of these users engage in e-commerce activities. Global e-commerce sales are expected to reach $6.3 trillion in 2023, a significant increase from $4.2 trillion in 2020 (Statista a, 2024). Additionally, retail e-commerce sales are estimated to exceed $6.3 trillion worldwide, up from $5.8 trillion in 2023 (Statista b, 2024). E-commerce encompasses, not only selling or buying products online, but also handling customer queries, integrating payments, promoting products/services and ensuring secure transactions. E-commerce platforms enable customers to compare products based on price, quality and reviews before making a purchase, significantly influencing consumer behaviour. Event views, products or services provided by the user have a lot of commercial value (Ankita et al., 2023).

Today, internet users share their experiences or review products and services, which in turn influence the purchase decisions of other users. Social media integration with e-commerce platforms allows users to share reviews, thus affecting consumer behaviour from pre-purchase to post-purchase stages. The emergence of User-Generated-Content (UGC) is transforming reputation management practices and prompting organizational changes (Gensler et al., 2013). e-WOM is particularly powerful as it can reach a vast audience and is a critical source of product information (King et al., 2014). Positive reviews provide valuable information about satisfactory product experiences, leading to positive outcomes (Chevalier and Mayzlin, 2006). However, the credibility of e-WOM can be questionable due to anonymity, though platforms such as Amazon address this by allowing users to create profiles. Trust in online reviews is growing, with consumers valuing the opinions of online reviewers more than traditional marketing professionals (Smith et al., 2012). Consumers are eager to grasp the discussions on various web platforms and social media, using this information to make their purchasing or service usage decisions (Ankita et al., 2023).

2.2 User interface and multimodality

The web increases accessibility of top-ranking e-commerce websites, emphasizing the importance of accessible digital platforms in enhancing business reach and consumer interaction across different geographical locations (Acosta-Vargas et al., 2022). Effective web design is crucial for usability, defined by Costabile (2001) as the capability of a software product to be understood, learned, used and attractive to users under specified conditions. Content quality is paramount, as it directly impacts the user experience (Nielsen, 2000). Effective web content design involves a balance of text, images, videos, layout and structure to enhance user comfort and minimize confusion. Multimodal computer systems, which use multiple sensory inputs, can significantly improve user interfaces.

Research in human-computer interaction aims to design systems that are intuitive and accessible. Multimodal systems, which integrate visual and auditory cues, enhance the user experience and effectiveness of e-commerce platforms (Gulliksen et al., 2003). Research indicates that multimedia metaphors assist users in completing tasks with greater accuracy and efficiency. In e-learning, multimodal interfaces improve usability and learning performance (Moreno and Mayer, 2007). The desire to gain deeper consumer insights leads marketers to adopt the latest technologies and tools in their profession (Chinchanachokchai and McKelvey, 2023). Applying multimodal principles to e-commerce, such as using product ratings and reviews with visual metaphors, could similarly enhance user experience.

2.2.1 Facial expression avatar and emojis

Multimodal input systems have been developed to support diverse user needs (Srinivasan and Murphy, 2011). Avatars, which are virtual representations of users, enhance communication in digital contexts (Bailenson et al., 2005). Avatars can be static images or dynamic 3D models and are classified into abstract, realistic and naturalistic types (Chaturvedi et al., 2011). The success of avatars in user interfaces depends on user perceptions and the type of avatar used (Holzwarth et al., 2006).

Facial expressions are crucial for conveying emotions and enhancing communication in digital interactions (Ekman, 1992). The use of 3D animated avatars allows for expressive communication, improving user trust and engagement in e-commerce interfaces (Blascovich et al., 2002). The presence of human faces in digital environments, even as photos, can indicate honesty and reliability (Bente et al., 2008). Facial expressions enhance interpersonal communication by conveying emotions and thoughts that may be difficult to articulate verbally (Keltner and Ekman, 2000). Facial expressions are vital for identifying sentiments, with a smile being a clear indicator of positive emotions (Ankita et al., 2023).

Emojis, developed in the 1990s, have become a popular form of nonverbal communication, extending the concept of emoticons (Gesselman et al., 2019). Emojis and facial expression avatars are used in this study to communicate product reviews and ratings on an experimental e-commerce platform, aiming to enhance user interaction and engagement.

2.3 Theoretical framework

This study is grounded in several key theoretical frameworks to understand the impact of multimodal presentations on user experience in e-commerce:

2.3.1 Media richness theory

Media Richness Theory (MRT) posits that communication media vary in their capacity to convey rich information. Richer media, which include multiple cues such as visual, auditory and contextual information, are more effective for complex communication tasks (Daft and Lengel, 1986). In the context of e-commerce, the use of multimodal elements like facial expressions, emojis and animations can enhance the richness of product reviews. This enhancement occurs as these elements provide more nuanced and immediate forms of feedback, which improve user comprehension and engagement. For instance, facial expressions and emojis can convey emotions and attitudes that are difficult to express through text alone, making reviews more vivid and easier to understand. By leveraging richer media, e-commerce platforms can effectively communicate detailed product information, leading to better user performance and satisfaction.

2.3.2 Cognitive load theory

Cognitive Load Theory (CLT) suggests that learning and comprehension are optimized when extraneous cognitive load is minimized (Sweller, 1988). In an e-commerce setting, this theory is particularly relevant as users are often presented with a large amount of information they need to process quickly. Multimodal presentations can distribute information across different sensory channels, thus reducing cognitive overload. For example, using visual metaphors like emojis and avatars to represent reviews can simplify the information processing task for users, as these elements can be processed more quickly than text. This reduction in cognitive load can lead to improved efficiency and enjoyment, as users are able to understand product reviews more easily and make purchasing decisions more effectively. By minimizing extraneous cognitive load, these multimodal elements enhance the usability and effectiveness of e-commerce interfaces.

Recent studies have further explored these theories in various contexts. For example, Cheng et al. (2022) found that multimodal learning environments enhance student engagement and understanding. Haniya et al. (2019) demonstrated that multimodal communication improves student performance. Lick (2022) showed that multimodal sensory marketing increases consumer engagement.

2.4 Hypotheses development

Based on the assumption that using emojis, facial expression avatars and animations in communicating product reviews and ratings would affect and enhance users' decision-making, performance (effectiveness and efficiency), system usability and enjoyment, the following hypotheses were formulated:

H1.

Efficiency of Presentation Metaphors

Emojis provide a simplified and universally understood visual representation of sentiments, which can be quickly interpreted by users (Gesselman et al., 2019). This efficiency should hold across different quantities of product comparisons due to the straightforward nature of emoji symbols.

(a)
EMP (emojis presentation) will have the same efficiency in two, three or four product comparisons.

According to MRT, all three multimodal metaphors (emojis, avatars and animations) should enhance communication efficiency by providing richer media. However, the simplicity of emojis and avatars might make them equally efficient across various tasks compared to the more complex animations (Daft and Lengel, 1986).

(b)
The presentation metaphors will have the same efficiency for all tasks.

Emojis, being simple and easily interpretable, reduce cognitive load, allowing users to process larger quantities of information more quickly (Sweller, 1988). This is particularly advantageous in scenarios requiring rapid comparison of multiple products.

(c)
EMP will be more efficient in communicating a large amount of information in product comparisons than AVP (facial expression avatar) and AMP (animation presentation).

Avatars with facial expressions, while still visually rich, are less time-consuming to interpret than animations, which require more time to watch and comprehend (Holzwarth et al., 2006). Thus, AVP should outperform AMP in efficiency.

(d)
AVP will be more efficient than AMP in all product review presentations.

Animations can provide detailed and engaging information for single product reviews but become cumbersome and time-consuming when users need to compare multiple products simultaneously (Bailenson et al., 2005).

(e)
AMP in single product reviews will be more efficient than in two, three or four product comparisons.

H2.

Effectiveness of Presentation Metaphors

The simplicity and clarity of emojis enhance task performance by minimizing cognitive load and allowing users to focus on the essential information (Gesselman et al., 2019).

(a)
EMP will be more effective than AVP and AMP in terms of tasks completed successfully.

Rationale: Avatars with facial expressions convey nuanced emotional information efficiently without the extended time commitment required for animations, leading to better task completion rates (Blascovich et al., 2002).

(b)
AVP will be more effective than AMP in terms of tasks completed successfully.

H3. User Enjoyment

Emojis are not only efficient but also familiar and engaging to users, which enhances enjoyment. Their playful and expressive nature likely contributes to a more enjoyable user experience (Gesselman et al., 2019).

H3.

Using EMP will be considered more enjoyable by users compared to AVP and AMP.

H4. User Satisfaction

User satisfaction is influenced by ease of use and effectiveness. Given that emojis are simple, clear and quick to interpret, they are expected to result in higher satisfaction levels (Sweller, 1988; Nielsen, 2000).

H4.

Users will be more satisfied with EMP than AVP and AMP in product comparisons.

3. Methodology

3.1 Experimental platform and conceptual framework

An e-commerce platform was designed to investigate different visual multimodal communication conditions, such as facial expression avatars and emojis. All the reviews and ratings presentations were designed to deliver the same information about specific products according to the associated task. The presented page of the product contained sections related to the product (image) and product specification along with the ratings and reviews section. The product and the description were the same for all the multimodal conditions, while the reviews and ratings were different according to the condition being examined. The presentation metaphors used in the platform included: facial expressive avatar presentation (AVP), emoji presentation (EMP) and animation presentation (AMP). Users performed the same tasks varying in complexity (simple, moderate, difficult). This complexity level was increased in every task. Each task was presented separately from the previous task with the use of a different multimodal condition. A between-subjects design was used to ensure that each group tested a presentation metaphor at different complexity levels. The e-commerce platform structure can be seen in Figure 1.

3.2 The implementation of the conditions

3.2.1 Facial expression presentation (AVP)

This presentation utilized a facially-expressive avatar using the Poser tool. The expressions used were based on guidelines by Fabri et al. (2004) and Rigas and Ghandour (2016). Table 1 shows the mapping between the expression used and the rating value. The design objective of this approach was to communicate the highest possible number of reviews and ratings on the same page rather than the user having to browse several pages. For the experiment, the maximum number of reviews was 100, equating to 100 facially-expressive avatars presented in one product presentation. Figure 2 shows the way this approach was utilized to communicate ratings. The user could read a particular review by hovering the cursor over the avatar pictorial representation. This presentation was performed for two, three and four-product comparisons. As the choice increases, the space available to compare the products is reduced, which makes the ratings able to be communicated less than 100. The maximum number of reviews communicated in a two-product presentation was 80 reviews. However, in the four-product comparison, the number of reviews dropped to 50 reviews per product.

3.2.2 Emoji’s presentation (EMP)

Emojis were used as the second communication metaphor. The emojis presented were the ones that can be used to communicate product ratings and reviews, namely the angry, sad, neutral, smiley and happy faces. Figure 3 shows the emojis used in the EMP. Similar to the facial expression avatars, the ratings indicated below demonstrate that a rating between 2.5 and 3 was considered to be neutral, while ratings between 2.4 and 2 were communicated using a sad face. The angry face was used for ratings below 2. The smiley face communicated the rating of 4, and the happy face indicated 5. All presentations had a consistent layout similar to the ones presented and explained in the AVP; the only difference was the visual metaphor being tested.

3.2.3 Implementation of the animation presentation

The implementation of the animation involved using a set of cartoon designs. Abstract avatars, which are cartoon-like interactive characters with limited animation, were utilized (Gazepidis, 2002). A study of 12 users conducted by the authors using the chosen cartoon avatars helped to map the ratings with the characters. Table 2 presents examples of the characters used and the mapping. The design of the animation involved a theatre stage as a background and the characters presenting the reviews appearing on the stage. The type of character presenting the reviews depended on the rating of the review. Figure 4 presents the animation scene. For instance, for a positive rating (above 2.5), a smiley or happy character would present the review. The animation clips were in MPEG4 format and integrated within the experimental platform. The duration of each clip was between 20 and 45 s. The characters with positive or neutral reviews appeared from the left, then moved towards the centre, while characters with negative reviews appeared from the right and moved towards the centre. After presenting a review, the first character faded away before the second character moved towards the centre. This approach aimed to present reviews and ratings like a play, with the user as the audience.

3.3 Experimental design

The aim was to examine the effect of visual multimodal metaphor review and rating presentations and to determine the presentation method that provides better efficiency, effectiveness, user satisfaction and enjoyment for product reviews and ratings. All the multimodal visual presentations were empirically evaluated. In this paper, the efficiency and effectiveness results are presented. A between-subjects approach was applied throughout this experimental investigation. Each group of users tested one presentation metaphor using all the complexity levels. This design ensures that different users test different methods being evaluated, eliminating external factors or influences that affect users' performance (Field, 2018).

3.4 Sample size and participants

While five users could provide a basic test for system usability, having a larger number of users offers more adequate usability results (Nielsen, 2000). Many scholars conducted study with samples ranging between ten and 35. For example, Stahli et al. (2021) who experimented using sample of 45 users. For this study, the sample consisted of 48 opportunistic users, divided into three groups with 16 participants per group. Each group tested one presentation metaphor (AVP, EMP or AMP). The users had no prior knowledge of the experimental platform but were familiar with emojis and animation clips, indicating their first interaction with the experimental platform. To enhance the reliability of the findings, future studies will aim to increase the sample size to include at least 40 participants per group.

3.5 Manipulation of independent variables and manipulation check

In this experiment, two independent variables were manipulated:

(1)
Presentation Metaphors (IV1): The visual presentation groups included facial expression avatars (AVP), emojis (EMP) and animations (AMP).
(2)
Task Complexity Level (IV2): The tasks' complexity level increased from simple to moderate to complex.

Each participant experienced tasks at different complexity levels with their assigned presentation metaphor. The manipulation check involved asking participants to complete a post-task questionnaire to ensure they understood the differences between the presentation metaphors and the varying complexity levels. This check helped verify the effectiveness of the manipulations.

3.6 Experimental procedure

The experiment was conducted in four parts:

(1)
Pre-experiment Questionnaire: Collected demographic information and user profiles.
(2)
Task Completion: Participants were presented with the experimental platform conditions and asked to complete tasks according to each task requirement.
(3)
Post-task Questionnaire: Gathered data on user performance, satisfaction and enjoyment.
(4)
Manipulation Check: Verified participants' understanding of the presentation metaphors and complexity levels.

Table 3 presents the presentation conditions used for each group.

4. Results and discussion

The results of the users were analysed in terms of the time taken by them to finish the tasks (efficiency), number of correct choices or selection (effectiveness), post-task satisfaction and post-task enjoyment ratings. Significant testing using ANOVA (analysis of variance) which is a widely used statistical method for comparing means among multiple groups to understand if there are significant differences. These results are linked to the research hypotheses formulated earlier.

The users’ profile shows that they have a similar level of knowledge in the use of the internet, familiarity with online shopping and reading reviews. In addition, the data indicates that the users have mainly similar levels of education (mainly undergraduate) and a good level of computer use. The figures show that the users have similar characteristics and online shopping experience. Hence, the difference in the users’ performance is due to the experimental conditions applied.

4.1 Efficiency (hypotheses H1)

The time taken by users to complete tasks using the three presentations (animation, avatar and emoji) with different complexity levels (easy, moderate and difficult) was measured to assess efficiency.

Figure 5 shows the mean time taken to complete all tasks among the users (A) and the mean value based on the tasks’ complexity (B). It can be seen that the time taken to complete the tasks was lower in the facial expression avatar and emoji presentations. As shown in Figure 5 (a), the mean time to complete the tasks using the avatar presentation (AVP) and the emoji presentation (EMP) was lower than that using the animation presentation (AMP). The mean time taken by the users to complete all tasks using the avatar presentation was 2.23 min, similar to the average time taken to complete the tasks using the emoji presentation, which was 2.24 min. In comparison, the animation presentation took significantly more time, with an average of 4.30 min to complete the tasks. The data showed that the mean value, or the average time taken by the users, nearly doubled in the animation presentation group compared to the avatar and the emoji group. Moreover, the results showed that the avatar and emoji presentation tasks were better than using the animation presentation by around two minutes. Due to the nature of the interface design of the animation presentation (AMP), which used animation files, it was expected that using this metaphor it would take more time to complete the tasks as users would be relying on watching the animation content.

Figure 5 (b) shows the completion time based on the task complexity. The tasks were designed to increase in difficulty and were divided into two easy, one moderate and one difficult. From the figure it can be seen that the completion time for the animation presentation was higher for all the complexity levels compared to the other presentation metaphors. The figure also showed that the variance time for the avatar and the emoji presentations was very close at all complexity levels with a mean difference up to 0.3 min (30 s). Moreover, it can be observed that the time variance increases when the animation metaphor was tested. The statistical test showed that the data for the completion time of all the tasks using the different presentation metaphors is significant with p < 0.05 as presented in Table 4.

A Levene test confirmed homogeneity of variances among the groups, and an ANOVA test showed significant differences in task completion times (p = 0.000). Scholars discussed the necessity of post hoc tests following an ANOVA to understand specific group differences, highlighting the importance of ANOVA in multiple experimental group comparisons. Hence, Fisher’s post hoc LSD test indicated significant differences between the animation and avatar presentations (p = 0.000), but not between the avatar and emoji presentations (p = 0.51).The results showed that the avatar and emoji presentations were significantly more efficient than the animation presentation, supporting hypotheses H1(c) and H1(d). However, hypothesis H1(a) is not supported as there was no evidence that efficiency remained constant across different product comparisons. Similarly, hypothesis H1(b) was not supported as the efficiency varied across different tasks. For hypothesis H1(e), the animation presentation was not more efficient even in single product reviews.

4.2 Effectiveness (hypotheses H2)

The effectiveness of the different presentation metaphors was evaluated by the number of correct task choices. Each user completed four tasks with varying complexity levels (easy, moderate and difficult). Figure 6 shows the percentage of tasks completed successfully for all tasks (A) and according to task complexity (B). Users performed better using the emoji presentation (EMP) compared to the other metaphors, with a success rate of 89.75%, which is 32% higher than the animation presentation (AMP) and 7% higher than the avatar presentation (AVP).

ANOVA F-tests indicated significant differences in task success rates for easy and difficult tasks (p < 0.05), but not for moderate tasks (p > 0.05) as Table 5 presents. Fisher’s post hoc analyses showed significant differences between the emoji (EMP) and animation presentations (AMP) for easy and complex tasks (p < 0.05) and between AVP and AMP for complex tasks (p < 0.05), while the test showed no significant difference between the metaphors in the moderate task with p > 0.05 as presented in Table 6. This supports hypotheses H2(a) and H2(b).

4.3 Users perceived enjoyment (hypothesis H3)

Enjoyment was measured after each task and after completing the experiment. User enjoyment was measured using a 5-point Likert scale with 1 being least enjoyable to 5 very enjoyable. Their post-task enjoyment level was provided by each user after completing every task and after testing the presentation metaphor for that task. Figure 7 shows the overall percentage value of enjoyment level for each presentation metaphor. It is clear that the users have stated that EMP was the most enjoyable with up to 90% agreeing. AVP was the second with up to 80.6% enjoyment level. The lowest enjoyment of 55.3% was for AMP. The results indicate the users enjoyed using the system with the EMP interface more than using AVP and AMP interfaces.

Statistical analyses using ANOVA showed significant differences in enjoyment levels across tasks. For example, Task 1 showed significant differences among the presentation metaphors (F = 11.618, df = 2, p < 0.005). Post-hoc analysis using Fisher’s LSD test identified where the significant differences were among the groups. Tables 7 and 8 shows the result of the ANOVA and the post-hoc analysis.

4.4 User satisfaction (hypotheses H4)

User satisfaction with the presentation metaphors was measured using a post-task questionnaire based on the System Usability Scale (SUS) from Bangor et al. (2009). Table 9 contains the statement used for system satisfaction. Every user was asked to evaluate the metaphor used and tested using the user satisfaction statements.

Figure 8 shows the frequency of users’ agreement to each of the SUS statements in the post-task questionnaire. Similar levels of agreement were expressed by the users using different presentation metaphors for statement S5 related to the functions of the system (I found the various functions in this system were well integrated). However, it can be noted that the users were less satisfied with the AMP, by stating of the need to learn before using the system, with 50% agreeing on S10 (I needed to learn a lot of things before I could get going with this system). In S1 users were asked if they would use the system more frequently (I think that I would like to use this system frequently) with AVP scoring 75%, 70% for EMP while only 50% considered using the AMP. Moreover, the users were asked in S3 regarding the ease of using the system (I thought the system was easy to use) where the users considered both EMP along with AVP to be the easiest with satisfaction of 100% and 95% respectively. Additionally, the users gave their satisfaction on the complexity of the system in S2 (I found the system unnecessarily complex) with 50% agreeing about the statement for AVP while 0% agreed for AVP and EMP. It is clear that the users were more satisfied with review rating presentation EMP and AVP than with review rating presentation using AMP. Statistical analysis conducted using the ANOVA with Friedman test showed a significant difference among the users when using the different presentation metaphors (x² = 381.994 df = 2, p < 0.05).

5. Conclusion

The study investigated the impact of visual multimodal facially expressive avatars, emojis and animation presentations on e-commerce product review and rating presentations, considering tasks of varying complexity levels (simple, moderate and difficult). The platform designed for this study included three different multimodal conditions: (1) facial expression avatars (AVP), (2) emojis (EMP), and (3) animation clips (AMP). These conditions were evaluated using a within-subject approach with an opportunistic sample of users (n = 48). The data collected enabled the comparison of the different presentations in terms of efficiency, effectiveness, user satisfaction and enjoyment levels.

Results showed that facially expressive avatars and emoji-based presentations were more efficient than the animation presentation metaphor. Using these two metaphors (EMP and AVP) reduced the time taken by users to complete tasks, and more tasks were successfully completed. While no significant difference was found between EMP and AVP, a significant difference was noted between the animation presentation and the other two conditions. Overall, the study demonstrated that the most enjoyable presentations for reviews involved facially expressive avatars and emojis, highlighting their importance in enhancing user performance on e-commerce interfaces in terms of efficiency, effectiveness, user satisfaction and enjoyment. Although the study showed the role of facial expressive avatars and emojis in communicating product ratings and reviews, the animation presentation metaphor had less impact.

5.1 Theoretical implications

This study contributes to the growing body of literature on multimodal communication in digital environments. The findings support the cognitive theory of multimedia learning (Mayer, 2020), which posits that people learn better from words and pictures than from words alone. Additionally, the results align with cognitive load theory (Sweller et al., 2011), suggesting that simpler visual metaphors (emojis and avatars) reduce cognitive load compared to more complex animations, thus enhancing user performance and satisfaction. The study underscores the importance of effective multimodal communication in digital interfaces, providing empirical evidence that can inform future research in this area.

Furthermore, the studies by Aslan et al. (2019) and Ma et al., (2023) demonstrated the effectiveness of multimodal engagement in real-time educational settings, emphasizing the positive impact of multimodal technologies on engagement and performance. These findings suggest that multimodal approaches are beneficial across various domains, including education and e-commerce, where user engagement is crucial.

5.2 Practical/managerial implications

For e-commerce platforms, these findings have significant practical and managerial implications. Incorporating emojis and facial expression avatars into product review sections can improve user experience by making information more accessible and engaging, potentially increasing purchase intentions. The study also highlights the importance of simplicity in design for user engagement and efficiency. By incorporating multimodal elements such as emojis and facial expression avatars into product review sections, platforms can improve user experience by making information more accessible and engaging, potentially increasing purchase intentions.

(1)
Enhancing User Interface Design:

Emojis and Avatars Integration: Managers and designers should consider integrating emojis and facial expression avatars into the review sections. This can be achieved by developing user-friendly interfaces where these visual elements can be easily added to reviews and ratings. This approach helps to make product reviews more vivid and easily interpretable for users.

(2)
Prioritizing Simplicity and Usability:

Simplified Visual Metaphors: The study highlights the importance of simplicity in design for user engagement and efficiency. Managers should direct designers to prioritize easy-to-understand visual metaphors over complex animations. Simple visual elements like emojis and avatars reduce cognitive load and enhance user satisfaction.

(3)
Improving User Engagement and Retention:

Engaging Review Systems: To keep users engaged, e-commerce platforms can make the review process more interactive and enjoyable by incorporating visually appealing elements such as emojis and avatars. This can encourage more detailed and frequent reviews, improving the overall richness of product feedback available to potential buyers.

By adopting these strategies, e-commerce platforms can enhance their competitive advantage in the digital marketplace by offering a user-friendly and efficient platform that attracts new users and retains existing ones through an engaging and enjoyable user experience.

5.3 Future research

While this study provides valuable insights, further research is needed to explore the long-term effects of using different visual metaphors on user behaviour and satisfaction. Future studies could investigate the impact of individual user characteristics, such as age and digital literacy, on the effectiveness of these multimodal presentations. Expanding the sample size and diversity would also help generalise the findings to a broader population.

Additionally, future research could explore the use of video presentations of reviews and ratings using emojis to determine if this method yields better results in terms of user performance. Understanding the individual perceptions of animation characters could also provide deeper insights into the effectiveness of animation in e-commerce contexts.

In conclusion, this study underscores the importance of choosing appropriate visual metaphors for presenting product reviews and ratings in e-commerce platforms. By adopting simpler and more intuitive visual elements such as emojis and avatars, businesses can enhance user experience, satisfaction and engagement, ultimately leading to better business outcomes.

Figures

Figure 1

E-commerce platform for the multimodal metaphors presentation of review messages and ratings investigating impact on the user’s performance, purchase intention, satisfaction and enjoyment

Figure 2

Facial expression avatar reviews and ratings

Figure 3

Emoji’s used in the emoji’s presentation (EMP)

Figure 4

Animation presentation

Figure 5

Mean values of time taken by users to complete the tasks grouped by the presentation A and complexity level B

Figure 6

Percentage of correctly completed tasks achieved by the users in the three groups for all tasks (A) and for task complexity (B)

Figure 7

Enjoyment percentage for each metaphor presentation

Figure 8

Frequency of users’ agreement for each SUS statement for each presentation metaphor

Table 1

Facial expression mapping with ratings

Table 2

Characters example of animation clip presentation (AMP)