Exploring Bidirectional Performance of Hotel Attributes through Online Reviews Based on Sentiment Analysis and Kano-IPA Model

Chen, Yanyan; Zhong, Yumei; Yu, Sumin; Xiao, Yan; Chen, Sining

doi:10.3390/app12020692

Open AccessArticle

Exploring Bidirectional Performance of Hotel Attributes through Online Reviews Based on Sentiment Analysis and Kano-IPA Model

by

Yanyan Chen

,

Yumei Zhong

,

Sumin Yu

^*,

Yan Xiao

and

Sining Chen

College of Management, Shenzhen University, Shenzhen 518061, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(2), 692; https://doi.org/10.3390/app12020692

Submission received: 12 November 2021 / Revised: 5 January 2022 / Accepted: 7 January 2022 / Published: 11 January 2022

(This article belongs to the Topic Soft Computing)

Download

Browse Figures

Versions Notes

Abstract

:

As people increasingly make hotel booking decisions relying on online reviews, how to effectively improve customer ratings has become a major point for hotel managers. Online reviews serve as a promising data source to enhance service attributes in order to improve online bookings. This paper employs online customer ratings and textual reviews to explore the bidirectional performance (good performance in positive reviews and poor performance in negative reviews) of hotel attributes in terms of four hotel star ratings. Sentiment analysis and a combination of the Kano model and importance-performance analysis (IPA) are applied. Feature extraction and sentiment analysis techniques are used to analyze the bidirectional performance of hotel attributes in terms of four hotel star ratings from 1,090,341 online reviews of hotels in London collected from TripAdvisor.com (accessed on 4 January 2022). In particular, a new sentiment lexicon for hospitality domain is built from numerous online reviews using the PolarityRank algorithm to convert textual reviews into sentiment scores. The Kano-IPA model is applied to explain customers’ rating behaviors and prioritize attributes for improvement. The results provide determinants of high/low customer ratings to different star hotels and suggest that hotel attributes contributing to high/low customer ratings vary across hotel star ratings. In addition, this paper analyzed the Kano categories and priority rankings of six hotel attributes for each star rating of hotels to formulate improvement strategies. Theoretical and practical implications of these results are discussed in the end.

Keywords:

online reviews; hotel attribute; attribute bidirectional performance; sentiment analysis; Kano model; importance-performance analysis

1. Introduction

Unlike using recommendations of relatives and friends in the past, people increasingly make hotel booking decisions relying on online reviews on various online travel platforms in the modern era. Hotel online reviews are posted by numerous customers according to their experiences in hotels, which are perceived as more objective, trustworthy and helpful than information provided by hotels [1,2]. Online reviews generally consist of online ratings and textual reviews. Online ratings signal customer satisfaction or dissatisfaction with hotels. Textual reviews contain customers’ actual expectations, feelings and perceptions about hotel services. According to bounded rationality model, customers are unable to elaborate and extract useful information from numerous and heterogeneous data, thus driving them to prefer and rely more on ratings than on textual reviews [3]. As more and more potential customers regard the online ratings as one of the direct references of hotel quality when selecting hotels, it is crucial for hotels to obtain high customer ratings to achieve the goal of improving online bookings [4,5]. Therefore, exploring what contributes to the difference in online ratings between satisfied and dissatisfied customers is particularly important for hotels. In other words, for the purpose of being competitive sustainably in the hospitality industry, it is critical for hotels to understand the determinants of customer satisfaction and dissatisfaction which are proxied by online ratings [6,7].

Existing studies have proved that the performance of multiple hotel attributes is strongly correlated with customer satisfaction [8,9,10]. Most studies have investigated the hotel attributes that lead to customer satisfaction and dissatisfaction through surveys [11,12,13]. Recently, with the development of data mining techniques, online reviews serve as the promising data source for customer satisfaction analysis. Several scholars have analyzed the attribute performance through online reviews using sentiment analysis methods, and hence found the determinants of customer satisfaction in the hotel industry [14,15]. However, these studies processed hotel reviews as a whole dataset, neglecting discriminating positive and negative reviews. Processing hotel reviews as a whole can compare the overall performance of multiple attributes from the perspective of all customers but could not distinguish between contributors of customer satisfaction and factors resulting in customer dissatisfaction. Previous studies have found that dual-valence (that is, featuring both positive and negative sentiment) reviews existing in hotels of one–five-star ratings [16,17]. The presence of negative sentiment toward attributes in positive reviews and positive sentiment toward attributes in negative reviews was observed [18,19,20]. In other words, even if the performance of several hotel attributes does not meet customer expectations, customers are still satisfied with the hotel and give high ratings to the hotel because of the good performance of other hotel attributes. Meanwhile, customers can be very dissatisfied with the hotel and give low ratings to hotels when the performance of some certain hotel attributes is poor, even though they think other hotel attributes perform well. Therefore, it’s necessary to investigate the following question:

Research Question 1 (RQ1).

Which hotel attribute with good performance contributes to high customer ratings and which hotel attribute with poor performance causes low customer ratings?

In fact, it should be pointed out that customers’ expectations and perceptions vary across different market segments, such as different hotel star ratings [14,21]. Exploring the determinants of customer satisfaction and dissatisfaction of each market segment is beneficial for making more appropriate and precise strategies [10]. Moreover, it is helpful for hotel managers to understand customer demands for different star hotels in the decision-making of marching into new markets through comparing the difference of attribute performance in different star hotels. However, whether the hotel attribute contributing to high/low customer ratings varies across different star hotels has not been verified. Therefore, this study intends to investigate the following question:

Research Question 2 (RQ2).

Does the hotel attribute contributing to high/low customer ratings vary across different star hotels?

To answer the above two questions, it is necessary to analyze the effect of attribute performance on customer satisfaction. Customers’ preferences, expectations and perceptions on each hotel attribute are influenced by comprehensive factors, thus driving positive and negative customer evaluations toward the bidirectional (good and poor) performance of hotel attributes [22]. Traditionally, one unit increase in good performance and one unit decrease in poor performance concerning a certain hotel attribute should cause the same change of customer satisfaction, thus the relationship between attribute performance and customer satisfaction is assumed to be linear or symmetric [23]. However, some studies have demonstrated that some attributes provide more satisfaction than dissatisfaction [24,25,26]. In other words, hotel attributes can have asymmetric effects on customer satisfaction [24]. The Kano model was proposed by Kano et al. (1984) to identify these non-linear or asymmetric relationships between attribute performance and customer satisfaction. The Kano model is often applied to classify hotel attributes into different categories in terms of customer demands, which is helpful for hotel managers to better understand customer expectations and perceptions [27,28]. Meanwhile, considering the limited hotel resources, it is critical to determine attribute priority to maximize customer satisfaction through service improvement. Many studies have shown that applying a combination of the Kano model and importance-performance analysis (IPA) in customer satisfaction analysis can not only analyze customer requirements toward service attributes, but also determine attribute priority [10,29,30,31,32,33]. The IPA is a common and effective technique to formulate improvement strategies according to the importance and performance of the attribute [34]. However, existing studies concerning the Kano-IPA model are mainly based on surveys, and few studies use online reviews as a data source for the Kano-IPA model. There are two main reasons limiting the application of the Kano-IPA model in online reviews. On the one hand, online textual reviews are unstructured and therefore need to be processed before they can be converted into usable structured data. On the other hand, there is a question of how to apply the processed data to the customer satisfaction model to obtain different Kano categories. Considering online reviews serving as promising data source for analyzing and improving hotel services, this study intends to apply feature extraction and natural language processing (NLP) techniques to conduct Kano-IPA model through online reviews.

In summary, this study aims to identify the well-performed attributes contributing to high customer ratings and poorly performed attributes causing low customer ratings for different star hotels. For this, firstly, we distinguish between positive and negative reviews for different star hotels according to online ratings. Next, we apply feature extraction and sentiment analysis techniques to explore bidirectional performance of hotel attributes. In particular, a new sentiment lexicon for hospitality domain was built from numerous online reviews using the PolarityRank algorithm. To further understand customers’ rating behaviors and demands for hotel service, this study intends to conduct the Kano-IPA model through online reviews for attribute classification and prioritizing. We propose an approach to classify attributes into the Kano model, which provides convenience for the application of the Kano model in textual reviews. Lastly, the comparative analysis of attribute performance and priority rankings is carried out to enhance the understanding of customers’ demands for different star hotels.

The remainder of this paper is organized as follows. Section 2 briefly reviews the relevant literature to provide the motivation for this study. Section 3 presents the framework and methodology employed in this study. Section 4 presents the results and provides some discussion of this study. Section 5 concludes and offers theoretical and practical implications, limitations, and directions for future research.

2. Literature Review

2.1. Studies on Hotel Online Reviews

Hotel online reviews, in the form of online ratings and textual reviews, represent customers’ emotions and experience toward service quality based on their expectations against their actual experience. In general, most websites collect customer ratings and opinions on hotels by offering several critical attributes for evaluation. Many researchers have used these hotel attributes to explore customer behaviors. For example, Wang et al. [35] investigated the importance of six attributes including value, location, service, room, cleanliness and sleep quality offered by TripAdvisor.com during the process of hotel selection decision-making. Liu at al. [9] verified the differences of these six hotel attributes’ preferences between domestic and international tourists. Bi et al. [10] also used online reviews from TripAdvisor.com to analyze the asymmetric effect of the performance of these six attributes on overall customer ratings. Nicolau et al. [36] analyzed the influence of the variations in the ratings of hotel attributes (comfort, staff, services, value for money and cleanliness) on the variation in the ratings of location to test the halo effect, where these attributes are offered by Booking.com (accessed on 4 January 2022) for evaluation. Evidently, online reviews contain various information of service quality concerning hotel attributes. Thus, it’s significant to extract useful information from massive online reviews to help hotel management to improve service quality.

Many previous studies focused on the analysis of numerical ratings, including the overall and multi-attribute ratings. Though overall ratings can indicate customers’ overall satisfaction in a straightforward way [37], multi-attribute ratings can obtain a better understanding of factors driving customer satisfaction for different segments within hotels [38]. Sharma et al. [39] classified the multi-attribute ratings into positive, neutral and negative sentiments and then applied interval-valued neutrosophic TOPSIS for ranking hotels. Bi et al. [10] used both overall and multi-attribute ratings to explore the asymmetric effects of attribute performance on customer satisfaction. However, multi-attribute ratings are usually incomplete, limiting utilizing multi-attribute ratings to obtain more information about customers’ feelings on service quality of each attribute.

Online textual reviews, a kind of unstructured data source, contain a wealth of information, including customers’ preferences, expectations, feelings and perceptions toward hotels [6,38], have gained growing interest among scholars. Especially, with the advance in NLP techniques, more and more studies based on text analysis have been conducted. In the current hospitality research, topic analysis that aims to extract the review’s important aspects has been popular and a number of topic mining algorithms have been applied. Guo et al. [40] used Latent Dirichlet Allocation (LDA) topic modeling tool to analyze customers’ preferences of hotel attributes. Hu et al. [41] adopted a structural topic model text analysis method to analyze the causes of customers’ complaints for hotel service improvement. Wang et al. [35] extracted key factors of different attributes for ranking hotels using the term frequency-inverse document frequency (TF-IDF) and Word2Vec methods. While topic mining is useful to identify key service factors, it cannot reflect whether customers are satisfied with hotel service quality. Sentiment analysis of textual data, which focuses on extracting a review’s sentiment polarity, such as positive, negative and neutral, can indicate customers’ real emotions and satisfaction toward hotel services [42,43]. Therefore, some hospitality scholars have gradually applied both topic mining and sentiment analysis techniques to capture customers’ concerns, emotions, satisfaction and complaints toward hotel services. Bi et al. [44] applied LDA and IOVO-SVM algorithms to identify hotel attributes and their sentiment strengths to conduct IPA plotting for attribute improvement strategies. Al-Smadi et al. [45] used the bidirectional Long Short-Term Memory (LSTM) to extract opinionated aspects and their polarity from Arabic hotels’ reviews. Nie et al. [46] applied a semantic partitioned sentiment dictionary to obtain sentiment values of different attributes to rank hotels.

Existing studies suggested that online textual reviews indicate details on customers’ demands and perceptions of hotel attributes [6,9]. Although numerical textual reviews have been studied widely to extract critical hotel attributes and their sentiment [14,15,46], few studies have distinguished between positive reviews and negative reviews to identify well-performed attributes contributing to satisfied customers and poorly performed attributes resulting in dissatisfied customers. Therefore, this study attempts to explore the difference between the good performance and poor performance of the same attribute with respect to positive reviews and negative reviews concerning different star hotels through sentiment analysis.

2.2. Studies on Sentiment Analysis

Sentiment analysis has emerged as an important aspect of NLP. Sentiment analysis leverages a variety of NLP techniques to extract the sentiment expressed in texts and determine whether they are positive, negative or neutral [47,48]. Analysis of text sentiments has spread across many fields such as consumer information, marketing, books, application, social media, tourism destination and hotels [49,50,51,52,53,54,55]. The approaches to sentiment analysis can be mainly divided into two types, namely, machine learning and lexicon-based methods.

Machine learning methods represent documents as vectors in a feature space and classify them into predefined sentiment categories [56]. There are several machine learning methods for sentiment classification, such as naive Bayes (NB), maximum entropy and support vector machine (SVM). These classifiers usually use the bag-of-words (BoW) method comprising unigrams or n-grams to determine how the documents are represented [57], resulting in high dimensionality of the feature space. With the help of feature selection techniques, such as part-of-speech (POS) tagging which aims to disambiguate text sense based on a lexical category, machine learning algorithms can reduce high-dimensional feature space by eliminating the noisy and irrelevant features. Existing studies concluded that some machine learning classifier algorithms have better performance than lexicon-based [58,59], but these methods have some certain defects: (1) classifiers trained for a domain-specific problem do not perform well in other domains [58]; (2) feature construction is critical but can hardly implement [60] and (3) these methods usually rely on a great volume of manually labeled training data [61]. Given these drawbacks, unsupervised methods like lexicon-based methods are applied.

Lexicon-based methods are based on the assumption that the contextual sentiment orientation is the sum of the sentiment orientation of each word or phrase by matching a word or phrase with words from sentiment lexicon and their associated sentiment scores [62]. In general, adjectives are used as indicators of the semantic orientation of a text [58]. More recently, verbs and nouns are also used to compile into a sentiment dictionary [63]. Such a lexicon or dictionary can be created manually, or automatically, using seed words to expand the list of words [64]. Abdulla et al. [65] built a lexicon for the Arabic language and the proposed approach gained better accuracy than other methods. Taboada et al. [58] constructed a dictionary incorporating intensification and negation to compute text sentiment scores, which is called the semantic orientation calculator (SO-CAL) approach. Dey et al. [66] developed an n-gram sentiment dictionary called Senti-N-Gram for automatic score calculation. Compared to machine learning methods, sentiment lexicons learned from a certain domain preserve the domain-based orientation of words, which provides greater accuracy for sentiment analysis tasks [67]. Furthermore, lexicon-based methods take the lexical and syntactical information in linguistic content into account in order to revise the sentiment valence [56]. That is, in the sentiment scoring process, negation, intensification, and the rhetorical roles of text segments are taken into account as well. The language-dependent features can also be considered in lexicon-based model [68].

In summary, machine learning sentiment analysis trained on a particular dataset by using features, which may reach quite high accuracy in detecting the polarity of a text. However, this is highly dependent on labeled data, limiting its application. Unsupervised lexicon-based methods, such as knowledge-graph propagation and seed word-based methods, not only overcome the absence of labeled data, but are able to extract domain-specific sentiment words [69,70]. Thus, lexicon-based methods are considered preferable for sentiment analysis in a certain domain, and in this study an unsupervised lexicon-based sentiment method is used for sentiment analysis to explore the bidirectional performance of hotel attributes.

2.3. The Kano Model

Traditionally, customer satisfaction has been regarded as one-dimensional or symmetric: the higher the perceived product/service quality is, the higher the customer’s satisfaction is and vice versa [23]. However, continuous improvements in product/service attributes without considering what customers actually want may not engender a higher level of customer satisfaction. Some researchers argued that the relationship between attribute performance and customer satisfaction is nonlinear or asymmetric [23,24,71]. Consequently, Kano et al. (1984) introduced a two-dimensional model, called the Kano model, that clarifies the asymmetric and nonlinear relationship between product/service attribute performance and customer satisfaction, and classifies the attributes into five categories, namely, must-be factors, one-dimensional factors, attractive factors, indifferent factors and reverse factors [72]. Later, the simplified Kano model classifies attributes into the following three factors, basic, performance and excitement factors corresponding to must-be, one-dimensional, attractive factors [73], which has been widely used in different research domains, as shown in Figure 1.

Basic: These attributes are the basic requirements of product/service. Customers are extremely dissatisfied when these attributes don’t meet their expectations. However, when customer expectations are exceeded, customers are just neutral since they take it for granted.
Performance: The performance of these attributes is positively and linearly related to customer satisfaction. In other words, the customer satisfaction increases with the increase in attribute performance, and vice versa.
Excitement: When the performance of these attributes exceeds customer expectations, customers are satisfied, but they are not dissatisfied when these attributes are absent. Therefore, good performance of this category has a stronger impact on customer satisfaction than its poor performance.

Identifying different categories of attributes is beneficial for hotels to understand the determinants of customer satisfaction and dissatisfaction, and hence improving service attributes effectively [29,74]. Many scholars have applied the Kano model to understand customer expectations and perceptions toward different service attributes in hospitality research, as shown in Table 1.

Most of these studies are based on a questionnaire survey, scarcely extracting key attributes from big data such as online reviews. One of problems limiting its application to big data is that it is hard to classify service attributes into Kano categories using existing methods. In current Kano model analysis, several methods are introduced to classify quality attributes. Kano et al. (1984) provided an approach using a structured questionnaire with functional and dysfunctional questions for each attribute [73]. The penalty-reward contrast analysis (PRCA) has been used widely to classify quality attributes by regression analysis with two sets of dummy variables for each attribute [25,73]. The moderated regression approach based on a five-point Likert scale, proposed by Lin et al. [77], uses regression coefficients to classify attributes. Another quantitative method called the “importance grid” has also been applied to a variety of studies, which compares explicit and implicit importance of each attribute to category in three factors [30,78]. Qualitative data methods including critical incident technique (CIT) and the “analysis of complaints and compliments” (ACC) have been applied to category attributes by comparing the difference in attribute frequency mentioned by customers in a positive context or a negative context [76,79,80]. In conclusion, these methods distinguish between different types of attributes by comparing the impacts of good performance and poor performance of the attribute on customer satisfaction.

Most studies relied on questionnaire survey when using the above Kano classifying methods to category attributes, which indicates existing classifying methods may not be suitable for the application of the Kano model in numerous textual reviews because of unstructured feature. Following the Kano model classifying principle, this study aims to propose a novel approach to classify hotel attributes into the Kano model in text analysis. This new approach will provide support for the application of the Kano model in numerous unstructured data to explore customer satisfaction.

2.4. Importance-Performance Analysis

Importance-performance analysis (IPA), proposed by Martilla and James (1977), is a graphical tool to classify attributes for improvement and rank their priority based on the importance and performance of each product/service attribute [34,81]. This approach constructs a plot with two dimensions, importance and performance of product/service attribute perceived by consumers, and classifies attributes into four quadrants equipped with different management strategies. An example of IPA is given in Figure 2. Quadrant I is labeled ‘Keep up the good work’, where attributes are considered highly important, and their performance is high. Attributes located in quadrant I can be considered as the major strengths of the product/service and should be maintained. Quadrant II is labeled ‘Possible overkill’, where attributes have low importance but high performance. The resources dedicated to these attributes may be excessive, so reallocating limited resources to address other more important attributes is proper. Quadrant III is labeled ‘low priority’, where attributes have both low importance and performance. Attributes in this quadrant are regarded as the minor weakness and have a low priority for improvement. Quadrant IV is labeled ‘Concentrate here’, where attributes are considered highly important but are poorly performed. Attributes in this quadrant are regarded as the major weaknesses and should be given a high priority for improvement.

IPA is applied in a wide variety of research domains, partly due to the clear managerial strategies it provides on how to allocate resources and efforts [82], and also due to its ability to identify the strengths and weaknesses of product/service to guide management in taking effective measures to keep competitive [83]. In hospitality research, IPA is commonly integrated with other techniques, such as SERVQUAL [84,85], data envelopment analysis [86,87], partial least squares path modeling [88], the Kano model [10,30,31,32,33]. Many scholars have applied a combination of the Kano model and IPA on customer satisfaction analysis. For example, Bi et al. [10] applied the Kano model and asymmetric IPA to explore the asymmetric effects of hotel attribute performance on customer satisfaction through online ratings. Jou and Day [31] integrated the Kano model and IPA into a three-dimensional IPA approach to identify the critical service attributes for hotel online booking through survey. Tseng [32] constructed an IPA-Kano model for classifying and diagnosing service attributes at the TPE airport. Pai et al. [33] combined the Kano model and IPA to investigate the critical service quality attributes to enhance customer satisfaction in the chain restaurant industry.

However, few hotel studies conducted Kano-IPA analysis using online textual review. Furthermore, few studies have applied Kano-IPA model to obtain the hotel attribute priority ranking for resource allocation to get improved across different hotel star ratings. These literature gaps need to be dealt with. Thus, considering the effectiveness of the Kano model and IPA for providing constructive guidelines to hotels to enhance customer satisfaction, it is of great significance to explore the application of Kano-IPA model in hotel textual reviews across different hotel star ratings.

3. Materials and Methods

The main objective of this study is to explore what contributes to the difference in hotel customer ratings for different star hotels. Specifically, this study identifies well-performed attributes contributing to high customer ratings and poorly performed attributes causing low customer ratings in terms of hotel star ratings by exploring the bidirectional performance of hotel attributes. This study also aims to apply the Kano-IPA model in online textual reviews for a better understanding of customers’ rating behaviors and demands, and hence provides effective attribute improvement strategies for different star hotels.

In this section, we propose a methodology to realize the above objectives and the structure of this methodology framework is shown in Figure 3. First, we collected data from TripAdvisor.com and processed the data according to hotel star ratings and customer ratings. Second, sub-attributes of six hotel attributes (value, location, service, room, cleanliness and sleep quality) that customers mentioned in online reviews were extracted. Specifically, similar terms and the similarity under each attribute are identified through the Word2Vec algorithm. Third, a sentiment lexicon for the hospitality domain to obtain sentiment values of each attribute and sub-attribute was obtained through the PolarityRank algorithm. Fourth, well-performed attributes that contribute to customer satisfaction and poorly performed attributes that cause customer dissatisfaction were identified for hotels of different star ratings through sentiment analysis. Finally, the above results by text mining were applied to conduct the Kano-IPA analysis for different star hotels. In particular, a novel approach for Kano model classification is proposed. Thus, the improvement strategies and priority of attributes are provided for different star hotels.

3.1. Data Collecting and Processing

We collected hotel online reviews in London from TripAdvisor.com, which is the world’s largest travel-sharing website. TripAdvisor.com contains millions of unbiased user-generated reviews from customers worldwide; thus, it’s feasible to collect a large volume of online reviews. The data collection and processing steps in this paper are as follows.

First, hotels in London were selected as data source for this research. London is one of the largest financial centers in Europe, as well as one of the world’s most famous tourist attractions. It attracts millions of customers across the world. Statistically, London recorded 28.47 million bed nights of domestic tourists and 118.9 million nights of international visitors in 2019 [89,90].

Second, we crawled all available information at both hotel-level and review-level in London using a Python program. The hotels with fewer than 400 reviews in English were removed to ensure the credibility of this research sample. A total of 640 hotels with 1,090,341 reviews in English satisfied our requirements. Hotel-level information contains hotel name, star, rating, number of reviews and address. Each review-level data contains reviewer, travel type, posting time, stay time, textual review, overall rating and ratings on six hotel attributes (value, location, service, room, cleanliness and sleep quality).

Finally, we classified the hotel reviews into different datasets according to the hotel star ratings and review overall ratings. Following the studies in [9,91], we categorized the online reviews into four datasets, namely, two-star and below hotels (1-, 1.5-, 2- and 2.5-stars hotels), three-star hotels (3- and 3.5-stars hotels), four-star hotels (4- and 4.5-stars hotels) and five-star hotels according to the hotel star ratings. Review classification based on the review overall rating is controversial [18]. The main argument is whether the 3-score rating reviews should be classified as neutral or negative. Studies have shown that a 3-score evaluation is close to the service failure for most of potential customers [18,92]. Therefore, in this study, according to review overall ratings, online reviews of each hotel star were divided into two sub-datasets respectively, 1–3-score rating reviews as negative reviews and 4–5-score rating reviews as positive reviews. Let

D_{t}^{n e g}

and

D_{t}^{p o s}

respectively indicate the negative and positive dataset of

t

-star hotels,

t = 2, 3, 4, 5

. The final distribution of sub-datasets is shown in Table 2.

3.2. Text Preprocessing and Sub-Attributes Selection

3.2.1. Text Preprocessing

Several standard steps were adopted to complete the text preprocessing task by using modules of the Natural Language Toolkit in Python programming environment, including:

Correcting spelling errors and transforming words with variant spellings (e.g., isn’t and is not);
Sentence segmentation and word tokenization;
Transforming capital letters to lowercases;
Removing non-English characters, punctuations and stopwords (an existing stopwords list from https://www.ranks.nl/stopwords) (accessed on 30 October 2021);
POS tagging;
Lemmatization (reduce the inflectional forms to their root forms, e.g., rooms and room).

3.2.2. Sub-Attributes Selection

In this study, six key attributes including value, location, service, room, cleanliness and sleep quality [9,10,35,36,46] are selected to explore the role of their bidirectional performance on customer overall ratings. These six attributes are provided by TripAdvisor.com as significant factors for customers to review [35]. Hotel customers use a variety of elements to evaluate the performance of the same attribute [46,93,94]. For example, customers may use “locate”, “place” and “distance” to describe the attribute “location”. Therefore, extracting words that are semantically similar to each hotel attribute is essential to comprehensively understand customers’ opinions.

In this study, we use Word2Vec algorithm to extract words semantically similar to each hotel attribute from textual reviews. Word2Vec is a generative similarity analysis method used to compare the degrees of semantic similarity between two words or two texts. Given a text corpus, Word2Vec learns a vector for each word in the vocabulary using the Continuous Bag-of-Words or the Skip-Gram neural network architectures [95]. Continuous Bag-of-Words is suitable for a small corpus, while Skip-Gram performs better in a large corpus. After training the word vector model, the similarity of the words can be obtained. For this study, gensim is used as library which provides ready-made implementation of Word2Vec algorithm. We trained word vectors from each dataset of different hotel star using the Skip-Gram model. With the pre-trained Word2Vec model for each dataset, the similarity value between attribute

A_{i}

and each word in dataset is calculated, where

i = 1, 2, 3, 4, 5, 6

. The words with similarity value under attribute

A_{i}

greater than 0.5 are selected as the sub-attribute

B_{i j}

of attribute

A_{i}

, where

j = 1, 2, \dots, P

is the number of sub-attributes. Let

S S_{i j}

denoted the similarity value between sub-attribute

B_{i j}

and attribute

A_{i}

.

3.3. Sentiment Lexicon Creation

We used the PolarityRank algorithm to create a sentiment lexicon from hotel reviews, which has achieved reasonable accuracy without training for domain-specific sentiment analysis [63,96]. The PolarityRank algorithm is a non-supervised sentiment analysis method based on PageRank, with the ability to consider the relevance between nodes, and spread both positive PolarityRank (

P R^{+}

) and negative PolarityRank (

P R^{-}

) of one node to other nodes through the relevance by edges of weights in a graph [63,96,97]. The main idea behind PolarityRank is to calculate two measures of relevance, the positive and the negative for each node in the graph [63].

Given a text, a graph can be built based on lexical and syntactical dependency, which is named a dependency-based parse tree in NLP. The lexical graph is defined as

G = (N, E)

, where

N = {g_{x}}

is a set of nodes and

E

is a set of bidirectional edges between pairs of nodes according to the syntactic dependencies and between all nodes contained in descendant branches. The edge

E

between node

g_{x}

and

g_{y}

contains an associated weight denoted by

w_{x y}

. An example of lexical graph is given, shown in Figure 4. After generating the graph, propagation process with

P R^{+}

and

P R^{-}

of each node begins. The detailed descriptions of implementation steps are given in Section 3.3.1, Section 3.3.2 and Section 3.3.3

3.3.1. Selecting Candidate Sentiment Words

After text preprocessing, the words were lemmatized as nouns, verbs, adjectives, adverbs, pronouns, etc. Previous studies selected the lemmatized nouns, adjectives and verbs as candidate sentiment words, discarding adverbs for they merely alter the degree of the polarity of the words they modify, but do not carry an inherent sentiment polarity [96,98]. Actually, many adverbs carry sentiment polarity, such as the adverb “luckily” in sentence “Luckily, there was one room available” expresses positive emotion.

To accurately analyze customers’ feelings, we used all the lemmatized nouns (n), verbs (v), adjectives (a) and adverbs (ad) as candidate sentiment words. The nodes of the graph corresponding to candidate sentiment words from hotel reviews are connected by the bidirectional edges. Following the study of Fernández-Gavilanes et al. [96], the co-occurrence frequency of node

g_{x}

and

g_{y}

in the whole dataset is assigned to the weight

w_{x y}

of edge

E

joining node

g_{x}

and

g_{y}

.

3.3.2. Assigning Initial Values to Candidate Sentiment Words

In this section, the candidate sentiment words are assigned initial positive value

e^{+}

and negative value

e^{-}

by SentiWordNet 3.0 through encoding a Python program. SentiWordNet 3.0 is a general sentiment lexicon publicly available for researchers, with three sentiment scores for each word, namely positive, negative and objective scores [99]. For each candidate sentiment word, we assigned the positive value from SentiWordNet 3.0 to

e^{+}

and the negative value from SentiWordNet 3.0 to

e^{-}

. For the words excluded in SentiWordNet 3.0, the

e^{+}

and

e^{-}

are equal to zero.

3.3.3. Calculation of ${PR}^{+}$ and ${PR}^{-}$

With weights for edges and pairs of initial sentiment values for nodes, calculation of

P R^{+}

and

P R^{-}

could commence. Let

E (g_{x})

be a set of indices

y

of the nodes for which there exists an edge to node

g_{x}

. Then, suppose

e_{x}^{+}

and

e_{x}^{-}

be the initial positive and negative values of node

g_{x}

respectively. The parameter

α

is set to 0.85 based on the original definition of PageRank, which is a damping factor to ensure convergence [63,97]. The

P R^{+}

and

P R^{-}

are estimated as follows:

P R^{+} (g_{x}) = (1 - α) e_{x}^{+} + α \sum_{y \in E (g_{x})} \frac{w_{x y}}{\sum_{z \in E (g_{y})} w_{x z}} P R^{+} (g_{y})

(1)

P R^{-} (g_{x}) = (1 - α) e_{x}^{-} + α \sum_{y \in E (g_{x})} \frac{w_{x y}}{\sum_{z \in E (g_{y})} w_{x z}} P R^{-} (g_{y})

(2)

The propagation process is stopped until the calculation converges or iteration times reach a fixed approximation threshold. In this study, after testing this process, a maximum of 300 iterations is set as the stopping criterion.

3.3.4. Calculation of Semantic Orientation

With the final values

P R^{+}

and

P R^{-}

, referred to Cruz et al. [63], semantic orientation

S O

of each candidate sentiment word is normalized as:

S O (g_{x}) = 5 \cdot \frac{P R^{+} (g_{x}) - P R^{-} (g_{x})}{P R^{+} (g_{x}) + P R^{-} (g_{x})}

(3)

Finally, we dropped the candidate sentiment words with a zero

S O

. Thus, the sentiment lexicon from hotel reviews consists of the words with nonzero

S O

. Let two-tuple

(g_{k}, S O_{k})

denote sentiment word

g_{k}

and the corresponding sentiment value

S O_{k}

, where

S O_{k} \in [- 5, 5]

and

k = 1, 2, \dots, m

, with

m

representing the number of words in the lexicon.

3.4. Sentiment Analysis of Attributes

3.4.1. Calculation of Sub-Attribute Sentiment Values

According to the principle of Lexicon-based methods to sentiment analysis, the polarity of a sentence can be obtained from the polarities of words in that sentence [62]. To obtain the sentiment value of each sub-attribute from different sub-datasets, we calculate the sentiment value of each sentence in different sub-datasets and record whether sub-attribute

B_{i j}

exists in that sentence. For a single dataset, let

β_{q}^{l} = (G_{q}^{l}, S O_{q}^{l})

be a two-tuple consisting of the

q t h

sentiment word

G_{q}^{l}

and corresponding sentiment value

S O_{q}^{l}

of the

l t h

sentence, where

l = 1, 2, \dots, L

, with

L

denoting the number of sentences in the dataset,

q = 1, 2, \dots, Q

, with

Q

denoting the number of sentiment words, and

G_{q}^{l}

belongs to the sentiment lexicon we created. Then, let

β^{l} = {β_{1}^{l}, β_{2}^{l}, \dots, β_{q}^{l}, \dots, β_{Q}^{l}}

be a set of pairs of sentiment words and the corresponding sentiment values in the lth sentence. For sub-attribute

B_{i j}

existing in the

l t h

sentence, the sentiment value of

B_{i j}

in the

l t h

sentence is calculated by the following

Equation (4)

:

S_{i j}^{l} = {\begin{matrix} | \sum_{q = 1}^{Q} S O_{q}^{l}, β^{l} \neq \emptyset \\ | 0, β^{l} = \emptyset \end{matrix}

(4)

where

l = 1, 2, \dots, L

, with

L

denoting the number of sentences in the dataset.

To improve the accuracy of sub-attributes sentiment polarities, it is important to take the intensifiers and negators into account since these words can affect the sentiment values [46,56]. The sentiment propagation for intensification and negation is described as follows.

Propagation 1: Intensification.

Intensifiers are linguistic terms that primarily combine with adjectives, as well as modify nouns, adverbs and verbs. These words serve to influence the strength of the sentiment word, enhancing or diminishing the sentiment strength. The most common way of identifying these valence shifters is using a list of words, such as adverbs and adjectives, associated with fixed values for intensifiers [100,101]. In this study, we used a list of intensifiers, adapted from Brooke, where each element is a modifier that emphasizes or attenuates words [102]. Let

γ_{r}

represent the shift value of intensifier

r

, where

r = 1, 2, \dots, R

. Following the above description of sentiment calculation, for a dataset, if there’re intensifiers existing in lth sentence, the sum of these shift values

γ_{l}^{s u m}

is calculated. If not, the

γ_{l}^{s u m}

is assigned zero. The propagation of

S_{i j}^{l}

is represented as:

{\overset{ˉ}{S}}_{i j}^{l} = S_{i j}^{l} + S_{i j}^{l} γ_{l}^{s u m}

(5)

where

l = 1, 2, \dots, L

, with

L

denoting the number of sentences in the dataset.

Propagation 2: Negation.

In sentiment analysis, negators are the words like “not” that cause negation. Negators could alter the meaning of a word, sentence or provide a negation context, like converting an affirmative statement into a negative statement. The most common way to process negators is attaching these terms to the nearest words [96]; i.e., in “This story is not interesting”, the word “interesting” is converted into “NOT-interesting”. In this processing method, negators are considered as polarity shifters of polar expressions that produce the opposite polarity. In other words, the polarity value was simply inverted if a polar expression fell within the negation scope [101]. Thus, as the term “perfect” assigned a positive sentiment value of

+ 4

, “NOT-perfect” has the sentiment value of

- 4

. However, some researchers hold the opinion that it is more reasonable to decrease the strength of sentiment words rather than directly invert them [96,102]. We use a list of negators, adapted from Brooke, where the negators are used as sentiment shifter with a default shift value of 4 [102]. If there’s at least one negator existing in the lth sentence, the negation propagation begins and is represented by

Equation (6)

:

{\hat{S}}_{i j}^{l} = {\begin{matrix} | {\overset{ˉ}{S}}_{i j}^{l} + 4, {\overset{ˉ}{S}}_{i j}^{l} < 0 \\ | 0, {\overset{ˉ}{S}}_{i j}^{l} = 0 \\ | {\overset{ˉ}{S}}_{i j}^{l} - 4, {\overset{ˉ}{S}}_{i j}^{l} > 0 \end{matrix}

(6)

where

l = 1, 2, \dots, L

, with

L

denoting the number of sentences in the dataset.

3.4.2. Calculation of Attribute Sentiment Values

For the purpose of ensuring that we get the pure positive sentiment value of attribute

A_{i}

in each positive dataset, only the positive sentiment value of each sub-attribute under attribute

A_{i}

is retained. In other words, the negative sub-attribute sentiment values in the positive dataset are re-assigned to zero, i.e., in the

6 th

sentence of five-star positive dataset

D_{5}^{p o s}

, the sentiment value of sub-attribute

B_{12}

is equal to

- 3

denoting

{\hat{S}}_{12}^{6} = - 3

, and then it should be re-assigned to zero. Similarly, the positive sub-attribute sentiment values in each negative dataset are re-assigned to zero. Let

{\tilde{S}}_{i j}^{p o s l}

indicate the re-assigned sentiment value of sub-attribute

B_{i j}

in the

l th

sentence of the positive dataset, and

{\tilde{S}}_{i j}^{n e g l}

indicate the re-assigned sentiment value of sub-attribute

B_{i j}

in the

l th

sentence of the negative dataset. These two concepts can be computed as follows:

{\tilde{S}}_{i j}^{p o s l} = {\begin{matrix} | 0, {\hat{S}}_{i j}^{l} < 0 \\ | {\hat{S}}_{j l}^{l}, {\hat{S}}_{i j}^{l} \geq 0 \end{matrix}

(7)

{\tilde{S}}_{i j}^{n e g l} = {\begin{matrix} | {\hat{S}}_{j l}^{l}, {\hat{S}}_{i j}^{l} < 0 \\ | 0, {\hat{S}}_{i j}^{l} \geq 0 \end{matrix}

(8)

where

l = 1, 2, \dots, L

, with

L

denoting the number of sentences in the dataset.

Given that the sub-attribute

B_{i j}

is the homonymsemantic similar word of attribute

A_{i}

but not exactly equal to

A_{i}

, it’s necessary to consider the semantic similarity between sub-attribute

B_{i j}

and attribute

A_{i}

. Let

S C_{i j}^{p o s}

indicate the positive sentiment value of sub-attribute

B_{i j}

under attribute

A_{i}

in the positive dataset, and

S C_{i j}^{n e g}

indicate the negative sentiment value of sub-attribute

B_{i j}

under attribute

A_{i}

in the negative dataset. Considering the semantic similarity between sub-attribute

B_{i j}

and attribute

A_{i}

,

S C_{i j}^{p o s}

and

S C_{i j}^{n e g}

are estimated as follows:

S C_{i j}^{p o s} = S S_{i j} \sum_{l = 1}^{L} {\tilde{S}}_{i j}^{p o s l}

(9)

S C_{i j}^{n e g} = S S_{i j} \sum_{l = 1}^{L} {\tilde{S}}_{i j}^{n e g l}

(10)

Finally, with sub-attribute overall sentiment values, the sentiment values of each attribute in different datasets can be calculated. The sentiment values of attribute

A_{i}

in the positive dataset and negative dataset are calculated respectively, as shown in Equations (11) and (12):

S C_{i}^{p o s} = \sum_{j = 1}^{P} S C_{i j}^{p o s}

(11)

S C_{i}^{n e g} = \sum_{j = 1}^{P} S C_{i j}^{n e g}

(12)

In addition, we also calculate the sentiment values of each attribute without re-assigned propagation for the following studies. The positive and negative datasets of the same hotel star are merged, and let

S C_{i}

represent the overall sentiment value of attribute

A_{i}

without discriminating positive and negative reviews, which is estimated as:

S C_{i} = \sum_{j = 1}^{P} S S_{i j} \sum_{l = 1}^{L^{'}} {\hat{S}}_{i j}^{l}

(13)

where

l = 1, 2, \dots, L^{'}

, with

L^{'}

indicating the total number of sentences in the review datasets of different hotel star ratings.

3.5. Kano-IPA Analysis

In this study, the Kano-IPA analysis contains three relevant parts. First, the six hotel attributes of each hotel star rating are classified into different categories in order to understand the effect of attribute performance on customer satisfaction. Second, we construct the IPA plot for hotels of different star ratings through analyzing the attributes’ importance and performance. Finally, the attribute priority rankings for improvement and resource allocation are given, so the different improvement strategies are provided for hotels of different star ratings. A detailed description of the Kano-IPA analysis is given as below.

3.5.1. Classifying Attributes into Kano Categories

In this study, a new approach to classify hotel attributes into Kano categories is proposed. As the above descriptions in our study, the positive sentiment value

S C_{i}^{p o s}

of attribute

A_{i}

is obtained from customers whose expectations toward hotel attribute

A_{i}

has been met or even exceeded. So

S C_{i}^{p o s}

of attribute

A_{i}

indicates the customer satisfaction that attribute

A_{i}

can bring when it performs well. Likewise, the negative sentiment value

S C_{i}^{n e g}

of attribute

A_{i}

is obtained from customers who think the attribute realistic performance hasn’t met their expectations, which represents customer dissatisfaction that attribute

A_{i}

causes when its performance is poor. The overall sentiment value

S C_{i}

of attribute

A_{i}

is obtained from all customers stayed in the hotels of the same star. Thus, the

S C_{i}

is regarded as the expectant customer satisfaction that attribute

A_{i}

should generate. In accordance with the obtained

S C_{i}^{p o s}

,

S C_{i}^{n e g}

and

S C_{i}

, following the previous index value classifying methods [10,24], here we define an index

S I

to compare the effects of the attributes’ good performance and poor performance on customer satisfaction in hotels of the same star rating, and the

S I

index of attribute

A_{i}

can be calculated as:

S I_{i} = \frac{S C_{i}^{p o s} - S C_{i}}{S C_{i} - S C_{i}^{n e g}}

(14)

Obviously,

S I_{i} \in [0, + \infty]

. The

S I

index indicates the ratio of the customer satisfaction of good performance to the customer dissatisfaction of poor performance comparing with the expectant customer satisfaction of the overall performance concerning attribute

A_{i}

. To determine the Kano category of each hotel attributes, a cut-off point

θ

is defined subjectively. According to the testing results based on different assignment methods in these review datasets, we define

θ = (S I^{M A X} - S I^{M I N}) / 6

, where

S I^{M A X}

and

S I^{M I N}

represent the largest and smallest values of the

S I

index among the six hotel attributes. Moreover, the mean of the

S I

index among the six hotel attributes is calculated, denoting

\overset{ˉ}{S} I

. Hence, hotel attributes can be classified into Kano categories as follows:

If

0 \leq S I_{i} < \overset{ˉ}{S} I - θ

, attribute

A_{i}

is regarded as basic factor, indicating attributes in this category bring more customer dissatisfaction compared to other attributes.

If

\overset{ˉ}{S} I - θ \leq S I_{i} \leq \overset{ˉ}{S} I + θ

, attribute

A_{i}

is regarded as performance factor, indicating attributes in this category bring equal or approximate customer satisfaction and dissatisfaction compared to other attributes.

If

S I_{i} > \overset{ˉ}{S} I + θ

, attribute

A_{i}

is regarded as excitement factor, indicating attributes in this category bring more customer satisfaction compared to other attributes.

3.5.2. Constructing the IPA Plot

In this section, we try to construct an IPA plot of the six attributes. From Section 4.4,

S C_{i}

indicating the overall performance of each attribute

A_{i}

, so our next task is to estimate the importance of each attribute. In this study, the term frequency-inverse document frequency (TF-IDF) algorithm is utilized to estimate the importance of each sub-attribute. TF-IDF is a statistical method, which is widely used to evaluate the relative importance of a word to a particular document in a set of documents or a corpus [35,103]. The term’s importance increases as it appears more frequently in the document, but at the same time, its importance decreases as the frequency it appears increases in the whole corpus. Based on TF-IDF algorithm, we defined

u_{i j}

indicating the weight of sub-attribute

B_{i j}

. As mentioned above, the sub-attribute

B_{i j}

is semanticly similar to the attribute

A_{i}

and the similarity

S S_{i j}

indicating the degree of semantic proximity. Therefore, we adopted the processing method of attribute importance from the study of Wang et al. [35], and the attribute importance is calculated as follows:

u_{i} = \frac{\sum_{j}^{P} u_{i j} * S S_{i j}}{\sum_{i}^{6} \sum_{j}^{P} u_{i j} * S S_{i j}}

(15)

With the performance and importance of each attribute, the IPA plot can be constructed. The IPA plot is drawn with importance on the vertical axis and performance on the horizontal axis, with the crosshair located inside based on the data-centered method [104], as shown in Figure 2. According to IPA, hotel managers should improve the attributes in Q IV and Q III in that order, maintain the attributes in Q I, and finally consider reducing investment for attributes in Q II [10,29].

3.5.3. Analyzing the Attribute Priority Rankings

Due to the limitation of hotel resource and efforts, the detailed priority rankings for resource allocation in the same quadrant still need to be determined. The Kano model indicates that the effect of attribute performance on customer satisfaction varies from different Kano categories. According to product lifecycle, the attributes of a product or service are regarded as excitement, performance and basic factors [32], which provides a guideline for resource allocation. Specifically, the basic factors should be given the first priority to fulfill, the performance factors should be put in the second order to fulfill, and the excitement factors are given the lower priority to fulfill [10,29]. Therefore, based on the integrated Kano-IPA model, the attribute priority rankings for resource allocation are as shown in Table 3.

4. Results and Discussion

4.1. Results of Sub-Attributes Selection

According to the process described in Section 3.2, sub-attributes and the corresponding similarity under each attribute are obtained from online reviews through Word2Vec algorithm. The sub-attributes of each attribute are sorted by the similarity values. Due to space limitations, we only show the top 10 similar sub-attributes with respect to the six attributes extracted from the five-star hotel reviews in Table 4. In Table 4, “Similarity” indicates the similarity between sub-attributes and the corresponding attribute. Considering the six attribute terms also appear in textual reviews, the six attribute terms are also considered as sub-attributes of themselves. For example, room is a sub-attribute of attribute room, and the similarity is 1. As results shown in Table 4, we find that some terms may be the sub-attributes of two or more attributes. For example, the similarity between the term “bed” and attribute room is 0.5867, meanwhile the similarity between the term “bed” and attribute sleep quality is 0.6537. That is, term “bed” is a sub-attribute of both attributes room and sleep quality. This observation is similar to the sub-attributes (or key factors) selection findings of Wang et al. [35] and Nie et al. [46], indicating that the scopes of different attributes may overlap.

4.2. Results of Sentiment Lexicon from Hotel Reviews

According to the process given in Section 3.3, the PolarityRank algorithm is employed to create a sentiment lexicon from the corpus composed of all textual reviews after preprocessing.

Based on the selecting criteria, the nouns, adjectives, verbs and adverbs with POS are selected as candidate sentiment words. To ensure the efficiency of the PolarityRank algorithm, the final list of candidate sentiment words is composed of words that exist in at least 30 reviews. A total of 13,933 candidate sentiment words and the co-occurrence frequency of any two nodes in the whole dataset are obtained. Subsequently, the initial positive and negative sentiment values of each candidate word are assigned based on SentiWordNet. The results of candidate sentiment words with POS, frequency and initial sentiment value are shown in Table 5. In Table 5, “Tag” indicates the POS of each candidate sentiment words, and “Number of Words” indicates the number of times that candidate sentiment words appear in the whole corpus.

Based on the PolarityRank algorithm,

P R^{+}

and

P R^{-}

of each candidate sentiment word can be estimated by Equations (1) and (2). The PolarityRank algorithm propagation process stopped until convergence. Additionally,

S O

of each candidate sentiment word can be calculated by Equation (3). According to the results, we can see the

S O

of some candidate sentiment words is equal to zero. The word with a zero

S O

is dropped because it is neutral without sentiment polarity. Finally, the sentiment lexicon composed of 5837 sentiment words with nonzero

S O

is created for attribute sentiment analysis. Due to space limitations, only the results of top 10 positive and negative sentiment words are shown in Table 6.

From the results of sentiment lexicon, we find that some words that may not be used in daily life, but express emotions are identified. These less-common words are identified from numerous user-generated data of hotel domain through PolarityRank algorithm. This sentiment lexicon preserves some terms particular to the hotel domain, and hence it is a preferable choice to be used for sentiment analysis of hotel attributes to ensure greater accuracy [67].

4.3. Results and Discussions of Attribute Bidirectional Performance

4.3.1. Results of Attribute Bidirectional Performance

With the obtained sentiment lexicon, the sentiment values of sub-attributes under each attribute in t-star negative dataset

D_{t}^{n e g}

and positive

D_{t}^{p o s}

are respectively obtained by Equations (4)–(10). Each sub-attribute has bidirectional performance, represented by positive sentiment value and negative sentiment value. The top 20 well-performed sub-attributes with the strongest positive sentiment polarity and top 20 poorly performed sub-attributes with the strongest negative sentiment polarity with respect to six attributes in five-star hotel reviews are shown in Table 7 and Table 8. The results indicate that the bidirectional performance of the same sub-attributes may affect customer satisfaction differently. For example, considering the sub-attribute “decorate” under attribute room, its positive sentiment value from positive reviews is 18,064.37, ranked 3, but on the other hand, the negative sentiment value from negative reviews is −61.57, ranked 17. That is, for “decorate”, customers tend to give much more praises when it performs well, whereas customers are probably not sensitive to its poor performance. The observation provides support for the existence of asymmetric relationship between attribute performance and customer satisfaction [24,72].

By Equations (11)–(13), the sentiment values of each attribute in hotel reviews of different star ratings are calculated. The positive, negative and overall sentiment values of each attribute are given in Table 9 according to the hotel star ratings. According to the negative sentiment values of six attributes in negative reviews, the ranking of poorly performed attributes is derived for hotels of each star rating. Hotels of three-stars, two stars and below have the same poorly performed attribute ranking, while hotels of four stars and five stars also have the same poorly performed attribute ranking. That is,

R o o m < C l e a n l i n e s s < L o c a t i o n < V a l u e < S e r v i c e < S l e e p q u a l i t y

in negative reviews of three-star, two-star and below hotels, and

R o o m < C l e a n l i n e s s < L o c a t i o n < S e r v i c e < V a l u e < S l e e p q u a l i t y

in negative reviews of four-star and five-star hotels. Similarly, according to the positive sentiment values of six attributes in positive reviews, the ranking of well-performed attributes is derived for hotels of each star rating. That is,

C l e a n l i n e s s > L o c a t i o n > R o o m > V a l u e > S e r v i c e > S l e e p q u a l i t y

in positive reviews of three-star, two-star and below hotels,

C l e a n l i n e s s > R o o m > L o c a t i o n > S e r v i c e > V a l u e > S l e e p q u a l i t y

in positive reviews of four-star hotels, and

C l e a n l i n e s s > R o o m > L o c a t i o n > S e r v i c e > S l e e p q u a l i t y > V a l u e

in positive reviews of five-star hotels.

4.3.2. Comparative Analysis of Attributes’ Bidirectional Performance

To better analyze the antecedents of both high and low customer ratings, the percentages of negative sentiment values and positive sentiment values concerning six attributes in terms of hotel star ratings are respectively calculated, shown in Figure 5 and Figure 6. Results show that room, cleanliness and location account for about 75% of the sum of attribute negative sentiment values, meanwhile these three attributes also account for about 75% of the sum of attribute positive sentiment values for hotels of each star rating. Room, cleanliness and location are core attributes of hotels, in line with some prior research [35,94,105,106]. This finding also implies the main contributors to high customer ratings and causes of low customer ratings are the same for hotels of each star rating, similar to the studies of Berezina et al. [94] and Kitsios et al. [107]. Value, service and sleep quality have less impact on both high and low customer ratings, contrary to some previous research. For instance, the study of Ban et al. [105] implied that intangible service has the greatest impact on customer satisfaction.

The results also imply that the percentage of positive/negative sentiment values concerning location, value and service fluctuate with hotel star rating. Thus, the effect of good/poor performance concerning location, value, service on high/low customer ratings varies across hotel star ratings. For location, its good/poor performance contributes less to customer satisfaction/dissatisfaction in four-star hotels than in other star hotels. For value, the percentages of both positive and negative sentiment values gradually drop with the increase in hotel star ratings above three-star hotels. That is, for value, poor performance in high-star (four-star and five-star) hotels does not cause as many complaints as in low-star (three-star and below) hotels, and good performance brings less satisfaction for customers of high-star hotels. This finding is consistent with common sense that customers who choose low-star hotels lay greater emphasis on value for money [108], and customers in high-star hotels may take it for granted when value performs well because they spend more [10]. On the contrary, for service, good/poor performance contributes markedly more to high/low customer ratings in high-star hotels than in low-star hotels. The results show the same finding as earlier studies which showed that the effect of service’s poor performance on customer dissatisfaction increases with the improvement of hotel level and luxury (i.e., four–five-star ratings) hotel customers emphasize good service [10,21]. Moreover, it is observed that the good performance of sleep quality has the potential to be the incentive for high customer ratings in five-star hotels.

By comparing the lines in Figure 5 and Figure 6 for one certain attribute, it can be found that the impact of the bidirectional performance concerning one attribute on high/low customer ratings is different. For room and sleep quality, the effect of their poor performance on low customer ratings is stronger than the effect of their good performance on high customer ratings. On the contrary, for cleanliness, location and service, the effect of their good performance on high customer ratings is stronger than the effect of their poor performance on high customer ratings. For value, the effect of its good performance on high customer ratings is stronger than the effect of its poor performance on low customer ratings in low-star hotels, while quite the opposite is true for high-star hotels. Therefore, the results indicate that the effect of attribute performance on customer ratings is asymmetric, consistent with many previous studies [8,10,23,24,71,76]. Furthermore, the asymmetric effect of values’ performance on customer ratings is different between high-star and low-star hotels.

4.4. Results and Discussions of Kano-IPA Analysis

4.4.1. Attribute Classification Based on the Kano Model

According to the obtained sentiment values and Equation (14), the

S I

values of six attributes concerning four hotel star ratings can be calculated and further the six attributes are classified into three Kano categories, as shown in Table 10. The final classification of attribute categories is basically consistent with the relative effect of each attribute on customer satisfaction for hotels of the same star rating. On the whole, the categories of all attributes except value vary across different hotel star ratings.

Specifically, value is always classified as an excitement factor, indicating that value can bring more satisfaction when it performs well regardless of hotel star ratings. Unlike the study of Bi et al. [10], this study shows that value is an excitement factor, providing support for the finding (value and price is the attractive factor for four–five-star hotels) of Chiang et al. [13]. Location is classified as a performance factor in hotels of three stars and below and is classified as an excitement factor in hotels of four stars and five stars. Compared with hotels of three stars and below, the good performance of location can bring more customer satisfaction for four-star and five-star hotels. Luxury hotel customers are willing to pay more for a convenient location [21]. Thus, customers in high-star hotels will be very satisfied when the performance of location, which is the core requirement, exceeds their expectations. Service and sleep quality, showing the same change with the increase in hotel star, are classified as basic factors in hotels of three stars and below, and are classified as performance factors in hotels of four and five stars. Thus, customers in hotels of three stars and below may not be sensitive to the good performance of service and sleep quality, but they are dissatisfied when the performance of service and sleep quality is poor. Meanwhile, customers in hotels of four and five stars are sensitive to the bidirectional performance of service and sleep quality. Room is classified as an excitement factor in hotels of three stars and below and is classified as a performance factor in hotels of four and five stars. The good performance of room can bring customer satisfaction in hotels of each star rating, while poor performance of room can bring more customer satisfaction in hotels of four and five stars than in hotels of three stars and below. Cleanliness is classified as a performance factor in hotels of two stars and below and is classified as a basic factor in hotels of three, four and five stars. This result indicates that customers in two-star and below hotels may feel satisfied when the room is clean, but customers in other star hotels take the good performance of cleanliness for granted. These findings are different to the study of Bi et al. [10] who classified service, sleep quality, room and cleanliness as basic factors in hotels of each type.

4.4.2. The IPA Plot

Based on the TF-IDF algorithm, we obtained the weight of each sub-attribute with respect to the six attributes concerning four hotel star ratings. Then the importance of the six attributes concerning four hotel star ratings was calculated respectively by Equation (15), as shown in Table 11. On the whole, the importance of value, service and sleep quality varies across the hotel star ratings, while other attributes’ importance fluctuates slightly and are considered as very important for all hotels. This finding is consistent with previous research that revealed that customers of high-star hotels are more likely to value some intangible attributes (i.e., service and sleep quality) [40]. Specifically, with the increase in hotel stars, the importance of value decreases, while the importance of service and sleep quality increases. In other words, customers who select high-star hotels pay more attention to service and sleep quality, and consider value as less important. On the contrary, customers who choose low-star hotels highly emphasize value, but consider service and sleep quality as less important. The importance of value shows a significant downward trend with the improvement in hotel stars, coinciding with Zhao’s [108] research.

With the obtained importance and performance of each attribute concerning four hotel star ratings, the IPA plots can be constructed, as shown in Figure 7. Location and cleanliness are located in Q I in hotels of all stars, which indicates that location and cleanliness should be well remained for their high importance and performance. Value, service and sleep quality are located in Q III in hotels of all stars, with low importance and performance, so they are of low priority for improvement. In contrast, room is located in Q IV in hotels of three stars, two stars and below, urgent to be improved, while it is located in Q I in four and five star hotels, indicating it is the hotels’ strength.

4.4.3. Suggestions for Attribute Improvement and Priority

With the obtained performance and importance of the six attributes, the attribute priority rankings for resource allocation concerning four hotel star ratings are obtained by integrating the Kano categories of six attributes with the IPA plot, as shown in Table 12. The attribute priority rankings are divided into two groups, namely,

R o o m > S e r v i c e > S l e e p q u a l i t y > V a l u e > C l e a n l i n e s s > L o c a t i o n

for low-star (three stars, two stars and below) hotels, and

S e r v i c e > S l e e p q u a l i t y > V a l u e > C l e a n l i n e s s > R o o m > L o c a t i o n

for high-star (five-star and four-star) hotels.

For low-star hotels, room (an excitement factor) is of the first priorities to get improved since it is very important, but it performs poorly from the perspective of customers. According to sub-attributes results, some effective measures can be taken to improve the room performance, such as paying attention to improving the facilities, tidiness, room size and soundproofing. The importance and performance of service, sleep quality and value are low. Service and sleep quality are basic factors which can cause numerous complaints when they perform poorly, while value is an excitement factor which can generate more customer satisfaction when it performs well. Service’s importance is higher than sleep quality, so service and sleep quality are respectively given the second and third priority for resource allocation for improvement. Regarding service, improving staff’s skill and attitude is important, and more professional, friendly and polite staff are needed. Moreover, hotel managers should pay attention to beds, pillows and soundproofing facilities to improve customers’ sleep quality. Value is given the fourth priority for improvement. The importance of value is significantly higher in low-star hotels than in high-star hotels, which indicates that customers who choose low-star hotels are more likely to emphasize value for money. Offering a variety of discounts, reasonable price and member reward is helpful to enhance customer satisfaction. Lastly, cleanliness and location are the strengths of hotels, which should be well maintained. Considering cleanliness is considered more important than location, cleanliness is given the fifth priority for resource allocation to get improved.

For high-star hotels, there are no attributes that should be improved urgently. However, it is still necessary to invest resources and effort in service, sleep quality and value. Service is given the highest priority for resource allocation for improvement. From Figure 7, it can be concluded that service performs much better in five-star hotels than other hotels and its importance gradually increases, but it has not been the strength of five-star hotels yet. Unlike common service aspects which should be strengthened for low-star hotels, some advanced service aspects need to be improved. For example, improving staffing levels, providing more proactive, pet-friendly and infant-related service, multilingual receptionists, etc., are preferable ways to obtain more customer satisfaction. Sleep quality (a performance factor) and value (an excitement factor) are given the second and third priorities, respectively, and their importance is very low. If possible, investing resources in improving sleep quality and value (i.e., improving bedding and soundproofing facilities, offering discounts and reasonable prices) can also improve hotel customer ratings. Cleanliness, room and location are the strengths of high-star hotels, and these attributes should be well maintained. Especially, room is a unique strength for high-star hotels, while it is the weakness of low-star hotels. This result is consistent with the hotel star rating system offered by the Automobile Association that good performance of room is a must-be requirement for hotels to be rated as high-star [109]. According to attribute categories, cleanliness, room and location are prioritized in order for resource allocation because they are basic, performance and excitement factors, respectively.

5. Conclusions

5.1. Theoretical Implications

This study explored the attribute bidirectional performance by dividing online reviews into positive reviews and negative reviews. The Kano-IPA model was used for further understanding of customer’s rating behaviors and demands for hotel service. The proposed methodology in five phases of sentiment analysis and Kano-IPA model enriches the research on online hotel reviews. The main theoretical contributions introduced are as follows:

First, this study explores the well-performed attributes contributing to high customer ratings and the poorly performed attributes causing low customer ratings. By dividing 1,090,341 online reviews into positive and negative reviews, the six attributes’ good performance (positive sentiment values) in positive reviews and poor performance (negative sentiment values) in negative reviews are calculated through sentiment analysis. Our findings suggest that room, cleanliness and location are the most crucial determinants of both high and low customer ratings for hotels of these four levels. By contrast, other attributes, including value, service and sleep quality, have less impact on customers’ rating behaviors. Therefore, the most crucial hotel attributes influencing customer satisfaction and dissatisfaction are exactly the same. Focusing on improving service quality of these general attributes including room, cleanliness and location is the key to win high customer ratings for all hotels. Thus, the effect of good/poor performance concerning location, value, service on high/low customer ratings varies across hotel star ratings.

Second, comparative analysis of attribute bidirectional performance concerning four hotel star ratings was conducted to verify the difference of hotel attributes contributing to high/low customer ratings among different hotel star ratings. This study indicates that the impact of several attributes on high/low customer ratings varies across different star hotels. On one hand, the impact of value and service’s poor performance on low customer ratings varies across hotel star ratings. With the improvement in the hotel level, the impact of value’s poor performance on low customer ratings shows a downward trend, while the impact of service’s poor performance on low customer ratings shows an upward trend. For three-star and below hotels, value’s poor performance contributes more to low customer ratings than service’s poor performance. In contrast, for four and five star hotels, service’s poor performance has greater impact on low customer ratings. On the other hand, the good performance in room, location, value, service and sleep quality contributes to high customer ratings differently among different star hotels, where the impact of value and service’s good performance on high customer ratings shows a larger range of changes. Interestingly, for value and service, with the improvement in the hotel level, the impact of their good performance on high customer ratings shows the same trend as the impact of their poor performance on low customer ratings. These findings indicate that customers’ expectations and perceptions on the good/poor performance of each attribute may vary across hotel star ratings. Thus, it is necessary to take hotel star ratings into consideration on customer satisfaction research.

Third, this study suggests that the effect of good performance on high customer ratings may not be equal to the effect of poor performance on low customer ratings for the same hotel attribute. In other words, the effect of attribute performance on customer satisfaction is asymmetric. For this reason, the Kano-IPA model was applied to better understand customer’s rating behaviors and demands for hotel service. The Kano categories of five attributes (location, service, room, cleanliness and sleep quality) vary across different hotel star ratings. Furthermore, suggestions on priority for attribute improvement are formulated for hotels of the four star ratings according to the results of Kano-IPA model.

Fourth, this study proposes a methodology for hotel attribute sentiment analysis based on the automated textual analysis techniques including the Word2Vec and PolarityRank algorithms. A new sentiment lexicon was created from user generated reviews based on the PolarityRank algorithm, contributing to sentiment analysis in the hotel domain. The advance in the sentiment lexicon creation contains the following two points. On the one hand, we adopted more words (i.e., adverbs) than existing studies for PolarityRank propagation [63,96], which avoids missing some important sentiment words. On the other hand, initial both positive and negative sentiment values of each candidate sentiment word are assigned by a function from SentiWordNet instead of assigning positive seed words and negative seed words sentiment values manually, which is considered more objective and trustworthy. In addition, to our best knowledge, our sentiment lexicon built from the 1,090,341 textual reviews is the instructive application of the PolarityRank algorithm in million-level datasets. Thus, the comprehensive and complete sentiment propagation provides a guarantee of more precise sentiment calculation.

Lastly, this study proposed a novel index approach for Kano model classification and further makes it possible to apply the Kano-IPA model to numerous textual reviews. The

S I

index is defined to represent the satisfaction-stimulating ability of any one hotel attribute. Then the six attributes are classified into three Kano categories by comparing each

S I

index with the average index value for hotels of each star rating. The proposed approach enriches the existing research on the classification of the Kano model. Additionally, based on the TF-IDF algorithm, the importance of each attribute is obtained to construct the IPA plot. This study is a preferable attempt to apply online reviews to explore the effects of attribute performance on customer satisfaction to understand customers’ rating behaviors.

5.2. Practical Implications

As consumers’ reliance on the Internet grows, online reviews are increasingly important since customers usually browse a lot of hotel reviews when making hotel choices. It is important to analyze how hotel attributes contribute to high and low customer ratings. This study enables hotel managers and hotel online platforms to understand customers’ rating behaviors, expectations and perceptions on hotel attributes. Furthermore, our findings and discussions provide a reference for hotel managers to allocate resources for attribute improvement and prioritization to achieve higher customer ratings.

First, due to the findings that the final attribute priority rankings for improvement are divided into two groups, two strategies for attribute improvement are given to low-star (three stars and below) and high-star (four- and five-star) hotels, respectively. For low-star hotels, room, which is an excitement factor, should be given the highest priority for resource allocation for improvement. Effective measures such as refurnishing, renovating, providing tidy and spacious rooms and proper decoration could be taken to improve room’s performance in order to enhance customer satisfaction. Service, sleep quality and value are of lower priority for improvement, and they are basic, basic, and excitement factors, respectively. Some effective measures should be taken to enhance the performance of service and sleep quality in order to reduce customer dissatisfaction, which might include, for instance, staff training for work skill and attitude improvement, quality improvement in beds, pillows and soundproofing. With sufficient resource, low-star hotel managers should also provide attractive discounts or reasonable prices to customers since value for money is highly important for them. For high-star hotels, though nothing calls for urgent improvement, there still a need for better performance in service, sleep quality and value. Service and sleep quality are performance factors, and their importance is significantly higher for customers in high-star hotels. Service improvement (i.e., higher staffing levels, proactive, pet-friendly and infant-related services and multilingual receptionists) and providing better sleeping conditions (i.e., better bedding and soundproofing) are preferable methods to enhance customer satisfaction. Moreover, providing proper discounts and price for customers is also needed.

Second, some strengths should be well maintained for different star hotels. For low-star hotel managers, cleanliness and location are the strengths to win customer satisfaction. Since cleanliness and location are performance or basic factors and of high importance for customers in low-star hotels, it is necessary to invest sufficient resource to ensure their high quality. For high-star hotel managers, cleanliness, room and location are the strengths that need to be well maintained. In contrast to customers in low-star hotels, cleanliness and location are, respectively, basic and excitement factors for customers in high-star hotels. Investing more in hotel location is a preferable way for high-star hotels to enhance customer satisfaction. While it is hard to transform the existing locations, some convenient transportation services can be offered to improve access to attractions or traffic stations, such as free shuttles, attraction brochures. Additionally, room is a unique strength for high-star hotels, while it is a weakness of low-star hotels. These findings are in line with the hotel star rating system offered by the Automobile Association that room is a basic and quantitative indicator for hotel star rating [109]. Therefore, hotel managers should pay great attention to room improvement for higher star ratings.

Third, this study indicates that attribute improvement priorities are the same for hotels of three stars, two stars and below. However, compared with two-star and below hotels, service and sleep quality’s importance is higher but performance is worse in three-star hotels. Service and sleep quality are basic factors, so their poor performance is more likely to cause great customer dissatisfaction. Customers pay more for a better hotel, so their expectations increase [110]. Thus, three-star hotel managers should pay more attention to improve performance in service and sleep quality to reduce customer dissatisfaction, and further enhance the competitive strengths against two-star and below hotels. Similarly, five-star hotel managers should keep alert for the pursuit of higher service quality since the SI index values of location, service, room, cleanliness and sleep quality show a downward trend compared to four-star hotels. This can be explained as follows: customers place much higher expectations on five-star hotels than four-star hotels, so very minor service failures can also cause great complaints. Compared to four-star hotels, investing resources to provide customers more attentive service and better sleep quality is necessary for five-star hotels.

Last but not least, for hotel online platforms, two aspects of practical significance are as follows. On the one hand, this study serves as references for online websites to recommend hotels to customers when they filter hotel star ratings. Our findings imply that customers have different expectations, preferences and demands for the six attributes when they choose hotels of different star ratings. Thus, different weights assigned to each hotel attribute according to hotel star ratings can be considered when designing the hotel recommendation system. On the other hand, we suggest that the six evaluation dimensions on the website should be upgraded. For example, considering the sub-attribute lists of room and cleanliness are similar, they can be merged into one dimension or given some notes for each attribute to help customers to distinguish between them.

5.3. Limitations

This study also has several limitations, which might serve as avenues for future research. First, the data were collected from one online travel website, which may not provide the complete information about customers’ opinions. In addition, not all customers write textual reviews and give ratings to the hotels after leaving. Therefore, hotel reviews can be collected from multiple online websites and customers who book hotels offline. Second, although this study explores the differences in the categories and performance of six attributes across four hotel star ratings, attribute differences between different traveling purposes or different regions may exist. Customers with different traveling purposes and from different districts have different preferences on hotel attributes, which may influence attribute performance and further influence attribute classification in the Kano model. In the future, classifying online reviews based on other methods involves complex research. Third, for each hotel, its star rating may move up or down when the hotel makes some changes such as redecoration, management mode upgrades or becoming run-down. Although the cost of improving hotel star ratings is very high, some hotels may attempt to make efforts for higher star ratings. As a result, for some hotels, earlier online reviews may not reveal their quality appropriately in the current star ratings. This will affect the attribute bidirectional performance analysis results among different hotel star ratings. Thus, it is preferable to select online reviews during the current star rating period or exclude the hotels with changes in star ratings in the future research. Additionally, exploring the difference in determinants of customer satisfaction and dissatisfaction between the previous and current hotel star ratings is a future research direction. Finally, the attributes used in this study are the six evaluation dimensions on TripAdvisor.com, which may not include all topics expressed in textual reviews. To comprehensively understand customer demands, different categories of attributes can be extracted from textual reviews in future research.

Author Contributions

Supervision, conceptualization, project administration, resources, writing—review and editing, Y.C.; conceptualization, methodology, software, investigation, data curation, visualization, writing—original draft preparation, Y.Z.; supervision, conceptualization, methodology, formal analysis, validation, funding acquisition, S.Y.; investigation, data curation, visualization, Y.X.; investigation, data curation, software, S.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 71901151).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ahani, A.; Nilashi, M.; Yadegaridehkordi, E.; Sanzogni, L.; Tarik, A.R.; Knox, K.; Samad, S.; Ibrahim, O. Revealing customers’ satisfaction and preferences through online review analysis: The case of Canary Islands hotels. J. Retail. Consum. Serv. 2019, 51, 331–343. [Google Scholar] [CrossRef]
Schuckert, M.; Liu, X.; Law, R. Hospitality and Tourism Online Reviews: Recent Trends and Future Directions. J. Travel. Tour. Mark. 2015, 32, 608–621. [Google Scholar] [CrossRef]
Yang, Y.; Park, S.; Hu, X. Electronic word of mouth and hotel performance: A meta-analysis. Tour. Manag. 2018, 67, 248–260. [Google Scholar] [CrossRef]
Ye, Q.; Law, R.; Gu, B.; Chen, W. The influence of user-generated content on traveler behavior: An empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Comput. Hum. Behav. 2011, 27, 634–639. [Google Scholar] [CrossRef]
Gavilan, D.; Avello, M.; Martinez-Navarro, G. The influence of online ratings and reviews on hotel booking consideration. Tour. Manag. 2018, 66, 53–61. [Google Scholar] [CrossRef]
Gao, B.; Li, X.; Liu, S.; Fang, D. How power distance affects online hotel ratings: The positive moderating roles of hotel chain and reviewers’ travel experience. Tour. Manag. 2018, 65, 176–186. [Google Scholar] [CrossRef]
Chatterjee, S. Drivers of helpfulness of online hotel reviews: A sentiment and emotion mining approach. Int. J. Hosp. Manag. 2020, 85, 102356. [Google Scholar] [CrossRef]
Chen, L.-F. Exploring asymmetric effects of attribute performance on customer satisfaction using association rule method. Int. J. Hosp. Manag. 2015, 47, 54–64. [Google Scholar] [CrossRef]
Liu, Y.; Teichert, T.; Rossi, M.; Li, H.; Hu, F. Big data for big insights: Investigating language-specific drivers of hotel satisfaction with 412,784 user-generated reviews. Tour. Manag. 2017, 59, 554–563. [Google Scholar] [CrossRef]
Bi, J.-W.; Liu, Y.; Fan, Z.-P.; Zhang, J. Exploring asymmetric effects of attribute performance on customer satisfaction in the hotel industry. Tour. Manag. 2020, 77, 104006. [Google Scholar] [CrossRef]
Cheng, C.-C.; Chen, C.-T. Creating excellent and competitive motels services. Int. J. Contemp. Hosp. Manag. 2018, 30, 836–854. [Google Scholar] [CrossRef]
Beheshtinia, M.A.; Farzaneh Azad, M. A fuzzy QFD approach using SERVQUAL and Kano models under budget constraint for hotel services. Total Qual. Manag. Bus. Excell. 2017, 30, 808–830. [Google Scholar] [CrossRef]
Chiang, C.-F.; Chen, W.-Y.; Hsu, C.-Y. Classifying technological innovation attributes for hotels: An application of the Kano model. J. Travel. Tour. Mark. 2019, 36, 796–807. [Google Scholar] [CrossRef]
Xu, X.; Li, Y. The antecedents of customer satisfaction and dissatisfaction toward various types of hotels: A text mining approach. Int. J. Hosp. Manag. 2016, 55, 57–69. [Google Scholar] [CrossRef]
Zhao, M.; Zhang, C.; Hu, Y.; Xu, Z.; Liu, H. Modelling Consumer Satisfaction Based on Online Reviews Using the Improved Kano Model from the Perspective of Risk Attitude and Aspiration. Technol. Econ. Dev. Econ. 2021, 27, 550–582. [Google Scholar] [CrossRef]
Fong, L.H.N.; Lei, S.S.I.; Law, R. Asymmetry of Hotel Ratings on TripAdvisor: Evidence from Single-Versus Dual-Valence Reviews. J. Hosp. Mark. Manag. 2016, 26, 67–82. [Google Scholar] [CrossRef]
Filieri, R.; McLeay, F.; Tsui, B.; Lin, Z. Consumer perceptions of information helpfulness and determinants of purchase intention in online consumer reviews of services. Inf. Manag. 2018, 55, 956–970. [Google Scholar] [CrossRef]
Kirilenko, A.P.; Stepchenkova, S.O.; Dai, X. Automated topic modeling of tourist reviews: Does the Anna Karenina principle apply? Tour. Manag. 2021, 83, 104241. [Google Scholar] [CrossRef]
Dinçer, M.Z.; Alrawadieh, Z. Negative Word of Mouse in the Hotel Industry: A Content Analysis of Online Reviews on Luxury Hotels in Jordan. J. Hosp. Mark. Manag. 2017, 26, 785–804. [Google Scholar] [CrossRef]
Vásquez, C. Complaints online: The case of TripAdvisor. J. Pragmat. 2011, 43, 1707–1717. [Google Scholar] [CrossRef] [Green Version]
Zhang, Z.; Ye, Q.; Law, R. Determinants of hotel room price: An exploration of travelers’ hierarchy of accommodation needs. Int. J. Contemp. Hosp. Manag. 2011, 23, 972–981. [Google Scholar] [CrossRef]
Francesco, G.; Roberta, G. Cross-country analysis of perception and emphasis of hotel attributes. Tour. Manag. 2019, 74, 24–42. [Google Scholar] [CrossRef]
Slevitch, L.; Oh, H. Asymmetric relationship between attribute performance and customer satisfaction: A new perspective. Int. J. Hosp. Manag. 2010, 29, 559–569. [Google Scholar] [CrossRef]
Albayrak, T.; Caber, M. The symmetric and asymmetric influences of destination attributes on overall visitor satisfaction. Curr. Issues Tour. 2013, 16, 149–166. [Google Scholar] [CrossRef]
Alegre, J.; Garau, J. The Factor Structure of Tourist Satisfaction at Sun and Sand Destinations. J. Travel. Res. 2009, 50, 78–86. [Google Scholar] [CrossRef]
Füller, J.; Matzler, K. Customer delight and market segmentation: An application of the three-factor theory of customer satisfaction on life style groups. Tour. Manag. 2008, 29, 116–126. [Google Scholar] [CrossRef]
Chang, K.-C.; Chen, M.-C. Applying the Kano model and QFD to explore customers’ brand contacts in the hotel business: A study of a hot spring hotel. Total Qual. Manag. Bus. Excell. 2011, 22, 1–27. [Google Scholar] [CrossRef]
Kuo, C.-M.; Chen, H.-T.; Boger, E. Implementing City Hotel Service Quality Enhancements: Integration of Kano and QFD Analytical Models. J. Hosp. Mark. Manag. 2015, 25, 748–770. [Google Scholar] [CrossRef]
Kuo, Y.-F.; Chen, J.-Y.; Deng, W.-J. IPA–Kano model: A new tool for categorising and diagnosing service quality attributes. Total Qual. Manag. Bus. Excell. 2012, 23, 731–748. [Google Scholar] [CrossRef]
Lai, I.K.W.; Hitchcock, M. A comparison of service quality attributes for stand-alone and resort-based luxury hotels in Macau: 3-Dimensional importance-performance analysis. Tour. Manag. 2016, 55, 139–159. [Google Scholar] [CrossRef]
Jou, R.-C.; Day, Y.-J. Application of Revised Importance–Performance Analysis to Investigate Critical Service Quality of Hotel Online Booking. Sustainability 2021, 13, 2043. [Google Scholar] [CrossRef]
Tseng, C.C. An IPA-Kano model for classifying and diagnosing airport service attributes. Res. Transp. Bus. Manag. 2020, 37, 100499. [Google Scholar] [CrossRef]
Pai, F.-Y.; Yeh, T.-M.; Tang, C.-Y. Classifying restaurant service quality attributes by using Kano model and IPA approach. Total Qual. Manag. Bus. Excell. 2016, 29, 301–328. [Google Scholar] [CrossRef]
Martilla, J.A.; James, J.C. Importance-performance analysis. J. Mark 1977, 41, 77–79. [Google Scholar] [CrossRef]
Wang, L.; Wang, X.; Peng, J.; Wang, J. The differences in hotel selection among various types of travellers: A comparative analysis with a useful bounded rationality behavioural decision support model. Tour. Manag. 2020, 76, 103961. [Google Scholar] [CrossRef]
Nicolau, J.L.; Mellinas, J.P.; Martín-Fuentes, E. The halo effect: A longitudinal approach. Ann. Tour. Res. 2020, 83, 102938. [Google Scholar] [CrossRef]
Zhao, Y.; Xu, X.; Wang, M. Predicting overall customer satisfaction: Big data evidence from hotel online textual reviews. Int. J. Hosp. Manag. 2019, 76, 111–121. [Google Scholar] [CrossRef]
Jannach, D.; Zanker, M.; Fuchs, M. Leveraging multi-criteria customer feedback for satisfaction analysis and improved recommendations. J. Inf. Technol. Tour. 2014, 14, 119–149. [Google Scholar] [CrossRef]
Sharma, H.; Tandon, A.; Kapur, P.K.; Aggarwal, A.G. Ranking hotels using aspect ratings based sentiment classification and interval-valued neutrosophic TOPSIS. Int. J. Syst. Assur. Eng. Manag. 2019, 10, 973–983. [Google Scholar] [CrossRef]
Guo, Y.; Barnes, S.J.; Jia, Q. Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent Dirichlet allocation. Tour. Manag. 2017, 59, 467–483. [Google Scholar] [CrossRef] [Green Version]
Hu, N.; Zhang, T.; Gao, B.; Bose, I. What do hotel customers complain about? Text analysis using structural topic model. Tour. Manag. 2019, 72, 417–426. [Google Scholar] [CrossRef]
Philander, K.; Zhong, Y. Twitter sentiment analysis: Capturing sentiment from integrated resort tweets. Int. J. Hosp. Manag. 2016, 55, 16–24. [Google Scholar] [CrossRef]
Yadav, M.L.; Roychoudhury, B. Effect of trip mode on opinion about hotel aspects: A social media analysis approach. Int. J. Hosp. Manag. 2019, 80, 155–165. [Google Scholar] [CrossRef]
Bi, J.-W.; Liu, Y.; Fan, Z.-P.; Zhang, J. Wisdom of crowds: Conducting importance-performance analysis (IPA) through online reviews. Tour. Manag. 2019, 70, 460–478. [Google Scholar] [CrossRef]
Al-Smadi, M.; Talafha, B.; Al-Ayyoub, M.; Jararweh, Y. Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int. J. Mach. Learn. Cybern. 2018, 10, 2163–2175. [Google Scholar] [CrossRef]
Nie, R.; Tian, Z.; Wang, J.; Chin, K.S. Hotel selection driven by online textual reviews: Applying a semantic partitioned sentiment dictionary and evidence theory. Int. J. Hosp. Manag. 2020, 88, 102495. [Google Scholar] [CrossRef]
Ravi, K.; Ravi, V. A survey on opinion mining and sentiment analysis: Tasks, approaches and applications. Knowl. Based. Syst. 2015, 89, 14–46. [Google Scholar] [CrossRef]
Fang, X.; Zhan, J. Sentiment analysis using product review data. J. Big Data 2015, 2, 1–14. [Google Scholar] [CrossRef] [Green Version]
Jansen, B.J.; Zhang, M.; Sobel, K.; Chowdury, A. Twitter power: Tweets as electronic word of mouth. J. Am. Soc. Inf. Sci. Technol. 2009, 60, 2169–2188. [Google Scholar] [CrossRef]
Bowden, J.; Mirzaei, A. Consumer engagement within retail communication channels: An examination of online brand communities and digital content marketing initiatives. Eur. J. Mark. 2021, 55, 1411–1439. [Google Scholar] [CrossRef]
Almjawel, A.; Bayoumi, S.; Alshehri, D.; Alzahrani, S.; Alotaibi, M. Sentiment analysis and visualization of amazon books’ reviews. In Proceedings of the 2019 2nd International Conference on Computer Applications & Information Security (ICCAIS), Riyadh, Saudi Arabia, 19–21 March 2019; pp. 1–6. [Google Scholar]
Liang, T.-P.; Li, X.; Yang, C.-T.; Wang, M. What in Consumer Reviews Affects the Sales of Mobile Apps: A Multifacet Sentiment Analysis Approach. Int. J. Electron. Commer. 2015, 20, 236–260. [Google Scholar] [CrossRef]
Kharde, V.A.; Sonawane, S.S. Sentiment Analysis of Twitter Data: A Survey of Techniques. Int. J. Comput. Appl. 2016, 139, 5–15. [Google Scholar]
González-Rodríguez, M.R.; Martínez-Torres, R.; Toral, S. Post-visit and pre-visit tourist destination image through eWOM sentiment analysis and perceived helpfulness. Int. J. Contemp. Hosp. Manag. 2016, 28, 2609–2627. [Google Scholar] [CrossRef]
Geetha, M.; Singha, P.; Sinha, S. Relationship between customer sentiment and online customer ratings for hotels—An empirical analysis. Tour. Manag. 2017, 61, 43–54. [Google Scholar] [CrossRef]
Araque, O.; Corcuera-Platas, I.; Sánchez-Rada, J.F.; Iglesias, C.A. Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert. Syst. Appl. 2017, 77, 236–246. [Google Scholar] [CrossRef]
Yoshikawa, Y.; Iwata, T.; Sawada, H. Latent support measure machines for bag-of-words data classification. In Advances in Neural Information Processing Systems 27; Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2014; pp. 1961–1969. [Google Scholar]
Taboada, M.; Brooke, J.; Tofiloski, M.; Voll, K.; Stede, M. Lexicon-Based Methods for Sentiment Analysis. Comput. Linguist. 2011, 37, 267–307. [Google Scholar] [CrossRef]
Basarslan, M.S.; Kayaalp, F. Sentiment Analysis with Machine Learning Methods on Social Media. ADCAIJ Adv. Dist. Comput. Artif. Intell. J. 2020, 9, 5–15. [Google Scholar] [CrossRef]
Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern. Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef]
Ghiassi, M.; Lee, S. A domain transferable lexicon set for Twitter sentiment analysis using a supervised machine learning approach. Expert. Syst. Appl. 2018, 106, 197–216. [Google Scholar] [CrossRef]
Hogenboom, A.; Heerschop, B.; Frasincar, F.; Kaymak, U.; de Jong, F. Multi-lingual support for lexicon-based sentiment analysis guided by semantics. Decis. Support Syst. 2014, 62, 43–53. [Google Scholar] [CrossRef] [Green Version]
Cruz, F.L.; Vallejo, C.G.; Enrı´quez, F.; Troyano, J.A. PolarityRank: Finding an equilibrium between followers and contraries in a network. Inf. Process Manag. 2012, 48, 271–282. [Google Scholar] [CrossRef]
Turney, P.D.; Littman, M.L. Measuring praise and criticism. ACM Trans. Inf. Syst. 2003, 21, 315–346. [Google Scholar] [CrossRef] [Green Version]
Abdulla, N.A.; Ahmed, N.A.; Shehab, M.A.; Al-Ayyoub, M. Arabic sentiment analysis: Lexicon-based and corpus-based. In Proceedings of the 2013 IEEE Jordan conference on applied electrical engineering and computing technologies (AEECT), Amman, Jordan, 3–5 December 2013. [Google Scholar]
Dey, A.; Jenamani, M.; Thakkar, J.J. Senti-N-Gram: An ngram lexicon for sentiment analysis. Expert. Syst. Appl. 2018, 103, 92–105. [Google Scholar] [CrossRef]
Sanagar, S.; Gupta, D. Unsupervised Genre-Based Multidomain Sentiment Lexicon Learning Using Corpus-Generated Polarity Seed Words. IEEE Access. 2020, 8, 118050–118071. [Google Scholar] [CrossRef]
Sánchez-Rada, J.F.; Iglesias, C.A. Social context in sentiment analysis: Formal definition, overview of current trends and framework for comparison. Inf. Fusion. 2019, 52, 344–356. [Google Scholar] [CrossRef]
Fares, M.; Moufarrej, A.; Jreij, E.; Tekli, J.; Grosky, W. Unsupervised word-level affect analysis and propagation in a lexical knowledge graph. Knowl. Based Syst. 2019, 165, 432–459. [Google Scholar] [CrossRef]
Qiu, G.; Liu, B.; Bu, J.; Chen, C. Expanding domain sentiment lexicon through double propagation. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, CA, USA, 11–17 July 2009. [Google Scholar]
Mittal, V.; Ross, W.T., Jr.; Baldasare, P.M. The asymmetric impact of negative and positive attribute-level performance on overall satisfaction and repurchase intentions. J. Mark. 1998, 62, 33–47. [Google Scholar] [CrossRef] [Green Version]
Kano, N.; Seraku, N.; Takahashi, F.; Tsuji, S. Attractive Quality and Must-Be Quality. J. Japan. Soc. Qual. Control. 1984, 14, 39–48. [Google Scholar]
Matzler, K.; Sauerwein, E. The factor structure of customer satisfaction. Int. J. Serv. Ind. Manag. 2002, 13, 314–332. [Google Scholar] [CrossRef]
Chen, L.-F. A novel approach to regression analysis for the classification of quality attributes in the Kano model: An empirical test in the food and beverage industry. Omega 2012, 40, 651–659. [Google Scholar] [CrossRef]
Yang, C.-C.; Jou, Y.-T.; Cheng, L.-Y. Using integrated quality assessment for hotel service quality. Qual. Quant. 2009, 45, 349–364. [Google Scholar] [CrossRef]
Tontini, G.; Bento, G.d.S.; Milbratz, T.C.; Volles, B.K.; Ferrari, D. Exploring the nonlinear impact of critical incidents on customers’ general evaluation of hospitality services. Int. J. Hosp. Manag. 2017, 66, 106–116. [Google Scholar] [CrossRef]
Lin, S.-P.; Yang, C.-L.; Chan, Y.; Sheu, C. Refining Kano’s ‘quality attributes–satisfaction’ model: A moderated regression approach. Int. J. Prod. Econ. 2010, 126, 255–263. [Google Scholar] [CrossRef]
Busacca, B.; Padula, G. Understanding the relationship between attribute performance and overall satisfaction. Mark. Intell. Plan. 2005, 23, 543–561. [Google Scholar] [CrossRef]
Friman, M.; Edvardsson, B. A content analysis of complaints and compliments. Manag. Serv. Qual. Int. J. 2003, 13, 20–26. [Google Scholar] [CrossRef]
Mikulić, J.; Prebežac, D. A critical review of techniques for classifying quality attributes in the Kano model. Manag. Serv. Qual. Int. J. 2011, 21, 46–66. [Google Scholar] [CrossRef] [Green Version]
Sampson, S.E.; Showalter, M.J. The Performance-Importance Response Function: Observations and Implications. Serv. Ind. J. 1999, 19, 1–25. [Google Scholar] [CrossRef]
Sever, I. Importance-performance analysis: A valid management tool? Tour. Manag. 2015, 48, 43–53. [Google Scholar] [CrossRef]
Sörensson, A.; von Friedrichs, Y. An importance–performance analysis of sustainable tourism: A comparison between international and national tourists. J. Dest. Mark. Manage 2013, 2, 14–21. [Google Scholar] [CrossRef]
Parasuraman, A.P.; Zeithaml, V.A.; Berry, L.L. A Conceptual Model of Service Quality and Its Implications for Future Research. J. Mark 1985, 49, 41–50. [Google Scholar] [CrossRef]
Cheng, Y.-S.; Kuo, N.-T.; Chang, K.-C.; Hu, S.-M. Integrating the Kano model and IPA to measure quality of museum interpretation service: A comparison of visitors from Taiwan and Mainland China. Asia Pac. J. Tour. Res. 2019, 24, 483–500. [Google Scholar] [CrossRef]
Dabestani, R.; Shahin, A.; Saljoughian, M.; Shirouyehzad, H. Importance-performance analysis of service quality dimensions for the customer groups segmented by DEA. Int. J. Qual. Reliab. Manage 2016, 33, 160–177. [Google Scholar] [CrossRef]
Yin, P.; Chu, J.; Wu, J.; Ding, J.; Yang, M.; Wang, Y. A DEA-based two-stage network approach for hotel performance analysis: An internal cooperation perspective. Omega 2020, 93, 102035. [Google Scholar] [CrossRef]
Phillips, P.; Barnes, S.; Zigan, K.; Schegg, R. Understanding the Impact of Online Reviews on Hotel Performance. J. Travel. Res. 2016, 56, 235–249. [Google Scholar] [CrossRef] [Green Version]
Domestic Overnight Tourism Trips to London. Available online: https://data.london.gov.uk/dataset/domestic-overnight-tourism-trips-to-london (accessed on 1 August 2020).
Number of International Visitors to London. Available online: https://data.london.gov.uk/dataset/number-international-visitors-london (accessed on 1 August 2021).
Keller, D.; Kostromitina, M. Characterizing non-chain restaurants’ Yelp star-ratings: Generalizable findings from a representative sample of Yelp reviews. Int. J. Hosp. Manag. 2020, 86, 102440. [Google Scholar] [CrossRef]
Local Consumer Review Survey. 2015. Available online: https://www.brightlocal.com/research/local-consumer-review-survey-2015/ (accessed on 19 October 2015).
Ban, H.-J.; Choi, H.; Choi, E.-K.; Lee, S.; Kim, H.-S. Investigating Key Attributes in Experience and Satisfaction of Hotel Customer Using Online Review Data. Sustainability 2019, 11, 6570. [Google Scholar] [CrossRef] [Green Version]
Berezina, K.; Bilgihan, A.; Cobanoglu, C.; Okumus, F. Understanding Satisfied and Dissatisfied Hotel Customers: Text Mining of Online Hotel Reviews. J. Hosp. Mark. Manag. 2015, 25, 1–24. [Google Scholar] [CrossRef]
Zhang, D.; Xu, H.; Su, Z.; Xu, Y. Chinese comments sentiment classification based on word2vec and SVMperf. Expert. Syst. Appl. 2015, 42, 1857–1863. [Google Scholar] [CrossRef]
Fernández-Gavilanes, M.; Álvarez-López, T.; Juncal-Martínez, J.; Costa-Montenegro, E.; Javier González-Castaño, F. Unsupervised method for sentiment analysis in online texts. Expert. Syst. Appl. 2016, 58, 57–75. [Google Scholar] [CrossRef]
Page, L.; Brin, S.; Motwani, R.; Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web. Tech. Rep. Stanf. Digit. Libr. Technol. Proj. 1998. [Google Scholar]
Dragut, E.C.; Fellbaum, C. The role of adverbs in sentiment analysis. In Proceedings of the Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929–2014), Baltimore, MD, USA, 27 June 2014; pp. 38–41. [Google Scholar]
Baccianella, S.; EsuliAndrea, A.; Sebastiani, F. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the International Conference on Language Resources and Evaluation, Valletta, Malta, 17–23 May 2010. [Google Scholar]
Kennedy, A.; Inkpen, D. Sentiment Classication of Movie and Product Reviews Using Contextual Valence Shifters. Comput. Intell. 2006, 22, 110–125. [Google Scholar] [CrossRef] [Green Version]
Polanyi, L.; Zaenen, A. Contextual valence shifters. In Computing Attitude and Affect in Text: Theory and Applications; Shanahan, J.G., Qu, Y., Wiebe, J., Eds.; The Information Retrieval Series; Springer: Dordrecht, The Netherlands, 2006; Volume 20, pp. 1–10. [Google Scholar]
Brooke, J. A Semantic Approach to Automatic Text Sentiment Analysis. Master’s Thesis, Simon Fraser University, Burnaby, BC, Canada, 31 March 2009. [Google Scholar]
Kim, D.; Seo, D.; Cho, S.; Kang, P. Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec. Inf. Sci. 2019, 477, 15–29. [Google Scholar] [CrossRef]
Azzopardi, E.; Nash, R. A critical evaluation of importance–performance analysis. Tour. Manag. 2013, 35, 222–233. [Google Scholar] [CrossRef]
O’Connor, P. Managing a Hotel’s Image on TripAdvisor. J. Hosp. Mark. Manag. 2010, 19, 754–772. [Google Scholar] [CrossRef]
Manolitzas, P.; Glaveli, N.; Palamas, S.; Talias, M.; Grigoroudis, E. Hotel guests’ demanding level and importance of attribute satisfaction ratings: An application of MUltiplecriteria Satisfaction Analysis on TripAdvisor’s hotel guests ratings. Curr. Issues Tour. 2021, 24, 1–6. [Google Scholar] [CrossRef]
Kitsios, F.; Kamariotou, M.; Karanikolas, P.; Grigoroudis, E. Digital Marketing Platforms and Customer Satisfaction: Identifying eWOM Using Big Data and Text Mining. Appl. Sci. 2021, 11, 8032. [Google Scholar] [CrossRef]
Zhao, S. Thumb Up or Down? A Text-Mining Approach of Understanding Consumers through Reviews. Decis. Sci. 2021, 52, 699–719. [Google Scholar] [CrossRef]
UK Hotel Star Rating System. Available online: https://www.narehotel.co.uk/uk-hotel-star-rating-system (accessed on 4 January 2022).
Kashyap, R.; Bojanic, D.C. A Structural Analysis of Value, Quality, and Price Perceptions of Business and Leisure Travelers. J. Travel. Res. 2000, 39, 45–51. [Google Scholar] [CrossRef]

Figure 1. The Kano model.

Figure 2. An example of IPA plot.

Figure 3. Framework of the proposed methodology.

Figure 4. An example of a lexical graph.

Figure 5. The percentage of negative sentiment values concerning six attributes.

Figure 6. The percentage of positive sentiment values concerning six attributes.

Figure 7. The IPA plots concerning four hotel star ratings.

Table 1. Related studies of Kano model in hospitality research.

Literature	Objective	Classifying Method	Sample Source
Yang et al., (2009) [75]	Offer enhanced value to the hotel customer through low prices while meeting appropriate features using refined Kano model and a strategic price model.	Kano’s method	Questionnaire
Chang and Chen (2011) [27]	Use the Kano model and quality function deployment (QFD) to explore hotel brand contact elements perceived by customers.	Kano’s method	Questionnaire
Tontini et al., (2017) [76]	Explore nonlinear effects of service quality on customers’ evaluation of three-star hotels in Rio de Janeiro, Brazil.	CIT and PRCA	TripAdvisor.com
Lai and Hitchcock (2016) [30]	Integrate IPA with the Kano three-factor theory to examine the difference of service attribute importance in different market segments using the case of luxury hotels in Macau.	Importance grid	Questionnaire
Beheshtinia and Farzaneh Azad (2017) [12]	Identify customer needs for the hotels and prioritize them using a combination of the SERVQUAL and Kano approaches.	Kano’s method	Questionnaire
Cheng and Chen (2018) [11]	Analyze competitive qualities required for improvement to enhance service quality of motels in Taiwan.	Moderated regression	Questionnaire
Bi et al., (2020) [10]	Explore the asymmetric effects of attribute performance on customer satisfaction with respect to different market segments.	PRCA	TripAdvisor.com

Table 2. Distribution of sub-datasets according to the hotel star and overall customer ratings.

	Hotel Star Ratings
	Two-Star and Below	Three-Star	Four-Star	Five-Star
Total number of hotels	72	231	237	100
negative reviews	21,771	86,720	106,585	21,997
positive reviews	41,486	231,938	408,553	171,291
Total	63,257	318,658	515,138	193,288

Table 3. The attribute priority rankings for resource allocation.

Kano Category	IPA Strategy
Kano Category	Concentrate Here	Low Priority	Keep Up the Good Work	Possible Overkill
Basic	1	4	7	10
Performance	2	5	8	11
Excitement	3	6	9	12

Table 4. Top 10 similar sub-attributes with respect to the six attributes from five-star hotels.

Value		Location		Service
Sub-Attributes	Similarity	Sub-Attributes	Similarity	Sub-Attributes	Similarity
represent	0.6609	position	0.7390	middling	0.6122
competitively	0.6373	localisation	0.6764	exemplar	0.6041
comparative	0.6291	harrow	0.6467	fatless	0.5972
money	0.6283	fatherland	0.6444	outstanding	0.5942
ratio	0.6070	breckenridge	0.6380	approachability	0.5908
introductory	0.6064	geest	0.6367	topnotch	0.5858
comparably	0.6055	truistic	0.6341	cleanness	0.5787
reasonable	0.5948	locality	0.6308	tentativeness	0.5771
affordability	0.5733	heartland	0.6254	exceptional	0.5751
inline	0.5720	kenton	0.6245	staff	0.5668
Room		Cleanliness		Sleep Quality
Sub-Attributes	Similarity	Sub-Attributes	Similarity	Sub-Attributes	Similarity
plushly	0.6636	spotless	0.7262	soundly	0.8287
suite	0.6609	immaculate	0.6335	undisturbed	0.7309
uprate	0.6463	tidy	0.6275	hypnos	0.6987
handspring	0.6425	spacey	0.6169	insomniac	0.6950
furbished	0.6340	furbished	0.6079	restless	0.6841
cubit	0.6321	conformable	0.6057	soundproofed	0.6737
spacious	0.6289	appoint	0.6056	silent	0.6722
pristinely	0.6284	spacious	0.6055	pillow	0.6691
spacey	0.6214	equipped	0.5960	uninterrupted	0.6646
luminous	0.6207	scrupulously	0.5927	blackout	0.6643

Table 5. Candidate sentiment words with tag, frequency and initial sentiment value.

Terms	Tag	Number of Words	$e^{+}$	$e^{-}$
adequate	a	27,210	0.0795	0.0682
close	a	2257	0.1810	0.0697
abusive	a	120	0	0.8750
affable	a	115	0.6250	0
inspirational	a	75	0.6250	0
trusty	a	52	0.5000	0
accomplished	a	51	0.4432	0
mannerly	a	44	0.7500	0
stubborn	a	40	0	0.6667
immune	a	38	0.0900	0.0950
scramble	n	213	0	0.0833
defect	n	95	0	0.0950
blind	n	68	0	0.0300
despair	n	44	0.0833	0.3750
easiness	n	38	0.0568	0.2045
unwillingness	n	33	0	0.5000
enhancement	n	32	0.3750	0
prejudice	n	30	0	0.8750
horribly	r	508	0	0.7500
painfully	r	506	0	0.0833
marvellously	r	70	0.5000	0.1250
pathetically	r	51	0.3333	0.1667
guiltily	r	38	0.1250	0
responsibly	r	31	0.5000	0
stop	v	2146	0	0.0183
hate	v	283	0	0.7500
free	v	269	0.2523	0.0103
adore	v	97	0.5000	0.1250
desire	v	44	0.1705	0.0341
respect	v	34	0.4583	0
…	…	…	…	…

Table 6. TOP 10 positive and negative sentiment words.

Sentiment Words	Tag	Number of Words	$P R^{+}$	$P R^{-}$	$S O$
deferentially	r	33	0.8250	0.0051	4.9382
upbeat	n	45	1.2375	0.0077	4.9380
mannerly	a	44	1.8582	0.0134	4.9282
glowing	a	32	1.8582	0.0135	4.9279
topnotch	a	33	1.5494	0.0121	4.9224
uxorious	a	35	1.2398	0.0099	4.9205
fortune	n	41	1.5504	0.0132	4.9157
maestro	n	43	1.5505	0.0132	4.9156
amusingly	r	32	1.8606	0.0159	4.9153
rosy	a	48	1.5389	0.0139	4.9103
mustiness	n	31	0.0128	2.0339	−4.9372
untrustworthy	a	33	0.0119	1.7442	−4.9321
unemployed	a	34	0.0104	1.4540	−4.9290
untypical	a	31	0.0129	1.7452	−4.9265
malodorous	a	42	0.0163	2.0373	−4.9206
bogus	a	32	0.0140	1.7464	−4.9203
prejudice	n	30	0.0170	2.0380	−4.9174
damage	v	30	0.0149	1.7472	−4.9155
ungraded	a	52	0.0124	1.4560	−4.9154
egregious	a	33	0.0177	2.0387	−4.9141

Table 7. Top 20 well-performed sub-attributes with the strongest positive sentiment polarity with respect to six attributes in five-star hotels.

Value		Location		Service
Sub-Attributes	Sentiment Values	Sub-Attributes	Sentiment Values	Sub-Attributes	Sentiment Values
quality	9287.06	hotel	127,281.60	service	83,160.98
price	7723.98	location	51,065.79	staff	77,198.91
value	7457.58	great	47,273.65	excellent	19,643.26
overall	4718.63	London	43,130.88	food	18,286.84
cheap	3431.94	place	15,842.34	experience	17,114.62
money	3167.76	perfect	11,355.46	quality	9531.05
rate	3115.13	close	6832.03	attentive	8627.21
deal	2744.77	station	6794.12	superb	7623.67
compare	1427.76	locate	6678.65	outstanding	5455.66
reasonable	1384.81	central	5362.67	attention	4843.12
opinion	874.96	tube	5162.28	exceptional	3612.60
discount	441.41	overall	5036.61	deliver	2642.43
represent	344.97	near	3440.62	impeccable	1426.53
inexpensive	282.49	distance	3431.87	exemplary	1093.08
bargain	226.16	heart	2727.13	cleanliness	1029.09
affordable	176.04	attraction	2529.99	presentation	415.14
favourably	169.36	convenient	2336.89	unmatched	230.21
competitive	164.94	boutique	2066.52	noteworthy	157.20
ratio	73.53	ideal	1999.84	commendable	155.89
phenomenally	52.47	spot	1966.29	incomparable	70.66
Room		Cleanliness		Sleep Quality
Sub-Attributes	Sentiment Values	Sub-Attributes	Sentiment Values	Sub-Attributes	Sentiment Values
room	215,490.83	room	112,559.79	bed	21,150.78
bed	20,678.19	very	87,177.71	comfortable	18,250.67
decorate	18,064.37	well	29,182.25	sleep	8989.04
bathroom	16,396.52	clean	26,561.49	comfy	2552.32
clean	13,874.17	comfortable	21,693.18	pillow	2535.55
suite	13,413.97	bed	20,495.64	linen	2019.79
spacious	9802.36	nice	20,308.36	peaceful	1063.44
large	9007.25	excellent	18,263.84	bedding	1058.61
although	5901.86	decorate	17,812.75	wake	913.49
size	5253.50	bathroom	17,204.46	restful	606.27
bedroom	4876.74	amenity	15,456.15	mattress	499.52
appoint	4173.68	modern	10,517.20	duvet	301.12
deluxe	3075.58	spacious	10,358.82	heavenly	255.83
apartment	2444.80	extremely	9127.39	sleeper	250.90
tastefully	2220.49	luxurious	6218.94	silent	177.54
spotless	1873.90	appoint	5207.32	soundly	176.52
superior	1617.76	super	4743.52	blackout	162.35
furnish	1525.96	elegant	4253.82	dreamy	107.29
spotlessly	1203.26	nicely	3671.61	supremely	104.01
junior	1133.58	polite	3554.43	soundproof	103.71

Table 8. Top 20 poorly performed sub-attributes with the strongest negative sentiment polarity with respect to six attributes in five-star hotels.

Value		Location		Service
Sub-Attributes	Sentiment Values	Sub-Attributes	Sentiment Values	Sub-Attributes	Sentiment Values
price	−3459.05	hotel	−24,483.27	service	−14,546.04
money	−2399.80	location	−5800.25	staff	−8881.41
value	−1547.67	London	−3815.88	experience	−3077.38
overall	−1298.75	great	−3269.14	food	−2274.30
quality	−1208.69	place	−2614.52	quality	−1240.45
rate	−1145.92	overall	−1386.27	excellent	−708.58
cheap	−861.07	close	−1289.30	deliver	−629.82
deal	−662.34	near	−720.90	attention	−493.67
opinion	−291.08	locate	−560.99	attentive	−261.83
discount	−265.94	station	−509.16	exceptional	−141.31
compare	−265.43	central	−462.92	superb	−138.37
reasonable	−218.69	tube	−332.92	cleanliness	−134.34
represent	−69.34	convenient	−303.54	outstanding	−113.61
bargain	−42.48	base	−302.06	presentation	−50.10
competitive	−27.97	perfect	−287.74	impeccable	−40.52
advertised	−19.33	ideal	−255.15	cleanness	−15.81
disproportionate	−16.96	position	−198.63	exemplary	−10.32
ratio	−16.09	spot	−166.93	servicing	−8.97
commensurate	−13.71	distance	−162.97	middling	−8.62
inexpensive	−12.85	boutique	−145.95	tentativeness	−7.46
Room		Cleanliness		Sleep Quality
Sub-Attributes	Sentiment Values	Sub-Attributes	Sentiment Values	Sub-Attributes	Sentiment Values
room	−53,448.27	room	−27,918.25	bed	−3801.36
bed	−3716.42	very	−11,076.76	sleep	−3001.91
bathroom	−3220.28	clean	−3866.42	comfortable	−886.78
clean	−2019.59	bed	−3683.61	wake	−581.08
although	−1719.86	nice	−3652.38	pillow	−428.58
suite	−1527.98	bathroom	−3378.96	mattress	−300.63
large	−1048.14	well	−3181.78	comfy	−188.85
size	−800.25	comfortable	−1054.04	bedding	−145.26
bedroom	−783.50	extremely	−1009.08	duvet	−136.03
spacious	−417.60	excellent	−658.82	linen	−131.69
deluxe	−364.08	spacious	−441.31	soundproofing	−70.98
superior	−270.01	modern	−437.53	sleepless	−65.18
apartment	−211.28	polite	−300.16	proofing	−51.64
junior	−176.58	luxurious	−235.98	peaceful	−44.81
appoint	−160.08	super	−234.58	insulation	−44.18
furnish	−65.87	amenity	−229.05	soundproof	−40.43
decorate	−61.57	appoint	−199.73	restless	−39.13
spotless	−28.02	comfy	−198.85	comforter	−37.62
furnished	−26.66	maintain	−141.21	disturbed	−34.95
airy	−17.59	functional	−140.51	sleeper	−28.19

Table 9. The negative, positive and overall sentiment values of each attribute according to the hotel star ratings.

Hotel Star Ratings	Sentiment Values	Attribute
Hotel Star Ratings	Sentiment Values	Value	Location	Service	Room	Cleanliness	Sleep Quality	Mean
Two-star and below	Negative	−44,402.05	−77,480.13	−34,111.69	−113,060.25	−88,818.79	−7304.89	−60,862.97
	Positive	75,603.2	141,847.69	60,384.86	136,324.38	147,678.6	5874.08	94,618.8
	Overall	15,868.76	35,830.05	17,723.35	9693.39	31,797.07	490.12	18,567.12
Three-star	Negative	−157,906.9	−277,368.47	−119,342.87	−371,405.85	−318,294	−43,190.77	−214,584.81
	Positive	369,114.39	707,692.72	294,129.77	606,484.11	777,154.88	65,084.39	469,943.38
	Overall	84,434.11	205,247.57	90,848.91	96,265.24	238,493.17	12,018.31	121,217.88
Four-star	Negative	−102,202.45	−242,558.76	−192,196.55	−409,184.1	−364,975.08	−46,462.47	−226,263.23
	Positive	296,090.96	931,501.55	630,134.85	1,058,169.12	1,384,338.72	81,519.98	730,292.53
	Overall	92,485.82	352,708.09	244,041.18	362,762.47	622,552.97	18,611.68	282,193.7
Five-star	Negative	−13,991.48	−48,218.33	−32,808.41	−70,719.73	−63,615.54	−10,346.97	−39,950.08
	Positive	47,803.81	369,732.22	263,145.74	358,219.8	482,105.65	62,616.95	263,937.36
	Overall	14,494.87	164,098.46	128,769.04	169,039.22	274,199.1	28,743.64	129,890.72

Table 10. The three Kano categories of six attributes concerning four hotel star ratings.

Attribute	Two Stars and Below Hotels		Three-Star Hotels		Four-Star Hotels		Five-Star Hotels
Attribute	$S I$	Category	$S I$	Category	$S I$	Category	$S I$	Category
Value	0.9911	Excitement	1.1747	Excitement	1.0458	Excitement	1.1693	Excitement
Location	0.9356	Performance	1.0411	Performance	0.9723	Excitement	0.9685	Excitement
Service	0.8230	Basic	0.9671	Basic	0.8851	Performance	0.8317	Performance
Room	1.0316	Excitement	1.0910	Excitement	0.9008	Performance	0.7890	Performance
Cleanliness	0.9607	Performance	0.9674	Basic	0.7714	Basic	0.6154	Basic
Sleep quality	0.6907	Basic	0.9612	Basic	0.9667	Performance	0.8665	Performance
Mean	0.9055		1.0338		0.9237		0.8734
θ	0.0568		0.0356		0.0457		0.0923
Upper threshold	0.9623		1.0693		0.9694		0.9657
Lower threshold	0.8487		0.9982		0.8780		0.7811

Table 11. The importance of the six attributes concerning four hotel star ratings.

Hotel Star Rating	Attribute
Hotel Star Rating	Value	Location	Service	Room	Cleanliness	Sleep Quality
Two-star and below	0.1354	0.2412	0.0990	0.2566	0.2574	0.0104
Three-star	0.1328	0.2446	0.0962	0.2337	0.2690	0.0238
Four-star	0.0692	0.2267	0.1353	0.2508	0.2978	0.0202
Five-star	0.0315	0.2626	0.1545	0.2315	0.2812	0.0387

Table 12. The attribute priority rankings of the six attributes for resource allocation concerning four hotel star ratings.

Attribute	Hotel Star Rating
Attribute	2-Star and Below	3-Star	4-Star	5-Star
Value	4	4	3	3
Location	6	6	6	6
Service	2	2	1	1
Room	1	1	5	5
Cleanliness	5	5	4	4
Sleep quality	3	3	2	2

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Y.; Zhong, Y.; Yu, S.; Xiao, Y.; Chen, S. Exploring Bidirectional Performance of Hotel Attributes through Online Reviews Based on Sentiment Analysis and Kano-IPA Model. Appl. Sci. 2022, 12, 692. https://doi.org/10.3390/app12020692

AMA Style

Chen Y, Zhong Y, Yu S, Xiao Y, Chen S. Exploring Bidirectional Performance of Hotel Attributes through Online Reviews Based on Sentiment Analysis and Kano-IPA Model. Applied Sciences. 2022; 12(2):692. https://doi.org/10.3390/app12020692

Chicago/Turabian Style

Chen, Yanyan, Yumei Zhong, Sumin Yu, Yan Xiao, and Sining Chen. 2022. "Exploring Bidirectional Performance of Hotel Attributes through Online Reviews Based on Sentiment Analysis and Kano-IPA Model" Applied Sciences 12, no. 2: 692. https://doi.org/10.3390/app12020692

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exploring Bidirectional Performance of Hotel Attributes through Online Reviews Based on Sentiment Analysis and Kano-IPA Model

Abstract

1. Introduction

2. Literature Review

2.1. Studies on Hotel Online Reviews

2.2. Studies on Sentiment Analysis

2.3. The Kano Model

2.4. Importance-Performance Analysis

3. Materials and Methods

3.1. Data Collecting and Processing

3.2. Text Preprocessing and Sub-Attributes Selection

3.2.1. Text Preprocessing

3.2.2. Sub-Attributes Selection

3.3. Sentiment Lexicon Creation

3.3.1. Selecting Candidate Sentiment Words

3.3.2. Assigning Initial Values to Candidate Sentiment Words

3.3.3. Calculation of PR + and PR −

3.3.4. Calculation of Semantic Orientation

3.4. Sentiment Analysis of Attributes

3.4.1. Calculation of Sub-Attribute Sentiment Values

3.4.2. Calculation of Attribute Sentiment Values

3.5. Kano-IPA Analysis

3.5.1. Classifying Attributes into Kano Categories

3.5.2. Constructing the IPA Plot

3.5.3. Analyzing the Attribute Priority Rankings

4. Results and Discussion

4.1. Results of Sub-Attributes Selection

4.2. Results of Sentiment Lexicon from Hotel Reviews

4.3. Results and Discussions of Attribute Bidirectional Performance

4.3.1. Results of Attribute Bidirectional Performance

4.3.2. Comparative Analysis of Attributes’ Bidirectional Performance

4.4. Results and Discussions of Kano-IPA Analysis

4.4.1. Attribute Classification Based on the Kano Model

4.4.2. The IPA Plot

4.4.3. Suggestions for Attribute Improvement and Priority

5. Conclusions

5.1. Theoretical Implications

5.2. Practical Implications

5.3. Limitations

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3.3. Calculation of ${PR}^{+}$ and ${PR}^{-}$