Social media has become the largest data source of public opinion. The application of sentiment analysis to social media texts has great potential, but faces great challenges because of domain ...heterogeneity. Sentiment orientation of words varies by content domain, but learning context-specific sentiment in social media domains continues to be a major challenge. The language domain poses another challenge since the language used in social media today differs significantly from that used in traditional media. To address these challenges, we propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification using an unannotated corpus and a dictionary. We evaluate our method using two large developing corpora, containing 743,069 tweets related to the stock market and one million tweets related to political topics, respectively, and five existing sentiment lexicons as seeds and baselines. The results demonstrate the usefulness of our method, showing significant improvement in sentiment classification performance.
•We propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification.•The proposed method addresses challenges from both content domain and language domain.•We evaluate our method using two large developing corpora and five existing sentiment lexicons as seeds and baselines.•The evaluation results demonstrate the usefulness of our method.
The era of Web 2.0 is witnessing the proliferation of online social media platforms, which develop new business models by leveraging user-generated content. One rapidly growing source of ...user-generated data is online reviews, which play a very important role in disseminating information, facilitating trust, and promoting commerce in the e-marketplace. In this paper, we develop and compare several text regression models for predicting the helpfulness of online reviews. In addition to using review words as predictors, we examine the influence of reviewer engagement characteristics such as reputation, commitment, and current activity. We employ a reviewer's RFM (Recency, Frequency, Monetary Value) dimensions to characterize his/her overall engagement and investigate if the inclusion of those dimensions helps improve the prediction of online review helpfulness. Empirical findings from text mining experiments conducted using reviews from Yelp and Amazon offer strong support to our thesis. We find that both review text and reviewer engagement characteristics help predict review helpfulness. The hybrid approach of combining the textual features of bag-of-words model and RFM dimensions produces the best prediction results. Furthermore, our approach facilitates the estimation of the helpfulness of new reviews instantly, making it possible for social media platforms to dynamically adjust the presentation of those reviews on their websites.
•We propose hybrid text regression models for predicting online review helpfulness.•We creatively adapt RFM analysis to characterize a reviewer's online engagement.•A reviewer's RFM dimensions improve the prediction of review helpfulness.•The hybrid approach, combining bag-of-words and RFM, produces the best results.•Our study can help online platforms rate and present new online reviews instantly.
Opinion mining of microblog messages has become a popular application of business analytics in recent times. Opinions reflected in microblogs have provided businesses with great opportunities to ...acquire insights into their operating environments in real time. In particular, the relationship between microblog sentiment and stock returns is of great interest to investment professionals and academic researchers across multiple disciplines. We empirically test this complex relationship in a comprehensive study. We perform vector autoregression on a data set containing close to 18 million microblog messages spanning 4 years at the market and the individual stock levels, and at the daily and the hourly frequencies. The results show that the influence of microblog sentiment on stock returns is both statistically and economically significant at the hour level. Microblog sentiment is also largely driven by movements in the market. Moreover, stock returns have a stronger influence on negative sentiment than on positive sentiment. These findings have important implications for both research and practice.
Reduction in hospital readmissions has long been identified as a target area for healthcare public policy reform by the US government. In October 2012, the Affordable Care Act (ACA) established the ...Hospital Readmissions Reduction Program (HRRP) program, which requires the Centers for Medicare and Medicaid Services to reduce payments to hospitals with excess readmissions. Even with recent changes in ACA, the HRRP program is still in place. In this study, we empirically examine the effectiveness of the introduction of the HRRP on hospital readmission and mortality rates. We observe that, in general, the introduction of the HRRP has significantly reduced readmission rates. However, the introduction of the HRRP does not necessarily decrease the mortality rates, which highlights the unintended consequences of public policy. What is more interesting is that the impact of the HRRP is heterogeneous in hospital size and racial groups. First, after the HRRP introduction, large hospitals have experienced a greater reduction in readmission rates than small hospitals. Second, after the introduction of the HRRP, the zip code regions with a higher percentage of Hispanic and African‐American populations have experienced a larger reduction in readmission rates. These results contribute to both theory and practice in public policy and provide important and nuanced policy implications for evaluating the effectiveness of the HRRP. Policy‐makers also need to pay close attention to these results for future implementations of policies similar to the HRRP.
Many online sellers send a review request
only
a few days after product delivery to gather customer reviews. Yet, the value of this strategy is questionable because buyers with short product exposure ...are unlikely to have enough time to inspect the product thoroughly and thus may not offer valuable evaluations. We address this question by examining the influence of consumers’ product exposure on the helpfulness of their reviews. Our findings suggest that those with a longer product exposure tend to produce more helpful posts. The subsequent topic modeling analyses reveal that reviewers’ assessments of product utilitarian aspects increase with product exposure. Such information is perceived as less subjective and contains more discussions on product functionality. Lastly, we found that users with prior product domain knowledge do not need a long exposure to produce helpful reviews. Businesses with an urgent need to gain reviews may target them as a priority.
•We incorporate higher level knowledge constructs, cognitive scripts, for analyzing review helpfulness.•The scripts-enriched text regression model outperforms both the baseline and bag-of-words ...models.•Our model can identify the most helpful reviews quickly and enhance the review website's usefulness.
In this paper, we examine the utility of script analysis for predicting the helpfulness of online customer reviews. We employ the lens of cognitive scripts and posit that people share a cognitive script for what constitutes a helpful review in a given domain. Conceptually, a script includes the salient elements that readers look for before determining whether a review is helpful. To operationalize the construct of cognitive script, we seek the help of human annotators and ask them to highlight phrases that they believe are important for determining review helpfulness. The words in the annotated phrases are collected and become part of the script lexicon for a given domain. The lexicon entries represent the shared conception of essential elements, which are key to the evaluation of review helpfulness. We employ the words in the script lexicon as features in a text regression model to predict review helpfulness. Furthermore, we develop and empirically validate a new approach for combining script analysis and dimension reduction. The purpose of the study is to propose a new method to predict review helpfulness and to evaluate the effectiveness and efficiency of the scripts-enriched model. To demonstrate the efficacy of the scripts-enriched model, we compare it with benchmark models – a Baseline model and a bag-of-words (BOW) model. The results show that the scripts-enriched text regression model not only produces the highest accuracy, but also the lowest training, testing, and feature selection times.
A consumer typically visits an online store a few times before making a purchase decision, and on each visit spends some time browsing the store. The durations of these visits vary not only across ...consumers but also for a given consumer across multiple visits. We argue that the amount of time that a consumer spends on the first visit to a website depends on how she is drawn to the website. We find that the duration of the first visit is influenced by the advertising tool—banner ad or search engine—used to attract consumers to the website. The durations of subsequent visits are influenced by the durations of earlier visits. The search durations are also influenced by the visit day of the week and time of day. In this paper, we develop a multiple-spell competing risk model to capture the underlying stochastic process, chief elements of which are two interrelated processes: a duration process and a transition process. The multistate, multiple-spell model allows us to identify a
window of opportunity
, within which the purchase probability is higher than the exit probability. Online salespersons should target site visitors during this window of opportunity. The model, which is calibrated on clickstream data obtained from a major online vendor, can also be used to determine the bid price strategy for search engine ads.
Care coordination involves shaping patient care activities and sharing information among all participants concerned with a patient's care to achieve safe and effective care. The objectives of care ...coordination are to promote sharing of patients’ clinical information, keep patients and families informed, and manage effective referrals and care transitions. Failures in care coordination account for a large amount of waste per year in the United States. Many innovative healthcare organizations have recently recognized the danger of poorly coordinated care and have implemented analytics to improve it. Therefore, more analytics‐based research (especially combining explanatory analytics with predictive analytics) is needed to direct efforts to improve care coordination. This paper focuses on systematically studying the extant literature to understand how analytics play a role in improving care coordination. Our goal is to identify a set of key research questions that would lead to new research areas in the use of analytics for care coordination. Based on these questions, we offer new analytics solution pathways to care coordination problems.
Purpose
Online search effort is routinely measured by the duration of visit at the website as obtained from clicksream data or surveys. Measuring search effort by the time spent at a website assumes ...that all consumers who search for the same duration obtain the same amount of information. This would be acceptable if all consumers possessed the same navigational ability. In practice, different consumers have different levels of ability to navigate a website. The purpose of this study is to find whether an individual’s navigational ability has an influence on visit duration and purchase likelihood.
Design/methodology/approach
The authors use visit duration data from a real website which makes it possible to partition the visit duration into the times spent on relevant and irrelevant pages. The data were collected through an experimental study. The authors develop an empirical model, comprising hazard and choice models, to assess the relationship between navigational ability and elements of website usage.
Findings
A consumer with poor navigational ability spends more time searching on the Web and has lower purchase probability compared to a consumer with superior ability.
Research limitations/implications
The study is limited to one data.
Practical implications
This research has managerial implications for website design, such as link-structure, appearance, size and the number of graphics.
Originality/value
This is the first study to research navigational ability’s influence on online consumer behavior.
While extensive research in data mining has been devoted to developing better classification algorithms, relatively little research has been conducted to examine the effects of feature construction, ...guided by domain knowledge, on classification performance. However, in many application domains, domain knowledge can be used to construct higher-level features to potentially improve performance. For example, past research and regulatory practice in early warning of bank failures has resulted in various explanatory variables, in the form of financial ratios, that are constructed based on bank accounting variables and are believed to be more effective than the original variables in identifying potential problem banks. In this study, we empirically compare the performance of two sets of classifiers for bank failure prediction, one built using raw accounting variables and the other built using constructed financial ratios. Four popular data mining methods are used to learn the classifiers: logistic regression, decision tree, neural network, and
k-nearest neighbor. We evaluate the classifiers on the basis of expected misclassification cost under a wide range of possible settings. The results of the study strongly indicate that feature construction, guided by domain knowledge, significantly improves classifier performance and that the degree of improvement varies significantly across the methods.