We propose a data mining (DM) approach to predict the success of telemarketing calls for selling bank long-term deposits. A Portuguese retail bank was addressed, with data collected from 2008 to ...2013, thus including the effects of the recent financial crisis. We analyzed a large set of 150 features related with bank client, product and social-economic attributes. A semi-automatic feature selection was explored in the modeling phase, performed with the data prior to July 2012 and that allowed to select a reduced set of 22 features. We also compared four DM models: logistic regression, decision trees (DTs), neural network (NN) and support vector machine. Using two metrics, area of the receiver operating characteristic curve (AUC) and area of the LIFT cumulative curve (ALIFT), the four models were tested on an evaluation set, using the most recent data (after July 2012) and a rolling window scheme. The NN presented the best results (AUC=0.8 and ALIFT=0.7), allowing to reach 79% of the subscribers by selecting the half better classified clients. Also, two knowledge extraction methods, a sensitivity analysis and a DT, were applied to the NN model and revealed several key attributes (e.g., Euribor rate, direction of the call and bank agent experience). Such knowledge extraction confirmed the obtained model as credible and valuable for telemarketing campaign managers.
•Assessment of a real problem of bank telemarketing to sell long-term deposits•A data-driven approach using newly proposed social and economic characteristics•Focus on feature engineering, resulting in a highly tuned model of 22 features•Comparison of four data mining models under a realistic rolling-window scheme•Results allow targeting 79% of buyers by selecting the half better classified.
•A recent review on the application of business intelligence to the banking domain.•Coverage of the last twelve years of scientific literature on those subjects.•Usage of text mining and the latent ...Dirichlet allocation to analyze articles.•Provide new insights and future research trends which may benefit banking business.
This paper analyzes recent literature in the search for trends in business intelligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in order to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelligence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining procedure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or denial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of articles focused more on business intelligence techniques and its applications, using the banking industry just for evaluation, thus, not clearly acclaiming for benefits in the banking business. By identifying these current research topics, this study also highlights opportunities for future research.
This study presents a research approach using data mining for predicting the performance metrics of posts published in brands' Facebook pages. Twelve posts' performance metrics extracted from a ...cosmetic company's page including 790 publications were modeled, with the two best results achieving a mean absolute percentage error of around 27%. One of them, the “Lifetime Post Consumers” model, was assessed using sensitivity analysis to understand how each of the seven input features influenced it (category, page total likes, type, month, hour, weekday, paid). The type of content was considered the most relevant feature for the model, with a relevance of 36%. A status post captures around twice the attention of the remaining three types (link, photo, video). We have drawn a decision process flow from the “Lifetime Post Consumers” model, which by complementing the sensitivity analysis information may be used to support manager's decisions on whether to publish a post.
This research aims to assess air travelers' concerns affected by the Coronavirus pandemic, expressed in the comments they wrote online. A sample of 639 comments written on the Italian National ...Consumer Union website and related to the airline industry was assessed through an automated sentiment analysis in this study. The achieved results showed that travelers' concerns were directed mainly towards compensations, cancellations, and COVID-19 and at the same time, they had mixed and unpredictable feelings. This element suggests that consumers may have understood that airline companies are facing unsustainable cash-flow and revenue situations. Moreover, all our hypotheses, grounded on existing literature, were refuted. Accordingly, we argue that the actual context prevents assessments based on previous assumptions, and studies related to the impact of COVID-19 need to be conducted anew.
This study presents an enhanced automated approach based on literature analysis and synthesis for establishing the dimensions of the ethnic marketing literature, covering a set of 239 journal ...articles published by nine major publishers. The approach reported is enhanced by two novel procedures to address previously identified limitations, namely: definition of a relevant dictionary based on both a sufficient lexicon extracted from a definition of the core theme and a conditional dictionary, with related but non-core terms; and a visually appealing pictorial representation to summarize the discovered topics. The application of the method to ethnic marketing indicates that ethnic marketing research is characterized by high conceptual heterogeneity, although a clear definition of “ethnic marketing” is imperative for research development. Overall, the paper advances an approach with considerable scalability advantages when compared with extant approaches, an important issue to consider when textual sources become big data.
Given the research interest on Big Data in Marketing, we present a research literature analysis based on a text mining semi-automated approach with the goal of identifying the main trends in this ...domain. In particular, the analysis focuses on relevant terms and topics related with five dimensions: Big Data, Marketing, Geographic location of authors’ affiliation (countries and continents), Products, and Sectors. A total of 1560 articles published from 2010 to 2015 were scrutinized. The findings revealed that research is bipartite between technological and research domains, with Big Data publications not clearly aligning cutting edge techniques toward Marketing benefits. Also, few inter-continental co-authored publications were found. Moreover, findings show that research in Big Data applications to Marketing is still in an embryonic stage, thus making it essential to develop more direct efforts toward business for Big Data to thrive in the Marketing arena.
The development of the Internet and mobile devices enabled the emergence of travel and hospitality review sites, leading to a large number of customer opinion posts. While such comments may influence ...future demand of the targeted hotels, they can also be used by hotel managers to improve customer experience. In this article, sentiment classification of an eco-hotel is assessed through a text mining approach using several different sources of customer reviews. The latent Dirichlet allocation modeling algorithm is applied to gather relevant topics that characterize a given hospitality issue by a sentiment. Several findings were unveiled including that hotel food generates ordinary positive sentiments, while hospitality generates both ordinary and strong positive feelings. Such results are valuable for hospitality management, validating the proposed approach.
The automatic classification of abstract sentences into its main elements (background, objectives, methods, results, conclusions) is a key tool to support scientific database querying, to summarize ...relevant literature works and to assist in the writing of new abstracts. In this paper, we propose a novel deep learning approach based on a convolutional layer and a bidirectional gated recurrent unit to classify sentences of abstracts. First, the proposed neural network was tested on a publicly available repository containing 20 thousand abstracts from the biomedical domain. Competitive results were achieved, with weight-averaged Precision, Recall and F1-score values around 91%, and an area under the ROC curve (AUC) of 99%, which are higher when compared to a state-of-the-art neural network. Then, a crowdsourcing approach using gamification was adopted to create a new comprehensive set of 4111 classified sentences from the computer science domain, focused on social media abstracts. The results of applying the same deep learning modeling technique trained with 3287 (80%) of the available sentences were below the ones obtained for the larger biomedical dataset, with weight-averaged Precision, Recall and F1-score values between 73 and 76%, and an AUC of 91%. Considering the dataset dimension as a likely important factor for such performance decrease, a data augmentation approach was further applied. This involved the use of text mining to translate sentences of the computer science abstract corpus while retaining the same meaning. Such approach resulted in slight improvements (around 2 percentage points) for the weight-averaged Recall and F1-score values.
Digital journalism has faced a dramatic change and media companies are challenged to use data science algorithms to be more competitive in a Big Data era. While this is a relatively new area of study ...in the media landscape, the use of machine learning and artificial intelligence has increased substantially over the last few years. In particular, the adoption of data science models for personalization and recommendation has attracted the attention of several media publishers. Following this trend, this paper presents a research literature analysis on the role of Data Science (DS) in Digital Journalism (DJ). Specifically, the aim is to present a critical literature review, synthetizing the main application areas of DS in DJ, highlighting research gaps, challenges, and opportunities for future studies. Through a systematic literature review integrating bibliometric search, text mining, and qualitative discussion, the relevant literature was identified and extensively analyzed. The review reveals an increasing use of DS methods in DJ, with almost 47% of the research being published in the last three years. An hierarchical clustering highlighted six main research domains focused on text mining, event extraction, online comment analysis, recommendation systems, automated journalism, and exploratory data analysis along with some machine learning approaches. Future research directions comprise developing models to improve personalization and engagement features, exploring recommendation algorithms, testing new automated journalism solutions, and improving paywall mechanisms.
Health information systems have been developed to help hospital managers steer daily operations, including key performance indicators (KPIs) for monitoring on a time-aggregated basis. Yet, current ...literature lacks in proposals of productivity dashboards to assist hospitals stakeholders. This research focuses on two related problems: (1) hospital organizations need access to productivity information to improve access to services; and (2) managers need productivity information to optimize resource allocation. This research consists in the development of dashboards to monitor information obtained from a hospital organization to support decision makers. To develop and evaluate the productivity dashboard, the Design Science Research (DSR) methodology was adopted. The dashboard was evaluated by stakeholders of a large Portuguese hospital who contributed to iteratively improving its design toward a useful decision support tool. Additionally, it was ascertained that monitoring productivity needs more study and that the dashboards on these themes are valuable assets at a monitoring level and subsequent decision-making process.