Appropriate risk management is crucial to ensure the competitiveness of financial institutions and the stability of the economy. One widely used financial risk measure is value-at-risk (VaR). VaR ...estimates based on linear and parametric models can lead to biased results or even underestimation of risk due to time varying volatility, skewness and leptokurtosis of financial return series. The paper proposes a nonlinear and nonparametric framework to forecast VaR that is motivated by overcoming the disadvantages of parametric models with a purely data driven approach. Mean and volatility are modeled via support vector regression (SVR) where the volatility model is motivated by the standard generalized autoregressive conditional heteroscedasticity (GARCH) formulation. Based on this, VaR is derived by applying kernel density estimation (KDE). This approach allows for flexible tail shapes of the profit and loss distribution, adapts for a wide class of tail events and is able to capture complex structures regarding mean and volatility. The SVR-GARCH-KDE hybrid is compared to standard, exponential and threshold GARCH models coupled with different error distributions. To examine the performance in different markets, 1-day-ahead and 10-days-ahead forecasts are produced for different financial indices. Model evaluation using a likelihood ratio based test framework for interval forecasts and a test for superior predictive ability indicates that the SVR-GARCH-KDE hybrid performs competitive to benchmark models and reduces potential losses especially for 10-days-ahead forecasts significantly. Especially models that are coupled with a normal distribution are systematically outperformed.
The paper proposes a novel approach to predict intraday directional-movements of currency-pairs in the foreign exchange market based on news story events in the economy calendar. Prior work on using ...textual data for forecasting foreign exchange market developments does not consider economy calendar events. We consider a rich set of text analytics methods to extract information from news story events and propose a novel sentiment dictionary for the foreign exchange market. The paper shows how news events and corresponding news stories provide valuable information to increase forecast accuracy and inform trading decisions. More specifically, using textual data together with technical indicators as inputs to different machine learning models reveals that the accuracy of market predictions shortly after the release of news is substantially higher than in other periods, which suggests the feasibility of news-based trading. Furthermore, empirical results identify a combination of a gradient boosting algorithm, our new sentiment dictionary, and text-features based-on term frequency weighting to offer the most accurate forecasts. These findings are valuable for traders, risk managers and other consumers of foreign exchange market forecasts and offer guidance how to design accurate prediction systems.
•We define and characterize Explainable AI for Operational Research (XAIOR).•3 subdimensions of XAIOR are performance, attributable, and responsible analytics.•Methods, algorithms, and applications ...of XAIOR in 6 major OR domains are reviewed.•We present an ambitious agenda for XAIOR research.
Display omitted
The ability to understand and explain the outcomes of data analysis methods, with regard to aiding decision-making, has become a critical requirement for many applications. For example, in operational research domains, data analytics have long been promoted as a way to enhance decision-making. This study proposes a comprehensive, normative framework to define explainable artificial intelligence (XAI) for operational research (XAIOR) as a reconciliation of three subdimensions that constitute its requirements: performance, attributable, and responsible analytics. In turn, this article offers in-depth overviews of how XAIOR can be deployed through various methods with respect to distinct domains and applications. Finally, an agenda for future XAIOR research is defined.
E-mail tracking provides companies with fine-grained behavioral data about e-mail recipients, which can be a threat for individual privacy and enterprise security. This problem is especially severe ...since e-mail tracking techniques often gather data without the informed consent of the recipients. So far e-mail recipients lack a reliable protection mechanism.
This article presents a novel protection framework against e-mail tracking that closes an important gap in the field of enterprise security and privacy-enhancing technologies. We conceptualize, implement and evaluate an anti-tracking mail server that is capable of identifying tracking images in e-mails via machine learning with very high accuracy, and can selectively replace them with arbitrary images containing warning messages for the recipient. Our mail protection framework implements a selective prevention strategy as enterprise-grade software using the design science research paradigm. It is flexibly extensible, highly scalable, and ready to be applied under actual production conditions. Experimental evaluations show that these goals are achieved through solid software design, adoption of recent technologies and the creation of novel flexible software components.
•Conceptualization of a novel protection framework against e-mail tracking.•First server-side implementation of reliable e-mail tracking protection.•Framework shows a high accuracy without any manual user effort.•Framework is developed as enterprise-grade software (extensible, scalable).
► We devise a novel approach to determine effective parameter settings for metaheuristics by means of advanced regression methodology. ► The approach extracts useful information from data associated ...with the metaheuristic’s search history and characteristics of the underlying optimization problem. ► Empirical results indicate that the relationship between effective parameter settings and these types of information is sufficiently strong to be exploited for parameter tuning. ► Random Forest Regression is found to be particularly appropriate for the focal application and is recommended as prediction model for an automated parameter tuning system.
The paper is concerned with practices for tuning the parameters of metaheuristics. Settings such as, e.g., the cooling factor in simulated annealing, may greatly affect a metaheuristic’s efficiency as well as effectiveness in solving a given decision problem. However, procedures for organizing parameter calibration are scarce and commonly limited to particular metaheuristics. We argue that the parameter selection task can appropriately be addressed by means of a data mining based approach. In particular, a hybrid system is devised, which employs regression models to learn suitable parameter values from past moves of a metaheuristic in an online fashion. In order to identify a suitable regression method and, more generally, to demonstrate the feasibility of the proposed approach, a case study of particle swarm optimization is conducted. Empirical results suggest that characteristics of the decision problem as well as search history data indeed embody information that allows suitable parameter values to be determined, and that this type of information can successfully be extracted by means of nonlinear regression models.
•Introduce mathematical program of a transductive discrete support vector machine (tDSVM).•Develop memetic algorithm to construct tDSVM classification models.•Demonstrate effectiveness of the novel ...classifier and memetic algorithm through several empirical experiments.
Transductive learning involves the construction and application of prediction models to classify a fixed set of decision objects into discrete groups. It is a special case of classification analysis with important applications in web-mining, corporate planning and other areas. This paper proposes a novel transductive classifier that is based on the philosophy of discrete support vector machines. We formalize the task to estimate the class labels of decision objects as a mixed integer program. A memetic algorithm is developed to solve the mathematical program and to construct a transductive support vector machine classifier, respectively. Empirical experiments on synthetic and real-world data evidence the effectiveness of the new approach and demonstrate that it identifies high quality solutions in short time. Furthermore, the results suggest that the class predictions following from the memetic algorithm are significantly more accurate than the predictions of a CPLEX-based reference classifier. Comparisons to other transductive and inductive classifiers provide further support for our approach and suggest that it performs competitive with respect to several benchmarks.
Leasing is a popular channel for marketing new cars. However, the pricing of leases is complicated because the leasing rate must embody an expectation of the car’s residual value after contract ...expiration. This paper develops resale price forecasting models in order to aid pricing decisions. One feature of the leasing business is that different forecast errors entail different costs. The primary objective of this paper is to identify effective ways of addressing cost asymmetry. Specifically, this paper contributes to the literature by (i) consolidating prior work in forecasting on asymmetric functions of the cost of errors; (ii) systematically evaluating previous approaches and comparing them to a new approach; and (iii) demonstrating that forecasting using asymmetric cost of error functions improves the quality of decision support in car leasing. For example, if the costs of overestimating resale prices are twice those of underestimating them, incorporating cost asymmetry into forecast model development reduces costs by about 8%.
The Price of Privacy Baumann, Annika; Haupt, Johannes; Gebert, Fabian ...
Business & information systems engineering,
08/2019, Letnik:
61, Številka:
4
Journal Article
Recenzirano
The analysis of clickstream data facilitates the understanding and prediction of customer behavior in e-commerce. Companies can leverage such data to increase revenue. For customers and website ...users, on the other hand, the collection of behavioral data entails privacy invasion. The objective of the paper is to shed light on the trade-off between privacy and the business value of customer information. To that end, the authors review approaches to convert clickstream data into behavioral traits, which we call clickstream features, and propose a categorization of these features according to the potential threat they pose to user privacy. The authors then examine the extent to which different categories of clickstream features facilitate predictions of online user shopping patterns and approximate the marginal utility of using more privacy adverse information in behavioral prediction models. Thus, the paper links the literature on user privacy to that on e-commerce analytics and takes a step toward an economic analysis of privacy costs and benefits. In particular, the results of empirical experimentation with large real-world e-commerce data suggest that the inclusion of short-term customer behavior based on session-related information leads to large gains in predictive accuracy and business performance, while storing and aggregating usage behavior over longer horizons has comparably less value.
In many business applications, including online marketing and customer churn prevention, randomized controlled trials (RCT's) are conducted to investigate on the effect of specific treatment (coupon ...offers, advertisement mailings,...). Such RCT's allow for the estimation of average treatment effects as well as the training of (uplift) models for the heterogeneity of treatment effects between individuals. The problem with these RCT's is that they are costly and this cost increases with the number of individuals included into the RCT. For this reason, there is research how to conduct experiments involving a small number of individuals while still obtaining precise treatment effect estimates. We contribute to this literature a heteroskedasticity-aware stratified sampling (HS) scheme, which leverages the fact that different individuals have different noise levels in their outcome and precise treatment effect estimation requires more observations from the "high-noise" individuals than from the "low-noise" individuals. By theory as well as by empirical experiments, we demonstrate that our HS-sampling yields significantly more precise estimates of the ATE, improves uplift models and makes their evaluation more reliable compared to RCT data sampled completely randomly. Due to the relative ease of application and the significant benefits, we expect HS-sampling to be valuable in many real-world applications.
There are various applications, where companies need to decide to which individuals they should best allocate treatment. To support such decisions, uplift models are applied to predict treatment ...effects on an individual level. Based on the predicted treatment effects, individuals can be ranked and treatment allocation can be prioritized according to this ranking. An implicit assumption, which has not been doubted in the previous uplift modeling literature, is that this treatment prioritization approach tends to bring individuals with high treatment effects to the top and individuals with low treatment effects to the bottom of the ranking. In our research, we show that heteroskedastictity in the training data can cause a bias of the uplift model ranking: individuals with the highest treatment effects can get accumulated in large numbers at the bottom of the ranking. We explain theoretically how heteroskedasticity can bias the ranking of uplift models and show this process in a simulation and on real-world data. We argue that this problem of ranking bias due to heteroskedasticity might occur in many real-world applications and requires modification of the treatment prioritization to achieve an efficient treatment allocation.