Cumulative sum (CUSUM) plots and methods have wide‐ranging applications in healthcare. We review and discuss some issues related to the analysis of surgical learning curve (LC) data with a focus on ...three types of CUSUM statistical approaches. The underlying assumptions, benefits, and weaknesses of each approach are given. Our primary conclusion is that two types of CUSUM methods are useful in providing visual aids, but are subject to overinterpretation due to the lack of well‐defined decision rules and performance metrics. The third type is based on plotting the CUSUM of the differences between observations and their average value. We show that this commonly applied retrospective method is frequently interpreted incorrectly and is thus unhelpful in the LC application. Curve‐fitting methods are more suitable for meeting many of the goals associated with the study of surgical LCs.
We provide an overview and perspective on the Phase I collection and analysis of data for use in process improvement and control charting. In Phase I, the focus is on understanding the process ...variability, assessing the stability of the process, investigating process-improvement ideas, selecting an appropriate in-control model, and providing estimates of the in-control model parameters. In our article, we review and synthesize many of the important developments that pertain to the analysis of process data in Phase I. We give our view of the major issues and developments in Phase I analysis. We identify the current best practices and some opportunities for future research in this area.
Finding the dominant cause(s) of variation in process improvement projects is an important task. Before trying to reduce variation in the dominant cause or mitigate the effect of variation in the ...dominant cause to reduce output variation, it is strongly recommended that we verify we have identified the true (dominant) cause. This article is about how best to verify we have correctly identified a dominant cause, as the existing literature does not properly answer this question. Although it may seem that a randomized controlled experiment is sufficient for this purpose, we show that experimental studies alone cannot provide all the required information. An experiment identifies whether a suspect is a cause of variation; however, we also require additional information (i.e., from observational studies) to determine whether it is dominant and not just significant. This article lists some viable composite study designs, assesses their relative merits, and recommends proper sample sizes. We also investigate how to systematically conduct a verification study in the era of smart manufacturing. Moreover, we provide a tangible example to illustrate our proposed procedure.
Background. Risk-adjusted control charts have become popular for monitoring processes that involve the management and treatment of patients in hospitals or other healthcare institutions. However, to ...date, the effect of estimation error on risk-adjusted control charts has not been studied. Methods. We studied the effect of estimation error on risk-adjusted binary cumulative sum (CUSUM) performance using actual and simulated data on patients undergoing coronary artery bypass surgery and assessed for mortality up to 30 days post-surgery. The effect of estimation error was indicated by the variability of the 'true' average run lengths (ARLs) obtained using repeated sampling of the observed data under various realistic scenarios. Results. Results showed that estimation error can have a substantial effect on risk-adjusted CUSUM chart performance in terms of variation of true ARLs. Moreover, the performance was highly dependent on the number of events used to derive the control chart parameters and the specified ARL for an in-control process (ARL₀). However, the results suggest that it is the uncertainty in the overall adverse event rate that is the main component of estimation error. Conclusions. When designing a control chart, the effect of estimation error could be taken into account by generating a number of bootstrap samples of the available Phase I data and then determining the control limit needed to obtain an ARL₀ of a pre-specified level 95% of the time. If limited Phase I data are available, it may be advisable to continue to update model parameters even after prospective patient monitoring is implemented.
A variety of random graph models have been proposed in the literature to model the associations within an interconnected system and to realistically account for various structures and attributes of ...such systems. In particular, much research has been devoted to modeling the interaction of humans within social networks. However, such networks in real-life tend to be extremely sparse and existing methods do not adequately address this issue. In this article, we propose an extension to ordinary and degree corrected stochastic blockmodels that accounts for a high degree of sparsity. Specifically, we propose hurdle versions of these blockmodels to account for community structure and degree heterogeneity in sparse networks. We use simulation to ensure parameter estimation is consistent and precise, and we propose the use of likelihood ratio-type tests for model selection. We illustrate the necessity for hurdle blockmodels with a small research collaboration network as well as the infamous Enron E-mail exchange network. Methods for determining goodness of fit and performing model selection are also proposed.
Supplementary materials
for this article are available online.
•We present analysis of over 28 million vehicle trips.•We introduce a novel case/control methodology for studying automotive telematics data and the associated risk of a driver's behaviour.•We find ...that speeding is the most important driver behaviour linking driver behaviour to crash risk.
Usage-based insurance schemes provide new opportunities for insurers to accurately price and manage risk. These schemes have the potential to better identify risky drivers which not only allows insurance companies to better price their products but it allows drivers to modify their behaviour to make roads safer and driving more efficient. However, for Usage-based insurance products, we need to better understand how driver behaviours influence the risk of a crash or an insurance claim. In this article, we present our analysis of automotive telematics data from over 28 million trips. We use a case control methodology to study the relationship between crash drivers and crash-free drivers and introduce an innovative method for determining control (crash-free) drivers. We fit a logistic regression model to our data and found that speeding was the most important driver behaviour linking driver behaviour to crash risk.
Many approaches for solving problems in business and industry are based on analytics and statistical modeling. Analytical problem solving is driven by the modeling of relationships between dependent ...(Y) and independent (X) variables, and we discuss three frameworks for modeling such relationships: cause-and-effect modeling, popular in applied statistics and beyond, correlational predictive modeling, popular in machine learning, and deductive (first-principles) modeling, popular in business analytics and operations research. We aim to explain the differences between these types of models, and flesh out the implications of these differences for study design, for discovering potential X/Y relationships, and for the types of solution patterns that each type of modeling could support. We use our account to clarify the popular descriptive-diagnostic-predictive-prescriptive analytics framework, but extend it to offer a more complete model of the process of analytical problem solving, reflecting the essential differences between causal, correlational, and deductive models.