The resource allocation problem (RAP) determines a solution to optimally allocate limited resources to several activities or tasks. In this study, we propose a novel resource allocation problem ...referred to as multi-period non-shareable resource allocation problem (MNRAP), which is motivated by the characteristics of resources considered in the stem cell culture process for producing stem cell therapeutics. A resource considered in the MNRAP has the following three characteristics: (i) resource consumption required to perform an activity and available resource capacity may change over time; (ii) multiple activities cannot share one resource; and (iii) resource requirements can be satisfied through the combination of different types of resources. The MNRAP selects some of the given activities to maximize the overall profit under limited resources with these characteristics. To address this problem, pattern-based integer programming formulations based on the concept of resource patterns are proposed. These formulations attempt to overcome the limitations of a compact integer programming formulation, the utilization of which is challenging for large-scale problems owing to their complexity. Further, based on a branch-and-price approach to solving pattern-based formulations, effective heuristic algorithms are proposed to provide high-quality solutions for large instances. Moreover, through computational experiments on a wide range of instances, including real-world instances, the superiority of the proposed formulations and heuristic algorithms is demonstrated.
•Multi-period non-shareable resource allocation problem is introduced.•Integer optimization models based on the concept of a resource pattern are proposed.•Heuristic algorithms based on the branch-and-price framework are also proposed.•The performances of proposed algorithms are demonstrated for various instances.
SPARSE HIGH-DIMENSIONAL REGRESSION Bertsimas, Dimitris; Van Parys, Bart
The Annals of statistics,
02/2020, Volume:
48, Issue:
1
Journal Article
Peer reviewed
Open access
We present a novel binary convex reformulation of the sparse regression problem that constitutes a new duality perspective. We devise a new cutting plane method and provide evidence that it can solve ...to provable optimality the sparse regression problem for sample sizes n and number of regressors p in the 100,000s, that is, two orders of magnitude better than the current state of the art, in seconds. The ability to solve the problem for very high dimensions allows us to observe new phase transition phenomena. Contrary to traditional complexity theory which suggests that the difficulty of a problem increases with problem size, the sparse regression problem has the property that as the number of samples n increases the problem becomes easier in that the solution recovers 100% of the true signal, and our approach solves the problem extremely fast (in fact faster than Lasso), while for small number of samples n, our approach takes a larger amount of time to solve the problem, but importantly the optimal solution provides a statistically more relevant regressor. We argue that our exact sparse regression approach presents a superior alternative over heuristic methods available at present.
Optimal classification trees Bertsimas, Dimitris; Dunn, Jack
Machine learning,
07/2017, Volume:
106, Issue:
7
Journal Article
Peer reviewed
Open access
State-of-the-art decision tree methods apply heuristics recursively to create each split in isolation, which may not capture well the underlying characteristics of the dataset. The optimal decision ...tree problem attempts to resolve this by creating the entire decision tree at once to achieve global optimality. In the last 25 years, algorithmic advances in integer optimization coupled with hardware improvements have resulted in an astonishing 800 billion factor speedup in mixed-integer optimization (MIO). Motivated by this speedup, we present
optimal classification trees
, a novel formulation of the decision tree problem using modern MIO techniques that yields the optimal decision tree for axes-aligned splits. We also show the richness of this MIO formulation by adapting it to give
optimal classification trees with hyperplanes
that generates optimal decision trees with multivariate splits. Synthetic tests demonstrate that these methods recover the true decision tree more closely than heuristics, refuting the notion that optimal methods overfit the training data. We comprehensively benchmark these methods on a sample of 53 datasets from the UCI machine learning repository. We establish that these MIO methods are practically solvable on real-world datasets with sizes in the 1000s, and give average absolute improvements in out-of-sample accuracy over CART of 1–2 and 3–5% for the univariate and multivariate cases, respectively. Furthermore, we identify that optimal classification trees are likely to outperform CART by 1.2–1.3% in situations where the CART accuracy is high and we have sufficient training data, while the multivariate version outperforms CART by 4–7% when the CART accuracy or dimension of the dataset is low.
In this paper, we investigate dynamic traffic optimization in railway systems, i.e., the behavior of these systems through time when their movements are dictated by solutions to optimization models ...with finite horizons. As interactions between trains are not considered beyond the limits of finite horizons, the danger of leading the system into a deadlock arises. In this paper we present new procedures to establish finite prediction horizons that are formally guaranteed to operate the system in a way that is compatible with the physical constraints of the network while avoiding deadlocking and minimizing computations. The key to this result is the notion of recursive feasibility . This paper introduces conditions sufficient to attain it. We then discuss several important ramifications of recursive feasibility that enable efficient computations. We examine the possibility of decomposing the underlying optimization models into smaller models with shorter horizons, or into models that only consider subsets of all trains. We also discuss warm starting and anytime approaches. We finally perform numerical experiments verifying these results on models that include a real-world railway system used for freight transport. On harder instances, some of our approaches outperform solving the same models as monolithic MILPs by more than two order of magnitude in terms of median computation times, while also achieving better worst-case optimality gaps.
•Mixed-integer linear programming model for optimizing renewable energy storage.•A clustering algorithm to approximate the optimal levelized cost of electricity.•Case study on providing power under ...different demand profiles in New York City.•Motivation for including backup storage options in addition to battery.
Intermittent solar and wind availabilities pose design and operational challenges for renewable power systems because they are asynchronous with consumer demand. To align this supply-demand mismatch, optimization-based design and scheduling models have been developed to minimize the capital and operational costs associated with power production and energy storage. However, hourly time discretization and large time horizons used to describe short- and long-term solar and wind dynamics, demand fluctuations, & price changes significantly increase the computational burden of solving these models. A decomposition algorithm based on agglomerative hierarchical clustering (AHC) is developed to alleviate the model complexity and optimize the system over representative time periods, instead of every hour. An advantage for AHC compared to other clustering methods is the preservation of time chronology, which is important for energy storage applications. The algorithm is applied to investigate a renewable power system with battery storage in New York City. Results show that a few representative time periods (5–15 days) sufficiently capture the system performance within 5% of the true optimal solution. The decomposition algorithm is suitable for investigating any optimization problem with time series data.
Unit commitment, one of the most critical tasks in electric power system operations, faces new challenges as the supply and demand uncertainty increases dramatically due to the integration of ...variable generation resources such as wind power and price responsive demand. To meet these challenges, we propose a two-stage adaptive robust unit commitment model for the security constrained unit commitment problem in the presence of nodal net injection uncertainty. Compared to the conventional stochastic programming approach, the proposed model is more practical in that it only requires a deterministic uncertainty set, rather than a hard-to-obtain probability distribution on the uncertain data. The unit commitment solutions of the proposed model are robust against all possible realizations of the modeled uncertainty. We develop a practical solution methodology based on a combination of Benders decomposition type algorithm and the outer approximation technique. We present an extensive numerical study on the real-world large scale power system operated by the ISO New England. Computational results demonstrate the economic and operational advantages of our model over the traditional reserve adjustment approach.
The ALAMO approach to machine learning Wilson, Zachary T.; Sahinidis, Nikolaos V.
Computers & chemical engineering,
11/2017, Volume:
106
Journal Article
Peer reviewed
Open access
•The ALAMO framework for building models is reviewed.•A review of the machine learning literature on model-building is presented.•ALAMO is illustrated through its application to learning problems in ...kinetics.
ALAMO is a computational methodology for learning algebraic functions from data. Given a data set, the approach begins by building a low-complexity, linear model composed of explicit non-linear transformations of the independent variables. Linear combinations of these non-linear transformations allow a linear model to better approximate complex behavior observed in real processes. The model is refined, as additional data are obtained in an adaptive fashion through error maximization sampling using derivative-free optimization. Models built using ALAMO can enforce constraints on the response variables to incorporate first-principles knowledge. The ability of ALAMO to generate simple and accurate models for a number of reaction problems is demonstrated. The error maximization sampling is compared with Latin hypercube designs to demonstrate its sampling efficiency. ALAMO's constrained regression methodology is used to further refine concentration models, resulting in models that perform better on validation data and satisfy upper and lower bounds placed on model outputs.
•We quantify the potential of EVs to utilize fluctuating RES through optimized charging.•We use empirical driving data to model the behavior of key sociodemographic groups.•Optimized charging can ...double the utilization of RES compared to simple charging.•Trip information is more relevant than charger availability to utilize EV flexibility.
Electric vehicles (EVs) are a new load type with considerable temporal flexibility. This work evaluates to what extent EV fleets (based on empirical driving profiles from two distinct sociodemographic groups) can cover their charging requirements by means of variable renewable generation (wind or solar-PV). For this purpose we formulate a mixed-integer optimization problem minimizing the amount of conventional generation employed. The results indicate that the usage of variable renewable generation can be more than doubled as compared to uncoordinated charging. Furthermore, we analyze how the utilization of renewable generation by EV fleets is affected through different portfolios of renewable generation sources, charging infrastructure specifications as well as a reduced optimization horizon.