The multi-returning secretary problem Bayón, L.; Fortuny Ayuso, P.; Grau, J.M. ...
Discrete Applied Mathematics,
10/2022, Letnik:
320
Journal Article
Recenzirano
Odprti dostop
In this paper we consider the so-called Multi-returning secretary problem, a version of the Secretary problem in which each candidate has m identical copies. The case m=2 has already been completely ...solved by several authors using different methods, but the case m>2 had not been satisfactorily solved yet. Here, under the conjecture of certain (very likely true) uniform convergence results, we provide an efficient algorithm to compute the optimal threshold and the probability of success for every m. Moreover, we give a method to determine their asymptotic values based on the solution of a system of m ODEs.
The optimal stopping problem is concerned with finding an optimal policy to stop a stochastic process in order to maximize the expected return. This problem is critical in stochastic control and can ...be found in many different fields, such as operations research, finance, and healthcare. In this paper, we model the underlying stochastic process of the optimal stopping problem as a Markov decision process and propose a computationally efficient model-free value-based reinforcement learning approach, named ΔV-learning. The efficiency is improved by taking advantage of the unique structural properties of the optimal stopping problem into our algorithm design. We consider two types of the optimal stopping problems: the standard optimal stopping and the regenerative optimal stopping, which differ in their transition dynamics once the stopping action is executed. We conduct numerical experiments on our proposed method and compare its performance against existing reinforcement learning algorithms and rule-based policies. The results show that our ΔV-learning method is able to outperform the benchmark algorithms in all experiments.
•Study the standard and regenerative optimal stopping problems.•Develop a new reinforcement learning method to solve the optimal stopping problems.•Incorporate the structural properties of optimal stopping into the algorithm.•Conduct experiments to demonstrate performance improvements.
In an online random trading problem, m sellers and n buyers arrive in a random sequential order to meet a decision maker. Each seller possesses an item and each buyer demands an item. All items are ...identical. Each agent (seller or buyer) has a positive valuation on one item and reveals her valuation to the decision maker when she arrives. The decision maker, who knows only m and n in advance, uses an online algorithm to make an irrevocable decision on whether or not to trade with the arriving agent at her arrival time. We design online algorithms to maximize the social welfare, i.e., the expected total valuation of the agents who possess items on hand at the end of trading process. For the single-buyer trading, our algorithm achieves a tight competitive ratio of 1+1/m. For the single-seller trading, when n tends to infinity, we establish lower bound 3.258 and upper bound 4.189 on the competitive ratio. When both m and n are sufficiently large, our algorithm achieves an asymptotic competitive ratio no more than 1+O(m−1/3lnm).
We study a variation of the game of best choice (also known as the secretary problem or game of googol) under an additional assumption that the ranks of interview candidates are restricted using ...permutation pattern-avoidance. We develop some general machinery for investigating interview orderings with a non-uniform rank distribution, and give a complete description of the optimal strategies for the pattern-avoiding games under each of the size three permutations. The optimal strategy for the “disappointment-free” (i.e. 321-avoiding) interviews has a form that seems to be new, involving thresholds based on value-saturated left-to-right maxima in the permutation.
Suppose that n items arrive online in random order and the goal is to select k of them such that the expected sum of the selected items is maximized. The decision for any item is irrevocable and must ...be made on arrival without knowing future items. This problem is known as the k-secretary problem, which includes the classical secretary problem with the special case k=1. It is well-known that the latter problem can be solved by a simple algorithm of competitive ratio 1/e which is optimal for n→∞. Existing algorithms beating the threshold of 1/e either rely on involved selection policies already for k=2, or assume that k is large.
In this paper we present results for the k-secretary problem, considering the interesting and relevant case that k is small. We focus on simple selection algorithms, accompanied by combinatorial analyses. As a main contribution we propose a natural deterministic algorithm designed to have competitive ratios strictly greater than 1/e for small k≥2. This algorithm is hardly more complex than the elegant strategy for the classical secretary problem, optimal for k=1, and works for all k≥1. We derive its competitive ratios for k≤100, ranging from 0.41 for k=2 to 0.75 for k=100.
Moreover, we consider an algorithm proposed earlier in the literature, for which no rigorous analysis is known. We show that its competitive ratio is 0.4168 for k=2, implying that the previous analysis was not tight. Our analysis reveals a surprising combinatorial property of this algorithm, which might be helpful to find a tight analysis for all k.
In this paper, we consider the k-submodular secretary problem, in which the items or secretaries arrive one by one in a uniformly random order, and the goal is to select k disjoint sets of items, so ...as to maximize the expectation of a monotone non-negative k-submodular function. A decision for each item must be made immediately and irrevocably after its arrival: accept and assign it to one of the k dimensions, or reject it. We first show that, in the unconstrained setting, there is an offline algorithm that can be transformed to a k2k−1-competitive online algorithm, which is asymptotically the best possible. For the problem with cardinality constraints, we present online algorithms with provable performance guarantees for the total size constraint and individual size constraint settings that depend on the corresponding budget parameters. For the problem under a knapsack constraint, we provide a constant-competitive online algorithm.
We solve the secretary problem in the case that the ranked items arrive in a statistically biased order rather than in uniformly random order. The bias is given by a Mallows distribution with ...parameter q∈(0,1), so that higher ranked items tend to arrive later and lower ranked items tend to arrive sooner. In the classical problem, the asymptotically optimal strategy is to reject the first Mn⁎ items, where Mn⁎∼ne, and then to select the first item ranked higher than any of the first Mn⁎ items (if such an item exists). This yields 1e as the limiting probability of success. The Mallows distribution with parameter q=1 is the uniform distribution. For the regime qn=1−cn, with c>0, the case of weak bias, the optimal strategy occurs with Mn⁎∼n(1clog(1+ec−1e)), with the limiting probability of success being 1e. For the regime qn=1−cnα, with c>0 and α∈(0,1), the case of moderate bias, the optimal strategy occurs with n−Mn∼nαc, with the limiting probability of success being 1e. For fixed q∈(0,1), the case of strong bias, the optimal strategy occurs with Mn⁎=n−L where L−1L<q≤LL+1, with limiting probability of success being (1−q)qL−1L>1e.
The well-known secretary problem in sequential analysis and optimal stopping theory asks one to maximize the probability of finding the optimal candidate in a sequentially examined list under the ...constraint that accept/reject decisions are made in real-time. The problem has received significant interest in the mathematics community and is related to practical questions arising in online search, data streaming, daily purchase modeling and multi-arm bandit mechanisms. A version of the problem is the so-called postdoc problem, for which the question of interest is to devise a strategy that identifies the second-best candidate with highest possible probability of success.
We study the postdoc problem in its combinatorial form. In this setting, a permutation π of length N is sampled according to some distribution over the symmetric group SN and the elements of π are revealed one-by-one from left to right so that at each step, one can only determine the relative orders of the elements revealed so far. At each step, one must decide to either accept or reject the currently presented element and cannot recall the decision in the future. The question of interest is to find the optimal strategy for selecting the position of the second-largest value. We solve the postdoc problem for the untraditional setting where the candidates are not presented uniformly at random but rather according to permutations drawn from the Mallows distribution. The Mallows distribution assigns to each permutation π∈SN a weight θc(π), where the function c represents the Kendall τ distance between π and the identity permutation (i.e., the number of inversions in π). To identify the optimal stopping criteria for the significantly more challenging postdoc problem, we adopt a combinatorial methodology that includes new proof techniques and novel methodological extensions compared to the analysis first introduced in the setting of the secretary problem. The optimal strategies depend on the parameter θ of the Mallows distribution and can be determined exactly by solving well-defined recurrence relations.
Real numbers from the interval 0,1 are randomly selected with uniform distribution. There are n of them and they measure shortcomings of the candidates from the pool of n secretaries in a search ...process; a smaller number indicates a better candidate. The numbers come one by one. We know only their relative ranks but do not know their values. We want to stop on the current element maximizing the probability that its value is smaller than 1n. We provide a complete solution of this model by constructing an optimal stopping algorithm, establishing its asymptotic performance, and showing how to calculate all characteristics of the algorithm with desired accuracy.
We study the secretary problem in which rank-ordered lists are generated by the Mallows model and the goal is to identify the highest ranked candidate through a sequential interview process which ...does not allow rejected candidates to be revisited. The main difference between our formulation and existing models is that, during the selection process, we are given a fixed number of opportunities to query an infallible expert whether the current candidate is the highest-ranked or not. If the response is positive, the selection process terminates, otherwise, the search continues until a new potentially optimal candidate is identified. Our optimal interview strategy, as well as the expected number of candidates interviewed and expected number of queries used can be determined through the evaluation of well-defined recurrence relations. Specifically, if we are allowed to query s−1 times and to make a final selection without querying (thus, making s selections in total) then the optimum scheme is characterized by s thresholds that depend on the parameter θ of the Mallows distribution but are independent on the maximum number of queries.