We propose mS2GD: a method incorporating a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent (S2GD). We consider the problem ...of minimizing a strongly convex function represented as the sum of an average of a large number of smooth convex functions, and a simple nonsmooth convex regularizer. Our method first performs a deterministic step (computation of the gradient of the objective function at the starting point), followed by a large number of stochastic steps. The process is repeated a few times with the last iterate becoming the new starting point. The novelty of our method is in introduction of mini-batching into the computation of stochastic steps. In each step, instead of choosing a single function, we sample b functions, compute their gradients, and compute the direction based on this. We analyze the complexity of the method and show that it benefits from two speedup effects. First, we prove that as long as b is below a certain threshold, we can reach any predefined accuracy with less overall work than without mini-batching. Second, our mini-batching scheme admits a simple parallel implementation, and hence is suitable for further acceleration by parallelization.
We consider the problem of estimating the arithmetic average of a finite collection of real vectors stored in a distributed fashion across several compute nodes subject to a communication budget ...constraint. Our analysis does not rely on any statistical assumptions about the source of the vectors. This problem arises as a subproblem in many applications, including reduce-all operations within algorithms for distributed and federated optimization and learning. We propose a flexible family of randomized algorithms exploring the trade-off between expected communication cost and estimation error. Our family contains the full-communication and zero-error method on one extreme, and an ϵ-bit communication and O(1/(∈n)) error method on the opposite extreme. In the special case where we communicate, in expectation, a single bit per coordinate of each vector, we improve upon existing results by obtaining O(r/n) error, where r is the number of bits used to represent a floating point value.
With the growth of data and necessity for distributed optimization methods, solvers that work well on a single machine must be re-designed to leverage distributed computation. Recent work in this ...area has been limited by focusing heavily on developing highly specific methods for the distributed environment. These special-purpose methods are often unable to fully leverage the competitive performance of their well-tuned and customized single machine counterparts. Further, they are unable to easily integrate improvements that continue to be made to single machine methods. To this end, we present a framework for distributed optimization that both allows the flexibility of arbitrary solvers to be used on each (single) machine locally and yet maintains competitive performance against other state-of-the-art special-purpose distributed methods. We give strong primal-dual convergence rate guarantees for our framework that hold for arbitrary local solvers. We demonstrate the impact of local solver selection both theoretically and in an extensive experimental comparison. Finally, we provide thorough implementation details for our framework, highlighting areas for practical performance gains.
Celotno besedilo
Dostopno za:
BFBNIB, DOBA, GIS, IJS, IZUM, KILJ, KISLJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
We propose a novel stochastic gradient method-semi-stochastic coordinate descent-for the problem of minimizing a strongly convex function represented as the average of a large number of smooth convex ...functions:
. Our method first performs a deterministic step (computation of the gradient of f at the starting point), followed by a large number of stochastic steps. The process is repeated a few times, with the last stochastic iterate becoming the new starting point where the deterministic step is taken. The novelty of our method is in how the stochastic steps are performed. In each such step, we pick a random function
and a random coordinate j-both using non-uniform distributions-and update a single coordinate of the decision vector only, based on the computation of the jth partial derivative of
at two different points. Each random step of the method constitutes an unbiased estimate of the gradient of f and moreover, the squared norm of the steps goes to zero in expectation, meaning that the stochastic estimate of the gradient progressively improves. The computational complexity of the method is the sum of two terms:
evaluations of gradients
and
evaluations of partial derivatives
, where
is a novel condition number.
Celotno besedilo
Dostopno za:
BFBNIB, DOBA, GIS, IJS, IZUM, KILJ, KISLJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
In this paper we study the problem of minimizing the average of a large number of smooth convex loss functions. We propose a new method, S2GD (Semi-Stochastic Gradient Descent), which runs for one or ...several epochs in each of which a single full gradient and a random number of stochastic gradients is computed, following a geometric law. For strongly convex objectives, the method converges linearly. The total work needed for the method to output an epsilon-accurate solution in expectation, measured in the number of passes over data, is proportional to the condition number of the problem and inversely proportional to the number of functions forming the average. This is achieved by running the method with number of stochastic gradient evaluations per epoch proportional to conditioning of the problem. The SVRG method of Johnson and Zhang arises as a special case. To illustrate our theoretical results, S2GD only needs the workload equivalent to about 2.1 full gradient evaluations to find a 10e-6 accurate solution for a problem with 10e9 functions and a condition number of 10e3.
Abstract Background We report the feasibility and outcomes of box-lesion ablation technique to treat stand-alone atrial fibrillation (AF). Methods There were 31 patients with a mean age of 63.3 ± 8.4 ...years who underwent bilateral totally thoracoscopic ablation of symptomatic paroxysmal AF ( n = 8; 25.8%) and long-standing perzistent AF ( n = 23; 75.2%). The box-lesion procedure included bilateral pulmonary vein and left atrial posterior wall ablation using irrigated bipolar radiofrequency with documentation of conduction block. Results There were no intra- or perioperative ablation-related complications. There was no operative mortality, no myocardial infarction, and no stroke. Skin-to-skin procedure time was 152.1 ± 36.7 min and the postoperative average length of stay was 6.26 ± 1.24 days. At discharge, 29 patients (93.5%) were in sinus rhythm. Median follow-up time was 20.4 ± 8.3 months. At three months postsurgery, 20 patients of 30 (66.6%) were free from AF without the need of antiarrhythmic drugs. Six patients (20%) underwent catheter reablation. Twenty-three patients (76.6%) were in sinus rhythm at one year after the last performed ablation (surgical ablation or catheter reablation). Conclusion The thoracoscopic box-lesion ablation procedure is a safe, effective, and minimally invasive method for the treatment of isolated (lone) AF. This procedure provided excellent short-term freedom from AF.
Abstract The purpose of this study was to review the outcome of dialysis-dependent patients undergoing cardiac surgery. We retrospectively reviewed 36 dialysis-dependent patients with a mean age of ...63 ± 9.4 years who underwent cardiac operations. Surgery included coronary artery bypass grafting (CABG) in 27 patients (75%), valve surgery in 2 (5.5%), combined CABG plus valve surgery in 5 (13.8%), combined valve surgery and MAZE procedure in 1 patient, combined valve surgery, CABG and MAZE procedure in 1 patient, major aortic surgery in 1 patient, suture of injured right ventricle in 1 patient and extirpation of infected right atrial thrombus in 1 patient. In-hospital mortality rate was 11.1%. All the deaths occurred in patients who underwent urgent procedure. Two of the deaths occurred in patients who underwent cardiac surgery procedure on pump (ascending aorta replacement and infected thrombus removing), one death occurred in a patient who underwent suture of injured right ventricle and another one death occurred in patient who underwent the conventional myocardial revascularization. The survival was 77.8% at 1 year. Generally suggested predictors of increased late mortality are heart failure, urgent/emergent surgery, the complexity of the surgical procedures (valve surgery, combined CABG, valve and major aortic surgery) and postoperative low cardiac output syndrome. In dialysis-dependent patients, CABG has an acceptable risk. Results in patients affected by valve lesions associated or not with coronary artery disease are improved by an early referral to surgery, before the onset of symptoms of heart failure.
Tým sociálních a kulturních antropologů z Univerzity Palackého v Olomouci uspořádal konferenci věnovanou metodologickým aspektům terénního výzkumu s cílem dát prostor ke sdílení zkušeností, ...výzkumných záměrů či teoretických východisek. Důraz byl kladen na diskutování různých podob etnografie: diverzita sahala od tematizace „klasických“ výzkumů v exotickém prostředí Oceánie po dilemata spojená s výzkumem na sociálních sítích. Snahu o poskytnutí prostoru pro toto sdílení v širším středoevropském rámci se podařilo naplnit jen částečně, většina účinkujících byla z České republiky. Nicméně angličtina jako jednací jazyk konference dává naději do budoucna, že tento typ vědecké akce se stane zajímavým pro širší skupinu antropologů či sociálních vědců z příbuzných disciplín i ze zahraničí.