Reward is enough Silver, David; Singh, Satinder; Precup, Doina ...
Artificial intelligence,
October 2021, 2021-10-00, 20211001, Letnik:
299
Journal Article
Recenzirano
Odprti dostop
In this article we hypothesise that intelligence, and its associated abilities, can be understood as subserving the maximisation of reward. Accordingly, reward is enough to drive behaviour that ...exhibits abilities studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalisation and imitation. This is in contrast to the view that specialised problem formulations are needed for each ability, based on other signals or objectives. Furthermore, we suggest that agents that learn through trial and error experience to maximise reward could learn behaviour that exhibits most if not all of these abilities, and therefore that powerful reinforcement learning agents could constitute a solution to artificial general intelligence.
Natural actor–critic algorithms Bhatnagar, Shalabh; Sutton, Richard S.; Ghavamzadeh, Mohammad ...
Automatica (Oxford),
11/2009, Letnik:
45, Številka:
11
Journal Article
Recenzirano
Odprti dostop
We present four new reinforcement learning algorithms based on actor–critic, natural-gradient and function-approximation ideas, and we provide their convergence proofs. Actor–critic reinforcement ...learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function-approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of special interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor–critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients. Our results extend prior empirical studies of natural actor–critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.
Off-policy prediction-learning the value function for one policy from data generated while following another policy-is one of the most challenging problems in reinforcement learning. This article ...makes two main contributions: 1) it empirically studies 11 off-policy prediction learning algorithms with emphasis on their sensitivity to parameters, learning speed, and asymptotic error and 2) based on the empirical results, it proposes two step-size adaptation methods called and that help the algorithm with the lowest error from the experimental study learn faster. Many off-policy prediction learning algorithms have been proposed in the past decade, but it remains unclear which algorithms learn faster than others. In this article, we empirically compare 11 off-policy prediction learning algorithms with linear function approximation on three small tasks: the Collision task, the task, and the task. The Collision task is a small off-policy problem analogous to that of an autonomous car trying to predict whether it will collide with an obstacle. The and tasks are designed such that learning fast in them is challenging. In the Rooms task, the product of importance sampling ratios can be as large as <inline-formula> <tex-math notation="LaTeX">2^{14}</tex-math> </inline-formula>. To control the high variance caused by the product of the importance sampling ratios, step size should be set small, which, in turn, slows down learning. The task is more extreme in that the product of the ratios can become as large as <inline-formula> <tex-math notation="LaTeX">2^{14}</tex-math> </inline-formula> <inline-formula> <tex-math notation="LaTeX">\times</tex-math> </inline-formula> <inline-formula> <tex-math notation="LaTeX">25</tex-math> </inline-formula>. The algorithms considered are Off-policy TD, five Gradient-TD algorithms, two Emphatic-TD algorithms, Vtrace, and variants of Tree Backup and ABQ that are applicable to the prediction setting. We found that the algorithms' performance is highly affected by the variance induced by the importance sampling ratios. Tree Backup, Vtrace, and ABTDare not affected by the high variance as much as other algorithms, but they restrict the effective bootstrapping parameter in a way that is too limiting for tasks where high variance is not present. We observed that Emphatic TDtends to have lower asymptotic error than other algorithms but might learn more slowly in some cases. Based on the empirical results, we propose two step-size adaptation algorithms, which we collectively refer to as the Ratchet algorithms, with the same underlying idea: keep the step-size parameter as large as possible and ratchet it down only when necessary to avoid overshoot. We show that the Ratchet algorithms are effective by comparing them with other popular step-size adaptation algorithms, such as the Adam optimizer.
Reflex Atrioventricular Block Sutton, Richard
Frontiers in cardiovascular medicine,
04/2020, Letnik:
7
Journal Article
Recenzirano
Odprti dostop
Reflex atrioventricular block is well-recorded although it is considered rare. Recent data suggests that it is less rare than has been supposed. It has been shown to occur in both vasovagal and ...carotid sinus reflexes. It has to be distinguished from paroxysmal atrioventricular block due to ventricular conduction tissue disease. Low chronic adenosine levels combined with adenosine release may mimic reflex atrioventricular block. Explanations of the mechanism of these phenomena have been lacking until the recent past. The relevance of reflex atrioventricular block to clinical decision-making is as a possible indication for pacing the heart with consideration given to the vasodepressor component of the reflex.
The objective of this study was to describe the correlation between the commercially available assay for anti-S1/RBD IgG and protective serum neutralizing antibodies (nAb) against SARS-CoV-2 in an ...adult population after SARS-CoV-2 vaccination, and determine if clinical variables impact this correlation.
We measured IgG anti-S1/RBD using the IgG-II CMIA assay and nAb IC50 values against SARS-CoV-2 WA-1 in sera serially collected post-mRNA vaccination in veterans and healthcare workers of the Veterans Affairs Connecticut Healthcare System (VACHS) between December 2020 and January 2022. The correlation between IgG and IC50 was measured using Pearson correlation. Clinical variables (age, sex, race, ethnicity, prior COVID infection defined by RT-PCR, history of malignancy, estimated glomerular filtration rate (GFR calculated using CKD-EPI equation) were collected by manual chart review. The impact of these clinical variables on the IgG-nAb correlation was analyzed first with univariable regression. Variables with a significance of p < 0.15 were analyzed with forward stepwise regression analysis.
From 127 sera samples in 100 unique subjects (age 20-93 years; mean 63.83; SD 15.63; 29% female; 67% White), we found a robust correlation between IgG anti-S1/RBD and nAb IC50 (R2 = 0.83, R2adj = 0.70, p < 0.0001). Race, ethnicity, and a history of malignancy were not significant on univariable analysis. GFR (p < 0.05) and prior COVID infection (p < 0.001) had a significant impact on the correlation between IgG anti-S1/RBD and nAb IC50. Age (p = 0.06) and sex (p = 0.07) trended towards significance on univariable analysis, but were not significant on multivariable regression.
There was a strong correlation between IgG anti-S1/RBD and nAb IC50 after SARS-CoV-2 vaccination. Clinical comorbidities, such as prior COVID infection and renal function, impacted this correlation. These results may assist the prediction of post-vaccination immune protection in clinical settings using cost-effective commercial platforms.
Abstract For the diagnosis of reflex syncope, diligent history-building with the patient and a witness is required. In the Emergency Department (ED), the assessment of syncope is a challenge which ...may be addressed by an ED Observation Unit or by a referral to a Syncope Unit. Hospital admission is necessary for those with life-threatening cardiac conditions although risk stratification remains an unsolved problem. Other patients may be investigated with less urgency by carotid sinus massage (>40 years), tilt testing, and electrocardiogram loop recorder insertion resulting in a clear cause for syncope. Management includes, in general terms, patient education, avoidance of circumstances in which syncope is likely, increase in fluid and salt consumption, and physical counter-pressure maneuvers. In older patients, those that will benefit from cardiac pacing are now well defined. In all patients, the benefit of drug therapy is often disappointing and there remains no ideal drug. A role for catheter ablation may emerge for the highly symptomatic reflex syncope patient.
Artificial intelligence is poised to revolutionize the field of medicine, however significant questions must be answered prior to its implementation on a regular basis. Many artificial intelligence ...algorithms remain limited by isolated datasets which may cause selection bias and truncated learning for the program. While a central database may solve this issue, several barriers such as security, patient consent, and management structure prevent this from being implemented. An additional barrier to daily use is device approval by the Food and Drug Administration. In order for this to occur, clinical studies must address new endpoints, including and beyond the traditional bio- and medical statistics. These must showcase artificial intelligence’s benefit and answer key questions, including challenges posed in the field of medical ethics.
Rev is an essential regulatory protein of Human Immunodeficiency Virus type 1 (HIV) that is found in the nucleus of infected cells. Rev multimerizes on the Rev-response element (RRE) of HIV RNA to ...facilitate the export of intron-containing HIV mRNAs from the nucleus to the cytoplasm, and, as such, HIV cannot replicate in the absence of Rev. We have developed cell-intact and cell-free assays based upon a robust firefly split-luciferase complementation system, both of which quantify Rev-Rev interaction. Using the cell-based system we show that additional Crm1 did not impact the interaction, whereas excess Rev reduced it. Furthermore, when a series of mutant Revs were tested, there was a strong correlation between the results of the cell-based assay and the results of a functional Rev trans-complementation infectivity assay. Of interest, a camelid nanobody (NB) that was known to inhibit Rev function enhanced Rev-Rev interaction in the cell-based system. We observed a similar increase in Rev-Rev interaction in a cell-free system, when cell lysates expressing Rev-NLUC or CLUC-Rev were simply mixed. In the cell-free system Rev-Rev interaction occurred within minutes and was inhibited by excess Rev. The levels of interaction between the mutant Revs tested varied by mutant type. Treatment of Rev lysates with RNAse minimally reduced the degree of interaction whereas addition of HIV RRE RNA enhanced the interaction. Purified GST-Rev protein inhibited the interaction. The Z-factor (Z’) for the cell-free system was ∼0.85 when tested in 96-well format, and the anti-Rev NB enhanced the interaction in the cell-free system. Thus, we have developed both cell-intact and cell-free systems that can reliably, rapidly, and reproducibly quantify Rev-Rev interaction. These assays, particularly the cell-free one, may be useful in screening and identifying compounds that inhibit Rev function on a high throughput basis.
•HIV-1 Rev-Rev interaction can be quantified by cell-based and cell-free split-luciferase complementation assays.•The Z-factor for the cell-free split-luciferase complementation assay to quantify Rev-Rev interaction was ∼0.85•A camelid nanobody known to inhibit Rev function enhanced Rev-Rev interaction in both the cell-based and cell-free systems.•Split-luciferase complementation assays may be useful in screening and identifying compounds that inhibit Rev function.
COVID-19 is a global crisis of unimagined dimensions. Currently, Remedesivir is only fully licensed FDA therapeutic. A major target of the vaccine effort is the SARS-CoV-2 spike-hACE2 interaction, ...and assessment of efficacy relies on time consuming neutralization assay. Here, we developed a cell fusion assay based upon spike-hACE2 interaction. The system was tested by transient co-transfection of 293T cells, which demonstrated good correlation with standard spike pseudotyping for inhibition by sera and biologics. Then established stable cell lines were very well behaved and gave even better correlation with pseudotyping results, after a short, overnight co-incubation. Results with the stable cell fusion assay also correlated well with those of a live virus assay. In summary we have established a rapid, reliable, and reproducible cell fusion assay that will serve to complement the other neutralization assays currently in use, is easy to implement in most laboratories, and may serve as the basis for high throughput screens to identify inhibitors of SARS-CoV-2 virus-cell binding and entry.
In part I of this study, we found that the classical studies on vasovagal syncope, conducted in fit young subjects, overstated vasodilatation as the dominant hypotensive mechanism. Since 1980, blood ...pressure and cardiac output have been measured continuously using noninvasive methods during tilt, mainly in patients with recurrent syncope, including women and the elderly. This has allowed us to analyze in more detail the complex sequence of hemodynamic changes leading up to syncope in the laboratory. All tilt-sensitive patients appear to progress through 4 phases: (1) early stabilization, (2) circulatory instability, (3) terminal hypotension, and (4) recovery. The physiology responsible for each phase is discussed. Although the order of phases is consistent, the time spent in each phase may vary. In teenagers and young adults, progressive hypotension during phases 2 and 3 can be driven by vasodilatation or falling cardiac output. The fall in cardiac output is secondary to a progressive decrease in stroke volume because blood is pooled in the splanchnic veins. In adults a fall in cardiac output is the dominant hypotensive mechanism because systemic vascular resistance always remains above baseline levels.