mAgic-FPU is the architecture of a family of VLIW cores for configurable system level integration of floating and fixed point computing power. mAgic customization permits the designer to tune basic ...parameters, such as the computing power/memory access ratio of the core processor, the number of available arithmetic operation per cycle, the register file size and number of port, as well as of the number of arithmetic operators. The reconfiguration (e.g., of register file size and number of port, as well as of the number of arithmetic operators) is supported by the software environment MADE (Modular VLIW processor Architecture and Assembler Description Environment). MADE reads an architecture description file and produces a customized assembler-scheduler for the target VLIW architecture, configuring a general purpose VLIW optimizer-scheduler engine. The mAgic-FPU core architecture satisfies the requisite of portability among silicon foundries. The first members of the mAgic FPU core family architecture fit the requirements of ‘Smart Antenna for Adaptive Beam-Forming processing’ and ‘Physical Sound Synthesis’. The first 1 GigaFlops mAgic core will run at 100 MHz within an area of 40 mm
2
in 0.25
μm ATMEL CMOS technology in first half 2002.
The deployment of the next generation computing platform at ExaFlops scale requires to solve new technological challenges mainly related to the impressive number (up to 10^6) of compute elements ...required. This impacts on system power consumption, in terms of feasibility and costs, and on system scalability and computing efficiency. In this perspective analysis, exploration and evaluation of technologies characterized by low power, high efficiency and high degree of customization is strongly needed. Among the various European initiative targeting the design of ExaFlops system, ExaNeSt and EuroExa are EU-H2020 funded initiatives leveraging on high end MPSoC FPGAs. Last generation MPSoC FPGAs can be seen as non-mainstream but powerful HPC Exascale enabling components thanks to the integration of embedded multi-core, ARM-based low power CPUs and a huge number of hardware resources usable to co-design application oriented accelerators and to develop a low latency high bandwidth network architecture. In this paper we introduce ExaNet the FPGA-based, scalable, direct network architecture of ExaNeSt system. ExaNet allow us to explore different interconnection topologies, to evaluate advanced routing functions for congestion control and fault tolerance and to design specific hardware components for acceleration of collective operations. After a brief introduction of the motivations and goals of ExaNeSt and EuroExa projects, we will report on the status of network architecture design and its hardware/software testbed adding preliminary bandwidth and latency achievements.
Efficient brain simulation is a scientific grand challenge, a parallel/distributed coding challenge and a source of requirements and suggestions for future computing architectures. Indeed, the human ...brain includes about 10^15 synapses and 10^11 neurons activated at a mean rate of several Hz. Full brain simulation poses Exascale challenges even if simulated at the highest abstraction level. The WaveScalES experiment in the Human Brain Project (HBP) has the goal of matching experimental measures and simulations of slow waves during deep-sleep and anesthesia and the transition to other brain states. The focus is the development of dedicated large-scale parallel/distributed simulation technologies. The ExaNeSt project designs an ARM-based, low-power HPC architecture scalable to million of cores, developing a dedicated scalable interconnect system, and SWA/AW simulations are included among the driving benchmarks. At the joint between both projects is the INFN proprietary Distributed and Plastic Spiking Neural Networks (DPSNN) simulation engine. DPSNN can be configured to stress either the networking or the computation features available on the execution platforms. The simulation stresses the networking component when the neural net - composed by a relatively low number of neurons, each one projecting thousands of synapses - is distributed over a large number of hardware cores. When growing the number of neurons per core, the computation starts to be the dominating component for short range connections. This paper reports about preliminary performance results obtained on an ARM-based HPC prototype developed in the framework of the ExaNeSt project. Furthermore, a comparison is given of instantaneous power, total energy consumption, execution time and energetic cost per synaptic event of SWA/AW DPSNN simulations when executed on either ARM- or Intel-based server platforms.
Recent experimental neuroscience studies are pointing out the role of long-range intra-areal connectivity that can be modeled by a distance dependent exponential decay of the synaptic probability ...distribution. This short report provides a preliminary measure of the impact of exponentially decaying lateral connectivity compared to that of shorter-range Gaussian decays on the scaling behaviour and memory occupation of a distributed spiking neural network simulator (DPSNN). Two-dimensional grids of cortical columns composed by point-like spiking neurons have been connected by up to 30 billion synapses using exponential and Gaussian connectivity models. Up to 1024 hardware cores, hosted on a 64 nodes server platform, executed the MPI processes composing the distributed simulator. The hardware platform was a cluster of IBM NX360 M5 16-core compute nodes, each one containing two Intel Xeon Haswell 8-core E5-2630 v3 processors, with a clock of 2.40GHz, interconnected through an InfiniBand network. This study is conducted in the framework of the CORTICONIC FET project, also in view of the next -to-start activities foreseen as part of the Human Brain Project (HBP), SubProject 3 Cognitive and Systems Neuroscience, WaveScalES work-package.
This short report describes the scaling, up to 1024 software processes and hardware cores, of a distributed simulator of plastic spiking neural networks. A previous report demonstrated good ...scalability of the simulator up to 128 processes. Herein we extend the speed-up measurements and strong and weak scaling analysis of the simulator to the range between 1 and 1024 software processes and hardware cores. We simulated two-dimensional grids of cortical columns including up to ~20G synapses connecting ~11M neurons. The neural network was distributed over a set of MPI processes and the simulations were run on a server platform composed of up to 64 dual-socket nodes, each socket equipped with Intel Haswell E5-2630 v3 processors (8 cores @ 2.4 GHz clock). All nodes are interconned through an InfiniBand network. The DPSNN simulator has been developed by INFN in the framework of EURETILE and CORTICONIC European FET Project and will be used by the WaveScalEW tem in the framework of the Human Brain Project (HBP), SubProject 2 - Cognitive and Systems Neuroscience. This report lays the groundwork for a more thorough comparison with the neural simulation tool NEST.
This short note regards a comparison of instantaneous power, total energy consumption, execution time and energetic cost per synaptic event of a spiking neural network simulator (DPSNN-STDP) ...distributed on MPI processes when executed either on an embedded platform (based on a dual socket quad-core ARM platform) or a server platform (INTEL-based quad-core dual socket platform). We also compare the measure with those reported by leading custom and semi-custom designs: TrueNorth and SpiNNaker. In summary, we observed that: 1- we spent 2.2 micro-Joule per simulated event on the "embedded platform", approx. 4.4 times lower than what was spent by the "server platform"; 2- the instantaneous power consumption of the "embedded platform" was 14.4 times better than the "server" one; 3- the server platform is a factor 3.3 faster. The "embedded platform" is made of NVIDIA Jetson TK1 boards, interconnected by Ethernet, each mounting a Tegra K1 chip including a quad-core ARM Cortex-A15 at 2.3GHz. The "server platform" is based on dual-socket quad-core Intel Xeon CPUs (E5620 at 2.4GHz). The measures were obtained with the DPSNN-STDP simulator (Distributed Simulator of Polychronous Spiking Neural Network with synaptic Spike Timing Dependent Plasticity) developed by INFN, that already proved its efficient scalability and execution speed-up on hundreds of similar "server" cores and MPI processes, applied to neural nets composed of several billions of synapses.
We introduce a natively distributed mini-application benchmark representative of plastic spiking neural network simulators. It can be used to measure performances of existing computing platforms and ...to drive the development of future parallel/distributed computing systems dedicated to the simulation of plastic spiking networks. The mini-application is designed to generate spiking behaviors and synaptic connectivity that do not change when the number of hardware processing nodes is varied, simplifying the quantitative study of scalability on commodity and custom architectures. Here, we present the strong and weak scaling and the profiling of the computational/communication components of the DPSNN-STDP benchmark (Distributed Simulation of Polychronous Spiking Neural Network with synaptic Spike-Timing Dependent Plasticity). In this first test, we used the benchmark to exercise a small-scale cluster of commodity processors (varying the number of used physical cores from 1 to 128). The cluster was interconnected through a commodity network. Bidimensional grids of columns composed of Izhikevich neurons projected synapses locally and toward first, second and third neighboring columns. The size of the simulated network varied from 6.6 Giga synapses down to 200 K synapses. The code demonstrated to be fast and scalable: 10 wall clock seconds were required to simulate one second of activity and plasticity (per Hertz of average firing rate) of a network composed by 3.2 G synapses running on 128 hardware cores clocked @ 2.4 GHz. The mini-application has been designed to be easily interfaced with standard and custom software and hardware communication interfaces. It has been designed from its foundation to be natively distributed and parallel, and should not pose major obstacles against distribution and parallelization on several platforms.
The EURETILE project required the selection and coding of a set of dedicated benchmarks. The project is about the software and hardware architecture of future many-tile distributed fault-tolerant ...systems. We focus on dynamic workloads characterised by heavy numerical processing requirements. The ambition is to identify common techniques that could be applied to both the Embedded Systems and HPC domains. This document is the first public deliverable of Work Package 7: Challenging Tiled Applications.
Tivantinib (ARQ 197), a selective, oral MET inhibitor, improved overall survival and progression-free survival compared with placebo in a randomised phase 2 study in patients with high MET expression ...(MET-high) hepatocellular carcinoma previously treated with sorafenib. The aim of this phase 3 study was to confirm the results of the phase 2 trial.
We did a phase 3, randomised, double-blind, placebo-controlled study in 90 centres in Australia, the Americas, Europe, and New Zealand. Eligible patients were 18 years or older and had unresectable, histologically confirmed, hepatocellular carcinoma, an Eastern Cooperative Oncology Group performance status of 0–1, high MET expression (MET-high; staining intensity score ≥2 in ≥50% of tumour cells), Child-Pugh A cirrhosis, and radiographically-confirmed disease progression after receiving sorafenib-containing systemic therapy. We randomly assigned patients (2:1) in block sizes of three using a computer-generated randomisation sequence to receive oral tivantinib (120 mg twice daily) or placebo (twice daily); patients were stratified by vascular invasion, extrahepatic spread, and α-fetoprotein concentrations (≤200 ng/mL or >200 ng/mL). The primary endpoint was overall survival in the intention-to-treat population. Efficacy analyses were by intention to treat and safety analyses were done in all patients who received any amount of study drug. This study is registered with ClinicalTrials.gov, number NCT01755767.
Between Dec 27, 2012, and Dec 10, 2015, 340 patients were randomly assigned to receive tivantinib (n=226) or placebo (n=114). At a median follow-up of 18·1 months (IQR 14·1–23·1), median overall survival was 8·4 months (95% CI 6·8–10·0) in the tivantinib group and 9·1 months (7·3–10·4) in the placebo group (hazard ratio 0·97; 95% CI 0·75–1·25; p=0·81). Grade 3 or worse treatment-emergent adverse events occurred in 125 (56%) of 225 patients in the tivantinib group and in 63 (55%) of 114 patients in the placebo group, with the most common being ascites (16 7% patients), anaemia (11 5% patients), abdominal pain (nine 4% patients), and neutropenia (nine 4% patients) in the tivantinib group. 50 (22%) of 226 patients in the tivantinib group and 18 (16%) of 114 patients in the placebo group died within 30 days of the last dose of study medication, and general deterioration (eight 4% patients) and hepatic failure (four 2% patients) were the most common causes of death in the tivantinib group. Three (1%) of 225 patients in the tivantinib group died from a treatment-related adverse event (one sepsis, one anaemia and acute renal failure, and one acute coronary syndrome).
Tivantinib did not improve overall survival compared with placebo in patients with MET-high advanced hepatocellular carcinoma previously treated with sorafenib. Although this METIV-HCC trial was negative, the study shows the feasibility of doing integral tissue biomarker studies in patients with advanced hepatocellular carcinoma. Additional randomised studies are needed to establish whether MET inhibition could be a potential therapy for some subsets of patients with advanced hepatocellular carcinoma.
ArQule Inc and Daiichi Sankyo (Daiichi Sankyo Group).
Extracellular nicotinamide phosphoribosyltransferase (eNAMPT) is increased in inflammatory bowel disease (IBD) patients, and its serum levels correlate with a worse prognosis. In the present ...manuscript, we show that eNAMPT serum levels are increased in IBD patients that fail to respond to anti-TNFα therapy (infliximab or adalimumab) and that its levels drop in patients that are responsive to these therapies, with values comparable with healthy subjects. Furthermore, eNAMPT administration in dinitrobenzene sulfonic acid (DNBS)-treated mice exacerbates the symptoms of colitis, suggesting a causative role of this protein in IBD. To determine the druggability of this cytokine, we developed a novel monoclonal antibody (C269) that neutralizes in vitro the cytokine-like action of eNAMPT and that reduces its serum levels in rodents. Of note, this newly generated antibody is able to significantly reduce acute and chronic colitis in both DNBS- and dextran sulfate sodium (DSS)-induced colitis. Importantly, C269 ameliorates the symptoms by reducing pro-inflammatory cytokines. Specifically, in the
lamina propria
, a reduced number of inflammatory monocytes, neutrophils, Th1, and cytotoxic T lymphocytes are found upon C269 treatment. Our data demonstrate that eNAMPT participates in IBD and, more importantly, that eNAMPT-neutralizing antibodies are endowed with a therapeutic potential in IBD.
Key messages
What are the new findings?
Higher serum eNAMPT levels in IBD patients might decrease response to anti-TNF therapy.
The cytokine-like activity of eNAMPT may be neutralized with a monoclonal antibody.
Neutralization of eNAMPT ameliorates acute and chronic experimental colitis.
Neutralization of eNAMPT limits the expression of IBD inflammatory signature.
Neutralization of eNAMPT impairs immune cell infiltration in
lamina propria
.