In the Python world, NumPy arrays are the standard representation for numerical data and enable efficient implementation of numerical computations in a high-level language. As this effort shows, ...NumPy performance can be improved through three techniques: vectorizing calculations, avoiding copying data in memory, and minimizing operation counts.
Python offers basic facilities for interactive work and a comprehensive library on top of which more sophisticated systems can be built. The IPython project provides on enhanced interactive ...environment that includes, among other features, support for data visualization and facilities for distributed and parallel computation
Python for Scientific Computing Oliphant, Travis E.
Computing in science & engineering,
2007, Volume:
9, Issue:
3
Journal Article
Peer reviewed
Open access
Python is an excellent "steering" language for scientific codes written in other languages. However, with additional basic tools, Python transforms into a high-level language suited for scientific ...and engineering code that's often fast enough to be immediately useful but also flexible enough to be sped up with additional extensions.
This open access volume explains the foundations of modern solvers for ordinary differential equations (ODEs). Formulating and solving ODEs is an essential part of mathematical modeling and ...computational science, and numerous solvers are available in commercial and open source software. However, no single ODE solver is the best choice for every single problem, and choosing the right solver requires fundamental insight into how the solvers work. This book will provide exactly that insight, to enable students and researchers to select the right solver for any ODE problem of interest, or implement their own solvers if needed. The presentation is compact and accessible, and focuses on the large and widely used class of solvers known as Runge-Kutta methods. Explicit and implicit methods are motivated and explained, as well as methods for error control and automatic time step selection, and all the solvers are implemented as a class hierarchy in Python.
A Theory of Scientific Programming Efficacy Pertseva, Elizaveta; Chang, Melinda; Zaman, Ulia ...
2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE),
2024-April-14
Conference Proceeding
Open access
Scientists write and maintain software artifacts to construct, validate, and apply scientific theories. Despite the centrality of software in their work, their practices differ significantly from ...those of professional software engineers. We sought to understand what makes scientists effective at their work and how software engineering practices and tools can be adapted to fit their workflows. We interviewed 25 scientists and support staff to understand their work. Then, we constructed a theory that relates six factors that contribute to their efficacy in creating and maintaining software systems. We present the theory in the form of a cycle of scientific computing efficacy and identify opportunities for improvement based on the six contributing factors.
•We propose the Sliced Coordinate Format (SCOO) for Sparse Matrix–Vector Multiplication on GPUs.•An associated CUDA implementation which takes advantage of atomic operations is presented.•We propose ...partitioning methods to transform a given sparse matrix into SCOO format.•An efficient Dual-GPU implementation which overlaps computation and communication is described.•Extensive performance comparisons of SCOO compared to other formats on GPUs and CPUs are provided.
Existing formats for Sparse Matrix–Vector Multiplication (SpMV) on the GPU are outperforming their corresponding implementations on multi-core CPUs. In this paper, we present a new format called Sliced COO (SCOO) and an efficient CUDA implementation to perform SpMV on the GPU using atomic operations. We compare SCOO performance to existing formats of the NVIDIA Cusp library using large sparse matrices. Our results for single-precision floating-point matrices show that SCOO outperforms the COO and CSR format for all tested matrices and the HYB format for all tested unstructured matrices on a single GPU. Furthermore, our dual-GPU implementation achieves an efficiency of 94% on average. Due to the lower performance of existing CUDA-enabled GPUs for atomic operations on double-precision floating-point numbers the SCOO implementation for double-precision does not consistently outperform the other formats for every unstructured matrix. Overall, the average speedup of SCOO for the tested benchmark dataset is 3.33 (1.56) compared to CSR, 5.25 (2.42) compared to COO, 2.39 (1.37) compared to HYB for single (double) precision on a Tesla C2075. Furthermore, comparison to a Sandy-Bridge CPU shows that SCOO on a Fermi GPU outperforms the multi-threaded CSR implementation of the Intel MKL Library on an i7-2700K by a factor between 5.5 (2.3) and 18 (12.7) for single (double) precision.
Source code is available at https://github.com/danghvu/cudaSpmv.
Pythran: Crossing the Python Frontier Guelton, Serge
Computing in science & engineering,
2018-Mar./Apr., 2018-3-00, Volume:
20, Issue:
2
Journal Article
Peer reviewed
Use of the Python language in scientific computing has always been characterized by the coexistence of interpreted Python code and compiled native code, written in languages like C or Fortran. This ...column takes a fresh look at the problem and introduces Pythran, a new optimization tool designed to efficiently handle unmodified Python code.
Intel Cilk Plus extends C and C++ to enable writing composable deterministic parallel software that can exploit both the thread and vector parallelism commonly available in modern hardware.
If you have been following developments in software engineering in recent years, you have probably noticed that the term DSL (domain-specific language) has become a minor buzzword in that field. You ...may have concluded that this is a hot new idea that is certainly not ready for application in real life. But, as I will show in this article, computational scientists (and others) have been using DSLs for decades. What is new is not DSLs per se, but the name and the attention given to them.