Using hardware-transactional-memory support to implement speculative task execution

E-resources

Peer reviewed

Using hardware-transactional-memory support to implement speculative task execution

Salamanca, Juan; Baldassin, Alexandro

Journal of parallel and distributed computing, October 2024, 2024-10-00, Volume: 192

Journal Article

Loops take up most of the time of computer programs, so optimizing them so that they run in the shortest time possible is a continuous task. However, this task is not negligible; on the contrary, it is an open area of research since many irregular loops are hard to parallelize. Generally, these loops have loop-carried (DOACROSS) dependencies and the appearance of dependencies could depend on the context. Many techniques have been studied to be able to parallelize these loops efficiently; however, for example in the OpenMP standard there is no efficient way to parallelize them. This article presents Speculative Task Execution (STE), a technique that enables the execution of OpenMP tasks in a speculative way to accelerate certain hot-code regions (such as loops) marked by OpenMP directives. It also presents a detailed analysis of the application of Hardware Transactional Memory (HTM) support for executing tasks speculatively and describes a careful evaluation of the implementation of STE using HTM on modern machines. In particular, we consider the scenario in which speculative tasks are generated by the OpenMP taskloop construct (Speculative Taskloop (STL)). As a result, it provides evidence to support several important claims about the performance of STE over HTM in modern processor architectures. Experimental results reveal that: (a) by implementing STL on top of HTM for hot-code regions, speed-ups of up to 5.39× can be obtained in IBM POWER8 and of up to 2.41× in Intel processors using 4 cores; and (b) STL-ROT, a variant of STL using rollback-only transactions (ROTs), achieves speed-ups of up to 17.70× in IBM POWER9 processor using 20 cores. •Proposal of STE/STL to parallelize loops using HTM’s speculative-execution support.•Modification of the LLVM OpenMP Runtime Library to support monotonic scheduling.•In-depth evaluation of STL on an Intel processor with TSX-NI support.•Comparison of the performance of STL and FOR-TLS using an IBM POWER8 processor.•Assessment of the performance of Speculative Taskloop (with the optimized OpenMP Runtime Library) and FOR-TLS using an IBM POWER8 processor. The results show the potential of the technique when aborts due to order-inversion are mitigated.•Proposal and evaluation of STL-ROT by comparing it to STL using IBM POWER machines.

Keep searching

Author

Access to the JCR database is permitted only to users from Slovenia. Your current IP address is not on the list of IP addresses with access permission, and authentication with the relevant AAI accout is required.

Year	Impact factor		Edition		Category		Classification
Year	JCR	SNIP	JCR	SNIP	JCR	SNIP	JCR	SNIP

Links to authors' personal bibliographies	Links to information on researchers in the SICRIS system

Source: Personal bibliographies and: SICRIS

Upload image

Shelf entry

Adding material to shelf was successful.

Adding material to shelf failed.

It was not necessary to add the material to the shelf.

Permalink

E-mail

Impact factor

Select the library membership card:

DRS, in which the journal is indexed

Citations

Theme