TRACO: An Automatic Loop Nest Parallelizer for Numerical Applications

Palkowski, Marek; Klimek, Tomasz; Bielecki, Włodzimierz

doi:10.15439/2015F34

Artykuł - szczegóły

Czasopismo

Annals of Computer Science and Information Systems

2015 | 5 | 681--686

Tytuł artykułu

TRACO: An Automatic Loop Nest Parallelizer for Numerical Applications

Autorzy

Marek Palkowski , Tomasz Klimek , Włodzimierz Bielecki

Warianty tytułu

Języki publikacji

Abstrakty

We present the source-to-source TRACO compiler allowing for increasing program locality and parallelizing arbitrarily nested loop sequences in numerical applications. Algorithms for generation of tiled code and extracting synchronization-free slices composed of tiles are presented. Parallelism of arbitrary nested loops is obtained by creating a kernel of computations represented in the OpenMP standard to be executed independently on many CPUs. We consider benchmarks, typical from compute-intensive sequences of algebra operations or numerical computation from industry and engineering. The speed-up of programs generated by TRACO are discussed. Related compilers and techniques are considered. Future work is outlined.(original abstract)

Słowa kluczowe

Numeric algorithms

Algorytmy numeryczne

Czasopismo

Annals of Computer Science and Information Systems

Rocznik

2015

Tom

Strony

681--686

Opis fizyczny

Twórcy

autor

Marek Palkowski

West Pomeranian University of Technology in Szczecin, Poland

autor

Tomasz Klimek

West Pomeranian University of Technology in Szczecin, Poland

autor

Włodzimierz Bielecki

West Pomeranian University of Technology in Szczecin, Poland

Bibliografia

A. Beletska, W. Bielecki, A. Cohen, M. Palkowski, K. Siedlecki, "Coarse-grained loop parallelization: Iteration space slicing vs affine transformations". Parallel Computing, vol. 37, pp. 479-497, 2011.
The TRACO Compiler, http://traco.sourceforge.net, 2015.
OpenMP Specification, version 3.1, http://www.openmp.org, 2014.
W. Pugh, E. Rosser, "Iteration space slicing and its application to communication optimization". In International Conference on Supercomputing, pp. 22-228, 1997.
W. Pugh, D. Wonnacott, "An exact method for analysis of value-based array data dependences". In Sixth Annual Workshop on Programming Languages and Compilers for Parallel Computing, Springer-Verlag, 1993.
W. Kelly et. al., "New User Interface for Petit and Other Extensions", User Guide, 1996.
W. Kelly et. al., "The omega library interface guide". Technical report, College Park, MD, USA, 1995.
W. Kelly, W. Pugh, E. Rosser, T. Shpeisman, "Transitive closure of infinite graphs and its applications", Int. J. Parallel Programming, vol. 24 (6), pp. 579-598, 1996.
S. Verdoolaege, "Integer Set Library - Manual", isl.gforge.inria.fr/ manual.pdf, 2015.
C. Bastoul, "Code Generation in the Polyhedral Model Is Easier Than You Think, PACT'13 IEEE International Conference on Parallel Architecture and Compilation Techniques", Juan-les-Pins, France, pp. 7-16, 2004.
W. Bielecki, M. Palkowski, T. Klimek, "Free scheduling for statement instances of parameterized arbitrarily nested affine loops", Parallel Computing, vol. 38 (9), pp. 518-532, 2012.
A. Darte, Y. Robert, F. Vivien, "Scheduling and Automatic Parallelization", Birkhauser, 2000.
W. Bielecki, M. Palkowski, "Perfectly nested loop tiling transformations based on the transitive closure of the program dependence graph", Soft Computing in Computer and Information Science Advances in Intelligent Systems and Computing, vol. 342, pp. 309-320, 2015.
W. Bielecki, T. Klimek, M. Palkowski, A. Beletska, "An Iterative Algorithm of Computing the Transitive Closure of a Union of Parameterized Affine Integer Tuple Relations", COCOA 2010: Fourth International Conference on Combinatorial Optimization and Applications, Lecture Notes in Computer Science, vol. 6508, pp. 104-113, 2010.
T. Peters, "Livermore Loops coded in C", Kendall Square Res. Corp., http://www.netlib.org/benchmark/livermorec, 1992.
U. Bondhugula, et al., "A practical automatic polyhedral parallelizer and locality optimizer", SIGPLAN Not., vol. 43 (6), pp. 101-113, urlhttp://pluto-compiler.sourceforge.net, 2008.
PoCC the Polyhedral Compiler Collection, pocc.sourceforge.net, 2014.
P. Feautrier, "Some efficient solutions to the affine scheduling problem: I. One-dimensional time", Int. J. Parallel Program., Kluwer Academic Publishers, vol. 21, pp. 313-348, 1992.
P. Feautrier, "Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time", International Journal of Parallel Programming, vol. 21, pp. 389-420, 1992.
J. Ramanujam, P. Sadayappan, "Tiling Multidimensional Iteration Spaces for Multicomputers", Journal of Parallel and Distributed Computing, Volume 16, Issue 2, pp. 108-120, 1992.
A. W. Lim, M. S. Lam, "Communication-free parallelization via affine transformations 24th ACM Symp. on Principles of Programming Languages, Springer-Verlag, pp. 392-106, 1994.
U. Bondhugula, et. al., "Automatic Transformations for Communication- Minimized Parallelization and Locality Optimization in the Polyhedral Model Compiler Constructure", In Proceedings of the CC'08/ETAPS'08, Springer, pp. 132-146, 2008.
M. W. Benabderrahmane, L. N. Pouchet, A. Cohen, C. Bastoul, "The polyhedral model is more widely applicable than you think". Proceedings of the 19th joint European conference on Theory and Practice of Software, International Conference on Compiler Construction, Springer- Verlag, pp. 283-303, 2010.
A. Lim, G. I. Cheong, M. S. Lam, "An Affine Partitioning Algorithm to Maximize Parallelism and Minimize Communication", In Proceedings of the 13th ACM SIGARCH International Conference on Supercomputing, ACM Press, pp. 228-23, 1999.
L. J. Wicker and R. B. Wilhelmson, "Simulation and analysis of tornado development and decay within a three-dimensional supercell thunderstorm". J. Atmos. Sci., vol. 52, pp. 2675-2703, 1995.
The Tensor Contraction Engine, http://www.csc.lsu.edu/~gb/TCE/, 2014.
The Polyhedral Benchmark suite, http://www.cse.ohiostate. edu/ pouchet/software/polybench/, 2014.

Typ dokumentu

Bibliografia

Identyfikatory

DOI

10.15439/2015F34

Identyfikator YADDA

bwmeta1.element.ekon-element-000171424106

Komentarze

Musisz być zalogowany aby pisać komentarze.

Annals of Computer Science and Information Systems

TRACO: An Automatic Loop Nest Parallelizer for Numerical Applications

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane