Parameterizing source code for architecture-bound optimization is a common approach to high-performance programming but one that makes the programmer's task arduous and the resulting code difficult to maintain. Certain parameterizations, such as changing loop order, may require elaborate code instrumenting that distract from the main objective. In this paper, we propose a templating and automatic code generation approach based on standard Python modules and the OPAL library for algorithm optimization. Advantages of our approach include its programmatic simplicity and the flexibility offered by the templating engine. We provide a complete example for the matrix multiply where optimization with respect to blocking, loop unrolling and compiler flags takes place.
Published June 2011 , 14 pages