ECS289K (Fall 2003) References
- General reference books
- Sourcebook of Parallel Computing, J. Dongarra et al editors,
Morgan Kaufmann Publishers, ISBN 1-55860-871-0, 2003
- Uniprocessor performance optimization and benchmarking
- David Parello and Olivier Temam and Jean-Marie Verdun,
On increasing architecture awareness in program
optimizations to bridge the gap between peak and sustained
processor performance---matrix multiply revisited.
Supercomputing, 2002
- R. Clint Whaley and Antoine Petitet and Jack Dongarra,
Automated Empirical Optimizations of Software and the
ATLAS Project, Parallel Computing, vol.27, pp3-25, 2001.
- Richard Vuduc, James W. Demmel, Katherine A. Yelick,
Shoaib Kamil, Rajesh Nishtala, Benjamin Lee,
Performance Optimizations and Bounds for
Sparse Matrix-Vector Multiply , Supercomputing 2002.
- Richard Vuduc, Attila Gyulassy, James W. Demmel,
Katherine A. Yelick,
Memory Hierarchy Optimizations and
Performance Bounds for Sparse ATA*x,
ICCS 2003.
- Matteo Frigo and Stephen Johnson,
FFTW: An Adaptive Software Architecture for the FFT,
1998.
- ATLAS:
Automatically Tuned Linear Algebra Software (ATLAS)
- BEBOP:
Berkeley benchmarking and optimization group
- Cluster programming
- Parallel dense matrix computing
- R. van de Geijn and J. Watts, SUMMA: Scalable Universal Matrix
Multiplication Algorithms, LAPACK Working Notes #96
- E. Elmorth et al: Recursive blocked algorithms and hybrid data
structures for dense matrix library software
-
ScaLAPACK, a subset of LAPACK routines redesigned for
distributed memory MIMD parallel computers.
Sparse matrix computing
Sparse matrix computing
Maintained by Zhaojun Bai, bai@cs.ucdavis.edu.