Note: you were redirected from an old ALICE bibliography reference.

For direct access to this reference, use instead.

Publications by year

2015  2014  2013  2012  2011  2010  2009  2008  2007  2006  2005  2004  2003  2002  2001  2000  1999  1998  

Most documents on this website are protected by copyright. By clicking on a PDF icon, you confirm that you or your institution has the right to do so. Note that the definitive versions of all EG papers (Eurographics,...) can be downloaded from ACM papers (Siggraph, ...) can be downloaded from


“Concurrent number cruncher - A GPU implementation of a general sparse linear solver”
Luc Buatois, Guillaume Caumon and Bruno Lévy
International Journal of Parallel, Emergent and Distributed Systems, to appear

Abstract: A wide class of numerical methods needs to solve a linear system, where the matrix pattern of non-zero coefficients can be arbitrary. These problems can greatly benefit from highly multithreaded computational power and large memory bandwidth available on GPUs, especially since dedicated general purpose APIs such as CTM (AMD-ATI) and CUDA (NVIDIA) have appeared. CUDA even provides a BLAS implementation, but only for dense matrices (CuBLAS). Other existing linear solvers for the GPU are also limited by their internal matrix representation. This paper describes how to combine recent GPU programming techniques and new GPU dedicated APIs with high performance computing strategies (namely block compressed row storage, register blocking and vectorization), to implement a sparse general-purpose linear solver. Our implementation of the Jacobi-preconditioned Conjugate Gradient algorithm outperforms by up to a factor of 6.0x leading-edge CPU counterparts, making it attractive for applications which are content with single precision.

BibTex reference

   AUTHOR     = "Luc Buatois and Guillaume Caumon and Bruno Lévy",
   TITLE      = "Concurrent number cruncher - A GPU implementation of a general sparse
                    linear solver",
   JOURNAL    = "International Journal of Parallel, Emergent and Distributed Systems",
   YEAR       = "to appear",

Supplemental material, links, hindsight ...