The OpenCL solver plugin for OpenFoam 1.7.x : clFoam v0.1 come out for test

======updated in year 2013=====

This version just tide the API,  using struct clmat to wrap the 3 parameter, I think it will reduce the chance of mismatched parameters.

Since OpenFoam need update the boundary condition to build  the matrix  A  ( to solve Ax=b ) multiple runs, it is not as  efficient as  a pure Ax=b sparse matrix problem.

I  did this piece of work, just to practice openCL,  to understand OpenFoam source code better when I was hunting jobs during PhD writing up stage.

However, I have other liability in my new jobs. I need to learn more of EEE stuff, I can not update the code to OpenFoam 2.x. Since the matrix solving code is mature, it should work with OpenFoam 2.x, with or without tiny tuning.

Sorry about this.


However, I upload the code on, I hope it may help somebody else to make leap in the direction: OpenCL4OpenFoam

clUtils  (C lib for vector and  sparse matrix  multiply )  has been extracted from clFoam v0.1, and it has been tested on windows 7  mingw32 building.



Dear Openfoamers
First annoucement on CFDonline

Until now, clFoam single precision has been tested on ATI 5650M GPU and NVidia Tesla C2050. The speed is slightly slower than CPU on Tesla C2050 for 160000 cells of case: cavity 4 times steps (clPCG). (see profilingDatasheet.xls in profiling data/ for details)

160000 cells on cluster: redqueen of mancherster University   single precison (SP) only DIC precond
AMD Qcore CPU 2700 MHZ

ExecutionTime = 31.93 s  ClockTime = 32 s
Tesla C2050, only clPCG no interface update

ExecutionTime = 33.85 s ClockTime = 44 s
Tesla C2050, only clPCG  interface update

ExecutionTime = 39.95 s  ClockTime = 55 s

The openCL solver is still promising, as it is a new tech and has great space to improve.

download link:

Quite a lot of work to do, any advice on improving the efficiency is appreciated. further, there must be some errors in the manual, DO leave me a email to correct them.

Thanks very much


1. Project Layout

# file system structure of the project generated by command:
there are 3 projects(subfolders) in clFoam
clUtils/ basic vector csrMatrix operation written by author
(BSD licensed)
Tested and profiled on AMD_STREAM_SDK, SP on GPU and DP on CPU

clFoam/ clPCG and clPBICG solver based on clUtils/
(GPLv3 licensed)
Tested and profiled on AMD_STREAM_SDK , single precison on GPU

vclFoam/ a wrapper to call viennaCL blas solver
(GPLv3 licensed)
Not finished, there is a bug

# other resource included
doc/ some useful documents, tutorials, install manuals
bin/ some bash scripts
profiling data/
SpeedITOFPlugin1.1/ is downloaded from SpeedIT toolkit website and edited for SP support

**** USABILITY*******
(1)clUtils : single precision works for both AMD and NV GPU

double precision past the test on openCL via GPU
double precision on cuda 3.1, fails for “OUT_OF_RESOURCE”
double precision NOT work properly on Tesla C2050 Cuda 3.1

(2)clFoam is usable for only single precison on GPU, clPCG and clPBiCG

(see profilingDatasheet.xls in profiling data/ for details)
For double precision, it should work but still buggy.
I did not have hardware handy for debug, only ssh assess to the remote cluster without upgrade to CUDA 4.0

(3)vclFoam is totally not usable,
As vclFoam will be not probably faster than clFoam, I do not spend quite a lot time on that plugin

**** *****************
2. Requirements
clFoam requires the following:
* A recent C++ compiler (e.g. gcc 4.x.x), GCC >4.4 is needed!!!
* OpenFoam 1.7.X
* OpenCL: For accessing GPUs(shared library and include files)
For AMD GPUs, install the AMD_STREAM_SDK
SEE installation guide:

SEE installation guide:

optional vclFoam
* uBLAS : (shipped with the Boost libraries)
#sudo apt-get install boost
* viennaCL 1.1 header has been put into vclFoam,

3. Installation

the install tutorials are put in separate files:


4. Authors and Contact

Qingfeng Xia

June 01 2011

Qingfeng Xia

This entry was posted in Download, Research. Bookmark the permalink.