OpenCL API wrapper for VASmalltalk

After working on my primitive project I start having a closer look to OpenCL – a computational environment for graphics cards. Both comapnies – ATI and Nvidia – are offering support for OpenCL 1.0 on their graphic cards. But not on all cards. Nvidia has offering longer support for this – but was/is named Cuda.

With OpenCL you are able to run programs in parallel on graphic cards. OpenCL is both: an environment or API and a programming c-like language.

The OpenCL wrapper for VASmalltalk should become a wrapper for the OpenCL-API and is under development und available from vastgoodies. The state of today is, that you can now query your available platforms (ATI oder NVidia) and the available devices (CPU or GPU).

Actually it can deliver an system report output via:

| anInterface aStream |
anInterface := MSKOpenCLInterface new initialize.
aStream := WriteStream on: String new.
anInterface allPlatformInfos do: [ :eachPlatform |
  eachPlatform printReportOn: aStream ].
aStream contents inspect

and the result looks like:

Name of platform                : ATI Stream
Vendor of platform              : Advanced Micro Devices, Inc.
Profile of platform              : FULL_PROFILE
Version of platform             : OpenCL 1.0 ATI-Stream-v2.0.0

Devices of this platform

Name of device                  : AMD Phenom(tm) 9550 Quad-Core Processor
Vendor of device                : AuthenticAMD
Vendor ID of device             : 4098
Version of device               : OpenCL 1.0 ATI-Stream-v2.0.0
driver version                  : 1.0
Device available                : true
Compute Address Space           : 32
Device compiler available       : true
Device is little endian         : true
Error correction support avail. : false
Size of global memory in cache  : 65536
Global Memory cache line        : 64
Global device memory            : 1073741824
This device has NO image support !
Size of local memory arena              : 32768
Maximum clock frequency                 : 2204
Number of parallel compute cores        : 4
Max number of arguments                 : 8
Max size in bytes of a constant buffer  : 65536
Max size of memory object allocation    : 536870912
Max size in bytes of the arguments      : 4096
Pref.Vect.Length for char (1 Byte)      : 16
Pref.Vect.Length for short (2 Byte)     : 8
Pref.Vect.Length for int   (4 Byte)     : 4
Pref.Vect.Length for long  (8 Byte)     : 2
Pref.Vect.Length for Float (4 Byte)     : 4
This device has NO double-precision support !
Max size in bytes of a constant buffer  : 4
Maximum dimensions of     work-item IDs : 3
Item Size               : 1024
Item Size               : 1024
Item Size               : 1024

End of      platform                ATI Stream

or for my machine at work:

Name of platform                : NVIDIA CUDA
Vendor of platform              : NVIDIA Corporation
Profile of platform             : FULL_PROFILE
Version of platform             : OpenCL 1.0 CUDA 3.0.1
Known Extensions for platform   :cl_khr_byte_addressable_store,  cl_khr_gl_sharing,  cl_nv_compiler_options,  cl_nv_device_attribute_query,  cl_nv_pragma_unroll,

Devices of this platform

Name of device                  : GeForce 8800 GT
Vendor of device                : NVIDIA Corporation
Vendor ID of device             : 4318
Version of device               : OpenCL 1.0 CUDA
driver version                  : 196.21
Device available                : true
Compute Address Space           : 32
Device compiler available       : true
Device is little endian         : true
Error correction support avail. : false
Size of global memory in cache  : 0
Global Memory cache line        : 0
Global device memory            : 1073414144
Max number of simultaneous image read by kernel    : 128
Max number of simultaneous image written by kernel : 8
Maximum number of samplers used by kernel          : 16
Max height of 2D image in pixels                   : 8192
Max height of 2D image in pixels                   : 8192
Max width of 2D image in pixels                    : 8192
Max depth of 3D image in pixels                    : 2048
Max height of 3D image in pixels                   : 2048
Max width of 3D image in pixels                    : 2048
Size of local memory arena              : 16384
Maximum clock frequency                 : 1650
Number of parallel compute cores        : 14
Max number of arguments                 : 9
Max size in bytes of a constant buffer  : 65536
Max size of memory object allocation    : 268353536
Max size in bytes of the arguments      : 4352
Pref.Vect.Length for char (1 Byte)      : 1
Pref.Vect.Length for short (2 Byte)     : 1
Pref.Vect.Length for int   (4 Byte)     : 1
Pref.Vect.Length for long  (8 Byte)     : 1
Pref.Vect.Length for Float (4 Byte)     : 1
This device has NO double-precision support !
Device Type                             : GPU
FP Mode                                 : FMA InfNAN RoundToInf RoundToNearest RoundToZero
Maximum dimensions of     work-item IDs : 3
Item Size               : 512
Item Size               : 512
Item Size               : 64
Extensions known                        : cl_khr_byte_addressable_store,  cl_khr_gl_sharing,  cl_nv_compiler_options,  cl_nv_device_attribute_query,  cl_nv_pragma_unroll,  cl_khr_global_int32_base_atomics,  cl_khr_global_int32_extended_atomics,

End of      platform                NVIDIA CUDA
This entry was posted in Smalltalk and tagged , . Bookmark the permalink.