Visualizing OpenCL computations within OpenGL

I've been fiddling with OpenCL computations for different projects in the past. One such project is the implementation of a specific triangulation algorithm for a structured light measurement system. The other one is a parallelized implementation of a 3d binary thinning algorithm, which a student of mine is writing his bachelor thesis about (in time, i hope he will be willing to publish some small description about it online..). We are both using python and the excellent module pyOpenCL by Andreas Klöckner.

As fast as OpenCL might be, to visualize the result we generally have to copy the gpu-buffered data back to the main memory and let frameworks such as Enthoughts mayavi or matplotlib or others draw it. For the end result of a computation this might be a viable solution but if we wanted to visualize how the computation itself progresses, the visualization will ultimately get in the way.

Since the data we need resides in the memory of the graphic card (usually i would use the gpu even though there are cpu-based opencl libraries by now), i guessed the graphics card should be able to use them for the visualization without even bothering the main memory or the cpu for that matter. This article by Ian Johnson who is working on bringing OpenCL acceleration to blenders particle system - or rather create a new particle system implementation using OpenCL - got me started on the interaction between OpenGL and OpenCL, so here it goes. The complete source code will be cloneable via git in time. I would like to describe the fundmental functions i'm using here and will omit any extra code that is not needed per se.

This is how my module import section looks like:

PYTHON:
  1. from OpenGL import GL as gl, GLU as glu, GLUT as glut
  2. from OpenGL.arrays import vbo
  3. import numpy as np
  4. import pyopencl as cl
  5. import scipy
  6. import os, sys

Starting off, we have to setup the glut-interface in which the OpenGL-buffer will be shown. This can be replaced by any GUI-framework, but i guess glut is very simple to setup.

PYTHON:
  1. glut.glutInit()
  2. glut.glutInitDisplayMode(glut.GLUT_RGBA | glut.GLUT_DOUBLE | glut.GLUT_DEPTH)
  3. glut.glutInitWindowSize(self.width, self.height)
  4. glut.glutInitWindowPosition(0, 0)
  5. self.win = glut.glutCreateWindow("Testing...")

This initializes glut, sets up the display mode to support RGBA with two buffers that can be swapped (GLUT_DOUBLE, it also works with only one buffer, but this seems more stable) and enables the depth field. After that we setup the window size, position and then create it with the title "Testing..." ;-) .

After that we need to setup some callbacks for glut. The most important ones are

PYTHON:
  1. glut.glutDisplayFunc(self.draw)
  2. glut.glutTimerFunc(10, self.timer, 10)

The first one sets up the function which draws the content of the glut window, the second one is called every 10 msecs and will be responsible for initiating the redraw.

Now we need to setup our scene. But before that we need some data, that we want to display. I chose a voxel field, gave it some color based on its position. Also, i need a numpy array with random values. Based on the random value of a voxel, i would like to let it disappear or reappear.

PYTHON:
  1. self.randarr = np.random.rand(size, size, size)
  2. self.randarr = np.require(self.randarr, 'f')
  3. arr = [(float(x)/(size-1)-.5,float(y)/(size-1)-.5,float(z)/(size-1)-.5,1.0) for x in xrange(size) for y in xrange(size) for z in xrange(size)]
  4. self.arr = np.require(arr, 'f')
  5. colarr = [(.5-float(x)/size+.2,float(y)/size-.5+.2,float(z)/size-.5+.2,1.0) for x in xrange(size) for y in xrange(size) for z in xrange(size)]
  6. self.colarr = np.require(colarr, 'f')
  7. self.arrvbo = vbo.VBO(data=self.arr, usage=gl.GL_DYNAMIC_DRAW, target=gl.GL_ARRAY_BUFFER)
  8. self.arrvbo.bind()
  9. self.colvbo = vbo.VBO(data=self.colarr, usage=gl.GL_DYNAMIC_DRAW, target=gl.GL_ARRAY_BUFFER)
  10. self.colvbo.bind()

After creating the arrays in python they are transfered to OpenGL-memory via "Vertex Buffer Object"s (vbo). I'm still not sure, what the subsequent bind does but it is not optional anyway. It is important to mind the size of the vectors used. Here, i've used 4 dimensions for both the position (arr) and the color (colarr). The color is thereby an RGBA-color and it seems (though i'm not sure), that the fourth dimension of the position will be used as a scaling factor, so i set it to 1.0.

We can go directly to the drawing function:

PYTHON:
  1. def draw(self):
  2.     # execute the OpenCL computation
  3.     self.execute()
  4.     # clear the gl buffers
  5.     gl.glClear(gl.GL_COLOR_BUFFER_BIT | gl.GL_DEPTH_BUFFER_BIT)
  6.     # tell opengl to use the vertex-array and the color-array
  7.     gl.glEnableClientState(gl.GL_VERTEX_ARRAY)
  8.     gl.glEnableClientState(gl.GL_COLOR_ARRAY)
  9.     # tell opengl where to find the arrays
  10.     self.arrvbo.bind()
  11.     gl.glVertexPointer(4, gl.GL_FLOAT, 0, self.arrvbo)
  12.     self.colvbo.bind()
  13.     gl.glColorPointer(4, gl.GL_FLOAT, 0, self.colvbo)
  14.     # tell gl to draw the arrays (with offset and number of points)
  15.     gl.glDrawArrays(gl.GL_POINTS, 0, self.size**3)
  16.     # disable the state and unbind the buffers
  17.     gl.glDisableClientState(gl.GL_VERTEX_ARRAY)
  18.     gl.glDisableClientState(gl.GL_COLOR_ARRAY)
  19.     self.arrvbo.unbind()
  20.     self.colvbo.unbind()
  21.     # we use double buffer, so we have to swap the active buffer
  22.     # and get the new buffer to the foreground
  23.     glut.glutSwapBuffers()

To start the main loop of glut, we still have to call glut.glutMainLoop().

Now to the nice part: The interaction between OpenGL and OpenCL means, that we have to manipulate OpenGL data within OpenCL. This is a little bit of a illogic paradigm, since it should be OpenCL data that we want to visualize. Well, but once you've moved beyond this slight irregularity, it's easy-peasy:

First, we setup our OpenCL environment: the context and the queue.

PYTHON:
  1. plats = cl.get_platforms()
  2. from pyopencl.tools import get_gl_sharing_context_properties
  3. self.ctx = cl.Context(properties=[
  4.     (cl.context_properties.PLATFORM, plats[0])]
  5.     + get_gl_sharing_context_properties(), devices=None)
  6. self.queue = cl.CommandQueue(self.ctx)

We then load and build our program...

PYTHON:
  1. f = open(filename, 'r')
  2. fstr = f.read()
  3. self.program = cl.Program(self.ctx, fstr).build()

... and bind our OpenGL buffers to OpenCL pointers:

PYTHON:
  1. mf = cl.mem_flags
  2. self.arrvbo.bind()
  3. self.arr_cl = cl.GLBuffer(self.ctx, mf.READ_WRITE, int(self.arrvbo.buffers[0]))
  4. self.colvbo.bind()
  5. self.col_cl = cl.GLBuffer(self.ctx, mf.READ_WRITE, int(self.colvbo.buffers[0]))
  6. self.randarr_cl = cl.Buffer(self.ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=self.randarr)
  7. self.arrvbo.unbind()
  8. self.colvbo.unbind()
  9. self.queue.finish()
  10. # set up the list of GL objects to share with opencl
  11. self.gl_objects = [self.arr_cl, self.col_cl]

We need the list of gl_objects because they have to be acquired each time we want to use them in OpenCL and released afterwards. I'm not sure, but i think this way we ensure that OpenCL and OpenGL don't hamper with the data at the same time. With cl.Buffer we set up the randarr array within OpenCL memory. The OpenCL calculation is done in the aforementioned self.execute():

PYTHON:
  1. def execute(self):
  2.     self.randarr = np.random.rand(self.size, self.size, self.size)
  3.     self.randarr = np.require(self.randarr, 'f')
  4.     cl.enqueue_copy(self.queue, self.randarr_cl, self.randarr)
  5.     cl.enqueue_acquire_gl_objects(self.queue, self.gl_objects).wait()
  6.     global_size = (self.size**3,)
  7.     local_size = None
  8.     kernelargs = (self.arr_cl,
  9.                   self.col_cl,
  10.                   self.randarr_cl,
  11.                   self.dt)
  12.     self.program.mykernel(self.queue, global_size, local_size, *(kernelargs))
  13.     cl.enqueue_release_gl_objects(self.queue, self.gl_objects).wait()
  14.     self.queue.finish()

The important bit is the function cl.enqueue_acquire_gl_objects and the corresponding cl.enqueue_release_gl_objects.

So now to the cl-code. It's actually fairly simple:

C:
  1. __kernel void mykernel(__global float4* pos, __global float4* color, __global float* rand, float dt)
  2. {
  3.     unsigned int i = get_global_id(0);
  4.     if (rand[i] <.05) { color[i].w = 0; }
  5.     if (rand[i]> .999) { color[i].w = 1; }
  6. }

Et voila: I've used some code snippets i found here but changed the image save procedure to use scipy.misc.imsave instead of the much slower matplotlib.

The result can be seen here:

Video not there? Download it: movie.ogv.ogv

Update: repository

The code can be found here: https://gitorious.org/openclgltest/openclgltest

Interactivitiy with Mayavi and python callback

Today i found out that i may use figure.on_mouse_pick(function) to get a callback on a mouse_pick on mayavi-objects and it opened up a whole new way of dealing with mayavi-scenes. Right now, i'm trying to do a semi-automatic point cloud segmentation using different criteria such as distance, gradient and curvature. Since the data i'm using this on is unstructured, the curvature thing is proving to be quite tricky.

So here is what i do, AFTER scatterplotting my points using mayavi:

PYTHON:
  1. figure.scene.disable_render = False # also something new i've learnt!
  2. def picker_callback(picker):
  3.     """ Picker callback: this get called when on pick events.
  4.     """
  5.     global mypts_selection, myplane
  6.     glyph_points = mypts.glyph.glyph_source.glyph_source.output.points.to_array()
  7.     point_id = picker.point_id/glyph_points.shape[0]
  8.     # If the no points have been selected, we have '-1'
  9.     if point_id != -1:
  10.         # Retrieve the coordinnates coorresponding to that data
  11.         # point
  12.         x, y, z = xtr[point_id,0], xtr[point_id,1], xtr[point_id,2]
  13.         # Move the outline to the data point.
  14.         outline.bounds = (x-0.1, x+0.1,
  15.                           y-0.1, y+0.1,
  16.                           z-0.1, z+0.1)
  17.         pt = xtr[point_id,0:3]
  18.         pts = tree.query_ball_point(pt,r=1.)
  19.        
  20.         figure.scene.disable_render = True
  21.         # draw the nearest few points to the selected one as red dots
  22.         if mypts_selection is not None:
  23.             mypts_selection.parent.parent.remove()
  24.         mypts_selection = mlab.points3d(xtr[pts,0],xtr[pts,1],xtr[pts,2],scale_factor=.15,color=(1,0,0),mode='sphere',scale_mode='none',colormap='prism')#,mask_points=10)
  25.        
  26.         # fit a plane to the point neighbourhoud
  27.         B,normd = fitplane(xtr[pts,0:3].T)
  28.         # and draw that plane
  29.         if myplane is not None:
  30.             myplane.remove()
  31.         myplane = mlab.pipeline.builtin_surface()
  32.         myplane.source = 'plane'
  33.         myplane.data_source.normal = B[0:3]
  34.         myplane.data_source.center = Plane(P=Point(B[0],B[1],B[2]),D=-B[3]).projection(Point(xtr[point_id,0],xtr[point_id,1],xtr[point_id,2])).asarray()
  35.         planesurface = mlab.pipeline.surface(myplane)
  36.         figure.scene.disable_render = False
  37.  
  38. # set the picker callback function and mouse tolerance
  39. picker = figure.on_mouse_pick(picker_callback)
  40. picker.tolerance = 0.01
  41.  
  42. # start the magic
  43. mlab.show()

Residual of planefitting using svd

Note to self: Find out about residuals of planefits using singular value decomposition. And while you're at it: find out, how least square fitting works with singular value decomposition anyhow. Right now - as with any transformation matrix bigger than 3x3 - i mostly think it's magic..

Here is my current planefitting code in python:

PYTHON:
  1. import numpy as np
  2. from Geometry import Point, Line, Plane
  3.  
  4. def fitplane(XYZ):
  5.     [rows,npts] = XYZ.shape
  6.    
  7.     if not rows == 3:
  8.         print XYZ.shape
  9.         raise ('data is not 3D')
  10.         return None
  11.  
  12.     if npts <3:
  13.         raise ('too few points to fit plane')
  14.         return None
  15.  
  16.     # Set up constraint equations of the form  AB = 0,
  17.     # where B is a column vector of the plane coefficients
  18.     # in the form   b(1)*X + b(2)*Y +b(3)*Z + b(4) = 0.
  19.     t = XYZ.T
  20.     p = (np.ones((npts,1)))
  21.     A = np.hstack([t,p])
  22.  
  23.     if npts == 3:                       # Pad A with zeros
  24.         A = [A, np.zeros(1,4)]
  25.  
  26.     [u, d, v] = np.linalg.svd(A)        # Singular value decomposition.
  27.     #print v[3,:]
  28.     B = v[3,:];                         # Solution is last column of v.
  29.     nn = np.linalg.norm(B[0:3])
  30.     B = B / nn
  31.     plane = Plane(Point(B[0],B[1],B[2]),D=B[3])
  32.     return plane

The geometry-module is the one i've posted on github. It enables me to do fast transformations and - the most important part - intersections between planes and lines. Go check that out. I would like to hear some feedback..

Back to the problem: I would like to also return a measure of error or residual. If anyone could help..

Update (25.10.2010): There was a small but still significant error in the algorithm. I assumed that B[0:2] returns the elements [B[0],B[1],B[2]], but it appears the slicing in python is in line with the syntax for a "for"-statement. So, the line with the norm had to be changed to

PYTHON:
  1. nn = np.linalg.norm(B[0:3])

On another note, i'd have to look up the np.linalg.solve()-function which should be able to solve my problem a little bit more efficiently.

Trying out Python for image manipulation and matlab-alternative?

Learning new languages can be sometimes a little bit frustrating. Mostly, because you not only have to learn the syntax and conventions of the language but mainly because initially it's hard to know, what kinds of libraries are available - and reasonable - in that specific language.

Python's syntax seems really easy. I'd still have to look out for the indentation. But the real pain is getting to know the libraries and figuring out which one to use to achieve the specific goals. Right now, i'm trying to use opencv (which i've already used with c++) and figuring out how to calibrate a projector for a structured light sensor (fringe projection actually). There are of course many ready-to-use algorithms and even source codes, but i didn't find any yet, that cater to my specific needs.

I'll try reporting on progress concerning the projector calibration, but for now, here is a problem i encountered using the python-binding package of opencv that comes with ubuntu lucid lynx. It only supports the old bindings and there are many methods missing there! An upstream bug report explains it all. When trying to bind opencv via

CODE:
  1. import cv

It couldn't find the module cv. The old wrapper can still be used, though

CODE:
  1. from opencv.cv import *

Of course, i could have built the wrapper myself, but there is a repository with the newest opencv library (2.1) and the right binding library for python.

CODE:
  1. $ sudo add-apt-repository ppa:gijzelaar/opencv2
  2. $ sudo apt-get update
  3. $ sudo apt-get install opencv

via

Multiscale Measurement Systems


Zoom into Human Skin
von Weird_Weird_Science

Nice visualization of the use of multiscale measurement systems. Used sensors include a simple video camera, an opto-microscopic camera (maybe a confocal microscope) and i think they had to use an electronic microscope for the last steps. I like how they were able to smooth out the zoom operation accross the different scales.

More zoom-videos:

Nice Fourier poem

A poem written in honor of the ever-enlightening Fourier-transform, which is often hated by students, mainly because they don't know what it means.. Here is to the Fourier-transform:

One day in a land far away
Some mathematicians at play
Found a transform of convenient form
The basis of physics today.

Convolving would wreck people's brains
Still the advent of Fourier domains
for convolving in one
means multiplication
in the corresponding domain.

Got trouble with an ODE?
Fourier transforms will set you free
When once you would cry,
You now multiply,
by a constant times the frequency.

Fourier transforms backwards and forth
I hope that you now see their worth
For in every domain
Advantages reign
Fourier was the salt of the earth.

by Luke Krieg
Posted to sci.physics, 11 Sep 2000

Thanks to Amara Graps (i was reading on wavelets..).

Creating Gray-Code sequences using gimp

I needed to implement gray-code sequences for a fringe projection system i'm trying to build. To avoid hard to understand and not really intuitive C-code, i used gimp to produce the projection images. In this article i'm going to document the code used.

First is the script-fu function to generate a picture with a given number of fringes or - alternatively - a given width of the period:

CODE:
  1. (define (script-fu-oan-create-fringes fwidth t fcount)
  2. (let*
  3.     (
  4.     (myimg (aref (cadr (gimp-image-list)) 0))
  5.     (mylayer (gimp-image-active-drawable myimg))
  6.     (layers (gimp-image-get-layers  myimg))
  7.     (channels (gimp-image-get-channels myimg))
  8.     (height (car (gimp-drawable-height (car mylayer))))
  9.     )
  10.     (if (= t TRUE) (set! fwidth (/ (car (gimp-drawable-width (car mylayer))) (* anzahl 2)))
  11.                     (set! fcount (/ (car (gimp-drawable-width (car mylayer))) (* breite 2)))
  12.     )
  13.     (define i 0)
  14.     (gimp-selection-all myimg)
  15.     (gimp-edit-clear (car mylayer))
  16.     (gimp-selection-none myimg)
  17.     (while
  18.         (<i fcount)
  19.         (gimp-rect-select myimg (* (* fwidth i) 2) 0 fwidth height CHANNEL-OP-ADD FALSE 0)
  20.         (set! i (+ i 1))
  21.     )
  22.     (gimp-edit-bucket-fill (car mylayer) FG-BUCKET-FILL NORMAL-MODE 100 0 FALSE 0 0)
  23. )
  24. )

Then there is the register-function call to enable the menu item for the function:

CODE:
  1. (script-fu-register
  2. "script-fu-oan-create-fringes"                              ;func name
  3. "Create Fringes"                                            ;menu label
  4. "Create an image of fringes\
  5.  "                                                          ;description
  6. "Omar Abo-Namous"                                           ;author
  7. "copyright 2009, Omar Abo-Namous"                           ;copyright notice
  8. "March 02, 2009"                                            ;date created
  9. ""                                                          ;image type that the script works on
  10. SF-VALUE "Fringe width:" "16"                               ;
  11. SF-TOGGLE "Use Fringe count" FALSE                  ;
  12. SF-VALUE "Fringe count:" "8"                            ;
  13. )
  14. (script-fu-menu-register "script-fu-oan-create-fringes" "<Image>/Xtns")

An older in-depth script-fu article by me.