tinySG Image Processor

Image processing with tinySG

Friends of tinySG,

For almost a decade I neglected to accept the importance of 2D image processing. I felt that if it is not 3D, it cannot be cool. Objectively, almost all my attempts to improve digital images taken on holidays came out worse than the original picture (in addition to have an undetermined feeling of "cheating" when manipulating photos).

Artists forced me to reconsider my position on 2D image processing by producing real nice renderings for marketing purposes: They played with light setups, added contrasts and effects like glow to create a real atmosphere in the scene, rather than just doing the plain, clean mathematics I used to do.

The image to the right shows an Mercedes SLK model in a night scene, with some environment reflection. The picture feels artificial and generated.

By integration of the image processor into tinySG's utility library, tsgEdit has access to a wide range of customisable image processing filters. The picture comes to live, creating some additional atmosphere. The image to the left shows the same scene, using a glow filter.

If you come to think about it, there is a huge realm of possibilities to be explored. Just to name a few:

color space conversions (gamma correction, greyscale, tonemapping, YUV-conversion for video playout)
monitor controls (contrast, brightness, saturation)
image feature detection (lines, edges)
special effects (glow, halo, lens flare, toon shading, screen space ambient occlusion, ...)
image improvements (sharpening, blurring, erosion)
generic convolution (you name the convolution kernel!)
real-time 3D effect textures (blurred mirror textures, IBL, ...)
distortion correction (curved screen warping)
Soft-edge/blended multichannel projections
watermark generation

The images below show some of the filters coming with tinySG's default filter library - click to see a larger version.


A concatenation of sobel and glow filter.	The Aston Martin processed by a blur filter.	The ISS model rendered with a sobel filter.	Teapot with 3x color saturation.

CAD data rendered in wireframe with logo overlay.	Rendering with watermark (i.e. for demo licenses).	Input duplication, scale, distortion and luminance conversion.	Sepia effect color manipulation.

The tinySG image processor

tinySG's image processor is entirely separated from the render pipeline. An application may choose to utilise it wherever needed: It may serve as a post processor after rendering a scene, just before the image is finally displayed in the window. It may serve as a video exporter, writing out renderings to image or video files. It may be used in an image processing pipeline, not related to the tinySG scene graph at all and even act as a real-time texture processor while rendering a 3D scene.

The image processor implements a directed, acyclic graph of image sources, filters and sinks, each of them represented by a C++ class. Sources may be tinySGs realtime renderer, video or image files. The input images are processed by a filter to do color correction, apply special effects, etc. The filters output are either passed to another filter or consumed by image sinks like a render window or an image file.

tsgEdit comes with the filter network editor shown to the left. A filter network is created by drag 'n drop, connecting available sources, sinks and filters together. Filter tweakables are edited on the right hand side of the dialog.

Filters may be implemented either in GLSL or OpenCL (yet about to come, prototype runs). GLSL filters may be described by an XML filter library, eliminating the need to use a compiler.

Filters may utilise multiple passes that calculate intermediate results. This feature allows to go for some cheeky performance trick, providing realtime performance even for large convolution kernels.
As a simple example, this is an example of the default contrast filter:

	<ImgageProcessorLibrary>
	  <Filter name="Contrast Filter" icon="img/imgContrastFilter100.png">
		<parameter name="Texture Unit" type="GL_SAMPLER_2D" uniform="tex" tweakable="false"/>
		<parameter name="Contrast" type="GL_FLOAT" size="1" uniform="contrast" tweakable="true" min="0.0" max="2.0" val="1.0"/>
		<parameter name="Average Luminance" type="GL_FLOAT_VEC4" size="1" uniform="avgLuminance" tweakable="true"
				      min="0.0 0.0 0.0 0.0" max="1.0 1.0 1.0 0.0" val="0.5 0.5 0.5 0.0"/>
		<source>
		  <pass ID="0" type="fragment">
			<![CDATA[
			#pragma optimize(on)
			uniform sampler2D tex;
			uniform vec4      avgLuminance;
			uniform float     contrast;
			void main()
			{
			   vec4 color = texture2D(tex,gl_TexCoord[0].st);
			   gl_FragColor = mix(avgLuminance, color, contrast);
			}
			]]>
		  </pass>
		</source>
	  </Filter>
	  <Filter>
		 ...
	  </Filter>
	</ImgageProcessorLibrary>

Alternatively, filters may derive from class csgImageFilter and process texture data provided in texture unit 0 by overloading one pure virtual function: csgImageFilter::GetShaderSource(). That's it! The GLSL YUV luminance filter looks like this:

       class csgYUVFilter: public csgImageFilter
       {
          public:
            csgYUVFilter () {}
            virtual ~csgYUVFilter () {}

          protected:
            // Provide base classes pure virtual interface:
            virtual char* GetShaderSource()
            {
               static char src[] =
               "uniform sampler2D tex;              \
                void main()                         \
                {                                   \
                  vec4 color = texture2D(tex,gl_TexCoord[0].st); \
                  float Y    = dot (vec4 (0.229, 0.587, 0.114, 0.0), color); \
                  gl_FragColor = vec4(Y);      \
                }";
               return src;
            }
       };

About Performance

Currently, the image processor performs at least three blits: From framebuffer to an FBO texture, running a filter that renders into another FBO texture and copying the result of the filter chain back to the frame buffer. More blit operations will happen if a filter works with multiple passes (like e.g. the glow filter) or if filters are chained one after another.

By rendering directly into an FBO, the 1st blit would become unnecessary. The same optimisation could be applied to the final back-copy, so that two blits could be saved easily. However, doing so would sacrifice some flexibility and code elegance, so I decided against it for now.

The following charts are taken on a MacBook Pro with a Geforce 9600M, running Windows XP. Naturally, the performance of 2D operations is a function of the resolution of the input image. The benchmark has been running at 1280x768, measuring frames per second (so higher values are better).

It shows three datasets of different complexity. The teapot is simple enough that the 3D pass is even faster than the two blit operations together. Thus, even the image processors fastest filters cut the framerate almost in half. However, as the framerate is above 400 fps even on a laptop this loss hardly matters.
400fps translate to 2.5ms render time per frame, if the image processor is disabled. Enabling the image processor with a YUV filter, the framerate drops to 250fps (4.0ms/frame). Thus, the cost of the YUV filter is about 1.5ms per frame at 1280x768, including blit operations.
The ISS and MountaineerEngine datasets render much slower. Large glow or blur kernels are quiet expensive and will hurt for datasets like the ISS. All other filters add a 2D processing overhead of roughly 10%, which again is hardly noticeable.

The following chart shows the performance of tinySGs test systems, running different processors, GPUs and operating systems. Thus, the data helps to reckon a platforms/architectures performance. It is not suitable as a fair comparison of CPUs or GPUs. The MacBook runs Windows XP/32 (!), Core i7 system runs Windows 7/64 and the AMD machine is a 64Bit Linux/X11 PC.
Once again, the teapot has been used as test dataset in order to de-emphasise GL rendering and focus on the image processors performance. image processor performance

Acknowledgements:

The cubemaps used in this story are courtesy of Humus and kindly provided for non-commercial use.
The original SLK model is courtesy of some anonymous hero out there on the Internet and free for non-commercial use.
Original ISS model is courtesy of NASA.
The CAD datasets were taken from the SPECaps NX benchmark.