Instanced RenderingHowever, OpenGL offers a pile of extensions to deal with this problem as well. As an example, a neat extension called GL_ARB_draw_instanced is even part of the core specification from version 3.1 on. Instanced rendering provides a way to render many copies or similar versions of a primitive with a single draw call. As opposed to rendering the same object multiple times by looping over glDrawArrays(), the call glDrawArraysInstanced() just takes the loop count as an additional parameter and - with the help of a shader program - can often achieve the same effect as an application loop with just a single call to OpenGL.
Instanced rendering is just one brick in the wall of possible overhead
reductions. If you aim for reducing overhead, you may also want to
provide as much state as possible in advance (using i.e. uniform buffers
or texture arrays) before you start to submit draws.
Using Instanced RenderingThe host sideThe OpenGL API provides two new functions for instanced rendering, which extend both glFrawArrays() and glDrawElements() with an instance count parameter: glFrawArraysInstanced() and glDrawElementsInstanced() work exactly the same way as their non-instanced counterparts - only that they send n copies of the primitive down the pipeline.This is what the OnRender() method of tinySG's csgIndexedTriset node looks like: void csgIndexedTriset::OnRender (csgRenderAction *A) { A->ApplyDeferredState (); // Bind VBOs (or set vertex array pointers if VBOs are disabled) if ( !BindBuffers(A->GetPipeID()) ) return; if (m_numInstances <=1 ) { glDrawArrays(GL_TRIANGLES, // prim type 0, // start index in VBO m_indices[CSG_VA_VERTEX].size()); // numVerts } else { glDrawArraysInstanced (GL_TRIANGLES, // prim type 0, // start index in VBO m_indices[CSG_VA_VERTEX].size(), // numVerts m_numInstances); // Pass instance count } UnbindBuffers (A->GetPipeID()); }Note the else-branch: Compared to legacy glDrawArrays(), the changes are minimal. If the node has an instance count larger than 1, the triangles of the triangle set are passed to glDrawArraysInstanced(). Even the if-else could be omitted when an instance count of one is used for regular draws. Up until now, you may still wonder what is good about rendering the very same triangles 100 times, without different transformations or materials. Well - obviously nothing. To make glDrawArraysInstanced() useful, you need a shader that treats each instance differently. The shader sideThe GLSL vertex shader has access to a new, build-in variable called gl_InstanceID. Starting a zero, it is increased with every instance. The nice thing is that it is up to the shader to do anything based on the instance ID:
ExamplesFor the sake of simplicity, the forest looks a bit regular. Nevertheless, it is is easy to add some randomness to the positions using a noise function. The fragment shader could also index different components of an array texture to create different trees.If you are still sceptical about the benefits of instanced rendering, I'd suggest to lean back and think of more use cases. The images below benefit from instanced rendering, although it may not be obvious. Click on the images to see larger versions.
Additional ideas on tinySG's future project list include
PitfallsBe careful when rendering instances of individual polygons or sub-objects: The instances of one primitive are rendered sequentially before proceeding to the next primitive. This influences both blending and Z-buffer contents.
An example of what can happen if individual polygons are rendered is shown
below. The wire frame torus on the left shows the mesh structure as well
as the layering. If you look real closely, you can also see color gradients
that imitate self-shadowing (inner shells have darker colors).
The artifacts are easily avoided by rendering an entire torus shell with just one primitive (i.e. all triangles in one shot using glDrawArraysInstanced with GL_TRIANGLES as shown in the code above). The rightmost image shows this - click on the images to see larger versions.
Scenegraph operations - i.e. picking - may be an issue on instances as well,
because they usually work on a scene database. If geometry is created on
the fly by a shader, operations on the scenegraph naturally work for the
1st instance only. Performance and conclusionsFinally, lets have a look at performance. The introduction states that instanced rendering is all about reducing the number of draw batches submitted to OpenGL. So, what happens if we draw the forest with a loop? tinySG offers a LoopGroup node, which traverses it's children n times. Using two nested LoopGroups, Transform nodes can achieve the same offset structure as the shader example above.If you prefer to work with indexed VBOs, no problem: GL_ARB_draw_instanced comes with glDrawElementsInstanced(), which allows to use indexed vertex data. However, tinySG always "flattens" it's indexed vertex attributes before downloading them into a VBO anyway and goes with glDrawArraysInstanced() instead, for reasons explained in the notes on index management.
Keep rendering,
Copyright by Christian Marten, 2014 Last change: 29.11.2014 |