tinySceneGraph



OpenGL Tessellation shader

isoline hairs Friends of tinyScenegraph,

Since release 4.0, OpenGL is capable of subdividing the geometry given by the application directly on the GPU. You may have seen the cobblestone road or roof tiles in the Unigine Heaven benchmark, which uses GPU tessellation to create detailed geometry. In OpenGL, a shader is responsible for all these details, i.e. how many sub-polygons are created and what they look like.

In a way, this is similar to the geometry shader stage, which can also create new primitives from existing ones. The difference is that the tessellation stage trades some flexibility of the geometry shader for a vast performance improvement. For example, a geometry shader can decide in each invocation exactly how many primitives it creates, while the tessellation stage uses a fixed function block to create primitives according to it's configuration. Further more, geometry shaders accept multiple primitive types GL_TRIANGLES, GL_LINES, etc.), while the tessellation shader requires primitives to be sent to the hardware as GL_PATCHES.

However, the geometry shader stage is not gone, so we you don't loose anything: It follows the tessellation stage, remains optional and can process the tessellation stage outputs as if they came from the vertex shader. We can use either stage, combine them or omit them entirely, running just a vertex and a fragment stage.

Application setup

If you already use glDrawArrays() on the host side to render your geometry, the required changes are very limited. You may simply specify all your triangles with all vertex attributes like normals, texcoords or generic attributes as before. Just put your data into a vertex buffer object (VBO). It does not matter if you interleave your vertex attributes or store them sequentially, if you use multiple VBOs or just one (although the former choices are better, performance-wise). See Generic Vertex Attributes for ways to do this.

The only real difference is that you are required to render GL_PATCHES instead of GL_TRIANGLES or whatever way you organise your geometry. To render a bunch of 100 triangular patches, call glDrawArrays() like this:

     // Setup and bind your VBO:
     // ...
     
     // Specify how many consecutive vertices make up a patch (e.g. 3 for a 
     // triangular patch, described by just it's vertices):
     glPatchParameteri(GL_PATCH_VERTICES, 3);
     
     // Note *GL_PATCHES* as the render mode parameter:
     glDrawArrays(GL_PATCHES, 0, 100);
  
The per-patch input will be processed by the tessellation control stage, which can see all input at once. Three vertices of a triangle like shown above is a really simple example. You may decide to do bezier patches, and define each patch by 16 vertices/tangents if you like.

Shader setup

The tessellation shader consists of three sub-stages. Let's have a look at them in reverse order:

Tessellation Evaluation Shader

The 3rd, final and mandatory tessellation sub-stage is the tessellation evaluation shader (TES). It specifies what kind of patches it expects via a layout statement:
  • layout(triangles, equal_spacing, cw) in: The TES creates triangular patches and receives barycentric coordinates for the sub-triangles as input parameters in the build-in variable gl_TessCoord.xyz.
  • layout(quads, equal_spacing, ccw) in: The TES creates quad patches and receives uv-like coordinates in gl_TessCoord.xy, defining sub-quads.
  • layout(isolines, equal_spacing, ccw) in: The TES creates a set of polylines, with the u-coordinate identifying the polyline and the v-coordinate identifying the point within the polyine.
Each TES invocation has access to the entire patch description. Together with the relative coordinates provided in gl_TessCoord it can calculate locations for vertices of a sub-triangle, sub-quad or polyline. For quads, simply think of the coordinates as texture coordinates in the range of 0.0 to 1.0, with respect to the original patch. The TES is supposed to output a vertex for the given sub-location, just like a vertex shader does.

On top of the relative coordinates of a sub-primitive, the tessellation evaluation shader has access to whatever the previous stage has provided. A simple, but complete tessellation evaluation shader dealing with triangular patches may look like this:

   // ** Tessellation evaluation shader **
   #version 150
   #extension GL_ARB_tessellation_shader : enable
   #extension GL_ARB_enhanced_layouts : enable

   layout(triangles, equal_spacing, cw) in;
   
   uniform mat4 Projection;
   uniform mat4 Modelview;

   // Input parameters, provided by an earlier 
   // shader stage (tcs). These are usually 
   // values from patch input:
   in vec3 tcPos[];

   void main()
   {
     // Interpolating the sub-primitive vertices, using 
     // barycentric coordinates in gl_TessCoord provided 
     // by the primitive generator:
     vec3 p0 = gl_TessCoord.x * tcPos[0].xyz;
     vec3 p1 = gl_TessCoord.y * tcPos[1].xyz;
     vec3 p2 = gl_TessCoord.z * tcPos[2].xyz;
     
     vec3 pos = (p0 + p1 + p2);
     gl_Position = Projection * Modelview * vec4(pos,1);
   }
  
Tessellation level 30
Cube made from 12 triangles, tessellated with inner- and outer tessellation level set to 30

Primitive Generator

The patch-relative coordinates the evaluation stage receives as it's input are generated by a fixed function block, the Primitive Generator. There is not much to do here, except advising it how to create these coordinates. This is the job of the tessellation control shader.

Tessellation Control Shader

The 1st sub-stage is the tessellation control shader (TCS). It's job is to instruct the primitive generator how to tessellate the patch. On top, it has to provide per-patch input variables for the TES (do calculations like a vertex shader or simple pass-through, postponing any transformation to later stages).

The tessellation levels specify how many subdivisions the primitive generator creates:

  • gl_TessLevelOuter[] gives the number of subdivision segments for each original primitive edge. Use this to ensure adjacent patches match water-tight, without any gaps or T-intersections in the mesh.
  • gl_TessLevelInner[] specifies how many inner sub-primitives to create. Simply use the same value as for gl_TessLevelOuter, or read about the complicated details in the OpenGL wiki (link below).

A simple, but complete TCS may look like this:

   // ** Tessellation control shader **
   #version 150
   #extension GL_ARB_tessellation_shader : enable

   // Specify that the tcs will be invoked 3 times, creating three input vertices
   // for the tes. Each invocation is assumed to write to the 'gl_InvocationID'
   // index in the output array(s).
   layout(vertices = 3) out;
 
   // All patch input is accessible in each invocation:
   in vec3 vPos[];
   
   // We decided by the layout above that we'll pass 3 vertices on to the
   // evaluation stage. Note we can have multiple arrays like tcPos[], all
   // with 3 elements (e.g. for normals):
   out vec3 tcPos[];

   // Have application-provided parameters that define the level of tessellation:
   uniform float tessLevelInner=3;
   uniform float tessLevelOuter=3;

   // gl_InvocationID specifies the input vertex of the patch. We defined 
   // 'glPatchParameteri(GL_PATCH_VERTICES, 3)' on the host side above, so
   // ID will be 0,1 and 2.
   #define ID gl_InvocationID

   void main()
   {
     // Just pass-through original vertices. Leave transformations and
     // barycentric interpolation to the evaluation stage: 
     tcPos[ID] = vPos[ID];

     // Setup primitive generator just once (not for all three vertices):
     if (ID == 0) {
       // In this simple example, the application specifies the tessellation 
       // levels via uniforms. However, we could also make use of the patch 
       // data in vPos, calculate the distance to the camera here and setup a 
       // dynamic level of detail.
       gl_TessLevelInner[0] = tessLevelInner;
       gl_TessLevelOuter[0] = tessLevelOuter;
       gl_TessLevelOuter[1] = tessLevelOuter;
       gl_TessLevelOuter[2] = tessLevelOuter;
     }
   }
  

Making tessellation useful

Obviously, flat triangulation of existing triangles as shown in the example above is not very useful. The code just demonstrates the bare minimum to get you started. Things become more interesting if you apply additional transformations to the sub-primitives, e.g. by using a displacement map or performing a bezier surface approximation.

Displacement mapping

Displacement mapping uses a heightmap texture to displace vertices of a surface along the surface normal. We can interpolate texture coordinates of patch corner vertices to get texture coordinates for inner sub-primitives and use the height map to displace generated sub-vertices.
displacement mapping displacement mapping
Using the above height map to lookup vertical displacements and a colormap, a mesh with 32 triangular patches renders the images below. The original triangles are clearly visible in wireframe at tessellation levels 32 and 64. Click on the images to see larger versions.
displacement mapping displacement mapping displacement mapping
displacement mapping displacement mapping displacement mapping
Tessellation level 8 Tessellation level 32 Tessellation level 64
The different levels are set in the tessellation control stage. Calculating the distance of incoming patches to the camera, this can be used to asjust the level of detail of the scene.

Isoline tessellation

isoline hairs Isoline tessellation creates a bunch of polylines from each patch. The primitive generator creates uv-coordinates, with u identifying the polyline and v the vertex within the polyline. This tessellation mode is ideal for generating a hell of a lot of lines very quickly. The image to the right is a screenshot of the tinyCoiffeur hair simulation plugin of tinySG. The image shows about 45,000 hairs, with each hair is visualised by a polyline with 16 vertices, affected by lighting.

A polylines points are stored in a shader storage buffer, the uv coordinates produced by the primitive generator are used to address them and fetch their coordinates in the evaluation stage. Each frame, a GLSL compute shader applies forces like gravity or wind to animate the polyline points.

To make addressing as simple as possible, the application uses instanced rendering to render as many instances of a single dummy point as there are hairs in the shader storage buffer. Each instance served as a patch defined by one point. The tessellation shader stage then creates multiple polylines from each patch, representing a strand of hair. This way, tessellation multiplies the number of simulated guide hairs by up to 64x on current hardware.
More details are scheduled for a separate story on tinyCoiffeur.

Bezier Surfaces

bezier surfaces Bezier surfaces come in at least two flavours: Triangle-based and quadrilateral-based surface patches. The latter basically takes tangent vectors into account to describe the (cubic) curvature of a surface. They are not yet supported by any tinySG scene node. The former are also known as n-patches and can be rendered in tinySG with csgIndexedTriset nodes and a suitable tessellation shader.

Triangular bezier patches are defined by the corners of a triangle and their associated surface normals. Without any shader, tinySG renders a csgIndexedTriset as if it was an ordinary set of triangles. But if the render traverser state has encountered an active tessellation shader before, the triangles are rendered as GL_PATCHES and the shader does it's magic on the provided vertices and normals. The images below show two such patches. Without a shader, the triangles appear flat. As soon as the n-patch shader is active, the vertex normals provoke a curvature in the surface. Patch boundaries are shown in red, normal vectors in blue, sub-triangles introduced by the tessellator in black.
displacement mapping displacement mapping displacement mapping
No active shader node - IndexedTriset is rendered as triangles. Active shader with inner and outer tessellation level set to 3. Active shader with inner and outer tessellation level set to 30.
N-patches behave pretty picky with respect to normals. Make sure the normal vectors match the real (=analytical) normal of the surface you approximate. Normals generated from a low-resolution mesh coordinates tend to be not exact enough and result in lighting artefacts on the patch (which is why the Audi TT is only shown in wireframe above - it has ugly normals that mess up all lighting).

Notes on performance

The tinySG test laptop runs a Geforce GT-750M. This chip seems to be mostly limited by fill rate or memory bandwidth, the compute-intensive tessellation runs on internal caches and has limited influence on performance. The images below show different levels of tessellation, with backface culling enabled (measurements are done with backface culling disabled).
No tessellation Tessellation level 10 Tessellation level 30
Fixed function pipeline, 12 triangles: 4425fps Tessellation, inner- and outer tessellation level set to 10, equal_spacing: 2950fps Tessellation, inner- and outer tessellation level set to 30, equal_spacing: 1807fps.

Now that you know...

This techguide is just meant to get you started with tessellation shaders. Additional features not mentioned above include
  • Isolines: As shown in the hair example, a patch can be made up of a single point, and create a huge mesh from it by using the point data as an index into some data buffer, setup a triangles or quads layout and emit enough control points. Since patch descriptions are abstract, it is up to your shader code to decide what to do with incoming patch data or where to source data from.
  • LOD: Tessellation levels can also be floating numbers instead of integers as shown above, and the TES can control the spacing of vertices generated by the primitive generator via it's layout directive:
    • equal_spacing will round floats up to the nearest integer and create equal distances between all vertices.
    • fractional_even_spacing and fractional_odd_sppacing do (almost) the same as equal_spacing, but round tessellation levels up to the nearest even integer (or odd, respectively). Both fractional modes generate two segments that are smaller than the other segments on a patch edge if levels are rounded.
    Although there are no such things like fractional triangles, this feature is still useful to smoothly transition between different levels of detail.
  • Primitive orientation: The primitive generator can create primitives in clockwise or counter-clockwise primitives. Just specify the orientation you need in the layout directive in the evaluation shader.
  • API control: The OpenGL API provides functions that allow to omit the control shader stage and setup parameters for the primitive generator on the application side.
I recommend to have a closer look at the OpenGL wiki in case you want to learn more about these details.

Keep rendering,
Christian


Acknowledgements:

  • The OpenGL wiki on opengl.org is possibly the best source to learn all about tessellation shaders.
  • The little grasshopper has excellent tutorials on OpenGL tessellation, which originally kicked off tinySG's implementation.
  • The earth textures are by courtesy of NASA.
  • The low-polygon Audi TT model is made by Petr005 and offered on ShareCG.com for non-commercial use.


Copyright by Christian Marten, 2015
Last change: 28.08.2015