Tag Archives: buffer

Deferred Rendering

Deferred rendering is an alternative to rendering 3d scenes. The classic rendering approach involves rendering each object and applying lighting passes to it. So, if an object is affected by 6 lights, it will be rendered 6 times, once for each light, in order to accumulate the effect of each light. This approach is referred forward rendering. Deferred rendering takes another approach: first of all of the objects render their lighting related information to a texture, called the G-Buffer.


This includes their colours, normals, depths and any other info that might be relevant to calculating their final colour. Afterwards, the lights in the scene are rendered as geometry (sphere for point light, cone for spotlight and full screen quad for directional light), and they use the G-buffer to calculate the colour contribution of that light to that pixel.

The motive for using deferred rendering is mainly performance related – instead of having a worst case batch count of objects by the amount of lights (if all objects are affected by all lights), you have a fixed cost of objects and lights. There are other pros and cons of the system, but the purpose of this article is not to help decide whether deferred rendering should be used, but how to do it if selected.


The main issue with implementing deferred rendering is that you have to do everything on your own. The regular rendering approach involves rendering each object directly to the output buffer (called ‘forward rendering’). This means that all of the transform & lighting calculations for a single object happen in a single stage of the process. The graphics API that you are working with exposes many options for rendering objects with lights. This is often called the ‘fixed function pipeline’, where you have API calls that control the fashion in which an object is rendered. Since we are splitting up the rendering to two parts, we can’t use these faculties at all, and have to re-implement the basic (and advanced) lighting models ourselves in shaders.


Even shaders written for the forward pipeline won’t be usable, since we use an intermediate layer (the G-Buffer). They will also need to be modified to write to the G-Buffer / read from the G-buffer (usually the first). In addition to that, the architecture of the rendering pipeline changes – objects are rendered regardless of lights, and then geometric representations of the light’s affected area have to be rendered, lighting the scene.

The Goal is to create a deferred rendering pipeline that is as unobtrusive as possible – we do not want the users of the engine to have to use it differently because of the way that its rendered, and we really don’t want the artists to change the way they work just because we use a deferred renderer. So, we want an engine that can:

The material system that stores all of the information that is required to render a single object type besides the geometry. Links to textures, alpha settings, shaders etc. are stored in an object’s material. The common material hierarchy includes two levels:

Technique – When an object will be rendered, it will use exactly one of the techniques specified in the material. Multiple techniques exist to handle different hardware specs (If the hardware has shader support use technique A, if not fall back to technique B), different levels of detail (If object is close to camera use technique ‘High’, otherwise use ‘Low’). In our case, we will create a new technique for objects that will get rendered into the G-buffer.


Pass – An actual render call. A technique is usually not more than a collection of passes. When an object is rendered with a technique, all of its passes are rendered. This is the scope at which the rendering related information is actually stored. Common objects have one pass, but more sophisticated objects (for example detail layers on the terrain or graffiti on top of the object) can have more.


Examples of material systems outside of Ogre are Nvidia’s CGFX and Microsoft’s HLSL FX. Render Queues / Ordering System: When a scene full of objects is about to be rendered, all engines need some control over render order since semi-transparent objects have to be rendered after the opaque ones in order to get the right output. Most engines will give you some control over this, as choosing the correct order can have visual and performance implications:

less overdraw = less pixel shader stress = better performance,


Full Scene / Post Processing Framework: This is probably the most sophisticated and least common of the three, but is still common. Some rendering effects, such as blur and ambient occlusion, require the entire scene to be rendered differently. We need the framework to support directives such as “Render a part of the scene to a texture”, “Render a full screen quad”, allowing us to control the rendering process from a high perspective.

Generating the G-Buffer:

So we know what we want to do, we can now start creating a deferred rendering framework on top of the engine. The problem of deferred rendering can be split up into two problems – creating the G-Buffer and lighting the scene using the G-Buffer. We will tackle both of them individually.


Deciding on a Texture Format:

The first stage of the deferred rendering process is filling up a texture with intermediate data that allows us to light the scene later. So, the first question is, what data do we want? This is an important question – it is the anchor that ties both stages together, so they both have to synchronized with it. The choice has performance (memory requirements), visual quality (accuracy) and flexibility (what doesn’t get into the G-Buffer is lost forever) implications.


We chose two RGBA textures, essentially giving us eight 16 bit floating point data members. It’s possible to use integer formats as well. The first one will contain the colour in RGB, specular intensity in A. The second one will contain the view-space-normal in RGB (we keep all 3 coordinates) and the (linear) depth in A.


Screen Space Ambient Occlusion – Application in games

I’ve been working on a screen space ambient occlusion code for a few weeks and I’ve managed to get a working project going, so I feel I’m qualified to talk about it. So let us begin. In computer graphics, ambient occlusion attempts to approximate the way light radiates in real life, especially off what are normally considered non-reflective surfaces. Like in my example here:


Unlike conventional methods such as Phong shading, ambient occlusion is a global method, meaning the illumination at each point is a function of other geometry in the scene. However, it is a very crude approximation to full global illumination. The soft appearance achieved by ambient occlusion alone is similar to the way an object appears on an overcast day.

The first major game title with Ambient Occlusion support was released in 2007, yes, we’re talking about Crytek’s Crysis 1. Then we have it’s sequel Crysis 2 as shown here:


It’s a really cheap approximation to real ambient occlusion, it’s used in many games. It is cheap, although it does draw out a significant overhead compared to running without SSAO on. Mafia II churns out 30-40 fps when I turn on AO on max settings without PhysX, but when I turn AO off it’s silky-smooth at 50-75.

Without ambient occlusion:


Screen space ambient occlusion is improved in Crysis 2 with higher quality levels based on higher resolution and two passes while the game’s water effects – impressive on all platforms – are upgraded to full vertex displacement on PC (so the waves react more realistically to objects passing into the water). Motion blur and depth of field also get higher-precision upgrades on PC too.

With ambient occlusion, note the presence of darker areas. 


 While there may only be three different visual settings, it’s actually surprising just how low the graphics card requirement is to get a decent experience in Crysis 2. In part, PC owners have the console focus to thank here – Crytek’s requirement of supporting the game on systems with limited GPUs and relatively tiny amounts of RAM required an optimisation effort that can only benefit the computer version.

According to the developer, the base hardware is a 512MB 8800GT: an old, classic card that’s only really started to show its age in the last year or so. On the lowest setting (which is still a visual treat), this is still good for around 30-40FPS at lower resolutions such as 720p and 1280×1024. If you’re looking for decent performance at 1080p, a GTX260 or 8800GTX is recommended while 1080p60 at the Extreme level really requires a Radeon HD 6970 or GTX580.

In terms of the CPU you’ll need, things have definitely moved on from the days of the original Crysis, where the engine code was optimised for dual-core processors. Crysis 2 is quad-core aware, and while it runs perfectly well with just the two cores, ideally you should be targeting something along the lines of a Q6600 or better.

Nowadays we have a bunch of DX10 & DX11 games that make use of Ambient Occlusion natively ( via their own engine ), Crysis 2, Aliens vs Predator 3, BattleField 3, Gears of War 2, etc.

Still, there are several new titles without Ambient Occlusion support, and also older games that would benefit from it. nVIDIA came to the rescue a couple of years ago with an option to force Ambient Occlusion on various games by enabling the corresponding option in their drivers Control Panel.

Initially you could only choose from “Off” and “On”, but from the 25x.xx drivers and on you have three options to choose from, “Off”, “Performance” which balances the effect’s application to enhance your image while keeping the performance numbers relatively close to what you had without A.O. enabled, and “Quality”, this option sacrifices performance, often at a massive rate, but gives you the best image quality you can achieve with A.O.

As a result of extensive testing of all 3 settings of Ambient Occlusion in a couple of games, this is our result:

Ambient occlusion is related to accessibility shading, which determines appearance based on how easy it is for a surface to be touched by various elements (e.g., dirt, light, etc.). It has been popularized in production animation due to its relative simplicity and efficiency. In the industry, ambient occlusion is often referred to as “sky light”.[citation needed]

The Terrain Ambient Occlusion system controls the amount of ambient light in a scene. For example, a dense forest has less ambient light near the ground because most of the light is stopped by the trees. In the current implementation, occlusion information is stored in textures and the effect is applied to the scene in a deferred way.

The images below show the difference (in a scene) when Terrain Ambient Occlusion is enabled and when it is not.


The ambient occlusion shading model has the nice property of offering better perception of the 3d shape of the displayed objects. This was shown in a paper where the authors report the results of perceptual experiments showing that depth discrimination under diffuse uniform sky lighting is superior to that predicted by a direct lighting model.

High 32 is highest smooth frame rate (steady 60fps) that I’ve found while testing with a MSI GTX 680 Lightning.


What are we looking at here? Alright, the best example is the vase held up to the candle. Notice the blurry yet real time shadow effect which is cast behind the vase? There you go, that’s SSAO in Amnesia. The thing is SSAO is entirely dependent on angle: if you were to stand behind the vase the shadow wouldn’t show up yet the vase would have a black glow to it. The bookcase, the fireplace and the dresser show other examples of SSAO effects you’ll see throughout the game. SSAO maxed out on High/128 killed my framerate down to 14 FPS and there was little difference compared to the much more playable setting at Medium/64. Overall, the game looks better with some form of SSAO enabled. In some areas it made things look engulfed in some ugly blurry black aura. Still, a good feature to have in a game that uses shadows for its overall atmosphere and I appreciate the effect SSAO tries to achieve yet I think it does a pretty sloppy job at emulating ‘darkness’. Setting this setting maxed out isn’t a good idea either, you’ll actually get less “black glowing” if it’s set to around 16/32. Now if only they had some form of Anti-Aliasing to play with.

The occlusion at a point on a surface with normal can be computed by integrating the visibility function over the hemisphere with respect to projected solid angle as shown below:


where  is the visibility function at , defined to be zero if  is occluded in the direction  and one otherwise, and  is the infinitesimal solid angle step of the integration variable . A variety of techniques are used to approximate this integral in practice: perhaps the most straightforward way is to use the Monte Carlo method by casting rays from the point  and testing for intersection with other scene geometry (i.e., ray casting). Another approach (more suited to hardware acceleration) is to render the view from  by rasterizing black geometry against a white background and taking the (cosine-weighted) average of rasterized fragments. This approach is an example of a “gathering” or “inside-out” approach, whereas other algorithms (such as depth-map ambient occlusion) employ “scattering” or “outside-in” techniques.

Along with the ambient occlusion value, a bent normal vector is often generated, which points in the average direction of samples that aren’t occluded. The bent normal can be used to look up incident radiance from an environment map to approximate image-based lighting. However, there are some situations in which the direction of the bent normal doesn’t represent the dominant direction of illumination.


So that’s SSAO in a nutshell, it was quite difficult to understand let alone implement due to the amount of calculations, as well as the fact that is puts heavy burdens on your processor. But like since the beginning, games sought to be as realistic as possible in order to uphold its role as a medium that immerses players into experience. Screen Space Ambient Occlusion is another means of doing to simulate the nature of lights and shadows. 

Vertex Buffer Objects, Frame Buffer Objects and Geometry shaders

The modern use of “shader” was introduced to the public by Pixar with their “RenderMan Interface Specification, Version 3.0” originally published in May, 1988.


As graphics processing units evolved, major graphics software libraries such as OpenGL and Direct3D began to support shaders. The first shader-capable GPUs originally only supported pixel shading, but vertex shaders were then introduced when developers realized the power of shaders and sought to take advantage of its potential. Geometry shaders were only fairly recently introduced with Direct3D 10 and OpenGL 3.2, but are currently supported only by high-end video cards.

Geometry in a complete three dimensional scene is lit according to the defined locations of light sources, reflection, and other surface properties. Some hardware implementations of the graphics pipeline compute lighting only at the vertices of the polygons being rendered.


The lighting values between vertices are then interpolated during rasterization. Per-fragment or per-pixel lighting, as well as other effects, can be done on modern graphics hardware as a post-rasterization process by means of a shader program. Modern graphics hardware also supports per-vertex shading through the use of vertex shaders.

Shaders are simple programs that describe the traits of either a vertex or a pixel. Vertex shaders describe the traits such as position, texture coordinates and colors of a vertex, while pixel shaders describe color, z-depth and the alpha value of a fragment. A vertex shader is called for each vertex in a primitive often after tessellation; thus one vertex in, one updated vertex out. Each vertex is then rendered as a series of pixels onto a surface that will be transported to the screen.

Shaders replace a section of video hardware often referred to as the Fixed Function Pipeline (FFP) – so-called because it performs lighting and texture mapping in a hard-coded manner. Shaders provide a programmable alternative to this hard-coded approach for the convenience of the programmers seeking to manage their code better.


The CPU sends instructions (compiled shading language programs) and geometry data to the graphics processing unit, located on the graphics card. In the vertex shader, the geometry is transformed.If a geometry shader is in the graphic processing unit and active, some changes of the geometries in the scene are performed. If a tessellation shader is activated in the graphic processing unit and active, the geometries in the scene can be subdivided.

The calculated geometry is triangulated as the triangles are broken down into fragment quads (one fragment quad is a 2 × 2 fragment primitive). Fragment quads are modified according to the pixel shader, then the depth test is executed, fragments that pass will get written to the screen and might get blended into the frame buffer. The graphic pipeline uses these steps in order to transform three dimensional (and/or two dimensional data into useful two dimensional data for displaying. In general, this is a large pixel matrix or “frame buffer”.


Vertex shaders are passed through once for each vertex given to the graphics processor. The purpose is to transform each vertex’s 3D position in virtual space to the 2D coordinate at which it appears on the screen (as well as a depth value for the Z-buffer).


Vertex shaders are capable of altering properties such as position, color, and texture coordinates, but cannot create new vertices like geometry shaders can. The output of the vertex shader goes to the next stage in the pipeline, which is either a geometry shader if present, or the pixel shader and rasterizer otherwise. Vertex shaders can enable powerful control over the details of position, movement, lighting, and color in any scene involving 3D models.

Geometry shaders are a relatively new type of shader, introduced in Direct3D 10 and OpenGL 3.2; formerly available in OpenGL 2.0+ with the use of extensions. This type of shader can generate new graphics primitives, such as points, lines, and triangles, from those primitives that were sent to the beginning of the graphics pipeline.


Geometry shader programs are executed after vertex shaders. They take as input a whole primitive, possibly with adjacency information. For example, when operating on triangles, the three vertices are the geometry shader’s input. The shader can then emit zero or more primitives, which are rasterized and their fragments ultimately passed to a pixel shader.

Typical uses of a geometry shader include point sprite generation, geometry tessellation in which you cover a surface with a pattern of flat shapes so that there are no overlaps or gaps, shadow volume extrusion where the edges forming the silhouette are extruded away from the light to construct the faces of the shadow volume, and single pass rendering to a cube map. A typical real world example of the benefits of geometry shaders would be automatic mesh complexity modification. A series of line strips representing control points for a curve are passed to the geometry shader and depending on the complexity required the shader can automatically generate extra lines each of which provides a better approximation of a curve.


Pixel shaders, which are also known as fragment shaders, compute color and other attributes of each fragment. Pixel shaders range from always outputting the same color, to applying a lighting value, to performing bump mapping, specular highlights, shadow mapping, translucency and other amazing feats of rendering as shown here.


They can alter the depth of the fragment for Z-buffering, or output more than one color if multiple render targets are active. In 3D graphics, a pixel shader alone cannot produce very complex effects, because it operates only on a single fragment, without knowledge of a scene’s geometry. However, pixel shaders do detect and acknowledge the screen coordinate being drawn, and can sample the screen and nearby pixels if the contents of the entire screen are passed as a texture to the shader. This technique can enable a wide variety of 2D postprocessing effects, such as blur, or edge detection/enhancement for cartoon/cel shading. Pixel shaders may also be applied in intermediate stages to any two-dimensional images in the pipeline, whereas vertex shaders always require a 3D model. For example, a fragment shader is the only type of shader that can act as a postprocessor or filter for a video stream after it has been rasterized.

Shaders, the 3D photoshop Part 2

In my previous blogpost, I went over the many algorithms used in both 2D and 3D computer graphics. I talked about how they are essentially the same. We’ll use a screen shot from my game Under the Radar that I editted in photoshop, before and after respectively.



Drop shadowing in photoshop is the same as shadow mapping in which it checks if a point is visible from the light or not. If a point is visible from the light then it’s obviously not in shadow, otherwise it is. The basic shadow mapping algorithm can be described as short as this:

– Render the scene from the lights view and store the depths as shadow map

– Render the scene from the camera and compare the depths, if the current fragments depth is greater than the shadow depth then the fragment is in shadow

In some instances, drop shadows are used to make objects stand out of the background with a an outline, in shaders this is done with sobel edge filters.

The Sobel operator performs a 2-D spatial gradient measurement on an image and so emphasizes regions of high spatial frequency that correspond to edges. Typically it is used to find the approximate absolute gradient magnitude at each point in an input grayscale image.

In theory at least, the operator consists of a pair of 3×3 convolution kernels. One kernel is simply the other rotated by 90°. This is very similar to the Roberts Cross operator.

These kernels are designed to respond maximally to edges running vertically and horizontally relative to the pixel grid, one kernel for each of the two perpendicular orientations. The kernels can be applied separately to the input image, to produce separate measurements of the gradient component in each orientation (call these Gx and Gy). These can then be combined together to find the absolute magnitude of the gradient at each point and the orientation of that gradient.

In photoshop, filters are added to images to randomize the noise and alter the look. The equivalent in shaders is known as normal mapping. Normal maps are images that store the direction of normals directly in the RGB values of the image. They are much more accurate, as rather than only simulating the pixel being away from the face along a line, they can simulate that pixel being moved at any direction, in an arbitrary way. The drawbacks to normal maps are that unlike bump maps, which can easily be painted by hand, normal maps usually have to be generated in some way, often from higher resolution geometry than the geometry you’re applying the map to.

Normal maps in Blender store a normal as follows:

  • Red maps from (0-255) to X (-1.0 – 1.0)
  • Green maps from (0-255) to Y (-1.0 – 1.0)
  • Blue maps from (0-255) to Z (0.0 – 1.0)

Since normals all point towards a viewer, negative Z-values are not stored, also map blue colors (128-255) to (0.0 – 1.0). The latter convention is used in “Doom 3” as an example.

Those are a majority of effects that shaders use that are similar to photoshop effects, there’s also color adjustment which can be done in color to HSL shaders along with other sorts of effects.

Shaders, the 3D photoshop

The simplest way to describe shaders is that it is the photoshop of 3D graphics; both of them are used to create effects enhancing lighting and mapping, to make the images more vivid and lively, and to give bad photographers, artists and modelers a chance to redeem their miserable work.

Perhaps the greatest thing they have in common are the algorithms used to execute their operations, they’re not just similar, they’re the exact same math operations.


Their primary difference is photoshop is used to manipulate 2D images and shaders alter 3D images, however both images are made up of pixels.


First, the image must be processed, but how? We must define a generic method to filter the image.


As you can see, all elements in the kernel MUST equal 1. We must normalize by dividing all the elements by the sum in the same way we normalize a vector.

The central element of the pixel, which in the case of what we have above is 6, will be placed all over the source pixel which will then be replaced with a weighted sum of itself and pixels nearby.

That’s how it works it works for images normally, but what about in shaders?

Normally we do forward rendering. Forward rendering is a method of rendering which has been in use since the early beginning of polygon-based 3d rendering. The scene is drawn in several passes, then the cull scene becomes renderable against frustum, then the culled Renderable is drawn with the base lighting component (ambient, light probes, etc…).


The problem with shaders is that fragment/pixel shaders output a single color at a time. It does not have access to its neighbours, therefore convolution is not possible.

The first pass is stored in the Frame Buffer Object (i.e. all color is in a texture) and we can sample any pixel value in a texture!

Digital images are created in order to be displayed on our computer monitors. Due to the limits human vision, these monitors support up to 16.7 million colors which translates to 24bits. Thus, it’s logical to store numeric images to match the color range of the display. By example, famous file formats like bmp or jpeg traditionally use 16, 24 or 32 bits for each pixel.


Each pixel is composed of 3 primary colours; red, green and blue. So if a pixel is stored as 24 bits, each component value ranges from 0 to 255. This is sufficient in most cases but this image can only represent a 256:1 contrast ratio whereas a natural scene exposed in sunlight can expose a contrast of 50,000:1. Most computer monitors have a specified contrast ratio between 500:1 and 1000:1.

High Dynamic Range (HDR) involves the use of a wider dynamic range than usual. That means that every pixel represents a larger contrast and a larger dynamic range. Usual range is called Low Dynamic Range (LDR).


HDR is typically employed in two applications; imaging and rendering. High Dynamic Range Imaging is used by photographers or by movie maker. It’s focused on static images where you can have full control and unlimited processing time. High Dynamic Range Rendering focuses on real-time applications like video games or simulations.


Since this is getting quite long, I’ll have to explain the rest in another blog. So stay tuned for part two where we will go over some of the effects of photoshop and shaders and how they’re the same as well as the algorithms behind them.