So as promised I will now go over how the Ambient Obscurance (AO) algorithm works. Its advantages compared to other algorithms, and how it is implemented in my engine. And as usual, I will follow up with some screenshots.
As I briefly mentioned in a previous post where I did a frame breakdown of the deep G-Buffer demo, the chosen algorithm is the Scalable Ambient Obscurance (SAO) algorithm. The researchers who wrote the deep G-Buffers also happened to write the SAO paper and some even worked on the paper SAO was based on (AlchemyAO). For additional details, you can find all papers at the following links: Deep G-Buffers, SAO, AlchemyAO.
NOTE: All following images and diagrams are taken from these papers or presentations unless otherwise specified.
It is best to define what we are talking about before getting into the details. So, what is AO? Well, Ambient Occlusion or Ambient Obscurance (I’ll shortly get into the difference) is a global illumination effect that approximates how exposed a point in the scene is from ambient or environmental lighting. In practice, it darkens cracks, corners and any surface that is in some way sheltered. For example less light will reach the area under your feet, creating a darkened area where you stand. This is often called a contact shadow and ensures that objects in the scene look grounded and not floating.
Here is an example from a presentation by John Hable about the AO used in Uncharted 2.
It is hard to place exactly where the guard is standing without the contact shadows under foot. It almost looks like he is floating.
So what is the difference between Ambient Occlusion and Ambient Obscurance? To be honest, I am not 100% sure that I understand the difference but I will attempt my best to explain the difference as I understand it.
Ambient Occlusion is the calculated occlusion over the hemisphere at a single point. Effectively giving us a value from 0 to 1 describing how difficult it is for ambient light to reach that point. A value of 1 means no ambient light will reach that point and a value of 0 means that the point is in no way occluded and all ambient light can reach that point from any direction within the local hemisphere. The image below taken from the Alchemy AO paper helps to explain this.
Here C is the point we are calculating occlusion for. n is the normal for the point and you can see the hemisphere we are testing that expands perpendicular to the surface normal. As this point is in a deep crevice the calculated occlusion will likely be quite high.
Ambient Obscurance is a modern form of ambient occlusion. The Alchemy paper gives the definition as “Ambient Obscurance: is a modern form of ambient occlusion, where the impact of the term falls off with distance so as to smoothly blend high-frequency occlusion discovered by the ambient obscurance algorithm with the occlusion discovered by a lower-frequency global illumination algorithm” I can’t totally decipher exactly what this means, but I believe it is saying that Ambient Obscurance is a more complex modification of Ambient Occlusion to allow for greater artistic control and easier integration with existing global illumination solvers.
Anyway, now that we have some of the definitions out of the way lets get onto the algorithms.
Since a lot of the work presented in the SAO paper relies on an understanding of the Alchemy AO algorithm I will briefly explain the aims and techniques discussed in the paper.
The technique was published by Vicarious Visions for one of their games in development at the time. They needed an AO algorithm that was able to produce robust occlusion and contact shadows for large and small scale occluders and one that was easily integrated with a deferred renderer. They found that most AO algorithms available at the time either failed for fine details as they were required to work at a lower resolution before upsampling. Or, that they failed to produce accurate contact shadows that stuck to the outline of the object. Below you can see a comparison of the Volumetric Obscurance algorithm and the Alchemy AO algorithm.
Comparison of volumetric obscurance (left) and alchemy AO (right). Notice the floating shadows on the at a. Whereas in at b they contour to the shape of the object.
The developers derived their own occlusion estimator to calculate the occlusion at a sample point. They included modifiable parameters to allow for artistic control of the final result. The final version of the equation is show below.
To calculate A, we take a sum of the calculated occlusion for a number of points Q within a world space radius of the point we are calculating for C. Here s is the number of samples and v is the vector from our point C to the randomly selected point Q. Taking the dot product with the normal at point C (n) will tell us where the point sits on the tangent plane perpendicular to n. For example if we imagine a point sitting on the same plane as C, vector v will be perpendicular to the normal and thus return a dot product of zero, meaning no occlusion. Intuitively we can understand this as a point on the same plane as C will not be able to occlude it. The diagram below helps to explain this further.
So now that we understand some of the mathematics at least on a basic level lets move on to discuss the implementation of the algorithm.
After the G-Buffer is generated the depth buffer and camera-space normals are passed to a full-screen shader. For each pixel the shader projects the world-space sample radius (aka the size of the hemisphere aligned with the tangent plane) into a screen space radius in pixels. Using a hash function of the pixel coordinates the shader selects a random rotation angle that is used to select random sampling points within the projected screen space disk. For each sampled point it extracts the camera-space Z from the depth buffer and runs the equation to calculate the occlusion. Finally the sum is divided by the total number of samples and the normalised occlusion value is returned. And that’s it really. In reality it is somewhat simple it just takes time to modify the paramaters to get the desired artistic result.
The algorithm scales with number of samples the sample radius and the screen resolution. They found that as the sample radius increased the number of texture cache misses increased, thus increasing the render time. This is one of the main draw backs of the algorithm which is where the need for our next algorithm SAO came about.
Scalable Ambient Obscurance
SAO builds upon the work produced for the Alchemy AO algorithm, leaving all the maths the same just making some tweaks to the way in which the algorithm is implemented. As discussed previously the main drawback of Alchemy AO was the limit on the sample radius, really only allowing for calculation of local occlusion effects. SAO presents a set of techniques that allow the algorithm to work for both forward and deferred renderers, as well as guarantee a fixed execution time independent of scene complexity and sample radius.
There are 3 core optimisations presented in the paper. First, they discuss methods to ensure accuracy of reconstructed depths is maintained throughout a frame. Second, they show that by pre-filtering the depth into a mip chain larger radii can be used with reduced cache misses by selecting the most efficient mip level. Finally they show that more accurate results can be generated when calculating face normals in the shader from the depth buffer. They show that the error in calculated normal is 0.2 degrees which is less inaccuracy than you get from reading normals from an RGBA8 texture.
As mentioned above these optimisations together manage to ensure a constant execution time independent of scene complexity or sample radius. As shown below.
Deep G-Buffer AO
SAO still suffers from common issues that occur with any screen-space technique. Specifically they are not temporally stable and generally miss out information for partially occluded surfaces. Below is an example taken from the deep G-Buffer paper. You can see where AO has been missed by a single-layer on the left that has been corrected by the 2-layer G-Buffer on the left.
As you can see partially occluded surfaces can be prone to “ghosting” where you find halos around foreground objects. The toaster is a good example as you see that the oven is white at the edges of the toaster as the depth information behind it is not available.
The modifications to the technique are relatively minor. When sampling our test point Q it samples from depth for both the first and second layer. It then calculates occlusion due to these two points and takes whichever contributes the most. If you can imagine the example above, when sampling for the edge of the oven the shader will end up sampling from the toaster in the foreground which doesn’t contribute to the occlusion. While in the second example it will have access to the position behind the toaster, likely close to the oven which will contribute to the occlusion.
Using multiple layers will affect the run-time costs but the benefits are quite impressive. Although it may appear as only some minor differences between the two images, what you don’t see is that the inaccurate areas will change as the cameras move, Creating temporal inconsistency as areas flash dark and bright as objects become occluded.
Integration with my framework
Finally now that we understand how it works it’s time to look at some examples taken from my own implementation. As well as some discussion on problem areas and where the project progresses from here.
Below are some examples of the difference between single and double layer AO.
From the above you can see that we are missing some AO around the edges of the pillar in the first picture and similarly from inside the bowl in the second picture.
The final result is very noisy and that is due to a low sample count. At the moment I am using 6, whereas the deep G-Buffer demo is using 50. It is also because I have yet to implement the final depth aware blur that diffuses some of the noise. I just wanted to pick something simple for now to display the results. I will follow up with some additional screen shots of the final version with higher samples and blurring. This example ran the whole AO pass in 0.43 ms, for just 6 samples that is quite slow. However, the shader is running in debug mode and relies on a few runtime parameters. Eventually I will run a test with optimisations enabled and reduce the reliance on runtime variables allowing the shader compiler to make further optimisations.
Just to show off a little. I have included a little gif of the fully shaded scene using the calculated AO to modulate an ambient term. You can see as the scene switches to AO enabled and disabled, the impact that it has on the visual quality.
Again, please excuse the noise. That will be fixed in the next set of changes.
I am happy with the final implementation, I feel it produces some appealing results. I am also happy with how I have managed to grasp the concept and hope that I have managed to give an understandable explanation of how the algorithm works. From here I will make some modifications to ensure performance and quality are up to standard. I will then move on to look at Temporal Supersampling Antialiasing (TSAA) before moving on to looking at implementing indirect lighting. In a similar way to I did for AO, I will breakdown a frame from the demo to get a better understanding before re-reading the papers and attempting an implementation. I have fallen behind where I really should be with the written work so I hope to make some progress there in the next couple of weeks. This has ended up as a longer post than I had intended (2057 words) so I will likely only post a small update for next week unless something interesting happens.