At the beginning of last week, I analysed a single frame from the demo application. This past week and a bit I have been using that information to implement the indirect lighting effect in my own application. I will now discuss the theory and implementation details of the algorithm before finishing with some discussion and analysis. And as always will finish with some screen shots and gifs.
The indirect lighting algorithm used in the deep G-Buffer paper is a modification of the paper “A Deferred Shading Pipeline for Real-Time Indirect Illumination” by Cyril Soler, Olivier Hoel and Frank Rochet. The full paper can be found here
In practice, the algorithm is very similar to the Scalable Ambient Obscurance (SAO) algorithm discussed in a previous post. It makes sense that the algorithms are similar as they are both estimating global lighting effects. The only different is that one is estimating the amount of light that won’t reach a point while the other estimates the amount of light that will reach a given point.
The above image is a visualisation of how the core of the algorithm works. We estimate the amount of outgoing radiance from point X based on incoming radiance from point Y along vector w. The equation to calculate this contribution is shown below.
Where outgoing radiance E at point X is the sum of the incoming radiance B at point Y multiplied by the clamped dot product between vector w and the normal at point X. It is a relatively intuitive algorithm, We just need to sample a number of given points around our current pixel, calculate their world-space position so we can calculate vector w which we then dot with the normal at our current pixel that we can sample from our G-Buffer. This is extended to two layers by calculating the result for both layers of the G-Buffer and then taking the value where w . Nx > 0 and w . Ny < 0. This is to ensure that incoming radiance from point Y could be directed towards point X. A note made in the paper is that the second check can be omitted to improve performance at the loss of accuracy. This can prove to be quite a substantial performance improvement as we don’t need to sample the normal for all Y reducing the total bandwidth requirements substantially.
In the previous posts, I broke down the work into a schedule that looked something like the one displayed below.
- Manual mipmap generation for colour and normals
- Scene lighting computed in HDR
- Generate radiosity from the lit scene.
- Shade scene using calculated radiosity (Excluding env map)
- Add env map to shading
- Bloom (Optional)
This actually turned out to be a pretty good task estimate based on what I could decipher from the frame analysis. I missed out the radiosity filtering but that was about it. I didn’t end up completing the work in the given order above so I will instead continue the explanation in the order the work was carried out. Starting with Pre-lighting.
We start with pre-lighting the scene as it is the first step in the process and one of the simplest to implement. This step calculates basic diffuse lighting for both layers of the G-Buffer which will be used as the outgoing radiance B in the radiosity calculation. We include the previous frames calculated radiosity to approximate multiple bounces of indirect lighting. This was a little complex as it requires temporal reprojection between frames to ensure a correct result when moving. Luckily I have gotten pretty familiar with reprojection methods as I have had to use them for both G-Buffer generation and temporal anti-aliasing. One interesting addition which I might touch on in another post is the inclusion of emissive lighting in this pass. Which allows for dynamic lighting contribution of emissive objects to the rest of the scene.
At the same calculating the diffuse lighting for both layers we pack the normal for each layer into a single buffer. This helps to reduce the required texture reads in the radiosity calculation as we can get both normals from a single texture. However, since we only have 4 channels we can only store the X and Y components of each normal. We then reconstruct the Z component in the shader. This comes with a reduction in accuracy. Although, when looking at the original and reconstructed normals I couldn’t really tell the difference so I doubt the implications are anything to worry about.
We use multi render target output to write to all values to three targets at the same time. This avoids having to set-up multiple passes and simplifies the pipeline a little bit which is helpful as more stages are added to the algorithm.
Custom MIP-MAP Generation
This works in the exact same way as was carried out for the SAO algorithm. Since we are sampling many points in a large radius the radiosity algorithm benefits in the same cache miss reduction as we discussed in the AO post. Since we had already done this for downsampling the camera-space depth buffer it was easy to apply the same process to the diffuse and normals rendered in the previous frame. The only additional work that was required for this was the inclusion of a RenderTargetBundleMipView class so that we could access a specific MIP map of all 3 targets and submit them together at the same time.
This completes all of the prep work required for the radiosity algorithm so we can now pass this downsampled data to the radiosity shader for computation.
The radiosity shader is actually quite simple. It shares a lot of the same functions that were used in the AO algorithm so not much new had to be added to get the final result. Once the sample point Y has been chosen we then just sample the data from the textures and pipe that data on to the equation. The only difference we see is that each sample includes a confidence weight that tells us how many correct samples were calculated for the given point. This is important as it tells us likely how accurate the calculated value is. Later this confidence value is used to mix the calculated radiosity with an ambient term taken from a static environment map.
The radiosity algorithm has a few parameters we can modify including the world-space sample radius and the total number of samples. These two values have the greatest impact on the quality of the final calculation. As you would image with any Monte-Carlo style calculation the more samples we can take the more accurate the final result will be. In the same way, a smaller sample radius will create a more accurate calculation as it creates a more dense area of samples. However, a smaller radius misses out on contribution from other parts of the scene and thus only creates local radiosity effects which we don’t always want. In practice there is no perfect radius, it really depends on the effect that is required.
As discussed in the theory section we can omit the second normal test which reduces the performance cost substantially allowing for greater sample numbers. This, however, does reduce the accuracy of the final result so you would need to decide if a less noisy result is preferred over accuracy. In most cases, you may decide to take this approach as you are more likely to notice the noise over the reduction in accuracy.
To help reduce some of the noise we put the raw radiosity through some filtering. This includes temporal accumulation with the previous result as well as the same bilateral blur we used to reduce the noise in the AO.
The temporal step works in the same way as the TSAA algorithm which requires that we add some jittering to the radiosity calculation. This is done by adding the current scene time to the calculated sampling angle, this results in a rotated sampling pattern in screen space effectively including samples that were missed out from the previous frame. We then average these new values with the previous frame, gradually accumulating more and more samples to effectively increasing the perceptual sample count over multiple frames. This works great to reduce the noise in the final result but requires more temporal reprojection to ensure we are accumulating the correct samples when the camera is moving. Just as before since I have had to do this a few times this was relatively easy to add.
Now that we have computed and filtered the indirect lighting we can apply it in the final shading pass. As discussed previously we use the confidence value that was calculated with the radiosity to mix the result with a static environment map. At this point, I didn’t have an environment map available so I instead just mixed it with a constant ambient term which worked in a similar fashion. Even at this stage, I found that I was able to produce some nice screen shots. for maybe a couple of days work.
The result looks a little flat as there is no shadowing and only the constant environment term however, I think the indirect lighting adds a nice softness to the scene, which you can see on the underside of the arches.
High dynamic range is important to add contrast to the scene so that we can store the high-intensity direct lighting with the ambient environment and indirect lighting. This will help to make some of the colour bleeding stand out more. As in the above image, there is not much colour bleeding from the curtains onto other surfaces as we saw in the frame we looked at in the previous post.
The addition of HDR was relatively simple. I moved most of the buffers to floating point formats and included a final tonemapping shader before presentation as well as including an intensity on the lights so that we could produce direct lighting outside of the [0, 1] range. I ended up using a filmic tonemapping curve that I strangely found on Twitter here. I don’t know if it is any good but includes exposure control and seemed to get a lot of likes from other developers so I just went with it.
One of the most important parts that is not a core step in the algorithm is shadows as this will create the contrast between the direct and indirect lighting as we know shadowed areas are only going to be able to be lit by ambient and indirect lighting. I have pretty much already limited the project to the use of direct lights so this limits the shadow maps to using orthographic projection only which simplifies some potential problems. The lights in the engine are components in the scene just as any mesh or camera would be, so they already come with their own transform that we can simply apply to a camera to render our shadow map.
Since we are using orthographic shadow maps we can also compute the ideal projection matrix to reduce the amount of wasted space in the final shadow map. We do this by taking the bounding box computed for the scene and do some dot products with the cameras Forward, Up and Right vectors. The resulting values tell us where along the light’s direction we want to place the camera and where we should place the far clipping plane. It also tells us what the orthographic height and width of the scene are from the light’s perspective.
Although this doesn’t end up producing fantastic shadows it at least ensures that we make as much use of the data that we have available. Finally, in the shader we apply a brute force PCF filter to slightly blur shadow edges. This again isn’t perfect but there isn’t really the time to investigate more complex methods. But for most cases, this technique produces acceptable shadows.
The effect that both HDR and shadows have on the resulting image can be seen in the below screenshot.
Hopefully, you can see the slight red colour bleeding into the shadow just to the right of the red curtain. as well as the nice soft lighting on the shader ball. in the middle.
Environment Map Lighting
One of the main issue with the above image is the darkness of the background. This is due to the fact that the further from the camera the lower the sample confidence falls. To combat this we want to mix the indirect result with a sample taken from an environment map. Environment maps or cube maps are just ways of storing the 360 degrees of lighting around a single point. You can see examples of these cubemaps from the previous post where we could see the maps that the demo app was using. I ended up using a couple of tools to filter and finally save cube maps into a dds file. Namely, I used both cmft and AMD’s CubeMapGen. These were ideal since they allowed for saving to dds making it easy to load them into the engine using the DirectX Tool Kit which creates our shader resource views for us. Super simple.
The lighting is relatively simple. Morgan McGuire (One of the authors of the deep G-Buffer paper) explains a technique where we can produce reasonable accurate diffuse and specular environment lighting using the standard MIP maps generated for the cube texture. For ambient diffuse lighting, we sample from the lowest MIP level using the world-space normal. The lowest MIP level gives us a single pixel in which effectively stores the average colour for that direction which is something like what we would expect from proper environmental diffuse lighting. For specular environment lighting we reflect the view vector about the normal to give us a ray that is in the direction of the light that would be reflected into out eye. We select a MIP level based on the roughness value at the current point. McGuire uses the specular exponent for calculating the MIP level since we are using roughness I instead just opted for a linear progression between MIP levels using alpha (roughness squared) that we used for the specular lighting calculation. This gives us the most detailed MIP level at roughness 0 and the lowest MIP level at roughness 1.
With all of this added together, we are at a point where we can compare results between projects.
Below I have included the comparison used in the deep G-Buffer paper and one that I created using my own application.
As you can hopefully see the final results are very similar. There are some minor differences, I used a dragon (definitely cooler), I think I went a little OTT with the colour bleeding but this can be dialed in with a few parameters. I think they also used a roughness texture on the curtain as my curtains look a little more uniform that theirs. The light direction is a little different, I think they have the light pointed slightly towards the camera and finally I think the slightly different tonemapping produces slightly different colours. However, all of that aside I would say the core aspects of the algorithm are successfully reproduced. the soft lighting on the underside of the dragon, the colour bleeding at extreme angles and the general soft lighting that you don’t get when using just direct lights.
Overall I am VERY happy with the final result. The main reason I picked the project was because of the results that they showed in the original paper, without really reading about the deep G-Buffer. I didn’t think that I would be able to produce results comparable to what they had shown so I am happy that I have managed to produce something that looks even a little bit like what they had originally shown.
I can now see how I totally underestimated the amount of work that was required for this project. I would say that bar the screen-space reflections all of the project work is complete. There are still some final improvements I can make to improve the quality however, I will be leaving this for the showcase and will now be spending most of the time working on the dissertation. which does mean that I won’t implement the screen-space reflections. I just can’t seem to justify the additional work as I feel like I already have something that I have plenty to write about. Any additional work will likely be time wasted I could have spent writing the dissertation.
In the following posts I may include some additional details about the implementation, however, I will mostly focus on adding small progress results as I continue to add sections to the dissertation. So far I have most of the methodology I just need to selectively copy & paste some of this information over. Currently, with just the deep G-Buffer and AO it already hits the word count so there shouldn’t be too many issues finishing that off. I think I have quite a lot of images to include in the results, I will just need to take plenty of performance measurements to compare to those shown in the original paper. I feel like the discussion could be a little difficult to write. At the moment I am not really sure how I am meant to process the results and what my arguments are. I will need to further discuss this with my supervisor to see what they think.
I am amazed at how quickly the deadlines have been creeping up. As of this day, there are only ~4.5 weeks left before the dissertation deadline. I think this gives me plenty of time to write the dissertation. I am just worried about how much spare time I will have to discuss it with my supervisor to go over it and make improvements. Luckily I enjoy writing about the project so it should all go by quite fast.