The purpose of this project is to add some realistic lighting to the meshes that were enabled by the previous project. In order to implement this lighting, there needs to be some kind of intersection algorithm (part 1). Checking intersection over all the primitives in a scene is egregiously bad, so bounding boxes are also enabled just like in the rasterizer (part 2). The last parts deal with actually tracing the path of the light across scene, simulating some bouncing and also optimizing this at the very end (part 5). It's important to note that all surfaces in this project are currently diffuse, which means that light bounces off of them in random directions to produce a dull, opaque texture.
So begins our raytracing journey!
The raytracing loop generates the specified number of rays, shooting one ray out from the center of the pixel if there is only 1 sample and choosing at random otherwise. In generating each ray, the origin of the ray is placed on a sensor plane at zvalue 1 (with appropriate conversions) and the t values are set. Later on, the t value will serve as a marker for primitive precedence in the scene. A bug happened during the implementation of this part where the t value actually wasn't being used correctly, causing a wall that should have been in behind a sphere to be in front of it.
The ray intersection algorithm uses the Moller Trumbore algorithm. This algorithm generates 3 numbers: (1) the intersection coordinate t, (2) the first barycentric coordinate b1, and (3) the second coordinate b2. b3 is found by:
b3 = 1  b1  b2
Finally, we can test whether the intersection is in the triangle by using the barycentric coordinates (all between 0 and 1) and also test whether another primitive is obstructing the view with the t value. The barycentric coordinates are also useful for shading (i.e. calculating the normal).
Here are a few images generated by the raytracing algorithm:



Note that the cow image took 800+ seconds to render with 8 threads on a VM.
In this portion we work on accelerating the previous part.
The purpose of the BVH is to focus the intersection checks on small batches to minimize needless checks in blank space. To construct the tree of mesh batches, the algorithm follows the following logic:
The actual logic for step 1, checking for bounding box intersection, is quite simple. The algorithm checks every axis against the ray's min_t and max_t, updating a local copy of that interval as necessary. The test passes if no check for x, y, or z fails, and the logic is purposefully duplicated so that fails are detected earlier. For example, if we know the xinterval falls outside the ray's tinterval, then no need to check y or z.
Now, instead of rendering in 800+ seconds, given the same resources, the cow renders in 1.153 seconds... Almost 700x speedup.

Two additional renders showing off the new computational speed that would have taken hours before:


Let's test the relationship between # of primitives and rendering time, with an additional variable of 1 or 2 camera rays per pixel.
Wow! One observation I made here is that the rendering speed actually seems unrelated to the number of primitives, where before, increasing the number of primitives had a detrimental effect on performance. Each render, given the same number of samples, takes approximately the same amount of time. The number of primitives did increase the time to construct a BVH, from 0.04s for the TEAPOT to about 2.0s for LUCY. Naturally, supersampling increases rendering time  that's nothing new.
The performance test demonstrates the power of this acceleration technique. I could see how if there was some way to preprocess the BVH, this raytracer would be handier for the realtime rendering required for something graphically demanding like a video game.
The next task is to implement some realistic shading using the previously defined intersection methods. The two methods of observing the effects of light on a scene:
Both methods use some small value epsilon to push the start of the casted ray slightly above the hitpoint, so it doesn't intersect with itself. The following image use 128 rays per pixel and 64 samples per light.


Importance sampling seems to be much more effective at quickly converging to the proper lighting and eliminating noise. Unfortunately, I worked very hard over the course of a week to try to eliminate the slighter darkness in the importance render, to no avail. Just note that for the next two parts, the reason the importance renders are just marginally smaller is because of this. Below, we'll also examine the effect of increasing sample per area light (all sampled at 1 ray/pixel):




It's important to note that increasing the number of samples per light increases the accuracy of the raycast, but the gritty quality still remains because of aliasing from undersampling per pixel.


In conclusion, the accuracy of the lighting is apparent from both renders. However, the uniform hemisphere images generally contain more noise at the same sampling rate as the importancesampled images. At a high level of supersampling, uniform hemisphere image would probably look indistinct from the importancesampled image, but that's a huge computational tax. The images above should demonstrate that sampling over the light source uses far less resources and produces better quality even at low sampling.
This time, we enable multiple raybounces. Overall, this method adds light onto the previous part to produce more realistic shading. At minimum, the pathtrace will do everything from Part 3. If the specified ray is of depth > 1, then at least one additional indirect ray will be cast. The exact number of additional rays will be between 1 and m, and for the sake of runtime, there will be a small chance of ray termination (Russian roulette). The indirect ray randomly bounces off the hitpoint, recursively calling the illumination function to determine the irradiance at the next point, as determined by the sampling method (hemisphere or importance).
Below are hemisphere and importance samples of global illumination. Compare them to the direct illumination above.


Below are the effects of observing only direct lighting vs only indirect lighting (since our algorithm combines them both for rays of depth greater than one) with max ray depth 6.


Below is the rendered view for different raydepths of the same bunny, but with combined indirect and direct lighting again.





Finally, we'll use different sampling rates for the same light to test all the work so far, using the sphere box, with max ray depth 6 and 4 lights samples.








Using a bit of statistics can cut down on computational resources if it seems like a part of a scene is pretty consistent over many samples. To do that, for every sample, I keep a running total of the pixel samples illuminance and another total of squared illuminance. With some calculations and after having taken a certain number of samples, we can calculate whether the samples are consistent enough to stop doing more computations, with 95% confidence. These checks are only made every specified number of batches of samples.
The final product of the first part of this project is displayed below, with sampling rate 2048, ray depth 5, adaptive sampling 8 per batch at max tolerance 0.085, and 1 sample per light.


The bunny is sampled at a slightly higher rate than it should be because I had to tweak the max tolerance from 0.05 to 0.085, likely due to the dimness glitch from Part 3. I ran out of time to find the exact max tolerance needed, but after many, many attempts, it is interesting to observe how increasing max tolerance causes faster render at the cost of lower sample rate (blue). This adapative approach errs on the side of too many samples (red), but shifting the max tolerance up slightly can accomodate for that.