This post is a devlog for my pathtracerplayground project. Follow along to see the progress, starting from scratch by opening a simple window.
Filmic tone mapping and better tone mapping controls
20231203
Commit: 37bfce8
The expose
function from the previous entry was not a very good tonemapping curve and was baked into the raytracing shader.
The ACES filmic tone mapping curve is a curve fit of a complex tone mapping algorithm used in the film industry and looks better than the curve used previously.
A simple exposure control multiplier is introduced, defined in terms of stops.
$$ \mathrm{exposure} = \frac{1}{2^{\mathrm{stop}}}, \mathrm{stop} \geq 0. $$
The exposure is halved each time the stop is incremented. The Monte Carlo estimator is scaled by the exposure multiplier and passed through the tone mapping function, which yields the final value.


A linear tone map with stops = 4 yields the following image.
Filmic tone map with stops = 4 yields a much better looking image.
Integrating a physicallybased sky model
20231120
Commit: c040325
The HosekWilkie sky model is integrated in the renderer. The model implementation is a straightforward Cport of the hwskymodel HosekWilkie model implementation.
The model provides physicallybased radiance values for any direction in the sky, assuming the viewer is on the ground of the Earth. The model was obtained by doing bruteforce simulations of the atmospheric scattering process on Earth, and then fitting the results into a simple model that is much faster to evaluate at runtime.
The sky model parameters, such as the sun’s position, can be updated in realtime. This involves recomputing the values in a lookup table, and uploading them to the GPU. The lookup table values are used to compute the radiance for each ray which scatters towards the sky:


The model does not return RGB color values, but physical radiance values along each ray. These values are much larger than the [0, 1] RGB color range. The following expose
function is used to map the radiance values to a valid RGB color:


Here is what the full sky dome looks like, rendered about the vertical yaxis.
The lighting scattering off the duck model also looks distinctly different from before.
Even better estimator: simple temporal accumulation
20231115
Commit: b10570a
Temporal accumulation is implemented and the sum term is added back to our Monte Carlo estimator:
$$ F_n = \frac{1}{n} \sum_{i = 1}^{n} k_{\textrm{albedo}}L_i(\omega_i), $$
1 sample per pixel is accumulated per frame, and our estimator is thus active over multiple frames. If the camera moves, or different rendering parameters are selected, rendering the image starts from scratch.
The duck after 256 samples per pixel:
Better estimator: importance sampling surface normal directions
20231114
Commit: 4a76335
Cosineweighted hemisphere sampling is implemented. The term \((\omega_i \cdot \mathrm{n})\) in the rendering equation equals \(\cos \theta\), where \(\theta\) is the zenith angle between the surface normal and the incoming ray direction. Directions which are close to perpendicular to the surface normal contribute very little light due to the cosine. It would be better to sample directions close to the surface normal.
The Monte Carlo estimator for nonuniform random variables \(X_i\) drawn out of probability density function (PDF) \(p(x)\) is
$$ F_n = \frac{1}{n} \sum_{i = 1}^{n} \frac{f(X_i)}{p(X_i)}. $$
Choosing \(p(\omega) = \frac{\cos \theta}{\pi}\) and, like before, taking just one sample, our estimator becomes:
$$ F_1 = \frac{\pi}{\cos \theta} f_{\textrm{lam}}L_i(\omega_i)(\omega_i \cdot \mathrm{n}). $$
We can substitute \((\omega_i \cdot \mathrm{n}) = \cos \theta\) and \(f_{\textrm{lam}} = \frac{k_{\textrm{albedo}}}{\pi}\) to simplify the estimator further:
$$ F_1 = k_{\textrm{albedo}}L_i(\omega_i), $$
where our light direction \(\omega_i\) is now drawn out of the cosineweighted hemisphere. The code is simplified to just one function, which both samples the cosineweighted direction and evaluates the albedo:


Inverse transform sampling the cosineweighted unit hemisphere.
The exact same recipe that was used for the unit hemisphere can be followed here to obtain rngNextInCosineWeightedHemisphere
. While the recipe used is introduced in Physically Based Rendering, the book does not present this calculation.
Choice of PDF
Set probability density to be \(p(\omega) \propto \cos \theta\).
PDF normalization
Integrate \(p(\omega)\) over hemisphere H again.
$$ \int_H p(\omega) d\omega = \int_{0}^{2 \pi} \int_{0}^{\frac{\pi}{2}} c \cos \theta \sin \theta d\theta d\phi = \frac{c}{2} \int_{0}^{2 \pi} d\phi = 1 $$ $$ \Longrightarrow c = \frac{1}{\pi} $$
Coordinate transformation
Again, using the result \(p(\theta, , \phi) = \sin \theta p(\omega)\), obtain
$$ p(\theta , \phi) = \frac{1}{\pi} \cos \theta \sin \theta. $$
\(\theta\)’s marginal probability density
$$ p(\theta) = \int_{0}^{2 \pi} p(\theta , \phi) d \phi = \int_{0}^{2 \pi} \frac{1}{\pi} \cos \theta \sin \theta d \phi = 2 \sin \theta \cos \theta. $$
\(\phi\)’s conditional probability density
$$ p(\phi  \theta) = \frac{p(\theta , \phi)}{p(\theta)} = \frac{1}{2 \pi} $$
Calculate the CDF’s
$$ P(\theta) = \int_{0}^{\theta} 2 \cos \theta^{\prime} \sin \theta^{\prime} d \theta^{\prime} = 1  \cos^2 \theta $$ $$ P(\phi  \theta) = \int_{0}^{\phi} \frac{1}{2 \pi} d\phi^{\prime} = \frac{\phi}{2 \pi} $$
Inverting the CDFs in terms of u
$$ \cos \theta = \sqrt{1  u_1} = \sqrt{u_1} $$
$$ \phi = 2 \pi u_2 $$
where \(u_1 = 1  u_1\) is substituted again. Converting \(\theta\), \(\phi\) back to Cartesian coordinates, note that \(z = \cos \theta = \sqrt{u_1}\).
$$ x = \sin \theta \cos \phi = \sqrt{1  u_1} \cos(2 \pi u_2) $$ $$ y = \sin \theta \sin \phi = \sqrt{1  u_1} \sin(2 \pi u_2) $$ $$ z = \sqrt{u_1} $$
In code:


Looking at our duck, before:
And after, with cosineweighted hemisphere sampling:
The first physicallybased pathtracer
20231113
Commit: c83b83e
A simple physicallybased pathtracer is implemented. ✨ The rendering equation is solved using the following Monte Carlo estimator:
$$ F_1 = 2 \pi f_r(\mathrm{x_i},\omega_i)L_i(\mathrm{x_i},\omega_i)(\omega_i\cdot\mathrm{n}), $$
where \(2 \pi\) is the surface area of the unit hemisphere, \(f_r\) is the bidirectional reflectance distribution function (BRDF), \(\omega_i\) is the incoming ray direction, and \(L_i\) is the incoming radiance along \(\omega_i\). The estimator is based on the Monte Carlo estimator for \(\int_{a}^{b} f(x) dx\), where the random variable is drawn from a uniform distribution:
$$ F_n = \frac{ba}{n} \sum_{i=1}^{n} f(X_i), $$
where \(n=1\) is set for now and a surface area is used instead of the 1dimensional range.
For simplicity, only the Lambertian BRDF is implemented at this stage, with \(f_{\textrm{lam}}=\frac{k_{\textrm{albedo}}}{\pi}\). Drawing inspiration from Crash Course in BRDF implementation, a pair of functions are implemented:


Inverse transform sampling the unit hemisphere.
To obtain rngNextInUnitHemisphere
, we can follow a recipe introduced in Physically Based Rendering. Here’s a quick summary of the calculation, which is also partially in the book.
The recipe
In order to do Monte Carlo integration, you will need to do inverse transform sampling with your selected probability density function (PDF). Here’s the general steps that you need to take to arrive at a function that you can use to draw sampels from.
 Choose the probability density function.
 If the density function is multidimensional, you can compute the marginal density (integrate out one variable), and the conditional density for the other variable.
 Calculate the cumulative distribution function (CDF).
 Invert the CDFs in terms of the canonical uniform random variable, \(u\).
Choice of PDF
Uniform probability density over all directions \(\omega\), \(p(\omega) = c\).
PDF normalization
The integral over the hemisphere H yields the surface area of a hemisphere:
$$ \int_{H} c d \omega = 2 \pi c = 1 $$
$$ c = \frac{1}{2 \pi} $$
Coordinate transformation
Use the result that \(p(\theta, , \phi) = \sin \theta p(\omega)\) to obtain
$$ p(\theta , \phi) = \frac{\sin \theta}{2 \pi} $$
\(\theta\)’s marginal probability density
$$ p(\theta) = \int_{0}^{2 \pi} p(\theta , \phi) d \phi = \int_{0}^{2 \pi} \frac{\sin \theta}{2 \pi} d \phi = \sin \theta $$
\(\phi\)’s conditional probability density
In general, \(\phi\)’s probability density would be conditional on \(\theta\) after computing \(\theta\)’s marginal density.
$$ p(\phi  \theta) = \frac{p(\theta , \phi)}{p(\theta)} = \frac{1}{2 \pi} $$
Calculate the CDF’s
$$ P(\theta) = \int_{0}^{\theta} \sin \theta^{\prime} d\theta^{\prime} = 1  \cos \theta $$
$$ P(\phi  \theta) = \int_{0}^{\phi} \frac{1}{2 \pi} d\phi^{\prime} = \frac{\phi}{2 \pi} $$
Inverting the CDFs in terms of u
$$ \theta = \cos^{1}(1  u_1) $$
$$ \phi = 2 \pi u_2 $$
\(u_1 = 1  u_1\) can be substituted as it is just another canonical uniform distribution, to obtain
$$ \theta = \cos^{1}(u_1) $$
$$ \phi = 2 \pi u_2 $$
Convert \(\theta\) and \(\phi\) back to Cartesian coordinates
$$ x = \sin \theta \cos \phi = \sqrt{1  u_1^2} \cos(2 \pi u_2) $$ $$ y = \sin \theta \sin \phi = \sqrt{1  u_1^2} \sin(2 \pi u_2) $$ $$ z = u_1 $$
where we used the identity \(\sin \theta = \sqrt{1  \cos^2 \theta}\). In code:


It generates unit vectors, distributed uniformly in a hemisphere about the z axis.
The functions are used in the pathtracing loop as follows:


We can now pathtrace our duck. If the noise on the duck exhibits a gridlike pattern, you should open the fullsize image as the website scales the images down.
Now that real work is being done in the shader, the performance of Sponza on my M2 Macbook Air is starting to become punishing. Framerates are in the 110 FPS range with the depicted image sizes.
Simple texture support 🖼️
20231110
Commit: 8062be7
Initial textures support is added. The “base color texture” from the glTF “pbr metallic roughness” material is loaded and passed over to the GPU. The same storage buffer approach from my Weekend raytracing with wgpu, part 2 post is used for texture lookup on the GPU.


On raytriangle intersection, the model vertex UV coordinates are interpolated and the color is computed from the corresponding texel.


The duck model from earlier looks more like a rubber duck now.
Sponza looks a bit rough at this stage. Two things need work.
 The plants in the foreground need transparency support. On intersection, the transparency of the pixel also needs to be accounted for the ray to intersect the triangle.
 There is an issue with the textures on the ground and in some places on the walls being stretched out.
The stretched textures are likely due to a problem with UV coordinate interpolation. Single colors and suspicious redblack color gradients can be seen when coloring the model according to the triangles’ interpolated UV coordinates.
Interpolated normals, texture coordinates, and GPU timers
20231109
Commit: 82208c1
GPU timers
Timing the render pass on the device timeline allows us to keep track of how long rendering the scene actually takes on the GPU, no matter what I do on the CPU side. The timings are displayed in the UI as a moving average.
My 3080 RTX currently runs the Sponza atrium scene (see below) at roughly 120 FPS at fullscreen resolutions.
Enabling allowunsafeapis
WebGPU timestamp queries are disabled by default due to timings being potentially usable for client fingerprinting. Using the timestamp queries requires toggling “allowunsafeapis” (normally toggled via Chrome’s settings) using the WebGPU native’s struct extension mechanism:


Loading normals and texture coordinates, triangle interpolation
The model surface normals and texture coordinates are loaded from the model. The model now exposes arrays of vertex attributes:


Normals and texture coordinates are interpolated at the intersection point using the intersection’s barycentric coordinates:


Here’s the famous Sponza atrium scene demonstrating the smoothly interpolated normals:
Loading textures
20231108
Commit: 1706265
The glTF model loader now loads textures, and the base color textures are exposed in the model:


Each triangle has a base color texture index, and it can be used to index into the array of base color textures. Textures are just arrays of floating point values.
A new textractor
executable (short for texture extractor 😄) is added. It loads a GLTF model, and then dumps each loaded texture into a .ppm
file. This way, it can be visually inspected that the textures have been loaded correctly. For instance, here is the duck model’s base_color_texture_0
:
Camera and UI interaction
20231107
Commit: 1593e8a
In order to work with arbitrary 3d models, camera positions and parameters should not be hardcoded. To kickstart scene interaction, two things are implemented:
 A “fly camera”, which allows you to move around and pan the scene around using the mouse.
 A UI integration, and a panel which allows you to edit the camera’s speed, and vfov.
The UI panel also prints out scene debug information:


A bug with the model loader’s model transform meant that the duck was hundreds of units away from the camera vertically. Without the debug UI, it was hard to tell whether I just couldn’t find the model in space, or whether the camera controller had a problem.
Raytracing triangles on the GPU 🎉
20231106
Commit: e950ea8
Today’s achievement:
The BVH and all intersection testing functions are ported to the shader. The rays from the previous devlog entry are now directly used against the BVH.
Porting the BVH to the GPU involved adjusting the memory layout of a number of structs. Triangle
s, Aabb
s, and the BvhNode
s all have vectors in them which need to be 16byte aligned in memory. This was achieved by padding the structs out. E.g. BvhNode
,


I also tried using AI (GitHub Copilot) to translate some of the hundreds of lines of intersection testing code to WGSL. It worked well in the sense that only certain patterns of expression had to be fixed. I likely avoided errors that I would have introduced manually porting the code over.
🤓 TIL about the @must_use
tag in WGSL. It’s used as an annotation for functions which return values and don’t have side effects. Not using the function in assignment or an expression becomes a compile error.


Raytracing spheres on the GPU
20231104
Commit: 16f6255
Before attempting to raytrace triangles on the GPU, first I need to demonstrate that the camera actually works in the shader.
To do so, a simple raytracing scenario is first implemented. An array of spheres, stored directly in the shader is added. Raysphere intersection code is copied over from my weekendraytracerwgpu project. The spheres are colored according to their surface normal vector. The spheres are visible and now I know that the camera works!
The camera and the frame data struct are bundled together into on uniform buffer.


I used WGSL Offset Computer, a great online resource for enumerating and drawing the memory layout of your WGSL structs, as a sanity check that my memory layout for renderParams
was correct.
Hello, bounding volume hierarchies
20231102
Commit: 624e5ea
This entry marks the first step towards being able to shoot rays at complex triangle geometry. A bounding volume hierarchy (BVH) is introduced (common/bvh.hpp
) and tested.
The test (tests/bvh.cpp
) intersects rays with the BVH, and uses a bruteforce test against the full array of triangles as the ground truth. The intersection test result (did intersect or did not) and the intersection point, in the case of intersection, must agree for all camera rays in order for the test to pass.
As an additional endtoend sanity check, a new executable target (bvhvisualizer
) is introduced. Primary camera rays are generated and the number of BVH nodes visited per ray is visualized as an image.
GpuBuffer
20231101
Commit: 0a826c4
Not a lot of visible progress, but GpuBuffer
was introduced (in gpu_buffer.hpp
) to significantly reduce the amount of boilerplate due to buffers & bind group descriptors:
Showing 5 changed files with 254 additions and 275 deletions.
On another note, used immediately invoked lambda functions in a constructor initializer for the first time today:


It’s busy looking, but it enables you to do actual work in the constructor initializer. Doing all the work in the constructor initializers like this could make move constructors unnecessary.
GPUgenerated noise
20231031
Commit: 645c09c
The major highlights of the day include generating noise on the GPU and displaying it, as well as cleaning up the main function using a new Window
abstraction.
Window
A simple Window
abstraction is introduced to automate GLFW library and window init and cleanup. main
does not yet have an exception handler, but this abstraction is a step closer to ensuring all resources are cleaned up even when exceptions occur.
GPU noise
Noise is generated on the GPU using the following functions.


Each frame,
rngNextFloat
is used in a compute shader to generate RGB values RGB values are stored in a pixel buffer
 The pixel buffer is mapped to the full screen quad from the previous day.
This approach is a bit more complex than necessary, and could just be done in the fragment shader.
Fullscreen quad
20231030
Commit: 51387d7
After 1000 lines of WebGPU plumbing code, the first pair of triangles is displayed on the screen in the shape of a fullscreen quad, and the UV coordinates are displayed as a color.
The pt
executable contains
 A
GpuContext
for initializing the WebGPU instance, surface, adapter, device, queue, and the swap chain.  A distinct
Renderer
which initializes a render pipeline, a uniform and vertex buffer, using theGpuContext
.
The renderer draws to the screen using a fullscreen quad. For now, the texture coordinates are displayed as colors.
Hello, GLFW!
20231029
Commit: 4df39fa
All projects have to start somewhere.
 Defined a
pt
executable target in CMake. The goal is for this executable to open up a scene and display it by doing pathtracing on the GPU.  The GLFW library dependency added using CMake’s
FetchContent
.  An empty window is opened with a window title. No GPU rendering support yet, only calls to
glfwSwapBuffers
.
A windows opens with a black frame.