Day 22 - 🔬 Deep Dive - How IBL Works

General / 01 June 2026

Image based Lighting turns a photograph of the world into a light source. Here's what actually happens between the cubemap and the final pixel.

Why Lighting From a Photo Works

Ambient lighting gives surfaces a uniform base color from all directions equally, a practical shortcut, but an obviously unrealistic one. In the real world, light arrives with wildly different intensities depending on where you look.

Image-based lighting replaces that flat approximation with a photograph of the actual environment. Every texel of the captured image functions as a light source, so instead of placing discrete lights by hand, the scene is illuminated by the world around it. Think of a VFX shot where a CGI element needs to match on-set footage, and IBL is what makes that integration feel effortless. For games it serves the same purpose: a quick way to establish a grounded, photoreal read without hand-placing dozens of lights.

That's what makes IBL feel different from traditional lighting setups. The environment contributes from every direction simultaneously, which is why IBL-lit scenes look grounded rather than staged.


source: marmoset

There are some limitations worth keeping in mind. Cubemaps for IBL need to be captured as HDR - typically 32-bit per channel EXR (FP32) or 16-bit half-float. Even then, extreme dynamic ranges can be a problem. A sunset sky with a visible sun is a common example: the contrast is high, and the capture process itself - cameras and HDR merge workflows - can't reliably capture the full range when the sun is in frame. The sun's luminance at noon is approximately 1,600,000,000 cd/m², which FP32 can store numerically, but getting there accurately from a real-world capture is the hard part.

For more info on capturing real life IBL footage: https://marmoset.co/posts/hdr-panorama-photography/

Building the Cubemap

The environment needs to be recorded as a spherical function - incoming radiance from every direction around a point. That data gets stored in one or more images, indexed by direction. This is environment mapping. It's one of the most powerful forms of environment lighting: more memory-intensive than other spherical representations, but simple and fast to decode at runtime. Critically, it can represent arbitrarily high-frequency signals just by increasing resolution, and capture any range of radiance by increasing bit depth - which is why HDR capture matters. A standard 8-bit texture would clip the sky and produce incorrect lighting integrals.

Several projection formats exist, each with different trade-offs.

Latitude-longitude (equirectangular) is the dominant exchange format. The reflected view vector is converted to spherical coordinates - longitude (azimuthal angle, 0 to 2π) and latitude (polar angle, 0 to π) - and mapped to UV:

Distortion is unavoidable when projecting a sphere onto a rectangle. The bigger problem is that the sample density is highly non-uniform: the poles are oversampled relative to the equator.

Sphere mapping derives the texture from the appearance of the environment as viewed orthographically in a perfectly reflective sphere. Photographing a chrome ball is the classic real-world capture method - the result is called a light probe. The mapping is view-dependent though, which limits its use, and a reflective sphere only captures the front hemisphere. In practice, sphere maps are typically assumed to operate in view space.

Cube mapping is the runtime standard. The environment is projected onto the six faces of a cube centered at the capture point, stored as six square textures with no wasted space. Synthetically, they're produced by rendering the scene six times with a 90° FOV; for real environments, equirectangular panoramas are reprojected. The key advantages are that it's view-independent, hardware-native, and indexing is trivial - the reflected view vector r is passed directly as a three-component texture coordinate, no trigonometry needed, and doesn't even need to be normalized.

Dual paraboloid mapping uses two textures, each covering one hemisphere via a parabolic projection. Octahedral mapping unfolds the sphere by cutting eight triangular faces and arranging them on a square or rectangle. It avoids the filtering artifacts that affect some other projections, and the L₁-normalized lookup is straightforward to implement in a shader.

Splitting Diffuse and Specular

The full IBL integral (incoming radiance × BRDF × cosine term over the hemisphere) is too expensive to evaluate per pixel. The solution is to precompute as much as possible, exploiting one key observation: diffuse and specular respond to the environment at very different frequencies.

Diffuse is low-frequency: only the general hemisphere energy matters, not the exact incoming direction. Specular is high-frequency and view-dependent: smooth surfaces pick out a narrow reflection cone; rough surfaces blur it.

Diffuse - irradiance and Spherical-harmonics

The diffuse component integrates incoming radiance convolved with a cosine-weighted kernel. The result is very smooth, which makes it ideal for compression. Spherical harmonics (SH) are the natural fit.

SH are an orthonormal basis defined on the sphere. Projecting the irradiance function onto each basis function gives a coefficient capturing how much of that frequency is present. The first three bands (L0 through L2, nine coefficients total) reconstruct diffuse lighting with very little error.

A key property is that the integral of the product of two functions equals the dot product of their coefficient vectors:

This means the diffuse irradiance integral reduces to a dot product at runtime, which is essentially free. Low-band SH avoids ringing (Gibbs phenomenon) because diffuse lighting is inherently smooth and well-represented by just a few coefficients.

Specular - prefiltered environment map

Specular is both roughness-dependent and view-dependent. A perfectly smooth surface samples a single reflected direction; increasing roughness blurs the reflection over a wider cone. This is represented as a mip chain of the environment map, where each level stores the environment pre-convolved at a different roughness value. Rough materials sample higher mips; mirror-like materials sample the base level.

The BRDF Integration Map

The split-sum approximation factors the specular integral into two independent parts. The prefiltered environment map handles the lighting side. The other part captures how the BRDF itself responds to a given roughness and viewing angle - independent of what the environment looks like.

This is baked into a 2D LUT, indexed by NdotV (cosine of the angle between surface normal and view direction) and roughness. The LUT stores two values per texel: a scale and a bias that together parameterize the Fresnel term. Because they're computed under a fixed white environment, they're analytically correct for any real environment when combined with the prefiltered map.

The LUT is computed once, offline. At runtime, it's a single texture lookup - (roughness, NdotV) returns (scale, bias) - and a couple of multiply-adds. No per-pixel integration required.

From Cubemap to Final Pixel

At render time, three lookups reconstruct the full IBL response:

  1. Diffuse - evaluate the SH coefficients with the surface normal (a dot product), or sample a pre-convolved cubemap. This gives the irradiance for the diffuse term.
  2. Specular lighting - sample the prefiltered environment map at the mip level corresponding to the material's roughness. High roughness → higher mip → blurrier reflection.
  3. Specular BRDF - sample the 2D LUT at (NdotV, roughness) to retrieve the scale and bias.


Practical Takeaway

The mip levels of a prefiltered environment map correspond directly to roughness. A roughness of 0 samples the sharpest mip; 1.0 samples the most blurred. That means roughness map quality shows up directly in the IBL response; banding or precision issues in the texture will produce visible artifacts in the reflection.

IBL also assumes PBR-compliant materials. The split-sum derivation is built on the same energy conservation and Fresnel assumptions that PBR materials follow, so the two are designed to work together, and a non-compliant material will produce incorrect results even with a perfectly captured environment.

Static IBL does have one hard constraint: it can't support a dynamic time-of-day cycle. Studios that need that reach for procedural atmospheric systems instead, giving artists direct control over how the sky and lighting evolve over time.

The shader combines them: diffuse albedo × irradiance, added to (BRDF scale × specular color + BRDF bias) × prefiltered radiance. Each of those inputs was precomputed. What the shader actually executes is three texture fetches and a few arithmetic operations - not a hemisphere integral.

© 2026 Stefan Groenewoud - All views are my own, not those of my employer.