In recent years there's been a lot of awesome GI solutions aimed at high-end GPUs, so I figured I'd try to instead make something aimed at weaker hardware like laptops and mobiles. Here’s a first view of my new realtime lightmapper running on a Macbook Air M1 (work-in-progress).
@SebAaltonen I sympathise with this view. However, to me "inline ray tracing" (aka RayQuery) is a more reasonable compromise than DXR1.0 ray tracing, because I retain some scheduling control, I avoid complexities of a new shader type, I can use wave features like intrinsics, barriers, etc.
@Pjbomb2 I understand that Truetrace is not tracing rays using screen data. By “screen space” I meant that path origins are based on what is on the screen (as opposed to some caches which work differently). Still, you answered my question, thank you.
@miketuritzin Very nice, well done! I know this is not what the video is about but: I never implemented transform gizmos myself and I have wondered how people do it. Are the axis arrows simply meshes you draw on top of everything with depth testing off? How do you handle mouse-over detection?
@Passer1dae Nice!! Do I understand it correctly that you fallback to APV when screen marching fails? If so, how do you reconstruct radiance in this case? I ask because APV stores irradiance, not radiance. Inverse cosine convolution in SH frequency space?
@JarkkoPFC@aras_p I think one key difference from vanilla ReSTIR DI is that they cache light visibility in screen space and use this as a guide in their RIS target function: Recently occluded lights are less likely to be picked. Pretty neat but I suspect it has artifacts when visibility changes.
@Passer1dae I see. The horizon approach felt a little too good to be true so I’m not surprised. Can you recommend any resources on “regular” depth ray marching then? I suppose https://t.co/6v31GqKVlf is a classic.
@Passer1dae Looks great! How do you update the elements in the cache? Does each hash entry shoot own rays or do you piggy back on rays shot from gbuffer?
@Pjbomb2 Thanks! I don't have anything particular to share at this point but I'd be happy to answer questions. No, this was designed with diffuse in mind only for perf reasons. You could extend it to support specular by having many "directional cells" per voxel, a bit like Nvidia's SHaRC.
With efficient reallocation and defragmentation of voxels in place (see prev tweet), I recently added cascading to my irradiance cache. This enables a (relatively) low probe count without sacrificing voxel resolution near the camera. 🧵
@john_clayjohn Looks great! Do I understand it correctly that in "After + bicubic" we see the result of bicubic runtime texture sampling of lightmaps which were already antialiased during baking?
Performance-wise I think the algorithm is still doing alright. With the scene in the video, I can run a basic deferred renderer + primary bounce ReSTIR + denoising + cache update/reallocation/defrag in less than 10ms on a Macbook Air M1 (2020).
@_arieleo_ I use ReSTIR for the first bounce and that runs for every pixel. The irradiance cache shown here is only for the remaining bounces. This tweet
https://t.co/wKOZwh1AKP shows the impact of the cache. Eventually, I want the cache to cover the entire scene though (via cascading).
Implementing a fast parallel allocation strategy for my irradiance cache was interesting. I considered Kajiya's approach of doing compaction using a per-frame prefix sum, but instead I settled on a ring buffer with time sliced "compaction". Still needs more work but here's demo!
I got time-sliced defragmentation working in my irradiance cache. The probes are arranged in a ring buffer and a defrag kernel removes holes by pushing active probes forward into orphaned slots (using a warp level prefix sum). Green = Available, Red = Orphan, White = Active. 🧵
This is how it looks like in a pathological case where the defragmentation cannot keep up because too many probes are deallocated and reallocated. For some reason, I find it quite satisfying to see how it eventually recovers.