Rasmus Rønn Nielsen

@raroni86

Graphics programmer at @unity3d. I like math, GPUs, realtime path tracing, importance sampling, ReSTIR.

Copenhagen, Denmark

Joined May 2008

214 Following

2.4K Followers

1.9K Posts

Pinned Tweet

Rasmus Rønn Nielsen @raroni86

about 4 years ago

In recent years there's been a lot of awesome GI solutions aimed at high-end GPUs, so I figured I'd try to instead make something aimed at weaker hardware like laptops and mobiles. Here’s a first view of my new realtime lightmapper running on a Macbook Air M1 (work-in-progress).

37

2K

250

209

0

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@SebAaltonen I sympathise with this view. However, to me "inline ray tracing" (aka RayQuery) is a more reasonable compromise than DXR1.0 ray tracing, because I retain some scheduling control, I avoid complexities of a new shader type, I can use wave features like intrinsics, barriers, etc.

0

1

0

1

988

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@Pjbomb2 I understand that Truetrace is not tracing rays using screen data. By “screen space” I meant that path origins are based on what is on the screen (as opposed to some caches which work differently). Still, you answered my question, thank you.

0

2

0

0

196

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@miketuritzin Very nice, well done! I know this is not what the video is about but: I never implemented transform gizmos myself and I have wondered how people do it. Are the axis arrows simply meshes you draw on top of everything with depth testing off? How do you handle mouse-over detection?

1

2

0

0

206

Who to follow

Just a guy who likes to learn and play with graphics, with a focus on realtime pathtracing. DM's OPEN! Always looking for new opportunities

Christoph Peters

Graphics researcher. Known for moment shadow maps, blue noise, spectra, light sampling. Opinions are my own. https://t.co/Q8asoCmY0s

Principal Devtech @Nvidia. Former graphics engineer at Ready At Dawn, Naughty Dog, Ubisoft. Views are my own. https://t.co/QIwAST5mpz

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@Passer1dae Nice!! Do I understand it correctly that you fallback to APV when screen marching fails? If so, how do you reconstruct radiance in this case? I ask because APV stores irradiance, not radiance. Inverse cosine convolution in SH frequency space?

1

1

0

0

312

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@Passer1dae Congrats on the release, Pavel!

1

1

0

0

191

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@BenSimsTech I don’t know why it is slow but I’ll pass your tweet on to a colleague who probably does.

1

3

0

0

172

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@JarkkoPFC @aras_p I think one key difference from vanilla ReSTIR DI is that they cache light visibility in screen space and use this as a guide in their RIS target function: Recently occluded lights are less likely to be picked. Pretty neat but I suspect it has artifacts when visibility changes.

1

3

0

2

271

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@Passer1dae I see. The horizon approach felt a little too good to be true so I’m not surprised. Can you recommend any resources on “regular” depth ray marching then? I suppose https://t.co/6v31GqKVlf is a classic.

1

0

0

0

130

Rasmus Rønn Nielsen @raroni86

over 1 year ago

@Passer1dae Looks great! How do you update the elements in the cache? Does each hash entry shoot own rays or do you piggy back on rays shot from gbuffer?

1

1

0

0

256

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

@Pjbomb2 Thanks! I don't have anything particular to share at this point but I'd be happy to answer questions. No, this was designed with diffuse in mind only for perf reasons. You could extend it to support specular by having many "directional cells" per voxel, a bit like Nvidia's SHaRC.

0

1

0

0

153

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

With efficient reallocation and defragmentation of voxels in place (see prev tweet), I recently added cascading to my irradiance cache. This enables a (relatively) low probe count without sacrificing voxel resolution near the camera. 🧵

3

169

18

37

11K

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

@john_clayjohn Looks great! Do I understand it correctly that in "After + bicubic" we see the result of bicubic runtime texture sampling of lightmaps which were already antialiased during baking?

1

4

0

0

493

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

This work was heavily inspired by the Kajiya renderer https://t.co/AEAi5235bA by the awesome @h3r2tic.

1

17

1

3

1K

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

Performance-wise I think the algorithm is still doing alright. With the scene in the video, I can run a basic deferred renderer + primary bounce ReSTIR + denoising + cache update/reallocation/defrag in less than 10ms on a Macbook Air M1 (2020).

1

4

0

0

1K

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

@_arieleo_ I use ReSTIR for the first bounce and that runs for every pixel. The irradiance cache shown here is only for the remaining bounces. This tweet https://t.co/wKOZwh1AKP shows the impact of the cache. Eventually, I want the cache to cover the entire scene though (via cascading).

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

Implementing a fast parallel allocation strategy for my irradiance cache was interesting. I considered Kajiya's approach of doing compaction using a per-frame prefix sum, but instead I settled on a ring buffer with time sliced "compaction". Still needs more work but here's demo!

1

66

10

14

5K

1

2

0

0

116

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

I got time-sliced defragmentation working in my irradiance cache. The probes are arranged in a ring buffer and a defrag kernel removes holes by pushing active probes forward into orphaned slots (using a warp level prefix sum). Green = Available, Red = Orphan, White = Active. 🧵

3

62

7

12

4K

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

This is how it looks like in a pathological case where the defragmentation cannot keep up because too many probes are deallocated and reallocated. For some reason, I find it quite satisfying to see how it eventually recovers.

0

3

0

0

697

Rasmus Rønn Nielsen @raroni86

almost 2 years ago

Because the defrag kernel is so fast, I can easily run it several times per frame to get faster defragmentation. Here it is running 4 times per frame.

1

5

0

0

763

Last Seen Users on Sotwe

Trends for you

Most Popular Users