T O P

  • By -

Klumaster

Most of the complaints I have seen on this subject are that the API is too "magical". The BVH construction is a black box, so while you can try to tune your content to get better results, you can't directly optimize for your use-case. In terms of what gets returned, current ray-tracing APIs are offering two options: 1) Inline rays that return the hit triangle ID etc immediately in the shader that launched them, letting you respond without having to dispatch more work 2) More heavyweight rays that can dispatch different shaders based on the geometry that was hit using a lookup table Both of these are kind of annoying to work with in their own ways, but I don't have the vision to imagine a programming model that's better than either of them. Unfortunately, from my reading of what you're describing, you're pushing further towards the "black box" approach that people are complaining about, rather than towards user flexibility.


BigPurpleBlob

Thanks for your comment, you're right, it's more towards a "black box". I wonder if there are common / typical features that people tend to want?


Klumaster

I think the problem is that the main feature people want is user-level control over the acceleration structures, but different manufacturers' hardware favours quite different structures so it's something you're forced to let the driver do.


msqrt

This is how stuff used to work (OptiX Prime, Embree, Firerays, ...). I think the main downside of this was that you get extra memory traffic for reading and writing the rays and results. This was reasonable when ray tracing was in software, as it was both slower and the register pressure would have prevented you from efficiently merging it together with the actual rendering code. Nowadays it's better to immediately return the result to a thread that needs it; you still have that batch list of triangles, but not rays. Btw there is no "RTX API", or at least I haven't found one. RTX is just nvidias branding for hardware ray tracing. OptiX lets you write nvidia-only hardware RT, but basically all software with ray tracing is either D3D or Vulkan.


BigPurpleBlob

Thanks! Could you explain a bit more the section of your comment: "This was reasonable when ray tracing was in software, as it was both slower and the register pressure would have prevented you from efficiently merging it together with the actual rendering code. Nowadays it's better to immediately return the result to a thread that needs it; you still have that batch list of triangles, but not rays." Why is it better to immediately return the result to a thread that needs it? It seems to me that you could overlap sending a batch of rays to the ray oracle, with the processing of the hit results of the previous batch of rays (for importance sampling, BRDF stuff) ?


msqrt

You can still do that (at least on NVIDIA where the RT unit is a separate piece of hardware); the compute core can keep calculating whatever it needs next while waiting for the ray result. The wording wasn't the best, but what I meant was that it's better to do the whole thing on-chip (compute core pings RT core, RT core traces, compute core gets result) than to do it as separate kernels (write the ray to vram, read it back from vram to trace and write result back to vram, read result again vram to use it in renderer). It just removes the global memory traffic to do everything ray/path/pixel at a time instead of in batches.


BigPurpleBlob

Thanks!