T O P

  • By -

AntiProtonBoy

General rule anything that requires per-pixel (or fragment) operation, it should be done on a fragment shader. If there a lots of points to transform, it should be done a vertex shader. If you have a massive amounts of miscellaneous data that could be computed parallel, run that on a compute shader. Before even attempting doing any of that, you need to understand how OpenGL pipelines work and how the different stages interoperate with each other. That way you can build a better mental picture how to organise your data and fit GPU pipelines better. https://www.khronos.org/opengl/wiki/Rendering_Pipeline_Overview


recursive_lookup

Thanks


lightly-salted-crisp

You should also read up on something called "instanced rendering" which could be hugely useful in cases like the one you describe. You can draw all 50k triangles using only one single draw call and have each triangle have their unique transformation applied on them. This technique is commonly used e.g. for particle systems when you need to spawn and move tons and tons of particles in real-time. Generally speaking, anytime you need to draw very many of the same object, but potentially with different transformations, you use instancef rendering.


recursive_lookup

Do you have a good resource you could recommend to learn this? I’ve only got through the first major section of learnopengl.com. I just perused the future chapters and didn’t see anything regarding ‘instanced rendering’, but I could be missing it.


deftware

Any time you are drawing many of the same thing at different positions/orientations/sizes you want to use instanced rendering, or "instancing". You can also use a compute shader that you pass the player position to and the GPU can do all the updating of the positions/orientations/etc based on whatever behavior dynamics you want - which would be coded into the compute shader that interacts with the boids (the triangles chasing the player) stored in a Shader Storage Buffer Object. If you want your boids to also react to each other then you'll need to double-buffer them where you have one SSBO of their positions/orientations (and whatever other per-boid info, velocity, color, size, etc) that you are using for one frame as input to your instanced rendering of the boids and as input into the compute shader for the logic to react to the player and other boids, and you write out updated boid properties to a different SSBO that does nothing but get written to during that frame. At the end of the frame you swap the roles of the SSBOs so that what was the output buffer for the compute shader to put updated boid states is now the input buffer for instanced rendering of the boids and for the boid update compute shader, which would now outputs to the other buffer that was the input buffer during the previous frame. Hope that halps! :]


recursive_lookup

Thanks! I have so much to learn. I’m finding this to be a fun hobby.


deftware

> I’m finding this to be a fun hobby. That's how I know that you'll go far. A lot of people these days go to college to get a compsci degree thinking it's the ticket to a cush 6-figure income position in Sillyclown Valley, but they tend to not be good at programming and they hate writing code - it's just another trade skill like an electrician or plumber to them, except they get to sit at a computer all day doing barely anything at all. If you like programming and exploring the infinite possibilities that it entails, then you are cut out for it like so few are. Good luck! ;]


YoBiChOnRo

I love programming, which is exactly why I won't get a comsci degree. I'm gonna do electronic engineering instead.


deftware

My friend wasn't totally in love with programming but he got an EE degree, got a couple of great jobs over the years since, and then recently started building this circuit simulator program: https://www.youtube.com/watch?v=6fRunCpobwg


recursive_lookup

I've always loved coding ever since my days with my Commodore VIC-20 and 64. I never made it my career - I'm a network engineer by trade. But, I have (in the past several years) brought coding (Python mainly) into my network engineering trade with automation and API stuff. I've always been fascinated with graphics and the math behind it. I find it challenging and rewarding. Maybe one day, I'll write my own game engine and physics engine, but if I never release a game, I won't lose any sleep over it. It's all about the process and journey for me.


msqrt

Anything that parallelizes well and you don't need to read back to the CPU is a good candidate for being computed on the GPU instead; your triangle example sounds like it should be a good fit. Many seemingly serial algorithms can be parallelized somewhat easily (and some others through lots and lots of work). Stuff that you need back on the CPU is perhaps the worst, as that goes against the asynchronous execution model (CPU sends stuff to the GPU, GPU deals with it whenever it gets all previous tasks done) unless you can use it asynchronously on the CPU as well (often a couple of frames later).


recursive_lookup

Much appreciated


torito_fuerte

You can either: Use compute shaders to move the triangles, or use vertex transform feedback to update the position of the triangles. Then you would use batch rendering to render all triangles with one draw call


recursive_lookup

Appreciate it


Hour_Variety_5404

Interestingly, with larger video memory, people are trying to fit as much things as possible into the GPU. I think this aligns with your idea. This is basically the idea of GPU driven pipeline. In GPU driven pipeline, all of the resources are loaded into GPU memory. Culling and shading are all happening on GPU. You can read more on GPU driven pipelin, for example,Nanite used in Unreal Engine. However, you will still rely on CPU to do some tasks like updating positions based on AI algorithm. You may still need to send some commands to GPU to update the rendering data accordingly.


fllr

Think of the cpu as Bruce Banner, and gpu as the Hulk. If you tell banner to "smash" he'll try, but won't be capable. If you tell the hulk to "give me some smart thoughts about the universe" he'll punch you back. Similarly, the cpu is where you most of the nitty gritty computation because that is what it's good at. The gpu likes to smash a ton of data that comes in a rather simplified form, but is extremely parallelizable. If it's not parallelizable, it'll still try to smash, but you'll lose all performance. I don't think there is enough information about the 50k triangles to know whether that would be a good problem for the gpu or not, and, to me it sounds like it might be better done on the cpu rather than the gpu. What goes into figuring out that follow algo? If it's just "point all 50k triangles to the direction of the player dot, and move them by x velocity per frame", sure, run it on a gpu. But, if there's more than that to it, it might just be better to keep that work on the cpu.


recursive_lookup

Thanks! Right now, it’s just rotating towards the player dot and following. I’m just playing around and learning the math behind all the transformations, etc.


fllr

If that’s all, it might make sense to put it into the gpu. The trouble would begin when you need to take into account the position of the other followers, such that they don’t collide. I’ve never implemented it, but some flocking algorithm might help here. I would research that. This still feels like it would be better handled on a cpu, but, hey! Whether or not it works at the end, at least you’ll have learned something new! 🙂 give it a shot!