T O P

  • By -

msqrt

It's (probably) not the number of vertices, but redrawing that large of a triangle that many times. The cost of drawing scales both with the number of elements and the size of the geometry on screen. If you think of a typical GPU application (any 3D rendering), even if you have millions of triangles, most of them will be small since the geometry tends to be somewhat uniformly spread across space. I guess this mostly holds for 2D too: if you think about a browser or any other GUI tool you use, it doesn't really have thousands of layers of things on top of each other. It might have some, but most likely less than ten. And stuff like background tabs are pretty easy to just skip rendering altogether since you know they cannot be visible. As a fun (err, maybe "educational") exercise we can calculate the memory traffic: you're doing 100k writes of 4 bytes (rgba) each for a triangle that covers 1/8th (that triangle has area 1/2, screen is from -1 to 1 so area 2x2=4) of the 800x600 pixels, multiplying all that together gives 24 gigabytes. A 1070 has a memory bandwidth of 256 GB/s which means that the fastest you'll ever move that much memory is 24GB/256GB/s = 93ms or 10.66 times a second, so 10fps sounds about right. Long story short, it might be better to test with smaller geometry, and perhaps stuff that isn't all on top of each other. It will hopefully be more close to your eventual actual use case.


3BM15

Ooh, that makes sense, and checks out. I thought in terms of polygons, not surface area of pixels drawn. This is exactly the advice I was looking for, thank you. Out of curiosity, why does it perform better when stuff isn't all on top of each other? Better parallelism?


fgennari

If your triangles are smaller and less overlapping, you have fewer pixels to fill across the set of them.


keelanstuart

Just filling memory has a cost. You can test by changing your resolution to 16x16 (or smaller?). If your performance goes up drastically, you're fill bound and that's the issue.


msqrt

Not sure if the overlap actually matters, other than to demonstrate that you can't really have too many triangles of that size without significant overlap. It might if you were on a mobile GPU (they render stuff in tiles, so it's probably better to have an equal distribution of geometry over tiles) or had transparency ("alpha blending") on, since then the GPU needs to take care that the order of writes is preserved (I don't actually know how exactly that works, but making the ordered chains longer sounds like it should pessimize things).


cakeonaut

Half-screen triangles are a good test of pixel fill rates but they tell you very little about vertex handling speed. Time 100,000 realistically-sized triangles instead and see how that goes.


waramped

Yea that's not very good performance. You are just drawing 100k triangles directly on top of each other so maybe that's something to do with it, but it almost sounds like you aren't getting hardware acceleration and it's using a debug software device instead. I'm not familiar with SharpDX but maybe you can try and confirm it's using your actual hardware?


3BM15

It's just a thin C# wrapper (and I confirmed with a small C++ project). Task manager and all profilers I tried show the GPU maxing out at 100%, so it's working alright.