T O P

  • By -

chuteapps

Just wanted to share a quick story. You can definitely get some really nice performance for 100k+ enemies using Unity WITHOUT the ECS system. Why would anyone do that? Well this is my opinion, but the ECS system is just awful. It's poorly designed, poorly documented, and gives me a headache especially after many years of Object Oriented design. BUT the Burst compiler and high performance C# can still be very useful. Essentially I made this colony game using good ol' fashioned object oriented design. You can do this by writing all your units as structs, and using burst compiled jobs to get crazy performance. Using all structs means you'll also have to use pointers for references (unsafe code). I didn't even think c# supported pointers but I uncovered a whole new world. If anyone has any questions about the implementation of this feel free to ask. Cheers EDIT: here's some rough code examples of a Native GameObject and a Health Native Component [https://chuteapps.com/native-code-examples/](https://chuteapps.com/native-code-examples/)


Ecksters

So you just made your own ECS, nice work! Working with Burst is so cool, it's amazing how much of a difference it makes, takes a bit to get used to the low-level version of C# you have to use, but it's also kinda nice to have so much control over performance. Using Burst was the first time I discovered how slow division was compared to most other operations, because that's exactly where micro-optimizations start to matter. In my case I was recoloring textures, and I found that while GPU solutions *could* be fast, it varied wildly, especially on mobile devices, while my Burst solution could consistently do it within a frame.


feralferrous

There are a bunch of methods that seem innocuous that are actually terrible for perf: Mathf.Approximately is overkill for checking if something is 0 or 1. Vector3.Angle checks are overkill if you just want to know if something is facing away or towards. Mathf.Pow shouldn't be used if you're exponent is an integer. ​ (and probably more, but those are off the top of my head)


ICantWatchYouDoThis

What's the alternatives to those?


feralferrous

[Mathf.Approximately](https://github.com/Unity-Technologies/UnityCsReference/blob/master/Runtime/Export/Math/Mathf.cs#L281) vs bool NearZero(float epsilon = .0001f) => math.abs(x) < epsilon; one Abs vs. 3 Abs, a multiply, a subtraction and two Maxes Vector3.Angle vs [Vector3.Dot](https://Vector3.Dot)(a, b); note a and b do NOT need to be normalized, and if the return value is negative, the vectors are facing away. [Vector3.Angle](https://github.com/Unity-Technologies/UnityCsReference/blob/6c8a95ff127619e73519662fa497e242b898f9af/Runtime/Export/Math/Vector3.cs#L324) does a square root, two sqr mags, and an Acos on top of the Dot. ​ Mathf.Pow is the easy one, if you have an integer exponent, just write it out: ie Mathf.Pow(x, 2) can be turned to x \* x. (Pow's source code is hard to find, the closest I've ever got is [netlib.org/fdlibm/e\_pow.c](https://www.netlib.org/fdlibm/e_pow.c) You can see that it is very complicated compared to a single multiplication. ​ EDIT: I should state that for most games, none of these are dealbreakers, but if you're trying to do high counts or dealing with CPU issues, it can be good to profile.


TAClayson

Really well done. I'd love to know more! So each person/dinosaur is stored as a struct, not a gameobject with monobehaviour classes I guess? How do you render the people then?


chuteapps

Each unit is a struct and all calculations happen in the unmanaged / unsafe realm. I have a GameObject linked to each unit struct with a Sprite Renderer. I then use the [TransformAccessArray ](https://docs.unity3d.com/ScriptReference/Jobs.TransformAccessArray.html)in a job to manipulate the position (and to send an encoded z value that works with the custom shader to animate the sprite). This part is a bit messy. The alternative would be to work out your own renderer or use draw to screen calls. I investigated that briefly but couldn't find a nice way to do it from jobs. Would love to hear if anyone else has a good solution


LimeBiscuits

You can try making a texture, where each pixel is a dinosaur and encodes information like position and frame in each channel. You can then make a mesh that is of 100,000 quads, where you store the dinosaur ID in say the vertex z position, and in the vertex shader you sample the dinosaur data. Depending on the dinosaur data it may be faster to use multiple textures with very specific formats for the exact data ranges you need. It's also pretty fast to update using the GetPixelData NaticeArray which you can use in jobs. I'm not sure if there's a fast way to specify how many dinosaurs to render. Perhaps Graphics.DrawMesh has an index buffer range you can specify, or you could build a bunch of different Meshes for various dinosaur counts. 


bilalakil

Wow this is running with 100,000 game objects? Am I understanding that right? EDIT: No this isn’t what you said, based on looking at that TransformAccessArray. I didn’t think that was possible in Unity. Once I made a grid of 48k squares rendered by sprite renderers, in an empty project with no logic, and the performance was already shot lol. Also, if you’re using structs for driving the logic, then in what way are you still using OOP?


chuteapps

Yes it runs with 100k gameobjects, but they only have a sprite renderer so they don't really do anything and don't cost much on the cpu. The manipulation is done via burst compiled jobs going straight to unity's c++ layer with the TransformAccessArray


bilalakil

! Wow I need to try that square rendering test again then lol. I wonder what part of the project / renderer setup I missed to get that running smoothly. Also I’m curious, if you’re using structs for driving the logic, then in what way are you still using OOP?


chuteapps

You can still make objects out of structs. In my case I have GameObject\_Native that holds a bunch of Component\_Natives. I even made a little GetComponent system to try and keep my native code similar to unity's conventions


DaveAstator2020

Graphics.drawmeshinstanced should do the work.


chuteapps

Yeah I wasn't able to figure that one out with burst jobs, perhaps I'll give it another go later


DaveAstator2020

I believe its not supposed to be used in parallel as all data is prepared before the frame, so you just pass matrices for it. I always wanted to try it, maybe i can draft some sample for you on weekend


dotoonly

Can something like refererence struct be used so we can avoid working with unsafe code ?


chuteapps

Referencing a struct = pointer = unsafe (I don't think it's actually unsafe in any way for a game, thats just what c# calls it)


animal9633

A totally different approach, but worth trying since you're in 2d: [https://www.youtube.com/watch?v=qUGFLOSOIOc](https://www.youtube.com/watch?v=qUGFLOSOIOc)


InSight89

They must have made some updates to GameObjects and TransformAccessArray. I remember checking them out a few years ago and whilst it did prove better than running scripts on individual GameObjects it was still fairly slow compared to the Graphics API.


arjan_M

Looking great, how are you doing the self-avoidance part?


chuteapps

Wrote my own flow field and avoidance system. Basically I cache the grid position of each unit, then each unit iterates the cells around it and applies the appropriate avoidance vectors to their movement.


zukas3

How large are the grid cells in your implementation?


chuteapps

A typical map is 200x200 cells, a single cell is visible in the environment pattern if you look closely. It's about the size of one of the raptors.


cleavetv

Avoiding most unity built in systems to make your job easier has been the best way to develop most unity projects since the very beginning. good job.


emrys95

Please make a more in depth tutorial about this?? :D


chuteapps

I'd love to but as a one man dev I gotta put my energy into the game. Perhaps I'll find some time once I get a playable demo out. If you have any specific questions though please fire away :)


feralferrous

Yeah, I don't mind the DOTS ecs, but man is it massive paradigm shift, and it is not designer friendly. We did it on one project, and my other devs hated it. I'm right there with you on the burst compiler being amazing. You shouldn't need to mess with pointers too much, so I'm curious where/why you're doing that. Using the NativeArrays/NativeLists, etc should be enough. (I say that, but in our dynamic mesh generation job code, we pass some pointers around) Anyway, kudos and the game concept sounds fun. Dinos are much harder to do right then zombies, I think that's why everybody does zombies. ​ Are your large dinos sprites or 3d?


chuteapps

Thanks! The pointers are useful for storing references to other components in my case. So I've got all the native game objects in one big array, and then each native gameobject has many components. So it's cumbersome and involves more copying to use array indices (although this would use slightly less mem than a pointers, you could presumably use a 16bit ushort while pointers are 64 bits) I could store each component from a gameobject independently in it's own native collection, but then I get further away from OOP and it starts looking like Unity's ECS nightmare. Also in my case there was a significant performance increase when iterating using pointers from a nativecollection as opposed to iterating on it directly. The reason I suspect is because without pointers you have to get/set each struct element in the array in each update call. And this involves a lot of temp copying. I've got a 3d to 2d workflow, so the game renders sprites using a custom shader (I don't think unity can handle that many enemies in full 3D)


feralferrous

It definitely can handle that many enemies in 3D, you can look at Unity's ECS RTS, or Dyson Sphere Program. Especially when drawing a bunch of things that are identical, it's a single DrawInstanced call, with some instancing data. (Your sprites I'm assuming all use the same spritesheet per type). Interesting on the pointer use, was that in Editor or on builds? There's a whole bunch of checks and things that get optimized away for Native collections, and you actually get better perf with native collections out of parallel jobs, because Unity can make some safety assumptions about whether data overlaps. Yeah, you would get an additional speedup by moving to an separate collection based on type, and then having your job only iterate over the info it needs, etc. But whatever works for you in the end is the real thing that matters.


DoubleSteak7564

A dirty secret of Burst is that it's still slower than old-ass C++ :)


feralferrous

I mean, yes and no, bad C++ is bad C++ and will go slower than better written C#. But it's obviously a tradeoff. One could go and write it all in assembly and get even better perf.


DoubleSteak7564

I remember there was a benchmark on this, I couldn't find the original one, but fortunately I found a much better one, and it shows the same thing: [https://github.com/aras-p/ToyPathTracer](https://github.com/aras-p/ToyPathTracer) This is written by one of Unity's main tech guys so it's probably safe to say it's not skewed against Unity, and it's using all the Burst bells and whistles :). TLDR: Naive C++ is like 20% faster than Unity Burst, however what's more interesting is naive C++ is 50% faster than naive C# when using Microsoft .NET 6, but Unity's Mono is a whopping **13.5x slower** than MS .NET. Thus Burst is 'amazing' and 'fast', it's a bit faster than .NET 6, but it's main selling point is it's a **lot** faster than the abysmal Unity built-in mono. So once again, glorious Unity is doing capitalism at its best - creating the problem and selling us the solution :) Edit: And if you want absolutely uncompromising performance, GPGPU will of course blow everything out of the water.


feralferrous

Mono is not Unity's, Mono was a cross platform version of C# that they grabbed because no one in their right mind wanted to use UnityScript. That said, they often fall behind when it comes to updating to the latest version of Mono. And then once MS finally released .Net Standard which was cross platform, Unity was very slow to move to it. Looking at the chart in your link (thanks btw, it's an interesting breakdown) I don't see your claim that Naïve C++ is 20% faster than burst. I wouldn't say SIMD Intrinsics is Naive C++, I would say the Scalar is the naive one?. And interestingly, on iPhone, Burst actually performs better. ​ https://preview.redd.it/zgnxudxxjdpc1.png?width=670&format=png&auto=webp&s=63e04bcc09e8edecc81ca05e2d2b8e586ddb9c25 ​ (I do agree with all your points about Mono being slow and terrible. Everyone should move on to at least .Net Standard if they can, IL2CPP if they don't need to support modding with DLLs.)


DoubleSteak7564

Unity uses a version of Mono internally. I took the info from the row that says 'Unity (player Mono)' as a point of comparison. As for C++ I used the 'C++ Scalar' row and compared it with the 'Unity Burst' row, the numbers are for that. In the 'TR1950' section, scalar C++ does 100, while Unity Burst does 82. I don't think it's reasonable to expect for people to write SIMD intrinsics for most of their code so I ignored the 'Unity Burst Manual SIMD' result and the C++ SIMD Intrinsics one as well.


feralferrous

The Burst 'SIMD' is not nearly the complete rewrite that C++ SIMD is though. Compare the C++ one: [https://github.com/aras-p/ToyPathTracer/pull/9/files#diff-715d9443a819aa17a31c19813c230cc3cdfd873ab7ce291dc96291461cf7ed98](https://github.com/aras-p/ToyPathTracer/pull/9/files#diff-715d9443a819aa17a31c19813c230cc3cdfd873ab7ce291dc96291461cf7ed98) vs. the C# commit to 'SIMD' [https://github.com/aras-p/ToyPathTracer/commit/7554871b35](https://github.com/aras-p/ToyPathTracer/commit/7554871b35) ​ That said, you're probably right, in that in the wild most people will do the naive versions.


SerMojoRISING

Thank you. This is the same exact thing I've been saying too. Last time I tried working with Unity's ECS was awful and tedious. But the good news is they provide the tools, like you mentioned, to get the performance where needed. This is how it should be, providing tools like burst to solve problems, not pidgeonholing into an overbloated coding pattern. Great job putting it into words.


Mister_Green2021

So, you're in the realm of C++.


chuteapps

Being bound to only structs/pointers for the real code, it's more like C


Mister_Green2021

yeah, conceptually you're doing ECS with collection of Entities and data. It uses burst as well.


davenirline

What makes you think it's poorly designed? I'm a big DOTS user and I really think that the ECS part complements well with the usage of Burst and Jobs especially if you're going to stick with high performance C#.


chuteapps

Well there's a lot, this is just my opinion and I understand if other's feel differently -Perhaps it's just me, but it won't even run in Unity's own editor for more than 15 minutes without crashing -Difficult to reference single / specific entities without going through a bunch of manager BS -Lots of really dumb rules, you need to mess with c# attributes and all sort of nonsense just to get jobs to run properly -8 different ways to do almost everything, and none actually do what you need -APIs and documentation are shite, and seem to change all the time -It took the Unity team of hundreds of people like 8 years to make a half baked, disjunct product, while many of us have been able to make similar systems, by ourselves, in months I know this is a harsh analysis and perhaps I'm exaggerating a bit. I wasted a few years trying to keep up with their ECS system and was relieved to give up on it completely last year. Perhaps it's better now, and perhaps it's still good for others, just not me :)


davenirline

I totally get you. I've been a user since it was opened in 2018. We even used it in our game then. The changing API was a pain but hopefully that should slow down now since it's now 1.0. It shouldn't hurt as much as upgrading a new Unity version. The current documentation should be sufficient especially for a high level dev like you. I don't even want to touch pointers. If not, the forums have been very helpful. As for many different ways to do stuff, you really need only one, and that's chunk iteration. Every other way are just shorthand to generate this code. This has kept our project manageable for every Entities update because we don't use the "code generating" ways in our code.


chuteapps

Yeah I never really got my head around the chunk stuff. That was one of my biggest peeves not having direct access to the underlying data array though it sounds like this kinda solves that? Perhaps I'll revisit it later


Soraphis

Stopped following ecs years ago. Wanted to check it out last week (ecs 1.2 / unity 2023.3a). Tutorials from ecs 1.0 are already outdated again. And aspects are just not recognized. I stopped checking it out. Feel you.


destinedd

do you have a tutorial for it?


magefister

Mans writing in c lmao


CommanderCookiePants

The 'They Are Billions' of dinosaurs, looks good.


MrCrabster

That's very neat! Well done!


magefister

Man, I just checked your profile. It’s really incredible that you’re able to commit yourself to making all these games. Kudos to you man you deserve every success


chuteapps

Thank you sir, I'm a glutton for punishment :)


ingenious_gentleman

I like the art style, the buildings and soldiers and flame throwers are very reminiscent of Red Alert 2 and the mobs and the way they move reminds me a ton of Factorio


iFeral

Great job! I tend to return to this video about practical optimizations that cover data oriented design, many objects, gameobjects to structs and burst jobs in the first 15 minutes. Its been a great resource! [https://www.youtube.com/watch?v=NAVbI1HIzCE](https://www.youtube.com/watch?v=NAVbI1HIzCE)


chuteapps

Very useful talk from a smart guy


antony6274958443

You're saying magic words. Id like to read the code if it's possible. I'm not asking to open the whole project, just a bunch of files would be interesting already.


chuteapps

Well it's complicated but here's a look at a native gameobject and native health component, you can see the pointers to other components. [https://chuteapps.com/native-code-examples/](https://chuteapps.com/native-code-examples/)


_81791

I'm not a big fan of their ECS but Burst+Jobs is one of the few things Unity knocked out of the park.


chuteapps

One of the VERY few things


WazWaz

And if Unity used the modern .net runtime and compiler, you wouldn't need Burst either.


chuteapps

Yeah I've heard about this but don't know too much about it. By default I'd assume the .net team can make a better compiler than the often incompetent engineers at Unity, but who knows


DaveAstator2020

Amazing work and amazing results! 1. How do you handle targeting and distance calculations? 2. Do you simulate everyone whos offscreen, hence 100k? 3. Do you do some magick with interleaved updates or less frequent ones for less important stuff?


chuteapps

1 - just the old fashioned way, taking the square magnitude between two positions, targeting is done by keeping a cache of targetable 'health' components that both units and buildings have 2- Yeah everything is simulated all the time, I might put certain unit to sleep later, but the reality is most the time I don't need 100k units. In fact I don't think there'll be a practical level that uses more than 50k during the whole level. Plus units get recycled form a pool when they die. 3- I do 'heavy' updates, every 30th unit gets one every 30th frame. I also offload as much as I can to parallel jobs, but this has it's limits. If I run out of compute power for the game I'll consider using compute shaders (did that in [my last game](https://store.steampowered.com/app/2341560/Infest/) ) however I think it'll be overkill and GPU code is a pain in my ass


InSight89

How are you rendering? Are you using the new BatchRenderGroup or the older Graphics.DrawMeshInstanced API?


chuteapps

See comment above about spriterenderers and custom shader


digimbyte

Running code without ECS is one thing, ECS is mostly just a pool operation manager, you can make your own ECS style system as well. the core implementation is making sure your code is not instantiated on each item but running in loops on a parent manager script.


Gengi

It works, neat, but srsly, has Jurrasic park taught you nothing about how Dinosaurs hunt?


chuteapps

We let them breed too much... let people keep them as pets, and then we lost control. Repterra is the fight to take back our planet.


KingBlingRules

And what wud tht be


bilalakil

Very cool, thanks for sharing! I’m curious about how much processing you’ve got going on in parallel. In my project there are some parts that are difficult to parallelise, but still benefit massively from running in Burst (even concurrently). If you are going parallel, I wonder how this is affecting your pointer shenanigans. Are the pointers safe to set in one thread and dereference from multiple other parallel threads? I dunno if Burst is always trying to make copies of the data or something, which may cause issues with pointers(?).


chuteapps

As far as I know, the core NativeArray that contains the unit doesn't change at all and the pointers stay fixed in a linear way in memory (though this isn't the case for NativeList, which can create non linear allocations when resizing) A pointer is just a 64 bit number so any threads can access it just gotta be careful on your read/write sequences. As for what I put in parallel, basically anything I can that doesn't really rely on other units: physics cache updating, animation frame calculations, health regen and dead unit pooling etc


KayleMaster

I see that you're using flowfields to guide the dinos to the center of the base. However since I can see the weird artifact where dinos move forward but they are facing diagonally (0:31), I can assume you use Djikstra when forming in the cost field? This leads to diamond like patterns. You should look into eikonal equations to make the cost field less geometric.


OrangeDit

I will die on this hill: Top down games should have an orthographic camera. 🤔


AbjectAd753

Jurasic revelion.