🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

What is an “appropriate” method for storing entities and components within an ECS model?

Started by
13 comments, last by All8Up 5 years, 11 months ago

I'm working on an ECS model for a C++ game I'm developing. I feel like I've gotten a good grasp on the ECS model, however, I'm struggling to determine how I should actually organize and store the components. This question has been raised by the dilemma of how I want my engine to iterate over my components and systems.

Here are two routes I could see myself going:

1) Store all components within entities themselves. This model would involve me creating an object pool of an entity type. Each entity type would contain a specific set of components relevant to that entity (e.g. graphics component, physics component, health component, etc.)

I would then register this object pool with the appropriate system (e.g. physics system) which would then iterate over the objects as necessary.

Pros: My systems would only have to iterate over entities that are known to have relevant components. Since the entities contain the actual components, the act of initializing each entity is (subjectively) easier.

Cons: I need to register multiple object pools with each system (e.g. projectiles and enemies both have health components, thus I would need to register each object pool to the health system).

2) Store all components of the same type in a container and then give my entities references to these components. I would then register a single component container with a system (e.g. give the physics component array to the physics system). The system would then only need to iterate over a single container instead of multiple containers.

I'm envisioning that all components are held within their respective system (e.g. physics system contains an array of physics components). Each entity would then contain a reference (pointer or ID handle) to an individual component.

My entities would then essentially become objects that just contain references to components, but not the actual components themselves.

Pros: The entire concept or register object pools to each system becomes obsolete, and each system only has to iterate over a single container.

Cons: The process of initializing an entity becomes (subjectively) difficult. For example, to create a projectile, I would need to request individual components from each system. It then becomes possible that during the creation of an entity, I'm able to obtain one component but obtain another. Thus the entity is only partially created. I would need to account for all of these fail cases.

I'm curious what your thoughts are on either option?

Advertisement

I went with each system (physics, movement, input, etc) having it's own container of components.  My entities are just an ID and in debug mode have a std::bitset with each component type being transformed into a value between 0 and however many component types there are and checking off the correct place in the bitset for easy checking.  No need for the entities to hold known components it has, really, at all and I plan to remove the bitset part of mine and make them pure ID's only.  Physics knows it needs to know about movement components (so it can get the data, read-only) so it gets passed in a reference (pointer from a unique_ptr in my case) to each system it needs to know about.  So my physics system basically sends a list of entity ID's to the movement system (since it knows what entities have physics) which gives it a list back of them (continuous in memory for cache-nice-ness).

I've read dozens of articles on different ECS setups and I have yet to see a good reason for entities knowing about their components.  You would think it would be useful, but you really have no real reason to ever query it.  Even with my bitset setup, I only queried it for testing purposes and have not really used it in months.  Hence me removing it soon.

"Those who would give up essential liberty to purchase a little temporary safety deserve neither liberty nor safety." --Benjamin Franklin

The second option is the closest to correct item but by tying components to specific systems you are breaking some of the intended utility of ecs.  Of course this may fit your intended usage patterns but it doesn't work for some of the uses I've put my system to.  Take the physics component as an example, I often want to access the physics component from many different systems.  Some concrete examples: audio system uses the velocity for doppler effects, AI systems use velocity to calculate intercept vectors, rendering can use velocity for trails and particle emitters attached to the entity as the velocity to initialize new particles to.  As to the idea of entities knowing about their components, from a singular point of view, they don't generally need it, but when you are doing things like querying the world for a list of entities around a point, filtering by what components exist on each entity is a good thing and then of course computing something based the components of the entities found is generally the goal of such a query.

So, overall, you are loosing a lot by not having a generalized iteration system for your entities.  The way I approached this is by having a core 'system' (otherwise known as the EntityManager) which owns all of the component containers.  Any system added afterwards can then request an 'index' from the core system.  I don't believe I have a single system in use that has an iterator which is a single component, I'm almost always interested in Transform and at least one other component.  I.e. rendering generally needs transform and the renderable components, audio needs transform and audible components, etc.

At the most simplistic, managing the indexes is actually fairly easy.  An index is simply a vector of entity ID's.  Every time you add/remove a component on an entity you look through all the indexes which exist and check if the index is watching for that component.  Removing components is trivial, if the index contains the given entity and a component that index is watching is removed, just remove the entity from the index.  Adding is a little more involved in terms that only if the component being added is the last component needed for the index to be interested in the entity do you add the entity to the index.

Implemented in the simple way yields a usable ECS system without the limitations your two suggestions would introduce.  I would highly suggest doing this in a correct manner the first time at least and then simplify later.  The abilities you are loosing by not going with a full implementation could be critical to your ability to actually use such a system.

I personally went for entities being simple ids with no pointers or references to components whatsoever.  The main pros are it's simple and does not have the problem of needing to update pointers (therefore removing the risk of dangling pointers to dead entities/components).  For example, if you want to reshuffle the components in a single system for more cache friendly performance, then the entities and other systems don't need to know because each system is completely decoupled.  The main con is components and entities don't directly know about each other so a slightly more complicated way of communicating (e.g. an event messenger or the observer pattern) needs to be designed.  Although there is the temptation to include pointers to components early on when everything is simple, as your engine grows and becomes more complicated, this becomes a horrible spaghetti mess of interdependent systems and components.  Having ids with events/messages to communicate does not get more complicated.  It can get slower if your messaging is not done in a sensible way, but this is more an optimisation problem for later (e.g. allow particular systems to subscribe to certain messages so the overall number of messages sent around is small).

 

I don't know if you've read this online book before, but many of these issues are discussed here ( http://gameprogrammingpatterns.com/component.html )

13 hours ago, CrazyCdn said:

I've read dozens of articles on different ECS setups and I have yet to see a good reason for entities knowing about their components.  You would think it would be useful, but you really have no real reason to ever query it.

how do the systems know in which entities they should operate? they operate on entities that have certain components, not just on components alone... i.e: the motion system would operate on all entities with a position component and a velocity component... while the rendering system would operate on all entities with a position component and a sprite component... etc.

If you couple the components to the systems, how can different systems operate on the same component?

12 hours ago, All8Up said:

So, overall, you are loosing a lot by not having a generalized iteration system for your entities.  The way I approached this is by having a core 'system' (otherwise known as the EntityManager) which owns all of the component containers.  Any system added afterwards can then request an 'index' from the core system.  I don't believe I have a single system in use that has an iterator which is a single component, I'm almost always interested in Transform and at least one other component.  I.e. rendering generally needs transform and the renderable components, audio needs transform and audible components, etc.

At the most simplistic, managing the indexes is actually fairly easy.  An index is simply a vector of entity ID's.  Every time you add/remove a component on an entity you look through all the indexes which exist and check if the index is watching for that component.  Removing components is trivial, if the index contains the given entity and a component that index is watching is removed, just remove the entity from the index.  Adding is a little more involved in terms that only if the component being added is the last component needed for the index to be interested in the entity do you add the entity to the index.

You seem to have sold me on the "systems shouldn't own components" argument. However, I'm not sure if I'm fully understanding the EntityManager and index request system you mention.

Could you expand on what exactly is an index in your context? You say an index is just a vector of entity ID's, but I'm somewhat confused as to how other systems would actually use this index. 

In the solution I use, the entity manager is just the API for working with the containers.  Basically you have functions such as the following:


enum class EntityHandle : uint64_t {eInvalid = 0};

EntityHandle EntitySystem::CreateEntity();
bool EntitySystem::DestroyEntity(EntityHandle);
void* AddComponent(EntityHandle, ComponentBit);
bool RemoveComponent(EntityHandle, ComponentBit);
void* GetComponent(EntityHandle, ComponentBit);

EntityIndex* GetIndex(ComponentMask componentMask);

So, creating an entity is:


auto entity = ecs->CreateEntity();
TransformComponent* transform = ecs->AddComponent<TransformComponent*>(entity, TransformComponent::kBit);
*transform = { Matrix33f::Identity(), Vector3f::Zero() };

So, some explanations.  Each component in my system has an ID to specify what it is, this is just a uint64_t which I happen to build from a crc of the name.  Each component also has a 'bit' which is computed when they are installed into the ecs itself.  The first component is bit: 1<<0, the second 1<<1 etc.  I'm simplifying things a bit here to be clear, because all of this is done dynamically but I won't go into the details as they depend on your needs.  For instance, the ComponentBit and ComponentMask types in my system are currently uint64_t's because I have never needed more than 64 components, but I can switch them out to bitsets if needed later.

Anyway, to the point, the entity index is an object which contains an array of entity ID's based on the given ComponentMask.  So, let's say I've inserted 5 components: Transform, Physics, Renderable, Audible and Spaceship.  Assume that the bits match up in that order.  I want the AudioSystem to iterate over only the entities which contain the Audible component.


auto index = ecs->GetIndex(ComponentMask{AudibleComponent::kBit});

So, the AudioSystem now has an index and when you run the main loop for the ECS it can do the following:


int32_t count;
const EntityHandle* entities;
index->GetEntities(&count, &entities);
for (int32_t i=0; i<count; ++i)
{
  // Do whatever.
}

Now, assume I have Audio3DSystem, it wants an index which consists of Transform and Audible:


auto index = ecs->GetIndex(ComponentMask{TransformComponent::kBit, AudibleComponent::kBit});

It uses the same code as above and gets the handles to the entities which are represented.  But, as mentioned in my initial reply, sometimes I want dopler effects so I directly query if the entity has a physics component.  Hence, having the entity know which components it has is a huge optimization.

A naive implementation of this is probably good enough for most uses and you won't have to go much further.  But, to give you an idea of how far you can push things (and maintain the simple API), with a number of optimizations behind the scenes, my ecs stress test runs just over 5 million entities doing a simple wander at ~200Hz on a 16 core thread ripper.  Getting to that point is a major undertaking but it is very doable.

5 hours ago, Jihodg said:

how do the systems know in which entities they should operate? they operate on entities that have certain components, not just on components alone... i.e: the motion system would operate on all entities with a position component and a velocity component... while the rendering system would operate on all entities with a position component and a sprite component... etc.

If you couple the components to the systems, how can different systems operate on the same component?

I was a little sleep deprived when I wrote my original post.  I have an EntitySystem that keeps a list of all entity ID's and what components they have so you can query there, generate lists of all entity IDs with multiple (or a single) component(s).  Each system gets a pointer to EntitySystem for querying.  This way my entities will be just pure ID's.

"Those who would give up essential liberty to purchase a little temporary safety deserve neither liberty nor safety." --Benjamin Franklin

I think method #2 is usually what you want.  When using a pure ECS, systems are usually iterating over collections of components and being able to iterate over those components in contiguous memory is what will be most efficient.  The other iteration scenarios such as iterating over all entities that have a set of components are best solved using views.  A view is a cached list of entities that have those components and should hopefully have them in an order that ensures the most cache hits.

However, not everyone likes a pure ECS approach.  The first one of these systems I made was emulating the "entities with components" paradigm I was using in Unity.  There, you don't have systems and components hold all the code.  It's great for putting things together in a much more ad-hoc way and you have certain steps (or events, really) where the entire hierarchy is iterated and certain methods like Update are called.  In this case, you probably want #1.

In an engine I did in C, I actually used #1 but taken to the Nth degree: all entities have memory for all components and a flag to determine which component the entity really has.  It wastes memory, but for a small game, who cares?  It'll make more cache misses since the stride of your iterations is going to be large, but again, for small games who cares?  I even had a fixed limit on the number of entities possible in the system, the entire ECS was a single malloc and could be disposed with a single free, the code using it only ever saw a void* and a set of functions to operate on it.  My point in bringing this up is that sometimes beating on a problem like a gorilla and following the path of least resistance is fine.  It breaks all the rules but it was a small game on a modern computer, it's fine.  Most people who sit down to make an engine really overthink things (but I suppose that's the point of an engine, I guess) when literally any solution will be just fine.  Because this solution I did in C was admittedly dumb and inflexible, but it ended up being just fine.

On 7/18/2018 at 9:01 AM, All8Up said:

It uses the same code as above and gets the handles to the entities which are represented.  But, as mentioned in my initial reply, sometimes I want dopler effects so I directly query if the entity has a physics component.  Hence, having the entity know which components it has is a huge optimization.

A naive implementation of this is probably good enough for most uses and you won't have to go much further.  But, to give you an idea of how far you can push things (and maintain the simple API), with a number of optimizations behind the scenes, my ecs stress test runs just over 5 million entities doing a simple wander at ~200Hz on a 16 core thread ripper.  Getting to that point is a major undertaking but it is very doable.

Your API seems clean and simple which I like. However, I'm curious how you efficiently handle modifying the index when new entities are added/removed.

More specifically, how do you efficiently update the returned index whenever a new entity/component is added? It seems like it would be inefficient to have to iterate over every entity to generate the new EntityIndex array whenever a single entity is added or removed.

I realize this is an implementation detail that I could optimize in the future.

This topic is closed to new replies.

Advertisement