(Start by reading Entity Systems are the Future of MMOs Part 1)
It’s been a long time since my last post on this topic. Last year, I stopped working for a big MMO publisher, and since then I’ve been having fun doing MMO Consultancy (helping other teams write their games), and iPhone development (learning how to design and write great iPhone apps).
Previously, I posed some questions and said I’d answer them later:
- how do you define the archetypes for your entities?
- how do you instantiate multiple new entities from a single archetype?
- how do you STORE in-memory entities so that they can be re-instantiated later on?
Let’s answer those first.
A quick warning…
I’m going to write this post using Relational terminology. This is deliberate, for several reasons:
- It’s the most-correct practical way of describing runtime Entities
- It’s fairly trivial to see how to implement this using static and dynamic arrays – the reverse is not so obvious
- If you’re working on MMO’s, you should be using SQL for your persistence / back-end – which means you should already be thinking in Relations.
…and a quick introduction to Relational
If you know literally nothing about Relational data, RDBMS’s, and/or SQL (which is true of most game programmers, sadly), then here’s the idiot’s guide:
- Everything is stored either in arrays, or in 2-dimensional arrays (“arrays-of-arrays”)
- The index into the array is explicitly given a name, some text ending in “_id”; but it’s still just an array-index: an increasing list of integers starting with 0, 1, 2, 3 … etc
- Since you can’t have Dictionaries / HashMaps, you have to use 3 arrays-of-arrays to simulate one Dictionary. This is very very typical, and so obvious you should be able to understand it easily when you see it below. I only do it twice in this whole blog post.
- Where I say “table, with N columns”, I mean “a variable-length array, with each element containing another array: a fixed-size array of N items”
- Where I say “row”, I mean “one of the fixed-size arrays of N items”
- Rather than index the fixed-size arrays by integer from 0…N, we give a unique name (“column name”) to each index. It makes writing code much much clearer. Since the arrays are fixed-size, and we know all these column names before we write the program, this is no problem.
Beyond that … well, go google “SQL Tutorial” – most of them are just 1 page long, and take no more than 5 minutes to read through.
How do you store all your data? Part 2: Runtime Entities + Components (“Objects”)
We’re doing part 2 first, because it’s the bit most of us think of first. When I go onto part 1 later, you’ll see why it’s “theoretically” the first part (and I called it “1”), even though when you write your game, you’ll probably write it second.
Table 3: all components
(yes, I’m starting at 3. You’ll see why later ;))
Table 3: components | ||||
---|---|---|---|---|
component_id | official name | human-readable description | table-name |
There are N additional tables, one for each row in the Components table. Each row has a unique value of “table-name”, telling you which table to look at for this component. This is optional: you could instead use an algorithmic name based on some criteria like the official_name, or the component_id – but if you ever change the name of a component, or delete one and re-use the id, you’ll get problems.
Table 4: all entities / entity names
Table 4 : entities | ||||
---|---|---|---|---|
entity_id | human-readable label FOR DEBUGGING ONLY |
(really,you should only have 1 column in this table – but the second column is really useful when debuggin your own ES implementation itself!)
…which combines with:…
Table 5: entity/component mapping
Table 5 : entity_components | ||||
---|---|---|---|---|
entity_id | component_id | component_data_id |
…to tell you which components are in which entity.
Technically, you could decide not to bother with Table 4; just look up the “unique values of entity_id from table 5” whenever you want to deal with Table 4. But there are performance advantages for it – and you get to avoid some multi-threading issues (e.g. when creating a new entity, just create a blank entity in the entity table first, and that fast atomic action “reserves” an entity_id; without Table 4, you have to create ALL the components inside a Synchronized block of code, which is not good practice for MT code).
Tables 6,7,8…N+5: data for each component for each Entity
Table N+5 : component_data_table_N | ||||
---|---|---|---|---|
component_data_id | [1..M columns, one column for each piece of data in your component] |
These N tables store all the live, runtime, data for all the entity/component pairs.
How do you store all your data? Part 1: Assemblages (“Classes”)
So … you want to instantiate 10 new tanks into your game.
How?
Well, you could write code that says:
int newTank()
{
int new_id = createNewEntity();// Attach components to the entity; they will have DEFAULT values
createComponentAndAddTo( TRACKED_COMPONENT, new_id );
createComponentAndAddTo( RENDERABLE_COMPONENT, new_id );
createComponentAndAddTo( PHYSICS_COMPONENT, new_id );
createComponentAndAddTo( GUN_COMPONENT, new_id );// Setup code that EDITS the data in each component, e.g:
float[] gunData = getComponentDataForEntity( GUN_COMPONENT, new_id );
gunData[ GUN_SIZE ] = 500;
gunData[ GUN_DAMAGE ] = 10000;
gunData[ GUN_FIRE_RATE ] = 0.001;
setComponentDataForEntity( GUN_COMPONENT, new_id, gunData );return new_id;
}
…and this is absolutely fine, so long as you remember ONE important thing: the above code is NOT inside a method “because you wanted it in an OOP class”. It’s inside a method “because you didn’t want to type it out every time you have a place in your code where you instantiate tanks”.
i.e. IT IS NOT OOP CODE! (the use of “methods” or “functions” is an idea that predates OOP by decades – it is coincidence that OOP *also* uses methods).
Or, in other words, if you do the above:
NEVER put the above code into a Class on its own; especially NEVER NEVER split the above code into multiple methods, and use OOP inheritance to nest the calls to “createComponet” etc.
But … it means that when you decide to split one Component into 2 Components, you’ll have to go through the source code for EVERY kind of game-object in your game, and change the source, then re-compile.
A neater way to handle this is to extend the ES to not only define “the components in each entity” but also “templates for creating new entities of a given human-readable type”. I previously referred to these templates as “assemblages” to avoid using the confusing term “template” which means many things already in OOP programming…
An assemblage needs:
Table 1: all Assemblages
Table 1 : assemblages | ||||
---|---|---|---|---|
assemblage_id | default human-readable label (if you’re using that label in Table 1 above) | official name | human-readable description |
Table 2: assemblage/component mapping
Table 2 : assemblage_components | ||||
---|---|---|---|---|
assemblage_id | component_id |
This table is cut-down version of Table 5 (entity/component mapping). This table provides the “template” for instantiating a new Entity: you pick an assemblage_id, find out all the component_id’s that exist for it, and then create a new Entity and instantiate one of each of those components and add it to the entity.
Table 3: all components
Table 3: components | ||||
---|---|---|---|---|
component_id | official name | human-readable description | table-name |
This is the same table from earlier (hence the silly numbering, just to make sure you noticed ;)) – it MUST be the same data, for obvious reasons.
Things to note
DataForEntity( (entity-id) ) – fast lookup
If you know the entity-id, you may only need one table lookup to get the data for an entire component (Table5 is highly cacheable – it’s small, doesn’t change, and has fixed-size rows).
Splitting Table 5 for performance or parallelization
When your SQL DB is too slow and you want to split to multiple DB servers, OR you’re not using SQL (doing it all in RAM) and want to fit inside your CPU cache, then you’ll split table 5 usually into N sub-tables, where N = number of unique component_id’s.
Why?
Because you run one System at a time, and each System needs all the components with the same component_id – but none of the components without that id.
Isolation
The entire data for any given system is fully isolated into its own table. It’s easy to print to screen (for debugging), serialize (for saving / bug reports), parallelize (different components on different physical DB servers)
Metadata for editing your Assemblages and Entities
a.k.a. “Programmer/Designers: take note…”
It can be tempting to add extra columns to the Entity and Assemblage tables. Really, you shouldn’t be doing this. If you feel tempted to do that, add the extra data as more COMPONENTS – even if the data is NOTHING to do with your game (e.g. “name_of_designer_who_wrote_this_assemblage”).
Here’s a great feature of Entity Systems: it is (literally) trivial for the game to “remove” un-needed information at startup. If, for instance, you have vast amounts of metadata on each entity (e.g. “name of author”, “time of creation”, “what I had for lunch on the day when I wrote this entity”) – then it can all be included and AUTOMATICALLY be stripped-out at runtime. It can even be included in the application, but “not loaded” at startup – so you get the benefits of keeping all the debug data hanging around, with no performance overhead.
You can’t do that with OOP: you can get some *similar* benefits by doing C-Header-File Voodoo, and writing lots of proprietary code … but … so much is dependent upon your header files that unless you really know what you’re doing you probably shouldn’t go there.
Another great example is Subversion / Git / CVS / etc metadata: you can attach to each Entity the full Subversion metadata for that Entity, by creating a “SubversionInformation” System / Component. Then at runtime, if something crashes, load up the SubversionInformation system, and include it in the crash log. Of course, the Components for the SubversionInformation system aren’t actually loader yet – because the system wasn’t used inside the main game. No problem – now you’ve started the system (in your crash-handler code), it’ll pull in its own data from disk, attach it to whatever entities are in-memory, and all works beautifully.
Wrapping up…
I wanted to cover other things – like transmitting all this stuff over the network (and maybe cover how to do so both fast and efficiently) – but I realise now that this post is going to be long enough as it is.
Next time, I hope to talk about that (binary serialization / loading), and editors (how do you make it easy to edit / design your own game?).
Did this post help you?
Support me on Patreon, writing about Entity Systems and sharing tech demos and code examples
154 replies on “Entity Systems are the Future of MMOs Part 5”
Until recently I worked for a company that was doing something that I think is close to your ‘NEVER NEVER’ – but it isn’t quite clear to me. Can you expand on that, and talk about why you think it’s a bad idea?
FWIW, their software worked, but there were some concerns about scalability.
I started reading your blog because of the first part of this, and I’m yet interested to see “how it ends”.
So, interesting post :D Someday I’ll put it in practice…
@Tom H
If you do that, you no longer have an Entity System, you only have half of one. You *may* feel that you still have the most important half (designers would disagree), but it’s pretty dangerous to do this because it almost certainly means you’ve lost your way, and you didn’t mean to do that.
I highlight it because if you ever notice yourself doing it, you should stop wht you’re doing, and think very carefully about what you’ve just done – if you just did this, you may also have recently broken the WHOLE of your ES.
So, since this series of posts is about how to make and use an ES … don’t do that! :)
c.f. Part 2, where I pointed out how OOP programmers frequently get halfway through building an ES, and then – accidentally! – turn it back into OOP. They they complain that the ES is a lot of effort and doesn’t help much. Well … duh.
Ah, OK, here’s the difference:
In my former employer’s system, but trying to use your terminology,
1. Every component can be an assemblage
2. Assemblages mix in components, so there’s an OOP-inheritance-like relation
3. Entities are statically typed to a particular assemblage
4. There is a set of Systems (implemented in C++), but you can also attach scripts to assemblages. Script execution is handled by a System, but this leads to another OOP-like phenomenon, where scripts override or extend one another.
It’s got incredible tool-time flexibility, but doesn’t have the runtime flex that you’re also advocating, and I don’t think it’s engineered to go flyweight or get the other PS3 streaming advantages. Which is a very interesting thought, but I think the architect was coming from a PC/MMO background into a cross-platform situation and didn’t know enough about consoles.
I’m also curious about your apparent flat component model. It seems like there’s a tendency to have rather “fat” components? Their engine has a maybe a couple of dozen component types, a dozen more for physics, a dozen more for the trigger system – but then when you get to gameplay you’re defining custom components right and left to store whatever assemblage-specific data you need for this particular assemblage’s scripted activities.
Ah, I see. Sorry, I haven’t been very clear on those areas.
Entities MUST NOT have their components defined by OOP; an Assembalge is just an uninstantiated / “not yet unified” entity … So the same goes for them.
Entities MUST have their components defined by aggregation – ditto assemblages.
But … Assemblages can also be defined by “nested” aggregation, which is just like multiple inheritance in C++. That’s safe. You don’t want it for runtime entities because it doesn’t help and it makes the performance lower (more queries required to read / write data per entitty).
off the top of my head: I like editors that let you use nesting to add/remove features from an assemblage – but they MUST also automatically unify the sub assembalg with its parent if you remove indiviudal components from the parent that were only there because of the sub.
(eg create Orc chieftain, add the Orc assemblage, and the Human Chieftain assemblage. Now, you want to be able to easily delete all the “humanity” you’ve accidentally picked up, without losing the “chieftainship”).
Or better: make refactoring assemblages easy: the above example would ideally result in creating a new Chieftain assemblage, splitting it out of human chieftain, embedding it back in, and addig it to Orc chieftain – all as a single atomic refactoring.
This leads into a bigger area though: how do you record the initial values for components?
Eg do you allow a component value to be defined relative to oter component values?
“(e.g. when creating a new entity, just create a blank entity in the entity table first, and that fast atomic action “reserves” an entity_id; without Table 4, you have to create ALL the components inside a Synchronized block of code, which is not good practice for MT code).”
For some databases, like PostgreSQL and Oracle, you don’t need a table whose sole purpose is providing that benefit – you can use and create sequences of unique IDs independent of columns and tables. I’m curious what performance advantages you’re talking about – it seems that an index on entity_components.entity_id would be sufficient.
I don’t have much else to say on the general subject at the moment – I’ve been giving various approaches some serious thought lately, but I’m not convinced of any of them. :-)
@Matthew
Yep, I tried to stick to the most basic form of Relational here – that way there’s less to explain to people who don’t know it already :). Also means it’s easier to conceptually move to [platform of choice] and not worry what built-ins it does/doesn’t have.
re: performance … I just meant that you’re likely to have places where you want to iterate over “all entities” with minimal lookup. It’s convenient to have that as an instantaneous lookup. For instance: if you’re caching entity data, you want to know the complete list of entities, and how many there are, stuff like that.
(incidentally…stuff like caching is likely to end up adding more columns/metadata to the Entity table – but since it’s metadata for activities that live “outside” the Entity System itself, i.e. it’s not game-related, that’s probably OK. I think :)).
@Matthew
So … what are these approaches you’re considering, and why haven’t any been chosen yet? :)
I freely admit you’ve both lost me on the technicalities, and that I’m very interested in the designer tools of anyone doing this…
First I just want to say entries like this series are why I love reading your blog.
I’m having problem buying/understanding the System division. How do systems communicate between each other? Lets take the Rendering system and the Animation system for example. When something happens and the Animation system decides it needs to move my player form point A to point B, how does the rendering system get this information? Which component inside my player has what information?
The follow up to that (assuming the systems must communicate some how, either though shared data or messages) is how does this scale? Esp once you’ve reached the point where you’d split table 5 onto different servers.
@adam – the usual suspects, really; binary blobs, entity systems like you describe (and/or simpler ones), and a stronger / more manual approach directly mapping game entities to tables.
I haven’t settled on one because for the last year our team has been focused on prototyping gameplay for something other than your standard Diku/EQ/WoW clone. I was just starting on some aspects of world persistence the last couple of weeks, and now I don’t have to worry about that. A big part of my interest in exploring approaches is balancing production performance and data analysis, and none of the standard approaches do very well at one or the other.
Where do you find ES to be less appropriate than the other approaches?
@andrew
it depends a bit uponhow your rendering layer works, obviously…
Beyond that, it depends how you choose to implement the ES internals of your systems.
I’ll think about this properly, and try to include a better response later, but off the top of my head (nb: I reserve the right to later declare some or all of this stupid and wrong ;))…
For instance: if your renderer requires polygon arrays, then your renderable component may include a list of polygons to render. Your animation system might mutate those polygons directly – either by modifying them in-situ, or by Reading eg a bones list from it’s own component – the animation component – moving the bones, regenerating the entire polygon list, and overwriting it.
The rendering system would pick up the polygons each frame, and probably pick up the world-co-ords too, and translate the polys into position.
@adam
So taking the example where the Animation system just modifies the polygons in the Render component. How is this best handled when the two systems are running on different servers? (I know its not likely that a desktop game with an Animation and Rendering system would be large enough to take two servers, but the problem still holds for larger Systems that need to interact)
@Andrew
This is just a thought but one way this might be done is if all of the different Components are stored on a separate Database Server and all of the System Servers are connected by a message system then the animation process could work like this:
First, the Animation System Server activates it’s Animation and updates the Animation Component on the database.
Then it sends a message to the Render System Server that a animation has been activated (this message would probably hold things like the Entity-id).
And finally the Render System Server will use the information in the Animation Message to fetch the animation data from the Database and use that during render time.
Hope this helps.
Good article!
Maybe you can readdress this issue that has been nagging me:
How do systems interact with each other?
Are the systems oblivious of each other?
Let’s imagine two systems:
“ShootSystem”
“RegenerateHealthSystem”
– But the condition of some game unit (tank) is that it cannot shoot while regenerating.
1. Can I not make the ShootSystem somehow query the RegenerateHealthSystem if that entity is regenerating?
2. Should these two Systems actually be one System? Something like “RegenerateHealthAndShootingSystem” ? That doesn’t make sense to me, I thought you said this was supposed to be modular!? :)
3. If these systems can communicate, then what is the significance of this when you have each of those systems running on it’s own server?
Good stuff. A couple of random comments:
1) If you’re willing to live with the limitation that a given entity can only have a single instance of any specific component (which may not be a limitation at all, depends on design), then the component_data_id columns are unnecessary, and the component data tables would use entity_id as their primary key.
Even if you really needed multiple components of the same type attached to an entity, I don’t know if I’d necessarily use a separate id space to store them. An “instance_id” column (unique only to a given entity_id) would also work, and component data tables would have be keyed on entity_id + instance_id.
Why do this? Well, I think it’s clearer to have component data indexed on entity_id anyway, but it also means that component data tables will be sorted based on entity_id, which is a good thing for performance.
2) I generally prefer to not store table names in other tables, e.g. the components.table_name column, and would opt for an algorithmic method. My knowledge of SQL is far from complete, but as far as I know, you can’t get a column value and then use that value in a later “static” query as the table name. Instead you have to build the query strings dynamically, which reduces performance (or at least, forces the database to re-analyze/optimize such queries).
Hard to describe exactly what I mean, but let’s assume that SQL included a unix-like backtick operator. The following is what I’d theoretically like to write, but (to my knowledge) cannot:
SELECT * FROM `SELECT table_name FROM components WHERE component_id = $2` WHERE entity_id = $1;
The above query is meant to return (arbitrary) component data, given an entity id and a component id. Then again, I’m not sure how the db optimizer could do much with the above, even if I could legally write it.
I guess my main issue is that I prefer using stored procedures – exclusively – from the application, for a number of reasons. I don’t know that I could do that with the “dynamic” table names utilized in your approach – it would depend on what the database procedural language supported.
And yes, using “known” table names does mean that entity loading queries would hard-code those tables (perhaps in a huge join, even), but in the interests of performance that is something you may ultimately find necessary. I look forward to the next article in this series, and how you might avoid the “N + c database queries for an entity with N components” problem.
@adam: entity systems are really awful at data analysis, because so little of the data model is encoded in it (arguably the data model is just very loose, but the point is that at launch time the game has a data model implied by the data that is more concrete and useful than the entity system’s). They are also not very impressive from a performance standpoint, with lots of iterated queries just to build one assembly.
@Jason: Transact-SQL has ExecuteSQL(), which is about as bad as you imagine. I saw a case where it was useful – an automated database update script had some lines of SQL that referred to non-existent tables and columns if that portion of the script didn’t need to be run. The script was smart enough to check for that condition, but if you left the ALTER TABLE commands as bare SQL they’d be parsed and SQL Server would error on them before anything was actually run. Wrapping them in ExecuteSQL() let us check for whether they were going to be valid, and THEN get the database to parse/execute them.
But yeah, it’s the database equivalent of self-modifying code, with equivalent big angry warning signs flashing around it. :-)
@Jason
re: 1 – good point. As you say, assuming only one instance of a given component is needed per entity … I think that’s a better way of handling it.
re: “N + c database queries for an entity with N components” problem.
…each system is typically only going to need access to a handful of components. If you need all the data for an entity, it’s either for offline / human-readable purposes (load it on screen to show to a player), so that latency isn’t an issue any more, or it’s as part of a mass serialization, in which case you’re probably going to sidestep it and just start dumping tables en masse.
Ideally: rather than fetch entities on-deman, you pre-select all the data needed by a particular system for ALL entities (should be < 10% of the total data of all entities – as per above, you only pick up the components that this system actually uses), and pass it that as input parameter on each tick of the game-loop. There are obvious optimizations here, e.g. splitting that pre-select into batches, but the use-cases for how much you want to pass to the system are going to depend a lot on where / how you're using the ES (is this a PS3 client? a multi-server cluster? etc).
@Matthew
I might have misunderstood what kind of data you’re talking about, but…
“entity systems are really awful at data analysis” – I’d argue the opposite :), although I agree that what you say is true “by default”.
The difference for me is that any real world ES is going to fall flat on its face as a data-maintenance nightmare if *all* you implement is the runtime system. Practically, you *must* build toolchain features / plugins that support the ES directly in your game-editors.
As soon as you do that, all the data model is once again back – only this time it’s even more explicit (and easier to read and write) than in the alternatives (such as OOP): it’s purer, because the *only* thing affecting the structures you have is the metadata you added to track those structures. In, say, traditional OOP, you’ll “additionally” have occurences of subclassing and non-subclassing that had to be done for reasons other than the core data structure.
FYI – in my earlist conversations with Scott Bilas about ES’s for Dungeon Siege etc, IIRC … he picked on the editor-support as the biggest single win/lose criteria for an ES being deemed a success at the project / final game level. For offline games like DS, the ES being implemented well or badly alters the output of the progrmming team, and the final runtime performance and bugginess. BUt the editor support alters the volume and quality of game-content – which for an RPG (obviously!) is the biggest single factor.
So … yeah … very important topic. One I haven’t gone into in any detail (yet), because with modern games (larger scale and/or multi-threaded and/or online), the ES implementation itself becomes a lot more critical. And it’s easier to screw up, IMHO :). But one I certainly want to cover in detail at some point – although I have less knowledge on how to do that part “right” – I have ideas and wishes on how to do it “better than I did before”, but haven’t tried them out yet.
@Adam
As usual (since it’s all I do), I’m considering how these systems work in an MMO, and in particular on the server side. Loading all (dynamic) entities, even into a caching server, is probably not feasible. (Though for entity templates, this is exactly what you’d want to do.) And the typical MMO architecture has a “world simulation” process that needs access to most, if not all, components of an entity, so partial entity loading isn’t applicable. So, it’s vital to be able to load an entire entity from the database in an efficient manner. Making multiple queries, one per component, does not strike me as satisfying that goal.
In a previous project of mine, we faced this issue. It turned out, after much experimentation, that the fastest way to load an entity from the database (MS SQL Server, fwiw) was to do a join across all of the component-specific instance data tables. If an entity didn’t have a row in one of those tables, then the result would have NULLs in those column positions, which had to be handled appropriately at the application level. Not pretty, and definitely not the easiest thing to maintain, but sometimes that’s how it goes.
It’s worth noting is that we only had about 5 components that had any instance data; add a few more, maybe a different approach would be better. If I were starting with a new technical design, I’d probably stick with a simpler method (i.e. query each component table individually), until it became necessary to improve performance.
@Jason
“the typical MMO architecture has a “world simulation” process that needs access to most, if not all, components of an entity”
If it’s working correctly, I don’t see how that would happen?
If you’re accessing most/all components of the entity, then you’ve failed to build ES Systems – instead, you’ve built a monolithic “doEverthingPlease()” method.
Why would you do such a thing? What in your simulation algorithm reqires doing everything in a single process in a single place?
@adam: re: what Jason is talking about, yeah. The problem – and this is where e.g. Darkstar falls down too – is that pretty much every piece of data has a good chance of affecting any particular step in the simulation. Are you at least level 35? Do you have a frobitz in your inventory that gives you a bonus to [whatever is happening]? Does your guild have a special power that gives you a bonus? “Relevant stats” at every step of the simulation easily balloons over time until any hard and fast demarcation you make… fails.
It is, in my opinion, a matter of designing the architecture of the game to fit the designers’ and players’ vision as much as possible, rather than restricting the vision to what architecture you have. The latter certainly comes into play quite a bit, but I think our job as programmers is to fight against it as much as possible… which is the challenging part that makes being a game programmer so fun and fulfilling, personally.
Now, re: data analysis on the entity system through tools, that’s precisely the same analysis model as e.g. binary blobs. Except that binary blobs are ridiculously fast to store and load compared to an entity system, require much less database hardware, and so on.
@adam
Just because there is a world simulation process doesn’t mean that the individual game systems within that process aren’t largely independent, and only work with a subset of entity components. For example, the movement system should (ideally) only need to know about an entity’s physics model – collision geometry, position/orientation, velocity, etc. But, if you have decided to distribute the server load based on geography (e.g. one server handles all processing for a portion of the world), you still end up with a process that needs the whole entity.
And, as Matt pointed out, those crazy game designers have ways of creating systems that do require access to numerous components. Combat systems, in particular, tend to grow in that direction. (For this reason, I’d argue, using geographical distribution is still the “best” way to spread the load on the server…but that is a whole other discussion.)
“If you’re working on MMO’s, you should be using SQL for your persistence / back-end”
Should we? Don’t you think it depends on how one wants to store and navigate one’s data? SQL/relational isn’t the golden hammer. I wouldn’t dismiss other storage/databases solutions (object, map/reduce inspired systems), especially for a game object entity system before thinking about a game’s specificity.
… Or maybe do you mean “you should” in the sense that “most of you are”?
@Olivier
Yes – sorry about that – I meant:
– EITHER: you’re already using it, or have used it, and at least know how to use it (most people)
– OR: you’re not using it – but that’s because you’re not using ANYTHING yet, you haven’t got that far (you probably WILL end up using it, and if you’re this new to MMO’s, you really SHOULD use it)
For the record: Alternative persistence backends are a lovely idea and I’m all for them.
OTOH … at AGDC, on our panel, Marty was claiming that only map/reduce was worth using, because “every other DB backend is too slow at writes” … and my response was twofold:
1. Use more functional code
2. Use Entity Systems
I believe you can make a lot of better-known techs (like OOP and SQL) work “well enough” if you modify them by using an ES. I also believe that if you really know what you’re doing, and you can find a robust enough language, and you coudl find qualified programmers, you’d write everything on the server in a functional language. But I also consider that totally unrealistic for 99% of us.
(if anyone tries it, let me know – I’ll be cheering for you, but don’t ask me to *bet* on you ;))
(Incidentally: 27 comments and still going … this is a big part of why writing anything declarative about ES design takes a lot of time and effort: so many details and edge cases to think about, that it’s worth thinking about, before you open your mouth and say something that implies stuff you didn’t intend :))
@Jason
I have no problem with a single process having everything – but it never needs it all at once. It only needs it sequentially – bringing me back to: when would you ever need all that data at once? (other than for data-dumps to go to a foreign system – offsite backup, etc)
[actuallly, I do have a problem with monolithic processes – unless it’s got a decent built-in scheduler and built-in threading system, and is micro-managing a load of mini-processes, which would be fine by me. But at that point I’d still like to know what was so wrong with the various OSS and commercial scheduling systems that you felt “I can do better; writing a multi-threaded OS ain’t so hard”; are the alternatives unusable?)
@adam
“I also believe that if you really know what you’re doing, and you can find a robust enough language, and you coudl find qualified programmers, you’d write everything on the server in a functional language.”
Would be nice if you could elaborate on that… One of these days when you get the time.
I think one of the main challenges of that is the use of external libraries and frameworks; F# should be faring better than erlang in that matter for instance… But most of the great frameworks one could use for an MMO backend I could think of are either exclusive or available on the java platform… Is there a solid functional language running on the Java platform?
@Matthew
“data analysis on the entity system through tools, that’s precisely the same analysis model as e.g. binary blobs. Except that binary blobs are ridiculously fast to store and load compared to an entity system, require much less database hardware, and so on.”
You’ve lost me here. Maybe I’m thinking of a slightly different concept when you say “binary blobs”, because the blobs I’m used to do not exhibit the features you describe.
Firstly, access:
A. ES’s provide direct access – and direct cross-entity joins – to all data.
B. Blobs literally *prevent* direct access – and all cross-entity joins.
The data analysis you can do offline on ES’s can also be done online – and it requires zero setup. To do the same analysis on blobs requires vast amounts of pre-processing (counted in hours or days to convert from the blobs into a usable format)
The inability to join across entities/game-objects is the main reason I would avoid blobs here – IMHO it’s like cutting both your DBMS’s arms off, blinding it, and then asking it to carry a cup of water without spilling a drop.
Secondly, speed:
If you have a crap DBMS, with poor locking (e.g. MySQL about 10 years ago), then yes – blobs significantly increase write speed by circumventing lock-management / consistency maintenance (at the expense that you’re not allowed any strong references between objects; ouch – that causes more performance problems elsewhere). But if you’ve got a good DBMS, circa 2009, that shouldn’t be the case.
You get some minor speed increases because you don’t have to do any CPU work. But you also get some minor speed decreases because you have to send vastly more data than you actually wanted / needed – and you cannot start processing on the received data until you’ve received the lot (unless you want to write some clever code to change your entire semantics of accessing data in RAM).
So … where am I going wrong here? Have there been some big changes in blobs in recent years that passed me by? (oops)
@Olivier
Side-effect free
Trivially (and automatically!) parallelizable
Trivially (and automatically!) cacheable
No more dupe bugs
What’s not to love? ;)
@adam
Sure… But… That means closing the door on a lot of nice/essential things….
Hey Adam,
I’ve read and enjoyed all of the previous posts on Entity Systems (including yours), and they all seem to be targeted towards end-users / clients that run ON fast hardware, e.g. PS3, or gaming PCs running games coded in C/C++.
I know very little about the Cell Architecture, except that it is much faster than a PC, and probably an order of magnitude faster than a mobile device (like the iPhone). So my question is, what do you suggest for slower devices/platforms; mobile, flash, java?
For instance, flash isn’t multithreaded…would you simulate threads for the Systems? Also, forgive my naivete, but don’t languages like Java and AS3 *force* OOP?
@Cameron
Beneath all the other reasons for using an ES, the core reason is simply “to do things in an OOP environment that we cannot otherwise do” – so Java and AS3 in no way undermine usage of an ES.
Well … except that without access to C/C++ structs, it’s hard to do the “streaming” part fast. Various people have reproduced structs in Java (it’s only mildly tricky), and it works fine now that java has direct (low-level) access to RAM. I have no idea what equivalents are available in AS3 today. You could still use an ES without that, it’s just that *some* of the performance improvements would no longer work.
re: mobile…
I’ve been concentrating on “big” platforms because … well, actually … I’ve been concentrating on “multiprocessor” platforms, where (generally speaking) you want to do as much data-driven programming as you can, and where some of the side-effects of ES’s have no negative side-effect on performance (because they’re “multiprocessor friendly”).
Again, there’s no particular reason to focus on those platforms: I got started with ES’s purely for the improvements they bring to the implementation of game-logic. That’s universal.
But I always intended to veer these blog posts towards “not ONLY are ES’s a better way to code, BUT ALSO they give you higher performance (in certain types of game)” – hence the title ;). A lot of techniques in game development make development easier/faster, but slow down the runtime game a lot. It’s nice to occasionally find some that are easier AND improve performance.
(NB: that’s not the only reason for the title)
Thanks for the articles, ES is really interesting topic and it has its points. But I fail to understand how “big” are system. I mean can one system operate on many components? If they can, how many? If I have a tank for example and two systems: shoot system and damage system, the first one needs access to data of the later one obviously. How should I handle that?
@Valentin
Ideally, as many systems as possible. That makes each system as small as possible, and as easy to edit + maintain + improve as possible.
…but use discretion, obviously.
In the end, *depending upon hardware/OS*, additional systems carry little overhead. At some point, adding X additional systems adds signficant overhead as the cost of iterating over “all entities” N+X times rather than N times becomes significant.
Generally, though, it’s not something to worry about: just have lots of systems now, and merge them later as/when necessary.
NB: one system certainly *can* act on multiple components, in fact most systems *must* act on components from other systems, although often only in a “read only” capacity.
(e.g. the rendering system needs to read the world-co-ords for each object, and add that to the local-co-ords for each polygon, in order to render everything in the right place on screen :). The world-co-ords are being maintained by another system, one to do with movement, but the polygon co-ords are being maintained by the rendering system. Or the animatin system, if that’s a separate system (which it probably is))
In the database, the ComponentData tables would, I assume, contain columns of numerical, datetime, string and binary data. These columns would be named appropriately and everything would look perfectly normal (no pun intended).
When we cross over to our application code, typically in an OO language, the component data is the point at which your entity model loses its generic one-size-fits-all structure. Whilst the core entities and components remain 100% dynamic, is there a point at which we transition to a more static data model, in particular for ComponentData?
I would expect code within a GunComponent to be able to call…
if (data.GunRate > 1000) {
… or …
data.GunName = “Colt”
… accessing the data via strongly typed accessors rather than a bag of string data. Perpetual casting and parsing of string data would otherwise cause a hefty performance hit.
This is a general question, but inspired by the following line from your most excellent (dude) article:
float[] gunData = getComponentDataForEntity( GUN_COMPONENT, new_id );
^ So does getComponentDataForEntity always return a float array?
@Edward
The facetious answer would be “use structs”.
My preferred approach is passing around big streams of byte data, and using a single instance of a (pertinent) struct to view that data one record-at-a-time from the OOP code. This allows for query-system-friendly “ask for everything you need at once, then slam it down the pipe in a single batch of bytes” … while also providing sane access semantics.
I tried googling for a link to a technique for a more comprehensive answer, and then found that the terminology I’ve been using for ten years apparently doesn’t exist. I’ll get back to you on that one :(.
Any chance of having a part 6? I would be interested to see how an editor for this would look like.
Thankyou for your really cool series of articles… I really like the concept but one thing bothers me that concerns with how subsystems communicate with each other.
for eg – A simple game with 3 subsystems called renderer, physicsSim and input. The game has 2 entities floating about. Both entities have a physics component, a renderable component and input component set up in the database.
The work the subsystems do is just the renderer just renders the renderable components, the physics just simulates the physics components and the input just updates the input components with the state of input. But where is the part that the input subsystem has to tell the physics system to apply forces based on input or the physics system to tell the renderer the transform to render at.
Is it ok to do a querry in the renderer to look for a physics component with the given entity id to get the transform to render at, if so I cant see the benefit and think this will cause threading issues or am I just still thinking in the wrong way to solve this problem?
Many thanks and I look forward to the next article.
I should of read the previous posts before posting since I think my question is answered already.
Thanks again.. both article and following discussions everyone.
[…] Gamedata-Klasse nimmt sich die Datenhaltung eines Entity Systems zum Vorbild und implementiert diese in PHP. Für Details und Erklärungen der Funktionsweise […]
[…] Klasse zur Datenhaltung (gamedata.class.php) geschrieben. Diese ist in Zeitgeist angelehnt an die Datenhaltung eines Entity-Systems. In den Kommentaren wurde mir von Gameplorer folgende Frage zu dem Prinzip gestellt: Wie gut […]
We want part 6!
I know this is an older post but I’d thought I’d ask anyways. We have our ES system running fairly well and it’s very similar to what you have described even though I did not find your article until today. Our components are simply containers of data and are stored as separate tables in SQL Server with one table per component type.
We are able to load all the entities and populate the objects in memory by using projections. This all works great and is fairly efficient in terms of the queries involved (one per component table).
The tricky part, and the area I need some help with, is how to persist the data as it changes during the running of the game loop. For example, we have about 100,000 location components with x, y, z coordinates that are being updated once every ten seconds. Trying to save this data back to SQL Server is painfully slow. Serializing to disk is fast but has problems when you want to change the structure or perform maintenance on the data.
How are people handling persistence?
Has anyone used any of the main memory databases for their objects?
Thanks,
Rick
@Rick
NEVER write-through live data to an SQL db, it’s generally far too slow. Instead, send it to an in-memory SQL db of some kind (there’s many available, from free to commercial). They work very well as an “intelligent cache that speaks SQL directly, and avoids you having to re-write you code to speak to the cache instead of speaknig SQL”.
If the in-memory DB”s you can afford are still too slow, get hold of an even more lightweight/minimal SQL implementation (or roil your own), and again run it purely in memory. Or try running it off a RAMDISK, and setup your “real” db as a replication slave or similar.
In general, the db features you need to run the data live are a small subset of the db features you need on the persistent copy of the data. There’s no reason why you have to run the same DB server (or even same DB vendor!) for the two different copies…
(FYI I recently wrote an ES for Android mobile phones; I didn’t run SQL at all, I just approximated it with a lightweight in-mem abstraction layer. I can then serialize that to flash-mem using SQLite as an entirely separate chunk of code. With the new version of Android, I might try running it through SQLite as well – new Android is allegedly up to 5x faster than old android – could be intereting to try)
I have been looking at the various MMDBs but the choices are fairly limited for my platform (.Net & Windows Server).
We’re also trying to work out efficient ways of messaging the entities as they need to react to changes. We have hundreds of thousands of objects so we need to be careful.
Thank you for your thoughts.
.
@Rick B: Figured I’d see you here. I wonder if you have looked at MongoDB. Seems pretty blazing fast since it runs in memory using memory mapped files:
http://blog.mongodb.org/post/101911655/mongo-db-memory-usage
We often see people trying to.do.an.ES and an event-driven system together, at the same.time. I’m usually not sure why people do this. I’ve no idea about your case, but in most cases it proves too much change at once – that’s TWO unusual paradigms that a team is new to, each with complex.performance and behavioral changes of their own.
Im not convinced ES and Event driven systems go.well together for”normal” situations.. e.g. They could work extremely parallel (both approaches are.inherently parallelisable), but few people have enough processors to make that worthwhile these days.
Out of interest, why are you doing both together? Where is it working? Where is it causing yiu headaches?