Yet another dev blog...: code design

Showing posts with label code design. Show all posts

Monday, October 14, 2013

Memory management 2/2

Wow after a year I finally manage to write this post.

Well, as promised we'll take a look at some memory management implementations (stack allocator, block allocator, pool allocator)

Motivation

Ok, so what's the motiviation behind all this? Why would we even need something like memory management? What's wrong about new and delete (or malloc and free if you're a C programmer)?

Well basically there's nothing wrong with using new and delete. It all depends on your use-case, code base complexity and overall performance.

Many developer out there will tell you to use malloc/new as little as possible and that you're doomed if you won't listen to them. Well that is probably true for AAA title but if you're just doing a small indie game or are just getting started with a little game then there's IMO nothing wrong with using malloc/free or new/delete.

If you're however working on a bigger game or even a console game, memory management will be unavoidable.

The reason for that is primary performance and secondary memory profiling/organisation (e.g. to make sure each subsystem is within its memory budget).

Whenever you're requesting memory by calling new/malloc, your operation system will most likely switch into so called 'Kernel Mode' to search for free memory in order to satisfy your memory request (some operation systems however implement malloc/new in a way that they'll get a bigger block of memory than you requested and will work with that block as long as it has enough free memory to satisfy your request - however to get that memory even these implementation have to switch to 'Kernel Mode' though they won't do it for each time you call malloc/new)

The switch from 'User Mode' to 'Kernel Mode' and back from 'Kernel Mode' to 'User Mode' is something that is considerably slow. If you're having a high amount of allocations per frame you'll run into serious performance problems using raw new/malloc to allocate memory.

Solution

The solution for this problem is - once you know it - pretty simple.

I'll only tell you how I implemented memory management in my engine. Other engines / games / projects will probably use different implementations / strategies...So don't get too fixed on this ;)

The solution is - in short - to just allocate a big block of memory using malloc/new and work with this block.

So instead of calling malloc/new each time you need memory, use the memory from one big block of memory which'll get allocated via malloc/new at the beginning of your application (for example). The costs from switching to 'Kernel Mode' and back will be eliminated by that and you can use some neat memory allocation pattern to make the best use of the allocated memory block.

Another benefit by this approach is IMO that you can exactly control how much memory your application allocates. So if you're working on a console with very limited memory, you can just allocate all available memory at the beginning of the application and then use that to satisfy all your allocations. Once all memory has been allocated and an allocation can not be satisfied you can just print a message and return NULL. (most consoles will probably just crash once you've requesting more memory than the console has to offer, so printing a message is definately a better way than just crashing)

The big question now is "what do we do with the memory block once we allocated it?" unfortunately you can't call malloc/new on that block to get slices of memory from it. What you'll have to do is implement your own memory allocation pattern to get memory out of that block.

Memory allocation pattern is what the next chapter will be about.

Memory allocation patterns

Stack Allocator

The easiest pattern to implement would be the stack allocator.

Basically the stack allocator just consists of a pointer to the current position in the memory block - it's like an offset from the start of the memory block.

For each allocation request, the stack allocator will return the current pointer. After satisfying the request the pointer will get increased by the size of the memory request.

Example:
At the beginning the memory offset of the stack allocator will be 0. So the current pointer will be pointing to the memory at the beginning of the memory block (as the current pointer = start of memory block + offset)

Let's assume the user wants to allocate 10 byte from the stack allocator. What happens is that the current pointer will get returned and the offset will get increased by the size of the requested memory (in this example the size would be 10).

To clarify that example here's some code:

 byte* beginning_of_memory_block = getMemoryBlock();  
 byte* requested_memory = beginning_of_memory_block + offset;   //offset = 0
 offset += sizeof(requested_memory); //10  
 byte* current_pointer = beginning_of_memory_block + offset;  //offset = 10
 setCurrentPointer(current_pointer);  
 return requested_memory

This is simplified pseudo-code to illustrate the concept of the stack allocator.

Deallocation is even simpler. Just decrease the pointer by the size of the deallocated memory.

Note: Deallocation has to be in the reverse order of the allocation order. That's the big contra of using a stack allocator. You can't allocate and deallocate memory in arbitrary order!

Pool allocator

The pool allocator is also not that complicated to implement.

The theory about a pool allocator is it'll split a memory block into memory chunks of a specific size (for example the size of a user defined class). Each time you request memory, the pool allocator will go through its chunks and check if there's a free chunk - if there is, it will return that chunk.

There are multiple ways to implement a pool allocator.

One of the simpler implementations would be to use an array.

If you were to write a pool allocator class, you could use C++ templates to create that array on the stack.

 template<unsigned int SizePerObject,unsigned int AmountObjects>  
 class PoolAllocator  
 {  
 //... Costructor, Destructor etc.  
 private:  
    unsigned char m_Memory[SizePerObject * AmountObjects];  
 };

To keep track of used chunks you could use another array that stores a boolean value per chunk to indicate whether the chunk is free or not

 template<unsigned int SizePerObject,unsigned int AmountObjects>  
 class PoolAllocator  
 {  
 //... Costructor, Destructor etc.  
 private:  
    unsigned char m_Memory[SizePerObject * AmountObjects];  
    bool m_FreeChunks[AmountObjects];  
 };

If the user wants to allocate memory from the PoolAllocator you could implement it in a way that the allocate function iterates over the array until it finds a chunk that is free (m_FreeChunks[i] == true). You could then use the position of the free chunk to get the memory from the memory block itself.

 for(int i = 0;i < AmountObjects; ++i)  
 {  
    if(m_FreeChunks[i])  
    {  
      m_FreeChunks[i] = false;  
      return m_Memory[SizePerObject * i];  
    }  
 }  
 printf("StackAllocator could not satisfy memory request.");  
 return NULL;

Deallocation is - like with the stack allocator- simpler than allocation.
For deallocation you just need to set the specific chunk the memory came from to 'not used' (m_FreeChunks[i] = true in our example).

Note: Due to its layout, a pool allocator can be used to allocate and deallocate objects in arbitrary order. However, the pool allocator can not be used to allocate memory of arbitrary size!

Block allocator

The block allocator is the best of both worlds. Allocations/deallocations can be made in arbitrary order and in arbitrary size. But before you get too euphoric and forget about the two previous allocator pattern, note that the algorithm for allocating memory is more complex than the simplistic allocation algorithm of the pool and stack allogator. Therefore, excessive allocation/deallocation using the block allocator will be considerably slower than using a pool or stack allocator.

The theory behind the block allocator is to split the underlying memory block into several smaller memory blocks. But unlike the pool allocator we don't do that at the beginning using a constant size, but rather during allocation with a variable size.

The used blocks will get saved in a internal array (or linked list) and on deallocation the block allocator tries to merge multiple deallocated blocks into a larger block, so that we'll end up with - again - one large memory block when all allocated memory from the block allocator has been deallocated.

 struct MemoryBlock  
 {  
    unsigned int StartPosition;  
    unsigned int EndPosition;
    bool Free;
 };  
 class BlockAllocator  
 {  
    //Constructor, destructor etc...  
 private:  
    unsigned char* m_Memory;  
    MemoryBlock* m_Blocks[AMOUNT_MEMORY_BLOCKS];  
 };

This is a simplified declaration of a MemoryBlock struct and the BlockAllocator class.

The algorithm for allocating memory could look like this

 for(int i = 0;i < AMOUNT_MEMORY_BLOCKS;++i)  
 {  
    if(m_MemoryBlocks[i]->Free)  
    {  
      unsigned int newEndPos = m_MemoryBlock[i]->StartPosition + sizeof(AllocatedObject);  
      m_MemoryBlock[i]->EndPosition = newEndPos;  
      m_MemoryBlock[i]->Free = false;  
      createMemoryBlock(); //create new memory block  
      return m_Memory + m_MemoryBlock[i]->StartPosition; //return memory  
    }  
 }  
 printf("Block allocator - can not satisfy memory request");  
 return NULL;

Again, this is just pseudo code ;).

I hope the informations provided in this and the last post was enough to give you a basic understanding of memory management in C++ and why it is important to know how to implement your own allocator.

If you still have questions or don't understand my examples, feel free to contact me.

In the next post I'll probably talk about the interaction between your game engine and an external game editor.

Tuesday, February 21, 2012

Use typedef to make your code easier to read.

I want to talk about the C++ keyword "typedef" in this post and how it can make your life easier.

I assume that you often work with hard to read template classes (like 90% of the STL).
Consider this code given the above assumption:

 void foo(){  
   std::list<Class> myList;  
   //...  
   for(std::list<Class>::iterator iter = myList.begin();iter !+ myList.end()iter++){  
    //..  
   }  
  std::tr1::smart_ptr<Class> myInstance(new Class());  
 }

This type of code layout can get very confusing very fast.
If you just add 3 typedef's at the beginning of this code, you'll make the code much easier to read.

 typedef std::list<Class> MyClassList;  
 typedef std::list<Class>::iterator MyClassIterator;  
 typedef std::tr1::shared_ptr<Class> MyClassPtr;

Using these typedefs, I'll show you what that code sample above will look like:

 void foo(){  
   MyClassList myList;  
   //...  
   for(MyClassIterator iter = myList.begin();iter !+ myList.end()iter++){  
    //..  
   }  
  MyClassPtr myInstance(new Class());  
 }

That's much better, isn't it?

Saturday, January 21, 2012

Generic 2D collision system.

Today I want to explain the system I use in my engine for collision detection.
I tried to keep the collision system as generic as possible.

My engine provides a class called "CollisionComponent". You can add one or more components to a game objects to make them collidable. (Collision detection is done via collision rectangles).

To make the collision handling as generic as possible I decided to realize the handling "response" (the point where it'll get decided what to do with the collided objects) with function pointers.

You can "subscribe" gameobjects via typename (String) or a specific id (int) and with a pointer to a function that receives two pointer to game objects(first is the object that did "the impact" and the second is the object that got hit) and returns void.

Here's a little code snippet to make it more clear how that works:

 void collision_handler(GameObject *object1,GameObject *object2)  
 {  
    //...Do fancy stuff here  
 }  
 int main()  
 {  
    GameObject object1("Type1");  
    GameObject object2("Type2");  
    //...  
    //...  
    CollisionComponent *collComp = new CollisionComponent();  
    CollisionComponent *collComp2 = new CollisionComponent();  
    object1.addComponent(collComp);  
    object2.addComponent(collComp2);  
    //..  
    //subscription with type  
    CollisionManager::subscribe(object1.getType(),object2.getType(),collision_handler);  
    //..  
    //subscription with id  
    CollisionManager::subscribe(object1.getID(),object2.getID(),collision_handler);  
    //..  
 }

The difference between subscription via type and id is that multiple game object can have the same type (for very generic collision handling like "bullet hits player" or "weapon hits enemy").If you want to make more specific collision handling you can use the subscribe function that uses ids because the ids of the objects are unique.

The only problem is that each tick each game object gets checked with every other game object.
That's not very performant, but I don't know how to do it better, yet.

Monday, January 9, 2012

Type conversion using template functions

Yesterday I came up with a new way to make type conversions prettier.
I want to explain you how that works in this post.

As you might know, I work with a component based object model in my engine (see here).
In this kind of object model, you sometimes need to modify the component of certain objects after you added them.

To do that I implement a getComponent() function, which - in the past - returned a object of the type of the component interface (called IComponent). Type conversions needed to be made afterwards to work with the desired implementation. That used to look like this:

 GameObject gameobject;  
 RenderableComponent *renderableComponent = new RenderableComponent("graphic.png");  
 gameobject.addComponent(renderableComponent);  
 //Somewhere else in the code in another function.  
 IComponent *component = gameobject.getComponent("RenderableComponent");  
 RenderableComponent *renderableComponent = static_cast<RenderableComponent*>(component);  
 //.. do stuff with renderablecomponent.

I found that code pretty ugly and unnecessary complicated, so I thought of another idea to do that user friendlier. The first step I did was to change the getComponent() function to a template function.

The function looked like this:

 IComponent *GameObject::getComponent(const String &componentName){  
      for(std::list<IComponent*>::iterator i = this->m_components.begin();i != this->m_components.end();i++){  
           if((*i)->getType() == componentName){  
                return (*i);  
           }  
      }  
      return NULL;
 }

With template support it now looks like this:

 template<class T>  
 T *getComponent(const String &componentName){  
      for(std::list<IComponent*>::iterator i = m_components.begin();i != m_components.end();i++){  
           if((*i)->getType() == componentName){  
                return static_cast<T*>((*i));  
           }  
      }  
      return NULL;  
 }

(Hint: If you don't know nothing about template functions, check here)

As you can see, the user determines the return type via the template type ( So there's no ugly conversion needed afterwards.)

The result is (in my opinion) much nicer source code.
The following is the above example reused with the new getComponent() template function:

  GameObject gameobject;   
  RenderableComponent *renderableComponent = new RenderableComponent("graphic.png");   
  gameobject.addComponent(renderableComponent);   
  //Somewhere else in the code in another function.   
  //IComponent *component = gameobject.getComponent("RenderableComponent");   
  //RenderableComponent *renderableComponent = static_cast<RenderableComponent*>(component);  
 RenderableComponent *renderableComponent = gameobject.getComponent<RenderableComponent>("RenderableComponent");   
  //.. do stuff with renderablecomponent.

I don't know what about you, but I find it much nicer ;) .

Compared with a macro like:

 #define GET_COMPONENT(type)(getComponent<type>(#type))

The example could even get more improved to look like this:

  GameObject gameobject;    
  RenderableComponent *renderableComponent = new RenderableComponent("graphic.png");    
  gameobject.addComponent(renderableComponent);    
  //Somewhere else in the code in another function.    
  //IComponent *component = gameobject.getComponent("RenderableComponent");    
  //RenderableComponent *renderableComponent = static_cast<RenderableComponent*>(component);   
  RenderableComponent *renderableComponent = gameobject.GET_COMPONENT(RenderableComponent);  
  //.. do stuff with renderablecomponent.