Yet another dev blog...: Game Engine

Showing posts with label Game Engine. Show all posts

Monday, October 14, 2013

Memory management 2/2

Wow after a year I finally manage to write this post.

Well, as promised we'll take a look at some memory management implementations (stack allocator, block allocator, pool allocator)

Motivation

Ok, so what's the motiviation behind all this? Why would we even need something like memory management? What's wrong about new and delete (or malloc and free if you're a C programmer)?

Well basically there's nothing wrong with using new and delete. It all depends on your use-case, code base complexity and overall performance.

Many developer out there will tell you to use malloc/new as little as possible and that you're doomed if you won't listen to them. Well that is probably true for AAA title but if you're just doing a small indie game or are just getting started with a little game then there's IMO nothing wrong with using malloc/free or new/delete.

If you're however working on a bigger game or even a console game, memory management will be unavoidable.

The reason for that is primary performance and secondary memory profiling/organisation (e.g. to make sure each subsystem is within its memory budget).

Whenever you're requesting memory by calling new/malloc, your operation system will most likely switch into so called 'Kernel Mode' to search for free memory in order to satisfy your memory request (some operation systems however implement malloc/new in a way that they'll get a bigger block of memory than you requested and will work with that block as long as it has enough free memory to satisfy your request - however to get that memory even these implementation have to switch to 'Kernel Mode' though they won't do it for each time you call malloc/new)

The switch from 'User Mode' to 'Kernel Mode' and back from 'Kernel Mode' to 'User Mode' is something that is considerably slow. If you're having a high amount of allocations per frame you'll run into serious performance problems using raw new/malloc to allocate memory.

Solution

The solution for this problem is - once you know it - pretty simple.

I'll only tell you how I implemented memory management in my engine. Other engines / games / projects will probably use different implementations / strategies...So don't get too fixed on this ;)

The solution is - in short - to just allocate a big block of memory using malloc/new and work with this block.

So instead of calling malloc/new each time you need memory, use the memory from one big block of memory which'll get allocated via malloc/new at the beginning of your application (for example). The costs from switching to 'Kernel Mode' and back will be eliminated by that and you can use some neat memory allocation pattern to make the best use of the allocated memory block.

Another benefit by this approach is IMO that you can exactly control how much memory your application allocates. So if you're working on a console with very limited memory, you can just allocate all available memory at the beginning of the application and then use that to satisfy all your allocations. Once all memory has been allocated and an allocation can not be satisfied you can just print a message and return NULL. (most consoles will probably just crash once you've requesting more memory than the console has to offer, so printing a message is definately a better way than just crashing)

The big question now is "what do we do with the memory block once we allocated it?" unfortunately you can't call malloc/new on that block to get slices of memory from it. What you'll have to do is implement your own memory allocation pattern to get memory out of that block.

Memory allocation pattern is what the next chapter will be about.

Memory allocation patterns

Stack Allocator

The easiest pattern to implement would be the stack allocator.

Basically the stack allocator just consists of a pointer to the current position in the memory block - it's like an offset from the start of the memory block.

For each allocation request, the stack allocator will return the current pointer. After satisfying the request the pointer will get increased by the size of the memory request.

Example:
At the beginning the memory offset of the stack allocator will be 0. So the current pointer will be pointing to the memory at the beginning of the memory block (as the current pointer = start of memory block + offset)

Let's assume the user wants to allocate 10 byte from the stack allocator. What happens is that the current pointer will get returned and the offset will get increased by the size of the requested memory (in this example the size would be 10).

To clarify that example here's some code:

 byte* beginning_of_memory_block = getMemoryBlock();  
 byte* requested_memory = beginning_of_memory_block + offset;   //offset = 0
 offset += sizeof(requested_memory); //10  
 byte* current_pointer = beginning_of_memory_block + offset;  //offset = 10
 setCurrentPointer(current_pointer);  
 return requested_memory

This is simplified pseudo-code to illustrate the concept of the stack allocator.

Deallocation is even simpler. Just decrease the pointer by the size of the deallocated memory.

Note: Deallocation has to be in the reverse order of the allocation order. That's the big contra of using a stack allocator. You can't allocate and deallocate memory in arbitrary order!

Pool allocator

The pool allocator is also not that complicated to implement.

The theory about a pool allocator is it'll split a memory block into memory chunks of a specific size (for example the size of a user defined class). Each time you request memory, the pool allocator will go through its chunks and check if there's a free chunk - if there is, it will return that chunk.

There are multiple ways to implement a pool allocator.

One of the simpler implementations would be to use an array.

If you were to write a pool allocator class, you could use C++ templates to create that array on the stack.

 template<unsigned int SizePerObject,unsigned int AmountObjects>  
 class PoolAllocator  
 {  
 //... Costructor, Destructor etc.  
 private:  
    unsigned char m_Memory[SizePerObject * AmountObjects];  
 };

To keep track of used chunks you could use another array that stores a boolean value per chunk to indicate whether the chunk is free or not

 template<unsigned int SizePerObject,unsigned int AmountObjects>  
 class PoolAllocator  
 {  
 //... Costructor, Destructor etc.  
 private:  
    unsigned char m_Memory[SizePerObject * AmountObjects];  
    bool m_FreeChunks[AmountObjects];  
 };

If the user wants to allocate memory from the PoolAllocator you could implement it in a way that the allocate function iterates over the array until it finds a chunk that is free (m_FreeChunks[i] == true). You could then use the position of the free chunk to get the memory from the memory block itself.

 for(int i = 0;i < AmountObjects; ++i)  
 {  
    if(m_FreeChunks[i])  
    {  
      m_FreeChunks[i] = false;  
      return m_Memory[SizePerObject * i];  
    }  
 }  
 printf("StackAllocator could not satisfy memory request.");  
 return NULL;

Deallocation is - like with the stack allocator- simpler than allocation.
For deallocation you just need to set the specific chunk the memory came from to 'not used' (m_FreeChunks[i] = true in our example).

Note: Due to its layout, a pool allocator can be used to allocate and deallocate objects in arbitrary order. However, the pool allocator can not be used to allocate memory of arbitrary size!

Block allocator

The block allocator is the best of both worlds. Allocations/deallocations can be made in arbitrary order and in arbitrary size. But before you get too euphoric and forget about the two previous allocator pattern, note that the algorithm for allocating memory is more complex than the simplistic allocation algorithm of the pool and stack allogator. Therefore, excessive allocation/deallocation using the block allocator will be considerably slower than using a pool or stack allocator.

The theory behind the block allocator is to split the underlying memory block into several smaller memory blocks. But unlike the pool allocator we don't do that at the beginning using a constant size, but rather during allocation with a variable size.

The used blocks will get saved in a internal array (or linked list) and on deallocation the block allocator tries to merge multiple deallocated blocks into a larger block, so that we'll end up with - again - one large memory block when all allocated memory from the block allocator has been deallocated.

 struct MemoryBlock  
 {  
    unsigned int StartPosition;  
    unsigned int EndPosition;
    bool Free;
 };  
 class BlockAllocator  
 {  
    //Constructor, destructor etc...  
 private:  
    unsigned char* m_Memory;  
    MemoryBlock* m_Blocks[AMOUNT_MEMORY_BLOCKS];  
 };

This is a simplified declaration of a MemoryBlock struct and the BlockAllocator class.

The algorithm for allocating memory could look like this

 for(int i = 0;i < AMOUNT_MEMORY_BLOCKS;++i)  
 {  
    if(m_MemoryBlocks[i]->Free)  
    {  
      unsigned int newEndPos = m_MemoryBlock[i]->StartPosition + sizeof(AllocatedObject);  
      m_MemoryBlock[i]->EndPosition = newEndPos;  
      m_MemoryBlock[i]->Free = false;  
      createMemoryBlock(); //create new memory block  
      return m_Memory + m_MemoryBlock[i]->StartPosition; //return memory  
    }  
 }  
 printf("Block allocator - can not satisfy memory request");  
 return NULL;

Again, this is just pseudo code ;).

I hope the informations provided in this and the last post was enough to give you a basic understanding of memory management in C++ and why it is important to know how to implement your own allocator.

If you still have questions or don't understand my examples, feel free to contact me.

In the next post I'll probably talk about the interaction between your game engine and an external game editor.

Friday, June 8, 2012

The next generation?

I just saw a video about the new Unreal Engine 4.
The video was about a technical artist showing what you can do with the
UED (Unreal Editor) 4.

Everything in the video looks very promising, but what caught me the most is what happens at the end of the video (at about 9:40). The guy showing the engine opens Visual Studio from within the UED and changes some c++ code (in the video he's manipulating the player's jump height).

When he returns to the game a little message appears saying "Compiling C++ Code...".
After a few seconds a new message appears saying "C++ Code successfully compiled".

After that he was jumping twice as high as before.
This is IMO one of the most astonishing things I've ever seen. In that way you could use C++ as a scripting language...A very powerful one.

I have absolutely no idea how they've implemented that, but that feature is just amazing and I'm eager to find out how that works ;).

Monday, December 5, 2011

Make your Engine more dynamic!

When I started to program my Engine, I wanted to create a somewhat portable Engine.
I wanted to support different libraries like OpenGL, OpenAL or DirectX to make my Engine run under Windows and other platforms.

My first attempt was to differentiate with the Preprocessor variables given by the OS
(For example _WIN32 under Windows).

That worked, but was somewhat ugly...

It was indeed so ugly that I decided to give it another try.
So I split everything into separate dynamic libraries.

At the end I had one library for every replaceable feature of my engine.
Here are some of them:

K15_GraphicModule.dll / K15_GraphicModule.so
K15_SoundModule.dll / K15_SoundModule.so
K15_InputQueueModule.dll / K15_InputQueueModule.so
They were implement as subsystems.

I'll try to explain the idea behind the subsystems by the example of the GraphicModule.

The GraphicModule is, for example, a subsystem of the Engine's GraphicManager.
The GraphicManager will try to find a "K15_GraphicModule.dll" or "K15_GraphicModule.so" (depends on the OS) in it's initialize function. When it finds the dynamic library, it loads it and tries to perform the function "createGraphicModule()" within the library.

The function createGraphicModule will return an implementation of a GraphicModule interface that is declared inside the Engine's library file.

Here's some code to clarify my example.

 GraphicManager::GraphicManager  
 {  
      typedef createGraphicModule IGraphicManagerModule* (*createGraphicModule)(void);  
      DynLib graphicLib("K15_GraphicModule") // load library (without extension...Determined by the current OS)  
      if(graphicLib.loadedSuccessful()){ //Check if library has been loaded.  
           createGraphicModule func = (createGraphicModule)graphiclib.getFuction("createGraphicModule");  
           if(func != NULL){ // function loaded?  
                this->m_subsystem = func(); //get Subsystem from dynamic library.  
           }  
      }  
 }

This system allows me to switch between OpenGL and DirectX just by replacing the dynamic library files...Or to switch from Windows File Manager to some Unix File Manager just by replacing the library file.

I just implemented this system about a few days ago, so I don't know how well that will work in an actual game...But it's only a matter of time 'till I know it ;-)

Monday, June 27, 2011

Particle implementation.

Hi internet!

I've just successfully finished my oral exam (damn that sounds dirty).
So my time as an apprentice is officially over, which means I have more time now to focus on my 2D engine.

In this post I want to show you how I implemented the particle system in my engine using the decorator pattern.

First some UML:

As you might see here I decided to derive the particle emitter into a finite and an infinite subclass. The finite emitter stopps after it spawned n particle (determined by the setParticleAmount() method) whereas the infinite emitter runs as long as isStarted() return true.

The most difficult thing was to cover all the things a particle system must do (e.g. transparency, spreading, collision detection, etc.).

To provide all those contingencies I decided to use the decorator pattern (I just read the book Head First Design Patterns (a really good book), so all that stuff was still in my mind ;) ).

The particle system runs pretty well, although some of the subsystems (especially the transparency) don't work as expected yet.

Sunday, June 5, 2011

Personal point of view (game engine).

(This post is absolutely IMHO so if you disagree with some points, that's just fine ;) )

Since I'm registered in the GameDev.net forum I noticed many people who were quite clueless wether they should make their own game engine or use a premade one.

I kept asking these questions myself back in the days I started game development and so I though it would be a good idea to make a little post about my personal point of view when it comes to game engines.

Before I start, I'll give you some examples of what I'd like to call good premade game engines:

Ogre3D (which is open source)
Unreal Development Kit (which is free for non commercial games (I was surpised when I heard about this))
Unity3D
RPG Maker (Which I made my first game with back in 2001...Man, that game was shitty!)

Check out the websites of these engines to learn more about them.

The first thing you should ask yourself when it comes to choose between selfmade and premade is why you want to make a game (engine).

Are you doing it for the sake of learning or just to, well..., make a game?
And, secondly, are you just doing it for fun or to make some sort of portfolio to get your foot into the industry (and if so,what kind of job do you wish to get? Developer, designer, etc.)?

Thats the points you should be certain about.

If you just want to make a game and don't care about the stuff going on in the background you should be fine with a premade engine. The learning curve should be much higher compared to a selfmade engine. (If you wish to work as a designer even a level editor like the Hammer editor from Valve (which you can get on Steam), or the Unreal ED (which is included in the above mentioned UDK and a part of nearly every installation of a Unreal related game (Unreal Tournament,Unreal)) should be enough to improve your portfolio).

I know that many people disagree with me on this point, but if you are interested in how things work (and especially as a developer) you should definitely start with your own engine. I learned tons of stuff since I started to work on my own 2D engine. Of course, programming your own engine shouldn't be your first project. I would recommend you to start with 2-3 simple games like Pong,Tetris or Pacman(especially Pacman is harder to realize then you might think at first) to get a feeling what components it takes to make a game.

If you are interested in starting your own engine, I can recommend the following books:

Game Coding Complete (By Ultima Online developer Mr. Mike)
Game Engine Architecture (By Naughty Dog employee Jason Gregory)

If you're stuck at one point and definitely have no idea what's wrong you should register in the above mentioned GameDev forum and search for your issue...Maybe some guy had the same issue before.

Tuesday, May 3, 2011

Thoughts about 2D animations.

I know that I shouldn't think of such things like animation systems (considering that I'll write my final exam tomorrow) but it's just something that got to my mind yesterday while I was falling asleep.

Actually I'm using a fairly simple approach for animations.
To create a animated graphic in my engine you got to use the following syntax (The parameters should be self explanatory):

  GraphicsFactory::createAnimatedGraphic(Uint32  animationStepTimeInMilliseconds,Uint8 animationSteps,bool autostart)

After you created the object you can add a graphic to it via:

 animatedGraphicObject.loadGraphic("graphic.png");

(I know its a strange way, but I'll stick with this for now)

When the graphic gets drawn, only the current animationframe will get drawn.

Example:

This System works great for a normal walkcycle like this, but if you have an animation like this:

The system isn't the best solution.

The reason for that is the bounding box (bounding boxes are used for collision detection and are almost always as big as the animation frame) of each animation frame. In the walkcycle example the size of each frame never changes (its always 24x32), but if you see at the second example, especially this frame:

you'll might notice that this frame (109x49) is way bigger than the other frames (62x44 and 80x44).

If the above mentioned system is used, the bounding box of each frame is equally to the bounding box of the biggest frame. (I hope you get what I'm trying to say)

This is a big disadvantage IMHO because if a bounding box collision is detected, the collision system goes into pixel perfect collision deteciton (at least in my engine), which is way more expensive than the bounding box collision.

To avoid this problem I thought of the following approach.

At the point where the user determines the amount of animation frames, the size of each animation frame is known. (Width of the graphic / amount of animation frames = size of the biggest animation frame).

The next step would be to check each pixel of each animation frame for its color value to determine the real bounding box of each animation frame by comparing the color value of the current pixel with the color value of the images color key (The color key is the color in the image that is invisible later in the actual game).

After that step, each animation frame can be packed into an array, so you get a nice animation with perfectly fitting bounding boxes for each frame :)

Sunday, March 27, 2011

Graphic creation technique finished!

The content of this video is just me testing the AbstractAnimatedGraphic class.
The graphicfile used for this animation is the following:

The animation is just yoshi, turning around his own axis.
In the video you can see that it is possible to regulate the animationspeed by
key pressing (well you can't see the actual key press).
This feature could be used for fancy slow-motion effects ;)

I also just finished the graphic creation factory.
My engine will support 2 different graphic modes.

Pure SDL (DirectDraw under Windows)
OpenGL

When the function getNewGraphic() from the class "GraphicFactory" get's called,
the engine will check specific configuration flags (which you can change on runtime)
to determine the derived type of the returned AbstractGraphic object (AbstractGraphic is the class that SDLObject and OpenGLObject inherit from).All created graphicObjects are saved to an list to support easy conversion on the fly if the user wants to switch the graphics mode during gameplay.

It's time to sleep now.
I had to work the whole weekend and didn't got much sleep.