October 21, 2016
Hot Topics:

More Game Coding Tidbits and Style That Saved My Butt

  • April 1, 2005
  • By Mike McShaffry
  • Send Email »
  • More Articles »

This article picks up where Coding Tidbits and Style That Saved My Butt left off. In this installment I will focus on memory and how it can effect your games.

Using Memory Correctly

Did you ever hear the joke about the programmer trying to beat the Devil in a coding contest? Part of his solution involved overcoming a memory limitation by storing a few bytes in a chain of soundwaves between the microphone and the speaker. That's an interesting idea, and I'll bet we would have tried that one on Ultima VII had someone on our team thought of it.

Memory comes in very different shapes, sizes, and speeds. If you know what you're doing you can write programs that make efficient use of these different memory blocks. If you believe that it doesn't matter how you use memory, you're in for a real shock. This includes assuming that the standard memory manager for your operating system is efficient; it usually isn't and you'll have to think about writing your own.

Understanding the Different Kinds of Memory

The system RAM is the main warehouse for storage as long as the lights are on. Video RAM or VRAM is much smaller and is specifically used for storing objects that will be used by the video card. On top of it all, virtual memory hides a hard disk behind your lightning fast system RAM, and if you're not careful a simple memcpy() could cause the hard drive to seek. You might as well pack up and go to lunch if this happens.

System RAM

Your system RAM is a series of memory sticks that are installed on the mother board. Memory is actually stored in nine bits per byte, with the extra bit used to catch memory parity errors. Depending on the OS, you get to play with a certain addressable range of memory. The operating system keeps some to itself. Of the parts you get to play with, it is divided into three parts when your application loads:

  • Global memory: This memory never changes size. It is allocated when your program loads and stores global variables, text strings, and virtual function tables.
  • Stack: This memory grows as your code calls deeper into core code and it shrinks as the code returns. The stack is used for parameters in function calls and local variables.
  • Heap: This memory grows and shrinks with dynamic memory allocation. It is used for persistent objects and dynamic data structures.

Old timers used to call global memory the DATA segment, harkening back to the days when there used to be near memory and far memory. It was called that because programmers used different pointers to get to it. What a disgusting practice! How I miss 16-bit segmented memory architectures. Not! Everything is much cleaner these days because every pointer is a full 32 bits. (Don't worry, I'm not going to bore you with the "When I went to school I used to load programs from a linear access tape cassette" story.)

Your compiler and linker will attempt to optimize the location of anything you put into the global memory space based on the type of variable. This includes constant text strings. Many compilers, including Visual Studio, will attempt to store text strings only once to save space:

const char *error1 = "Error";
const char *error2 = "Error";

int main()
   printf ("%x \n", (int)error1);
//How quaint. A printf.
   printf ("%x \n", (int)error2);
   return 0;

This code yields interesting results. You'll notice that under Visual C++, the two pointers point to the same text string in the global address space. Even better than that, the text string is one that was already global and stuck in the CRT libraries. It's as if we wasted our time typing "Error." This trick only works for constant text strings, since the compiler knows they can never change. Everything else gets its own space. If you want the compiler to consolidate equivalent text strings, they must be constant text strings.

Don't make the mistake of counting on some kind of rational order to the global addresses. You can't count on anything the compiler or linker will do, especially if you are considering crossing platforms.

On most operating systems, the stack starts at high addresses and grows towards lower addresses. C and C++ parameters get pushed onto the stack from right to left—the last parameter is the first to get pushed onto the stack in a function call. Local parameters get pushed onto the stack in their order of appearance:

void testStack(int x,int y)
   int a = 1;
   int b = 2;

   printf("&x= %-10x &y= %-10x \n", &x, &y);
   printf("&a= %-10x &b= %-10x \n", &a, &b);

This code produces the following output:

&x= 12fdf0    &y= 12fdf4
&a= 12fde0    &b= 12fdd4

Stack addresses grow downward to smaller memory addresses. Thus it should be clear that the order in which the parameters and local variables were pushed was: y, x, a, and b. The next time you're debugging some assembler code you'll be glad to understand this, especially if you are setting your instruction pointer by hand.

C++ allows a high degree of control over the local scope. Every time you enclose code in a set of braces you open a local scope with its own local variables:

int main()
   int a = 0;

   {    //start a local scope here...
      int a = 1;
      printf("%d \n", a);

   printf("%d \n", a);

This code compiles and runs just fine. The two integer variables are completely separate entities. I've written this example to make a clear point, but I'd never actually write code like this. Opening up a local scope just to reuse a variable name is something they shoot programmers for down here in Texas. The real usefulness of this kind of code is for use with C++ objects that perform useful tasks when they are destroyed—you can control the exact moment a destructor is called by closing a local scope.

Video Memory (VRAM)

Video RAM is the memory installed on your video card, unless we're talking about an Xbox. Xbox hardware has unified memory architecture or UMI, so there's no difference between system RAM and VRAM. It would be nice if the rest of the world worked that way. Other hardware such as the Intel architectures must send any data between VRAM and system RAM over a bus. The PS2 has even more different kinds of memory. There are quite a few bus architectures and speeds out there and it is wise to understand how reading and writing data across the bus affects your game's speed.

As long as the CPU doesn't have to read from VRAM everything clicks along pretty fast. If you need to grab a piece of VRAM for something the bits have to be sent across the bus to system RAM. Depending on your architecture, your CPU and GPU must argue for a moment about timing, stream the bits, and go their separate ways. While this painful process is occurring, your game has come to a complete halt.

The hard disk can't write straight to VRAM, so every time a new texture is needed you'll need to stop the presses, so to speak. The smart approach is to load up as many new textures as you can, hopefully limiting any communication needed between the CPU and the video card.

Practice Best

Never perform per pixel operations on data stored in VRAM. If you can, keep a scratch copy in system RAM. If you must do something weird to VRAM, copy the whole thing into system RAM and perform the operation there. When you're done, copy it back to VRAM in one shot, hopefully using an asynchronous copy if your graphics library supports it. Under DirectX, the NO_WAIT flag is the ticket for an asynchronous Blt(). The exception to this rule: If your game can require the latest video cards, most per pixel operations can be programmed in pixel shaders.

If you've been paying attention you'll realize that the GPU in your video card is simply painting the screen using the components in VRAM. If it ever has to stop and ask system RAM for something, your game won't run as fast as it could. Of course, if the CPU never sent anything different for the video card to draw, your game would be pretty boring, unless you like watching images that never change.

Tales from the Pixel Mines

The first texture manager I ever wrote was for Ultima IX. That was before the game was called Ultima: Ascension. I wrote the texture manager for 3DFx's Glide API, and I had all of an hour to do it. We wanted to show some Origin execs what Ultima looked like running under hardware acceleration. Not being the programmer extraordinare, my algorithm had to be pretty simple. I chose a variant of LRU, but since I didn't have time to write the code to sort and organize the textures, I simply threw out every texture in VRAM the moment there wasn't any additional space. I think this code got some nomination for the dumbest texture manager ever written, but it actually worked. The Avatar would walk around for ninety seconds or so before the hard disk lit up and everything came to a halt for two seconds. I'm pretty sure someone rewrote it before U9 shipped. At least, I hope someone rewrote it!

Page 1 of 4

Comment and Contribute


(Maximum characters: 1200). You have characters left.



Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date
Rocket Fuel