Introduction
Memory allocation looks like an easy subject. Apparently only: Memory management has always been a notorious source of troubles in C/C++. The problem is even more exacerbated on platforms with (very) limited memory, such as BREW.
Both heap and stack memory are scarce resources in BREW—phones’ specifications are extremely convincing documents in this respect. That’s why memory allocation (terminology note: allocation implies deallocation) is of paramount importance. Our coverage of this subject will be guided by the patterns described in [1], mentioned throughout the article in capitals.
We will review the main techniques and some of the ‘usual suspects’ and offer possible solutions to memory fragmentation, access speed, and exhaustion issues.
Handling Memory
Automatic storage (MEMORY DISCARD) plays a central role in all handling idioms and has definite advantages:
- It is fast (as this is basically compile time memory—all allocations are merely pointer arithmetic).
- It is safe to use (provides automatic allocation/deallocation).
Being so fast should make it very appealing for BREW use (special allocators were designed trying to use this memory for very fast access [2]). Unfortunately, this is not the BREW way. The stack is so small, recursion is not recommended and substantial memory allocations have to be done from the heap. But, automatic handling of C++ destruction makes stack allocation an indispensable tool for safe resource manipulation, better known as RAII (“resource acquisition is initialization” technique (TC++PL3 section 14.4) [3]). In previous articles, we presented several examples using this technique—BrewVector and BrewString are the first of all safe wrappers around memory pools. Later in this article, we’ll see how we can generalize this technique and use it to increase memory deallocation efficiency.
MEMORY DISCARD usage is based on placement new [4]. Please note that different signatures are possible for placement new:
void* operator new(size_t nbytes, Allocator* pool); void* operator new(size_t nbytes, Allocator& pool); void* operator new[](size_t nbytes, Allocator* pool); void* operator new[](size_t nbytes, Allocator& pool);
There is a lesser known placement, delete, that matches new:
void operator delete(void * v, Allocator* pool); etc...
If, during object initialization as part of a placement new call—for example, during constructor invocation on a class object instance—an exception is thrown, a matching placement delete call is made, with the same arguments and values as for placement new.
FIXED ALLOCATION is a very interesting technique, based on allocating all the necessary memory at the beginning of the program and releasing it at the end, so there is no need for temporary or intermediate memory allocations. There are definite benefits here:
- It is safe—obviously this is an all-or-nothing technique.
- It is deterministic and fast—no heap allocations.
Of course, in most of the cases, this idiom is unacceptable but notably FIXED ALLOCATION can be easily combined with other techniques. Possible scenarios are based on the so called “rainy day funds”—pre-allocated pools of memory allowing guaranteed execution of critical paths of code. An interesting example reported in [1] is pre-allocating all the objects needed for a critical operation (dialing 911) and such guaranteeing its availability. As explained later, FIXED ALLOCATION is central in the instrumentation of allocators.
VARIABLE ALLOCATION is synonymous to what is currently known as heap allocation. This is the technique mostly used in BREW—based on helper functions from the MALLOC/FREE family or global/class new /delete implemented on top of the same MALLOC/FREE. There are a number of issues with VARIABLE ALLOCATION:
- Usually, VARIABLE ALLOCATION incurs memory overhead as well as processor time
- Memory exhaustion
- Memory fragmentation
Items 2 and 3 are of particular interest to our discussion, but some of the techniques presented are pertinent to Item 1, too. Memory exhaustion is generally strongly related to deallocation (or lack of it) and it will be analyzed later.
Memory fragmentation is a significant problem with VARIABLE ALLOCATION. There are two kinds of fragmentation:
- Internal—memory waste inside an allocated structure
- External—memory waste between two allocated structures, as this space is too small to be reused
Interestingly enough, internal and external fragmentation are in balance; in other words, they are trying to decrease external fragmentation results in an increase in the internal one, complicating the matter even more.
Obviously, the main concern is external fragmentation and there are different strategies to deal with it, all implying first of all that there is a way to measure fragmentation:
- Do nothing if you are not affected. This is the simplest advice and the most effective for small programs or when other allocation techniques (see FIXED ALLOCATION) are available.
- Try to keep allocations/deallocations together.
- Avoid interleaving large blocks with small ones.
- Avoid allocating other objects before releasing large temporaries (see Qualcomm BREW Online Knowledge Base for an example). That’s why realloc might not be your friend.
- Use special techniques. And here is where allocators come handy. Usually, memory fragmentation is the result of small blocks (usually temporaries) allocating a blocking combination of larger blocks. Allocating small objects from a separate heap allows more control over the process. Another important raison d’être of allocators is speed—by replacing standard allocators with fine-tuned, fast memory factories. On top of this, using specialized allocators for small objects has the added benefit of avoiding extra per-object overhead (see [5] for a description and a full implementation) MEMORY DISCARD and POOLED ALLOCATION are examples of such techniques.
Code Examples
Following are some examples of allocators dealing with different facets of memory fragmentation.
First of all, we have to mention that the BREW API offers a memory pool that can be used as an allocator—IRscPool [4]. But from a memory fragmentation perspective, IRscPool is not usable because the pool is shared with the main heap.
What’s next is a simple example of a temporary heap—used mainly for allocating temporary objects. The idea is to allocate temporary objects from a predefined pool, making allocation as fast as stack allocations and creating premises for less fragmentation. Allocated objects are destroyed in a block, simply by moving the “stack” pointer in the initial position. Deallocating is more problematic for objects keeping external resources, taking into account that they were allocated by placement new and the destructor has to be called explicitly. We will see some possible answers to this problem later.
template <size_t POOL_SIZE> class DiscardableHeap { public: void* allocate(size_t sz) { if (allocated_ + sz >= POOL_SIZE) return 0; void * p = &pool_[0] + allocated_; allocated_ += sz; return p; } void deallocate() { allocated_ = 0; } ~DiscardableHeap() { } DiscardableHeap() :allocated_(0) { pool_.ensureCapacity(POOL_SIZE); } private: BrewVector<char> pool_; size_t allocated_; private: DiscardableHeap( const DiscardableHeap &value ); const DiscardableHeap &operator = ( const DiscardableHeap &rhs ); };
used for an example, as in:
AA *a = new (th_) AA; int j = a->guess(); a->~AA(); short * kk = new(rr_) short; th_->deallocate(); kk = new(rr_) short[10];
An interesting allocator breed is the following:
template <size_t CHUNKS, size_t CHUNK_SZ = 8, size_t POOL_SIZE = CHUNKS*CHUNK_SZ> class BrewAllocator;
BrewAllocator is more like a heap managing its internal memory pool and allowing reallocations. The pool is made of contiguous chunks of memory, facilitating management and alignment—please remember that a heap has to always return aligned memory. You can look at BrewAllocator as a generalization of the POOLED ALLOCATION concept. It can be used equally for small objects allocations as well as for general allocation purposes. Using BrewAllocator might incur internal fragmentation as in any POOLED ALLOCATOR but this can be easily minimized (via CHUNK_SZ).
template <size_t CHUNKS, size_t CHUNK_SZ = 8, size_t POOL_SIZE = CHUNKS*CHUNK_SZ> class BrewAllocator { public: bool isInPool(void* p) { return ((p >= pool_) && ( p < pool_+POOL_SIZE)); } void* allocate(size_t sz) { return findBestFit(sz); } void deallocate(void* p) { restorePool(p); } ~BrewAllocator() { unregisterResources(); } BrewAllocator() : pool_(new char[POOL_SIZE]) { resources_.ensureCapacity(CHUNKS); for(UINT i = 0 ; i < CHUNKS; ++i) { resources_.append(0); } } private: void unregisterResources() { delete[] pool_; } void* findBestFit( size_t sz) { if (!sz) return 0; void* fit = 0; size_t dif = sz/CHUNK_SZ; if ( (dif==0) || ((sz%CHUNK_SZ) > 0) ) dif++; if (dif > CHUNKS) return 0; UINT i = 0; while( i < resources_.size() ) { if (resources_[i] == 0) { int idx = i; for(UINT k = 1 ; k <= dif; ++k) { if (resources_[i] == 0) { if (k == dif) { fit = pool_ + CHUNK_SZ * idx; } else { ++i; } } else { fit = 0; break; } } if (fit) { resources_[idx] = dif; for(UINT j = 1 ; j < dif; ++j) { resources_[idx+j] = 1; } break; } } else { ++i; } } return fit; } void restorePool(void* p) { int i = ((char*)p - pool_) / CHUNK_SZ; int dif = resources_[i] ; for(int k = 0 ; k < dif; ++k) { resources_[i+k] = 0; } } private: char* pool_; BrewVector<int> resources_; private: BrewAllocator( const BrewAllocator &value ); const BrewAllocator &operator = ( const BrewAllocator &rhs ); };
resources_ keeps track of allocated chunks (both qualitative and quantitative). There is a findBestFit strategy (far from perfect) that allocates the memory.
Using BrewAllocator is straightforward. Having defined:
typedef BrewAllocator<128> Allocator; rr_ = new Allocator; void* operator new(size_t nbytes, Allocator* pool) { if (pool) return pool->allocate(nbytes); return 0; }
one can write:
int* j = new (rr_) int(); //rr->deallocate(j); if not deallocated is automatically // deallocated when the pool exits
Final Remarks
- Allocators usually follow the FIXED ALLOCATION idiom—being allocated at the very beginning and released at the very end of the program. They are usually singletons, meaning that the same instance is accessible all the time via a static call—there is no need to keep a reference of that particular instance. Unfortunately, this is not true in BREW because no static data is allowed. A clean solution is to keep the class instantiated directly by BREW (CPPApp (86 Kb) in our examples) as a context, used mainly as a placeholder for different handlers and managers. Please note that this is not a real C++ class, but a POD structure because BREW doesn’t have direct knowledge of any of C++ artifacts, so using it as a repository is natural.
- Using C++ constructs in C (“a better C”) or simply using some helper C++ classes such asas “libraries” definitely confers more robustness and clarity to the code. For example, allocators described here can be used from C.
What’s Next?
The main topic of this article was memory allocation and how to avoid fragmentation problems. The next installment will discuss another interesting subject—memory exhaustion. We will present some useful idioms automating deallocation together with other helpful ways of avoiding fragmentation.
References
[1] James Noble, Charles Weirr: Small Memory software – Patterns for systems with limited memory. Addison Wesley, 2001
[2] Matthew Willson: “Efficient Variable Automatic Buffers,” C/C++ Users Journal, Dec 2003
[3] Bjarne Stroustrup: The C++ Programming Language. Addison-Wesley, 2000
[4] For Brew Developers, There Are New Kids in Town: IThread & IRscPool http://www.developer.com/ws/brew/article.php/3085401
[5] Andrei Alexandrescu: Modern C++ Design. Addison Wesley, 2001