http://www.developer.com/

Back to article

Utility Libraries for BREW - A String Class


February 27, 2003

Preamble

This article is the first one in a series that will present possible implementations of utility libraries in BREW. We will start today with a string library and continue with vector and hashtable in the next installment.

A legitimate question might be: "We have STL — why something else?" STL is flexible and powerful (I am one of its greatest supporters, believe me!) but sometimes STL is simply "too heavy", especially on platforms like BREW. Examples of possible performance penalties: heavy use of copying ([1]), differences in string implementation([1]), temporaries([2], [3]), etc.

There are (sometime subtle) differences between ports the user has to be aware of.(see the 4 ways to implement a string as an example) Another possible issue is related to the compiler support for C++ features required by STL — exceptions, template specialization, etc. (see STL for eMbedded Visual C++ information page — [4] ) There are compilers and/or platforms that discourage the use of STL from the beginning — see Embedded C++ and Symbian ("The use of C++ for Symbian OS is targeted at suitability for phones, which means that some C++ standard functionality, such as C++ exception handling and the Standard Template Library, is not used" — [5])

Pulling the std::string...

A basic container, widely used, is std::string. The advantages of using strings instead of dynamically allocated arrays are well known ([1]). But there are different ways to implement strings, and from the BREW perspective this is extremely important. Replacing char* with string means we might increase the size of the object we place on the stack from 1 to 7, depending on the implementation. BREW uses AECHAR too, and as AECHAR manipulators don't have standard (wchar_t like) interfaces additional changes are needed. Some temporaries may be avoided for particular uses of string. We will investigate all these aspects one by one.

A very sensitive issue, especially for BREW, is the size of the string. There are 4 different ways to implement a string and each one has a different size. The basic approach is to use directly a size, a capacity, the raw pointer, allocator and reference count (if available). Size is at least 3 times the raw pointer. Performance is very good.

A very conservative way is to embed all the above mentioned class data in a structure and reference the structure via a pointer. Size is equal to the raw pointer but there is a big performance penalty — an extra allocation.

Other implementations exhibit the "small string optimization" idiom — an internal buffer holds strings smaller than 15 characters. If size exceeds this value the buffer holds the pointer to the dynamically allocated memory holding the string. This has seven times the size of the raw pointer but has the best performance.

A fourth approach is uses a variant of 2, like:

This way the size of the string equals the size of the raw pointer and the string is fully populated in one single allocation. Performance is good.

Reference counting (copy-on-write) is less and less used in STL libraries and even in the single-threaded world is a mixed blessing ([3], [7]).

Taking into account the whole picture and the restrictions of the BREW platform BrewString implements the fourth variant and is not referenced counted.

...and the BrewString...

The library is hosted in BrewString.h. It assumes that templates are available (true with ARM compiler) and it doesn't use exceptions (as they are not supported by ARM). The interface replicates the Java String class, as this emphasizes the fact that this is not an STL compatible implementation, having no support for iterators, and also has the additional merit of being a well-proved and popular API. A typedef takes the similarity one step further:

typedef BrewString<char> String;

On the other hand we'll see immediately that the similarity is merely syntactic — BrewString is not an immutable class. Immutable classes have distinct advantages (see Bloch — Effective Java) but are notorious bad-performers.

A PlatformSpecific header takes care of the differences between platforms as well as the global overload of operators new, new[], delete and delete[], as required by BREW. The use of new[] and delete[] is not a specific requirement for this library but is a good practice, insuring code portability. The same time this overload might provide a good starting point for a custom allocator.

To achieve the desired flexibility BrewString uses templates, with minimal code bloat. An example:

template <class U, class V>   const BrewString<T>&   replace( const U& match, const V& replacement );

This deals with all the possible char/string combinations in parameters and is instantiated as needed. You pay only for what you use.

As one can easily see BrewString does almost everything in place, avoiding temporaries. There is even an in-place substring() version:

const BrewString<T>& partialString( UINT right ) ;      const BrewString<T>& partialString( UINT left, UINT right ) ;

If we have to write code like:

String s = "something";s = s.substr(0,4);

then

s.partialString(4);

is at least 2 times faster than the STL implementation.

There are only 2 functions returning a BrewString by value: substring and operator+, and even this two use the so-called "return value optimization". For example:

const BrewString<T>   substring( UINT left, UINT right ) const   {      if ( ( left >= right ) || ( right > length_ ) )      {         return BrewString();         }        return BrewString ( buffer_ + left, right - left );   }String s = str.substr(0, 2);

will be optimized in the sense that there will be no temporary created to pass the value from the function. (Actually the rules for this optimization have been relaxed somehow but there might be still problems when using templates[7]).

There is a notable improvement in implementing operator+, making it 33% faster than the fastest STL implementation of this operator. There is a non-standard constructor:

template < class U, class V>   BrewString (const U& lhs, UINT lnl, const V& rhs, UINT lnr);

This can be used like:

String sc ="outdoor";String h("indoor", 2, sc, 3);String h will contain "inout".

Using the following operator+:

template <  class U, class V>   friend BrewString<T> operator+ ( const U& lhs,                                      const V& rhs )   {      String t;      UINT lnl = t.getLen(lhs);      UINT lnr = t.getLen(rhs);      return String(lhs, lnl,  rhs, lnr);   }

results into the return value optimization (String t is void, so that it adds nothing to the final tab)

BrewString — public interface:

template <class U>      BrewString( const U& rhs , UINT sz = 0);   BrewString( const BrewString<T> &rhs );   BrewString( ) ;   const BrewString<T> &   operator=( const BrewString &rhs );      const BrewString<T> &   operator=( const T* rhs );   template <class U>      const BrewString<T> &   operator+=( const U& rhs );      template <class U>   bool equals( const U& rhs ) ;      template <class U>   bool startsWith( const U &rhs, UINT offset = 0 ) ;   template <class U>   bool endsWith( const U& rhs ) ;   template <class U>   int indexOf( const U &rhs, UINT fromIndex = 0 ) ;      template <class U>      int   lastIndexOf( const U &rhs ) ;   template <class U>   int lastIndexOf( const U &rhs, UINT fromIndex ) ;      template <class U, class V>   const BrewString<T>&   replace( const U& match, const V& replacement );      const BrewString<T>&   replace( T findChar, T replaceChar );      const BrewString<T>   substring( UINT left ) ;      const BrewString<T>   substring( UINT left, UINT right ) ;      const BrewString<T>& partialString( UINT right ) ;      const BrewString<T>& partialString( UINT left, UINT right ) ;      const BrewString<T>& trim() ;      const BrewString<T>& toLowerCase( ) ;         const BrewString<T>& toUpperCase( ) ;      const T* toCharArray() ;   const UINT length( ) ;   const UINT capacity( ) ;   const T charAt( int index ) ;      void ensureCapacity(UINT sz);      void setLength(UINT sz);   

Efficiency

BrewString library is extremely competitive in size. What about efficiency?We conducted some tests with 2 compilers and 2 STL implementations (I1 — uses small string optimization, I2 — basic STL model), to see how the differences between the ports affect the results. I1 had the optimization inhibited to make comparisons viable.

  • operator+ - see above
  • Assignment operator: is largely the same in all the cases
  • Copy constructor: BrewString is 20% faster than I1; BrewString is 3 times faster than I2
  • Substring: BrewString equals. I1; BrewString is 1.5 times faster than I2;
  • partialString : BrewString is 70 times faster than I1; BrewString is 5 times faster than I1 (small string optimization activated)
  • find_first_of : BrewString is 1.5 times faster than I1; BrewString is 2 times faster than I2

BrewString proves to have above average results for the minimum possible size, offering a good alternative to STL.

Epilogue -Deferred Improvements

BrewString was declared as

template <typename T = char  >class BrewString ;

The <typename T = AECHAR> policy is missing. Addditional optimizations are also possible. If there is interest in further developing and using this library (including using AEChar) please contact the author.

References:

[1] Scott Meyers - Effective STL; Addison-Wesley, 2001
[2] Scott Meyers - More Effective C++; Addison-Wesley, 1996
[3] Herb Sutter - More Exceptional C++; Addison-Wesley, 2001
[4] STL for eMbedded Visual C++ information page - http://www.syncdata.it/stlce/stlinfo.html
[5] Symbian OS Guide - http://www.symbian.com/developer/techlib/v70docs/sdl%5Fv7.0/doc%5Fsource/devguides/ProgLanguages.html#OSGuide%2eProgLanguages
[6] Dov Bulka, David Mayhew - Efficient C++ ; Addison-Wesley, 2000
[7] Andrei Alexandrescu: Generic<Programming>: Move Constructors - http://www.cuj.com/experts/2102/alexandr.htm


Downloads: Source Code - 189 kb.

About the Author

Radu Braniste is Director of Technology at Epicad. He can be contacted at rbraniste@epicad.com

# # #

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date