http://www.developer.com/ws/android/development-tools/Android-XML-Parser-Performance-3824221-2.htm

Back to article

Android XML Parser Performance


June 11, 2009

The Android SDK provides two Java XML parsers that most developers are accustomed to: DOM (Document Object Model) and SAX (Simple API for XML). A while back, the Android SDK was enhanced to include an XML Pull Parser, with a note that the change provided "higher performance for small memory applications", such as portable devices.

I had initially assumed that meant the Pull parser would be the XML parser of choice for the Android platform. Otherwise, why would the developers of the Android platform include it so early on with such a change description? Before I began the testing, I was expecting the performance of the Pull parser to be the fastest of the three methods available. I had already begun using it as the parser of choice. But was it really the right choice? And, if so, how much faster was it? I had to know.

Understanding How the Different Parsers Work

A DOM parser works by parsing an XML file into a native data structure matching the hierarchy of the XML file. Most of the processing is done up front and the entire file is looked at, so a DOM parser typically uses the most memory of the three parsers. A SAX parser works by having the user implement a class with method handlers for various events, such as finding tags or attributes. A Pull Parser works by creating a loop that continually requests the next event and can then handle that event directly within the loop. The idea with the Pull Parser is that it can easily be stopped at any point, only do processing on demand, and remove the overhead of extra method calls and classes.

Although each of the three parsers is different, they all have the same fundamentals: they are designed for parsing XML. This means that the code for handling events, such as finding and interpreting the tag data and attributes, can remain basically the same across all three implementations. Since this code will remain the same, any performance differences should be due to differences in the parsing algorithms, including their use of memory, which may cause Java garbage collection to run more frequently.

 


Visit the Android Dev Center

 

Crafting a Reasonable Performance Test for Parser Performance

I decided a fun comparison exercise for all three parsers would be to parse a typical GPX XML file. I chose GPX because of the popularity of location-based services on mobile and the prevalence of the file format amongst GPS units. The GPX format often has a relatively large number of records-at least when compared to something like a news feed which may only ever have the latest twenty items. Additionally, many GPS devices support this format. Each record is relatively simple and contains just a few pieces of data.

Here is an example GPX-format record:

 

  <TRKPT LAT="43.95048" LON="-71.0852">
<ELE>189.530029</ELE>
<TIME>2007-09-08T17:04:55Z</TIME>
</TRKPT>

 

Since I was using a real-world XML file format, I originally thought I'd use files of real world sizes. For instance, a GPS unit that collects a location record each minute for a week will generate a GPX file with just over 10,000 records. As it turns out, my quick and dirty parsers were a bit slow for that size file, so I paired my results down and used 3 different file sizes:

  • Small size of 70 records
  • Medium size of 560 records
  • Large size of 2,796 records

I ran each of the tests between two and six times to help factor out other events that might be going on with the handset. All tests were run on a production T-Mobile G1 handset with Android 1.1.

A Quick Disclaimer

I know that the performance can be tuned; however, the results of this tuning should reap similar benefits across all three methods of parsing. Additionally, a lot of conversion from strings to numbers and dates takes place. However, since we're comparing the three methods of parsing and not how to parse this particular file type the fastest, this is an acceptable method of comparing them in a particular real world situation. That being said, the code is provided for review.

Testing My Assumptions Regarding Parser Performance

As I stated before, I had assumed the new XML Pull Parser would fare best, followed by SAX with DOM coming in last. This did not turn out to be correct.

The chart below shows the results of the three different tests I performed, small, medium and large data sets, in number of sections for test completion:



Click here for larger image

As you can see, DOM was still the slowest parsing method, but SAX consistently outperformed the XML Pull Parser slightly. This came as something of a surprise.

I also looked at how many records could be parsed each second. This would be a potentially easier metric to see how each parser scaled as well as an easy to manipulate number to see how long an arbitrarily sized file would take to parse.



Click here for larger image

Indeed, the Pull parser method was not the fastest parsing method. However, the results show that on file sizes larger than I tested, the Pull parser may actually end up faster. However, since the total time for any of the methods will be too long to reasonably execute on a handset, this does not matter.

Overall Impressions of Parser Performance on Android

The first surprise I had was at how slow all three methods were. Users don't want to wait long for results on mobile phones, so parsing anything more than a few dozen records may mandate a different method.

When Parsing on the Handset is Required: Stick with SAX

When parsing of XML on the Android handset is required, I would recommend using the SAX parser, especially where the file sizes are relatively small.

Get the code for this article

Author

Shane Conder is a software developer focused on mobile and web technologies. He co-authored Android Wireless Application Development, available from Addison-Wesley. He is currently working at a small mobile software company. Contact Shane at shane+a2@kf6nvr.net. "

Sitemap | Contact Us