A Brief Introduction to OpenOffice.org Writer Files
Digging Beyond the Surface
As may be rather evident, the above discussion barely scratches the surface of the ODT file format. However, with the knowledge I have hopefully imparted thus far, you should have little trouble "reverse engineering" the parts you need for yourself: Simply start with a blank document, add (only) the element you would like to understand, such as a table, and then save the document, unzip the ODT file, and open the content.xml file in an editor. Search through the file for a piece of text you inserted and then pick things apart. I assembled a thin folder of printouts relating to how an ODT file implements various aspects of a document; I have found the ODT file to be remarkably accessible.
At my workplace, the content material we manage, which turns into deliverable PDF files for customers, is stored in DITA XML topic files. (DITA stands for Darwin Information Typing Architecture.) This topic-based storage and management has served our in-house tech editors rather well, affording such things as content re-use, single-sourcing, and conditional content filtering. However, the content owners, those technical people closer to the product itself who own and author the raw content material, are not proficient with the DITA schemas—nor should they be.
Alternatively, you may have a process that you need to take in the other direction. Another use case involves creating a true "template" in Writer (an "ott" file) that can be given to a working group as a rather fancy fill-in form. The material in the resulting content.xml file can be scanned for (either by direct location, or by some applied style name, or via other mechanisms) and converted out to some other format. Consider the possibility of turning a "requirements" Writer document into a working skeleton for some test cases.
Hopefully, I have shown you just enough of OpenOffice.org's Writer file format to open up some possibilities for you to use it in new ways. By taking a blank document or document template, you can edit or replace the body section of the embedded content.xml file. By taking an existing document, you can find the content and transform it for other purposes. As the material inside the ODT file is readable text and reasonably well-structured XML, it is wide open to full, external, programmatic assault. Being able to take control of your own documents in this fashion is nothing short of powerful.
About the Author
Rob Lybarger works in a small IT shop in the greater Houston, TX area. Among other duties related to Ant and Java, he has written XSLT mentioned in the Real-World Applications section of this article and also performed various in-house customizations of the stock DITA processing and formatting stylesheets. At home, Rob enjoys spending time with his four month old daughter and being a highly satisfied owner of a Mac computer.
Page 2 of 2