Microsoft & .NETVisual C#Compound File Stream and Storage Manipulation

Compound File Stream and Storage Manipulation

Introduction

Did you ever want to group a bunch of files together into a single file for run-time read/write access but didn’t want to bother with a file format structure for accessing the files? There are many uses for this technology such as revision/undo storing, dynamic access to resources, incremental updates, WAD files etc. Microsoft’s answer to implementing these types of solutions is to use a technology known as Compound Files (CF). Note that Compound files are an implementation of the ActiveX structured storage model (From MSDN article, "Containers: Compound Files)

CFs may be viewed as a file system within a file. They allow you to create files (known as streams) and directories (known as sub-storages) within a single file. Compound files offer some advantages of a database (such as transactions with rollback) and general file system functionality. Files within the CF may be read/written from/to incrementally just as they are within a normal filesystem.

The Problem

Application programming to access a CF usually requires quite a bit of manipulations of the IStorage and IStream interfaces that are daunting to many. In addition, management of the interfaces at the right time can cause problems if not handled correctly.

Solutions

What is needed is another model for accessing streams and sub-storages within a CF. A very simple model that every application programmer is familiar with the concept of files. Using MFC, they are managed by the CFile class. Using this model, we can extend the class to accommodate the CFs.

This project presents the following solutions.

  • An MFC CFile derived class (CStgFile) that allow simple CFile type access to a stream within a file.
  • Methods for creation of CFs (CreateStg()) and the creation of single level sub-storages (MkStg()).
  • An OLE Automation (COM) class ("gstg.core") for manipulating CFs from scripting languages (and/or a CDispatch derived MFC class).
  • JavaScript examples of copying files in/out of a CF.

In addition, there is additional code to provide the following external file-system support.

  • An MFC class (CScanDir) that is used to scan a file-system directory for a file specification and return the results in a string array. Support for overriding the default behavior is also provided.
  • An OLE Automation (COM) class ("gstg.dir" ) for accessing directory information from scripting languages (and/or a CDispatch derived MFC class).
  • JavaScript examples of scanning a directory for files and sub-directories.

And finally, an example to demonstrate the functionality:

  • Copy a sub-directory of files from a file system into a sub-storage in a CF.

Development Methodology

  • The core code (CStgFile, CScanDir) is first developed as reusable MFC classes.
  • They are then "wrapped" with an OLE Automation (COM) layer that may be used by OLE Automation scripting engines (VB, VBA, WSH, etc.) and/or other MFC application via a CDispatch derived interface (using the TLB).
  • Finally, JavaScript test scripts are developed for exercising the basic functionality of the code before integration to a more thorough test.

Limitations

To reduce the complexities of illustrating these concepts, the following limitations were imposed.

  • CFs do not use TRANSACTED file semantics. All accesses are DIRECT.
  • CF implementation limited to one level of sub-storages.
  • Very little error (return codes) checking is performed in the OLE Automation wrappers.
  • Methods are not "friendly" to errant programming practices.

Examples of Usage

MFC Example

To illustrate how simple it is to use the CStgFile MFC class for copying an external file to a newly created CF, the following MFC code may be used.

CFile File( "tmp.tmp", CFile::modeRead ); // open a source file
CStgFile FileStg; // instance the CF wrapper

FileStg.CreateStg( "tmp.stg" ); // creates the storage
FileStg.Open( "tmp.tmp", CFile::modeCreate | CFile::modeWrite);

while( 1 ) // copy all bytes to stream
{
UINT cB = 0;
BYTE rgB[512*8];
while( (cB = FileSrc.Read( rgB, sizeof(rgB) )) > 0 )
{
FileStg.Write( rgB, cB );
}
}

FileStg.Close(); // close the stream
FileStg.CloseStg(); // close the CF file

Notice in this example, the one call to CreateStg() converts accesses to the file to using a CF. If this call is omitted, the access to the object uses the normal CFile methods. This may be useful in debugging when you wish to access the streams as normal files.

JavaScript Example

To perform the same operation in JavaScript, the solution is even simpler:

var objStg = WScript.CreateObject( "gstg.core" ); // create object
objStg.Create( "tmp.stg" ); // create the CF
objStg.CopyTo( "tmp.tmp", "tmp.tmp" ); // copy the external file
objStg.Close(); // close the CF

Other Examples

Other script examples are in the scr, scr/stg and scr/dir sub-directories. These include an example to copy a whole directory of files from the file system into a sub-storage of a CF (cp_bmps.js).

Conclusion

Using Compound Files becomes much easier using the CStgFile class for accessing streams and sub-storages. There are many other uses and advantages of using Compound Files that are beyond the scope of this document. Please refer to MSDN for further reading about OLE Compound Files.

Notes

  • In order to run-the OLE Automation examples, you must register the DLL[s].
  • In order to run the JavaScript example, you must use the Window Scripting Host CScript application. This is available for download from Microsoft and comes with Window98.
  • Use the DFView application from Microsoft to view CFs created with the CStgFile class.
  • These classes were developed with MS Visual C++ V5.0 and should be compatible with previous releases of the compiler.
  • There are some characters that are considered invalid for stream names (e.g. '!').

Other Uses

After understanding how CFs store streams of data, other uses become apparent:

  • A Web-Site in a file.
  • Resources for localization.
  • BLOB type storage insertion/retrieval without the database overhead.
  • Archival of data with direct access.
  • Etc.

Future

The class for accessing streams within a CF should be extended to support N levels of organization. In addition, support for selecting TRANSACTIONs should be included.

Download demo project - 46 KB

Date Posted: November 20, 1998

Get the Free Newsletter!
Subscribe to Developer Insider for top news, trends & analysis
This email address is invalid.
Get the Free Newsletter!
Subscribe to Developer Insider for top news, trends & analysis
This email address is invalid.

Latest Posts

Related Stories