Using Memory-Mapped Files in .NET 4.0
Introduction
Assume you have the need to manipulate multi-gigabyte files and read and write data to them. One option would be to access the file using a sequential stream, which is fine if you need to access the file from the beginning to the end. However, things get more problematic when you need random access. Seeking the stream is naturally a solution, but unfortunately a slow one.
If you have background in Windows API development, then you might be aware of an old technique called memory-mapped files (sometimes abbreviated MMF). The idea of memory-mapped files or file mapping is to load a file into memory so that it appears as a continuous block in your application's address space. Then, reading and writing to the file is simply a matter of accessing the correct memory location. In fact, when the operating system loader fetches your application's EXE or DLL files to execute their code, file mapping is used behind the scenes.
Using memory-mapped files from .NET applications is not new in itself, as it has been possible to use the underlying operating system APIs using Platform Invoke (P/Invoke) available already in .NET 1.0. However, in .NET 4.0, using memory-mapped files becomes available for all managed code developers without using the Windows APIs directly.
Memory-mapped files and large files are often associated together in the minds of developers, but there's no practical limit to how large or small the files accessed through memory mapping can be. Although using memory mapping for large files make programming easier, you might observe even better performance when using smaller files, as they can fit entirely in the file system cache.
The information and the code listings in this article are based on the .NET 4.0 Beta 1 release, available since May 2009. As is the case with pre-release software, technical details, class names and available methods might change once the final RTM version of .NET becomes available. This is worth keeping in mind while studying or developing against any beta library.
The New Namespace and its Classes
For .NET 4.0 developers, the interesting classes that
work with memory-mapped files live in the new
System.IO.MemoryMappedFiles namespace.
Presently, this namespace contains four classes and several
enumerations to help you access and secure your file
mappings. The actual implementation is inside the assembly
System.Core.dll.
The most important class for the developer is the MemoryMappedFile class. This class allows you to create a memory-mapped object, from which you can in turn create a view accessor object. You can then use this accessor to manipulate directly the memory block mapped from the file. Manipulation can be done using the convenient Read and Write methods.
Note that since direct pointers are not considered a sound programming practice in the managed world, such an access object is needed to keep things tidy. In traditional Windows API development in native code, you would simply get a pointer to the beginning of your memory block.
That said, the process or acquiring a memory-mapped file and the necessary accessor object, you need to follow three simple steps. First, you need a file stream object that points to (an existing) file on disk. Secondly, you can create the mapping object from this file, and as a final step, you create the accessor object. Here is a code example in C#:
FileStream file = new FileStream(
@"C:\Temp\MyFile.dat", FileMode.Open);
MemoryMappedFile mmf =
MemoryMappedFile.CreateFromFile(file);
MemoryMappedViewAccessor accessor =
mmf.CreateViewAccessor();
The code first opens a file with the
System.IO.FileStream class, and then passes the
stream object instance to the static
CreateFromFile method of the
MemoryMappedFile class. The third step is to
call the CreateViewAccessor method of the
MemoryMappedFile class.
In the above code, the CreateViewAccessor
method is called without any parameters. In this case, the
mapping begins from the start of the file (offset zero) and
ends at the last byte of the file. You can however easily
map in any portion of the file. For instance, if your file
would be one gigabyte in size, then you could map, say, a
view at the one megabyte mark with a view size of 10,000
bytes. This could be done with the following call:
MemoryMappedViewAccessor accessor =
mmf.CreateViewAccessor(1024 * 1024, 10000);
Later on, you will see more advanced uses for these mapped views. But first, you need to learn about reading from the view.
