May 21, 2019
Hot Topics:

MySQL System Architecture

  • May 18, 2007
  • By Michael Kruckenberg & Jay Pipes
  • Send Email »
  • More Articles »

We'll be referring to a number of C and C++ programming paradigms in this article. C source code files are those files in the distribution that end in c. C++ source files end in cc, or on some Windows systems, .cpp. Both C and C++ source files can include (using the #include directive) header files, identified by an .h extension. In C and C++, it is customary to define the functions and variables used in the source files in a header file. Typically, the header file is named the same as the source file, but with an . h extension, but this is not always the case. One of the first tasks you'll attempt when looking at the source code of a system is identifying where the variables and functions are defined. Sometimes, this task involves looking through a vast hierarchy of header files in order to find where a variable or function is officially defined.

Undoubtedly, you're familiar with what variables and functions are, so we won't go into much depth about that. In C and C++ programming, however, some other data types and terms are frequently used. Most notably, we'll be using the following terms in this article:

  • Struct
  • Class
  • Member variable
  • Member method

A struct is essentially a container for a bunch of data. A typical definition for a struct might look something like this:

typedef struct st_heapinfo /* Struct heap_info */
   ulong records; /* Records in database */
   ulong deleted; /* Deleted records in database */
   ulong max_records;
   ulong data_length;
   ulong index_length;
   uint reclength; /* Length of one record */
   int errkey;
   ulonglong auto_increment;

This particular definition came from /include/heap.h. It defines a struct (st_heapinfo) as having a number of member variables of various data types (such as records, max_records) and typedefs (aliases) the word HEAPINFO to represent the st_heapinfo struct. Comments in C code are marked with the // or /* ... */ characters.

A class, on the other hand, is a C++ object-oriented structure that is similar to a C struct, but can also have member methods, as well as member variables. The member methods are functions of the class, and they can be called through an instance of the class.

Doxygen for Source Code Analysis

A recommended way to analyze the source code is to use a tool like Doxygen (www.stack.nl/~dimitri/doxygen/index. html), which enables you to get the code structure from a source distribution. This tool can be extremely useful for navigating through functions in a large source distribution like MySQL, where a single execution can call hundreds of class members and functions. The documented output enables you to see where the classes or structs are defined and where they are implemented.

Doxygen provides the ability to configure the output of the documentation produced by the program, and it even allows for UML inheritance and collaboration diagrams to be produced. It can show the class hierarchies in the source code and provide links to where functions are defined and implemented.

On Unix machines, download the source code from the Doxygen web site, and then follow the manual instructions for installation (also available online at the web site). To produce graphical output, you'll want to first download and install the Graph visualization toolkit from http://www.graphviz.org/. After installing Doxygen, you can use the following command to create a default configuration file for Doxygen to process:

# doxygen -g -s /path/to/newconfig.file

The option /path/to/newconfig.fileshould be the directory in which you want to even-tually produce your Doxygen documentation. After Doxygen has created the configuration file for you, simply open the configuration file in your favorite editor and edit the sections you need. Usually, you will need to modify only the OUTPUT_DIRECTORY, INPUT, and PROJECT_NAME settings. Once you've edited the configuration file, simply execute the following:

# doxygen </path/to/config-file>

For your convenience, a version of the MySQL 5.0.2 Doxygen output is available at http://www.jpipes.com/mysqldox/.

The MySQL Documentation

The internal system documentation is available to you if you download the source code of MySQL. It is in the Docs directory of the source tree, available in the internals.texi TEXI document.

The TEXI documentation covers the following topics in detail:

  • Coding guidelines
  • The optimizer (highly recommended reading)
  • Important algorithms and structures
  • Charsets and related issues
  • How MySQL performs different SELECT operations (very useful information)
  • How MySQL transforms queries
  • Communication protocol
  • Replication
  • The MyISAM record structure
  • The .MYI file structure
  • The InnoDB record structure
  • The InnoDB page structure

Although the documentation is extremely helpful in researching certain key elements of the server (particularly the query optimizer), it is worth noting that the internal documentation does not directly address how the different subsystems interact with each other. To determine this interaction, it is necessary to examine the source code itself and the comments of the developers.2

Caution Even the most recent internals.texi documentation has a number of bad hyperlinks, references, and incorrect filenames and paths, so do your homework before you take everything for granted. The internals.texi documentation may not be as up-to-date as your MySQL server version!

Page 2 of 3

Comment and Contribute


(Maximum characters: 1200). You have characters left.



Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date