MySQL System Architecture, Page 3
TEXI and texi2html Viewing
TEXI is the GNU standard documentation format. A number of utilities can convert the TEXI source documentation to other, perhaps more readable or portable, formats. For those of you using Emacs or some variant of it, that editor supports a TEXI major mode for easy reading.
If you prefer an HTML version, you can use the free Perl-based utility texi2html, which can generate a highly configurable HTML output of a TEXI source document. texi2html is available for download from https://texi2html.cvshome.org/. Once you've downloaded this utility, you can install it, like so:
# tar -xzvf texi2html-1.76.tar.gz # cd texi2html-1.6 # ./configure # make install
Here, we've untarred the latest (as of this writing) texi2html version and installed the software on our Linux system. Next, we want to generate an HTML version of the internals.texi document available in our source download:
# cd /path/to/mysql-5.0.2-alpha/ # texi2html Docs/internals.texi
After installation, you'll notice a new HTML document in the /Docs directory of your source tree called internals.html. You can now navigate the internal documentation via a web browser. For your convenience, this HTML document is also available at http://www.jpipes.com/mysqldox/.
MySQL Architecture Overview
MySQL's architecture consists of a web of interrelated function sets, which work together to fulfill the various needs of the database server. A number of authors3 have implied that these function sets are indeed components, or entirely encapsulated packages; however, there is little evidence in the source code that this is the case.
Indeed, the architecture includes separate function libraries, composed of functions that handle similar tasks, but there is not, in the traditional object-oriented programming sense, a full component-level separation of functionality. By this, we mean that you will be disappointed if you go into the source code looking for classes called BufferManager or QueryManager. They don't exist. We bring this point up because some developers, particularly ones with Java back-grounds, write code containing a number of "manager" objects, which fulfill the requests of client objects in a very object-centric approach. In MySQL, this simply isn't the case.
In some cases—notably in the source code for the query cache and log management subsystems—a more object-oriented approach is taken to the code. However, in most cases, system functionality is run through the various function libraries (which pass along a core set of structs) and classes (which do the dirty work of code execution), as opposed to an encapsulated approach, where components manage their internal execution and provide an API for other components to use the component. This is due, in part, to the fact that the system archi-tecture is made up of both C and C++ source files, as well as a number of Perl and shell scripts that serve as utilities. C and C++ have different functional capabilities; C++ is a fully object-oriented language, and C is more procedural. In the MySQL system architecture, certain libraries have been written entirely in C, making an object-oriented component type architec-ture nearly impossible. For sure, the architecture of the server subsystems has a lot to do with performance and portability concerns as well.
Note As MySQL is an evolving piece of software, you will notice variations in both coding and naming style and consistency. For example, if you compare the source files for the older MyISAM handler files with the newer query cache source files, you'll notice a marked difference in naming conventions, commenting by the developers, and function-naming standards. Additionally, as we go to print, there have been rumors that significant changes to the directory structure and source layout will occur in MySQL 5.1.
Furthermore, if you analyze the source code and internal documentation, you will find little mention of components or packages.4 Instead, you will find references to various task-related functionality. For instance, the internals TEXI document refers to "The Optimizer," but you will find no component or package in the source code called Optimizer. Instead, as the internals TEXI document states, "The Optimizer is a set of routines which decide what execution path the RDBMS should take for queries." For simplicity's sake, we will refer to each related set of functionality by the term subsystem, rather than component, as it seems to more accurately reflect the organization of the various function libraries.
Each subsystem is designed to both accept information from and feed data into the other subsystems of the server. In order to do this in a standard way, these subsystems expose this functionality through a well-defined function application programming interface (API).5. As requests and data funnel through the server's pipeline, the subsystems pass information between each other via these clearly defined functions and data structures. As we examine each of the major subsystems, we'll take a look at some of these data structures and methods.
MySQL Server Subsystem Organization
The overall organization of the MySQL server architecture is a layered, but not particularly hierarchical, structure. We make the distinction here that the subsystems in the MySQL server architecture are quite independent of each other.
In a hierarchical organization, subsystems depend on each other in order to function, as components derive from a tree-like set of classes. While there are indeed tree-like organiza-tions of classes within some of the subsystems—notably in the SQL parsing and optimization subsystem—the subsystems themselves do not follow a hierarchical arrangement.
A base function library and a select group of subsystems handle lower-level responsibilities. These libraries and subsystems serve to support the abstraction of the storage engine systems, which feed data to requesting client programs. Figure 1 shows a general depiction of this layering, with different subsystems identified.
Note that client programs interact with an abstracted API for the storage engines. This enables client connections to issue statements that are storage-engine agnostic, meaning the client does not need to know which storage engine is handling the data request. No special client functions are required to return InnoDB records versus MyISAM records. This arrangement enables MySQL to extend its functionality to different storage requirements and media.
Figure 1. MySQL subsystem overview
Base Function Library
All of MySQL's subsystems share the use of a base library of common functions. Many of these functions exist to shield the subsystem (and the developers) from needing to operate directly with the operating system, main memory, or the physical hardware itself.6 Additionally, the base function library enables code reuse and portability. Most of the functions in this base library are found in the C source files of the /mysys and /strings directories. Table 2 shows a sampling of core files and locations for this base library.
Table 2. Some Core Function Files
|/mysys/array. c||Dynamic array functions and definitions|
|/mysys/hash. c/. h||Hash table functions and definitions|
|/mysys/mf_qsort . c||Quicksort algorithms and functions|
|/mysys/string. c||Dynamic string functions|
|/mysys/my_alloc . c||Some memory allocation routines|
|/mysys/mf_pack. c||Filename and directory path packing routines|
|/strings/*||Low-level string and memory manipulation functions, and some data type definitions|
- At the time of this writing, the MySQL server consists of roughly 500,000 lines of sourcecode.
- Whether the developers chose to purposefully omit a discussion on the subsystem's communication in order to allow for changes in that communication is up for debate.
- For examples, see MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne) and http://wiki.cs.uiuc.edu/cs427/High-Level+Component+Diagram+of+the+MySQL+Architecture.
- The function init_server_components() in /sql/mysqld.cpp is the odd exception. Really, though, this method runs through starting a few of the functional subsystems and initializes the storage handlers and core buffers.
- This abstraction generally leads to a loose coupling, or dependence, of related function sets to each other. In general, MySQL's components are loosely coupled, with a few exceptions.
- Certain components and libraries, however, will still interact directly with the operating system or hardware where performance or other benefits may be realized.
ConclusionWe hope this article has been a fun little excursion into the world of database server internals. So, what have we covered in this article? Well, we started off with some instructions on how to get your hands on the source code, and configure and retrieve the documentation in various formats. Then we outlined the general organization of the servers subsystems.
About the Authors
Mike Kruckenberg is a long-time MySQL devotee who has used MySQL personally and professionally since the early days of web-based applications. Besides having been the go-to guy for all things MySQL at his day (and night) jobs over the years, Mike is an active member of the MySQL community. In addition to being the coauthor of Pro MySQL, he is a coauthor on the MySQL Cluster Certification Study Guide and periodically writes about MySQL for Linux Magazine. He did the technical review for the soon-to-be published Expert MySQL (Apress) on MySQL source code modifications. Mike is a member of the MySQL Speakers, Writers, and Experts Guilds, regularly presents at tech conferences, and actively writes about MySQL and other (mostly) technical things at mike.kruckenberg.com.
Jay Pipes is the North American Community Relations Manager at MySQL. Coauthor of Pro MySQL (Apress, 2005), Jay has also written articles for Linux Magazine and regularly assists software developers in identifying how to make the most effective use of MySQL. He has given sessions on performance tuning at the MySQL Users Conference, RedHat Summit, NY PHP Conference, OSCON, and Ohio LinuxFest, among others. He lives in Columbus, Ohio, with his wife, Julie, and his four animals. In his abundant free time, when not being pestered by his two needy cats and two noisy dogs, he daydreams in PHP code and ponders the ramifications of __clone().
Source of This MaterialPro MySQL
By Michael Kruckenberg, Jay Pipes
Published: July 2005, Paperback: 768 pages
Published by Apress
eBook Price: $25.00
This material is from Chapter 4 of the book.
Reprinted with the publisher's permission.