It is easy to see the impact that open source software has had on the developer community. A Google search for “open source” turns up over nineteen million results. That’s more than a similar search for “Oracle”, and nearly one-fifth as many as for “Microsoft”, the king of proprietary software. This article will describe what open-source software is, and examine a few of the products that are out there on the open-source landscape.
What Is Open-Source Software?
According to the Open Source Initiative, http://www.opensource.org, Open-source software is any software that is distributed to the developer community with the source code. Traditionally, software vendors distribute only the binaries for their products, leaving developers in the dark as to the inner workings of the products they use. Because open-source software vendors distribute the source code, developers can readily improve the product by creating patches for problems, or by making enhancements.
Although most open-source software is free to the community, there are sometimes strings attached. Let’s take a look at the most common types of licensing for open-source software.
Licensing
GNU General Public License (GPL)
Under the General Public License, or GPL, the licensee is free to distribute or modify the software product, provided that the modifications, both in binary and source code form, remain free to the public under the terms of the GPL. In addition, the licensee must provide all build scripts, interface definitions, and installation scripts necessary to compile and install the program.
The GPL makes no provision for a warranty of any kind, and this fact must be displayed prominently in the source code, as well as on the user interface of the program, if applicable.
Works that contain the original software, or are derived from it, must fall under the GPL. Thus, no one is allowed to create proprietary software based on a GPL-licensed product, ensuring that free software remains free.
Anyone who modifies and redistributes software under the GPL must prominently display the fact that the new version is different than the older version. Thus, if a new version of the product is not up to par, that fact won’t tarnish the reputation of the original product.
GNU Lesser General Public License (LGPL)
The Lesser General Public License (LGPL), the successor to the GNU Library Public License, is the GPL equivalent for open-source software libraries. Under this license, the libraries themselves fall under provisions similar to the GPL. The libraries themselves, and any works directly based on them, in both source code and binary form, must be freely available to the public. However, unlike the GPL, programs that simply use the libraries can be excluded from the terms of the LGPL, and can be proprietary.
Because LGPL is less restrictive, free software developers who license software under it’s terms enjoy less of an advantage over those who are paid for their software. However, LGPL still provides some benefits to the open-source community. First, by providing libraries under LGPL rather than GPL, free software developers can encourage the use of their libraries as a default standard in the industry. Secondly, encouraging the use of the libraries may help to speed the growth of the use of other free software, such as operating systems, even by paid software vendors.
Mozilla Public License
The Mozilla Public License, or MPL, is very similar to the GPL in spirit, with a few exceptions. First of all, this license is more precise, from a legal standpoint, and addresses specifically some areas, such as intellectual property rights, license termination, government licensing, and liability issues, that are not explicitly mentioned in the GPL.
Unlike GPL, the MPL allows the licensee to create larger works that include the licensed software, while still allowing those programs to be distributed for payment, as long as the portion of the code licensed under Mozilla is still covered under the MPL.
Many developers are now embracing this more strictly-defined license for their open-source software, rather than continuing to use the GPL. The clause allowing larger works using the licensed product, as with the LGPL, is a boon for developers wishing to make their products a defacto industry standard.
Other Open-Source Licenses
There are many other licenses available to those who create open-source software. The BSD and MIT licenses were created to address products developed at the University of California and MIT, respectively. These licenses are very similar, and are available through the OSI as templates for those wishing to use them for their own open-source products.
Other licenses address open-source products based on software written by particular organizations. For example, the Apache and IBM licenses cover software written for the Apache Foundation and IBM, or products utilizing software developed by them.
The bottom line is that most open-source software is free, as long as you distribute any modifications you make under the terms of the original license agreement, and no warranty is provided, unless otherwise stated, by the developer of the product.
How Do Open-Source Developers Make Money?
If open-source software is free, how can any software developer hope to make money on it? They do so by providing value-added services and products that are not covered by the original licensing agreements.
Because open-source software is usually distributed “as-is”, software developers can earn income by providing warranty coverage for the products they develop.
Some vendors provide support for open-source products for a fee. Many provide various levels of support; just as commercial software vendors do, ranging in price and quality from email-only support up to on-site consulting and troubleshooting.
Open-source software is largely developed by volunteers. Developers are notorious for neglecting documentation, besides that found in the source code. Many open-source software developers provide excellent documentation for a nominal fee. Most provide documentation in electronic form, but some will provide hard-copy versions for a larger fee.
Although open-source licenses guarantee licensees the right to receive source code for the products licensed, they do allow licensors to charge for such distributions. Most open-source software, and it’s associated source code, can be downloaded from the Internet free of charge, but there are many vendors who will provide distributions, sometimes with additional documentation or other bonuses, on CD for a fee.
Now that we have defined open-source software and discussed how it is licensed, let’s have a look at some of the products available to developers.
Operating Systems
Linux
By far, the eight-hundred-pound gorilla of the open-source operating system community is Linux. This UNIX-like operating system is estimated to have up to 27 million users worldwide, according to the OSI, although it is practically impossible to verify exactly how many users are out there. Developed by Linus Torvald as an academic project in the early 1990s, Linux has now grown into a corporate tool used by fortune 500 companies, and distributed on servers by prominent server vendors, such as IBM.
Linux comes in many flavors, including Redhat, Mandrake, Slackware, Turbo Linux, SuSE, and others. By far, the most popular version is Redhat, which is widely distributed by commercial software vendors. This flavor of Linux, unlike some of the others, offers excellent support and documentation, and enjoys the largest following of any other Linux flavor.
Remaining true to it’s MINIX roots, Linux still has a command-line interface; however, open-source desktops, such as Gnome and KDE, that provide Windows-like functionality are available.
With the advent of open-source productivity tools, such as StarOffice and OpenOffice, Linux is even beginning to penetrate the desktop market. The number of applications available for Linux is growing, attracting many who were formerly Windows users out of necessity, but who can now find the programs they need in the open source community.
One downside to Linux, according to an article in Internet Week, is the lack of quality development tools for the platform. According to a survey by research firm Evans Data Corp., 25% of developers surveyed said that critical development tools, such as compilers, are only adequate, or need work, on Linux. Despite this, Linux’s stability and dependability have attracted a huge following.
For more information on Linux, visit http://www.linux.org.
BSD
BSD is a UNIX-based operating system descended from code originally written at UC Berkeley. Although it is not as popular as it once was, due to the arrival of Linux, BSD and it’s many flavors are still a notable presence on the open source scene.
Various flavors of BSD still exist, including FreeBSD, NetBSD, OpenBSD, and Darwin. The momentum behind BSD has decreased sharply due to the disappearance of several of the companies that provided support and patches for the various operating systems. Also, the popularity of Linux has led more developers to write code for that operating system, leaving BSD languishing.
For more information about BSD, visit http://www.bsd.org/.
Web Servers and Application Servers
Apache HTTP Server
The Apache HTTP server is, by far, the most popular web server on the World Wide Web. According to a Netcraft survey (http://news.netcraft.com/archives/web_server_survey.html), Apache leads the market with over 67% of web servers running Apache. This compares with only 21% for Microsoft, 3% for the Sun One server, and 1.6% for Zeus.
Apache was originally created at the National Center for Supercomputing Applications, under a different name. When development on that HTTP server stopped in 1994, some of the major players in the project met to apply patches to the product and create a more stable release. This new release, which branched from the original NCSA server, became the most widely-used server on the web today, and is one of the open source movement’s biggest success stories.
For more information about the Apache HTTP server, visit http://www.apache.org/.
Tomcat
With the arrival of Java servlets and JSP on the scene, it was quite natural that the people who brought you the Web’s most popular web server would also build a servlet engine. Tomcat, which is Sun Microsystems’ acid test for the servlet and JSP specification, is a part of the Apache Jakarta Project, a project dedicated to providing open-source, Java-based software solutions. Tomcat is available as a module that is pluggable into the Apache modules framework, but may also be used for standalone servlet and JSP development. Tomcat does not support the full J2EE specification, but it is sometimes used as the default servlet container for other open-source products that do support the full spec.
For more information about Tomcat, visit http://jakarta.apache.org/tomcat.
JBoss
JBoss is an open-source application server that supports the full J2EE specification. As mentioned above, Jboss incorporates the Tomcat servlet container as part of its code. For a closer look at Jboss and its features, see “Jump Into Jboss” on this site, or visit http://www.jboss.org/.
Databases
Most enterprise solutions require some sort of database to store their data. However, the high licensing fees, complexity, security issues, and high cost of database administration for commercial databases, such as Oracle and MS SQL Server, have made open-source alternatives attractive to a growing number of businesses. The two major players in this arena are PostgreSQL and MySQL, which we’ll examine in the following sections.
PostgreSQL
PostgreSQL is the open-source descendant of the Postgress database, created by the University of California at Berkeley. It incorporates many features found in commercial databases that developers have come to expect, such as complex queries, triggers, foreign keys, views, transactional integrity, and replication. Allowing developers to define new data types, functions, operators, aggregate functions, and even procedural languages can also extend the database. This last item is perhaps the most intriguing for developers accustomed to developing in PL/SQL, TCL, PERL, and Python, because PostgreSQL allows stored procedures to be implemented in these languages. Although all of the features of each language are not included in PostgreSQL, the feature sets for each are still powerful, familiar to those who have used them before, and, of course, always improving.
For more information about PostgreSQL, visit http://www.postgresql.org/.
MySQL
MySQL is a powerful but simple database geared toward applications with high transaction volumes and large quantities of data. This performance and scalability, however, come at the price of feature richness. MySQL does not currently support stored procedures or triggers, although both features are slated for version 5.0 and 5.1, respectively. Also, MySQL constrains developers to the SQL-92 specification, which lacks some of the SQL capabilities found in commercial databases.
The administration for MySQL is very simple. Administration is done within the MySQL command-line client, which is similar to SQL*Plus for Oracle. Also, numerous command-line utilities exist for performing various administrative functions, such as loading data, and repairing and analyzing tables.
Although MySQL lacks some features, it is capable of handling large volumes of data. According to the latest MySQL documentation, the database has been used in production with thousands of tables and millions of records without significant performance issues.
For more information about MySQL, visit http://www.mysql.org/.
Development Tools
CVS
Source code control is always an integral part of any development effort. The Concurrent Versioning System, or CVS, available at http://www.cvshome.org/, is the open-source standard for source code control. Virtually all open-source software projects make their code available via CVS repository.
CVS runs on numerous platforms, including Redhat Linux, Win32, Mac OS/X, and some versions of VMS. There is a cross-platform Java version available, as well.
CVS is usually used as a command-line tool, but various open-source GUI interfaces exist for several platforms. Also, CVS can interface directly with Ant (see below for details).
Ant
Who among us hasn’t struggled with “make” files? The amount of time spent figuring out whether a command didn’t execute because of a pesky leading space must have cost companies millions. In response to this problem, James D. Davidson, original author of Apache Tomcat, created a cross-platform build tool called Ant (“Another Neat Tool”) to overcome the problems with “make”, “jam”, and their cousins. The first release of Ant, version 1.1, came out in July of 2000, and has been a standby for Java developers ever since.
Ant automates the build and deployment process for Java developers, allowing tasks such as directory cleanup and creation, code check-out, compilation, file copying, JARring and WARring files, and even running Junit test cases. The complete functionality of Ant is too extensive to discuss here, but visiting http://ant.apache.org/ will provide more information. If the functionality provided in Ant is not sufficient, Ant allows developers to create their own custom tasks.
Eclipse
Eclipse is a cross-platform, Java-based development environment. Although the creators of Eclipse originally developed this environment with Java in mind, Eclipse provides a plug-in interface that allows the IDE to be used for other languages, as well. Currently, the official Eclipse Tools Project has plug-ins for C/C++, COBOL, and, of course, Java, but a host of community-based plug-ins exist for other languages and development tools, as well as for JSP, servlet, and Struts development.
Currently, Eclipse is available on various Linux flavors, Win32, and Solaris, but, provided that a Java Virtual Machine exists for your platform, you may still be able to use this IDE.
More information can be found at http://www.eclipse.org
Programming Languages
Perl
PERL, sometimes called “the duct tape of the Internet”, is perhaps the granddaddy of all open-source programming languages. Larry Wall developed it back in 1987. Now, according to http://www.perl.org, PERL has over a million users, and is a widely-used tool for Common Gateway Interface (CGI) scripting, which allows web servers to extend their processing capabilities beyond simply serving up web pages.
PERL is an interpreted language, and thus can run on any operating system that has a PERL interpreter. It includes many useful features, including extremely powerful regular-expression and string manipulation functions, database connectivity, and the ability to link to C/C++ native libraries. Developers can write PERL programs either procedurally, or using the object-oriented paradigm.
Python
Python is a cross-platform, object-oriented, extensible scripting language used widely for Internet development. It supports features such as regular expressions, advanced string processing features, Internet protocols (HTTP, SMTP, POP, and so forth), unit testing, logging, Python language parsing, and operating system calls in the standard language libraries. The language itself is extensible in either C or C++.
Python’s author, Guido van Rossum, originally wrote Python in response to complaints he had about an interpreted language he used back in 1989. Subsequently, he began development of the Python programming language to address these complaints. The first official release of Python came out in 1991, and has gained popularity as a CGI language in recent years.
More information can be found at http://www.python.org
PHP
PHP is a powerful, object-oriented scripting language primarily used for generating dynamic web pages. The language combines syntactic features from C, Java, and Perl, and has extensive string processing, Internet protocol, and databases. PHP is also easily extendable, with thousands of modules available in the PHP community.
PHP was originally developed as a Perl extension by Rasmus Lerdorf to fulfill his own personal web site needs. After some additional development, Lerdorf realized the potential for the language, and, along with others, began the development of PHP as it exists today. According to http://www.php.net/, PHP powers millions of Internet sites, accounting for up to 20% of all sites on the Web.
Testing
JUnit
Unit testing has always been a necessary evil for software developers. Now, Junit, a unit-testing tool for Java developers, can take some of the drudgery out of this important task.
JUnit is basically a Java framework for creating assertions (test cases), sharing test data, running test suites automatically, and verifying the results. By extending JUnit classes, developers can test their business logic by simply running the Junit test suite, just like any other Java program. Since developers code the expected outcome into each test case, there is no human intervention required to determine which test cases passed, and which failed.
For more information about Junit, visit http://www.junit.org/.
Bugzilla
No matter how carefully development is done, bugs always creep into the system. Bugzilla is an open-source defect tracking system based on Perl that incorporates many of the features of expensive commercial products. Some features include inter-bug dependency graphing, advanced reporting capabilities, a stable RDBMS back-end, support for email, XML, console, and HTTP APIs, and integration with various version management systems, including CVS.
Bugzilla still has some challenges ahead. Most notably, future enhancements include adding more flexibility to the bug-reporting portion of the application, as well as addressing performance problems in a few areas. Still, Bugzilla is a robust, widely used defect tracking system that is fast becoming an industry standard.
For more information about Bugzilla, visit http://www.bugzilla.org/.
Cactus
While Junit is used to test business logic, Cactus, part of the Apache Jakarta project, is used to test server-side Java code, such as servlets, JSPs, EJBs, etc. This framework actually extends the capabilities of Junit, so that appropriate HTTP requests are sent to the server, where the processing occurs, and is then sent back to Cactus as server-side output. Developers must extend the Cactus classes, as well as writing Junit test suites, to perform their tests.
For more information about Cactus, visit http://jakarta.apache.org/cactus.
End-User Applications
OpenOffice
The OpenOffice productivity suite is an open-source alternative to the Microsoft Office productivity suite. Although the default format for OpenOffice documents is XML, OpenOffice can read and write documents in MS Office-compatible formats, as well as exporting to PDF and Multimedia Flash formats. OpenOffice is also compatible with Sun Microsystems’ StarOffice suite, a commercial offering based on OpenOffice 1.1 with a few more bells and whistles.
OpenOffice includes a word processor, spreadsheet, presentation tool, drawing tool, and a database tool. Additionally, OpenOffice supports a variety of languages, and allows vertical and bi-directional writing, which are common in languages such as Hebrew and Japanese.
OpenOffice also supports user macros, which can be used to perform simple, repetitive tasks. For more complex tasks, users can extend OpenOffice with the OpenOffice SDK, using a wide variety of programming languages, such as C++, Java, Basic, OLE, and XML. Third-party vendors can further extend OpenOffice’s capabilities by implementing tools using the OpenOffice Add-On Framework.
Web Browsers and Email
Mozilla
Mozilla is arguably the most popular open-source web browser/email program. It offers features similar to the Netscape Communicator suite (in fact, Netscape 7.x is based on Mozilla). These features include a web browser, email and newsgroup reader, web page composer, address book, and chat program.
Mozilla Firebird is another open-source browser, based on the Mozilla code base. Firebird is not simply a standalone version of the Mozilla browser included in the Mozilla suite; the user interfaces, as well as some features, are significantly different.
Likewise, the Mozilla Thunderbird email reader is an open-source mail reader based on the Mozilla codebase. The user interface and customization options are different than those offered by the Mozilla email client.
For more information about these products, visit http://www.mozilla.org/.
Amaya
Amaya is an open-source browser/editor offered by the W3C. The project began as an HTML 4.0/CSS browser and editor, but has since expanded to include XML, XHTML, MathML, and SVG. The ultimate goal is to incorporate as many technologies endorsed by the W3C as possible into the product.
For more information about Amaya, visit http://www.w3.org/amaya.
Where to Get Open-Source Software
There are many resources for collections of open-source products. Below is a brief list of some of the most popular sites.
- http://www.apache.org/ – The home site for the Apache HTTP server, as well as the Jakarta Project, a great resource for Java frameworks and development tools.
- http://www.sourceforge.net/ – A hosting site for open-source developer tools.
- http://www.osdir.com/ – Directory of everything and anything open-source. This is probably the best site to visit if you wish to explore what’s out there.
- http://www.freshmeat.net/ – Numerous open-source products, including some not specifically geared towards development.
Conclusion
In this article, we have discussed what open-source software is, what’s out there, and where to find it. The open-source software products discussed here are only the tip of the iceberg. There is a vast world of open-source software out there to be explored, free for the taking.
About the Author
David Thurmond is a Sun Certified Developer with over eleven years of
software development experience. He has worked in the agriculture,
construction equipment, financial, and home improvement industries.
Development Tools
CVS
Source code control is always an integral part of any development effort. The Concurrent Versioning System, or CVS, available at http://www.cvshome.org/, is the open-source standard for source code control. Virtually all open-source software projects make their code available via CVS repository.
CVS runs on numerous platforms, including Redhat Linux, Win32, Mac OS/X, and some versions of VMS. There is a cross-platform Java version available, as well.
CVS is usually used as a command-line tool, but various open-source GUI interfaces exist for several platforms. Also, CVS can interface directly with Ant (see below for details).
Ant
Who among us hasn’t struggled with “make” files? The amount of time spent figuring out whether a command didn’t execute because of a pesky leading space must have cost companies millions. In response to this problem, James D. Davidson, original author of Apache Tomcat, created a cross-platform build tool called Ant (“Another Neat Tool”) to overcome the problems with “make”, “jam”, and their cousins. The first release of Ant, version 1.1, came out in July of 2000, and has been a standby for Java developers ever since.
Ant automates the build and deployment process for Java developers, allowing tasks such as directory cleanup and creation, code check-out, compilation, file copying, JARring and WARring files, and even running Junit test cases. The complete functionality of Ant is too extensive to discuss here, but visiting http://ant.apache.org/ will provide more information. If the functionality provided in Ant is not sufficient, Ant allows developers to create their own custom tasks.
Eclipse
Eclipse is a cross-platform, Java-based development environment. Although the creators of Eclipse originally developed this environment with Java in mind, Eclipse provides a plug-in interface that allows the IDE to be used for other languages, as well. Currently, the official Eclipse Tools Project has plug-ins for C/C++, COBOL, and, of course, Java, but a host of community-based plug-ins exist for other languages and development tools, as well as for JSP, servlet, and Struts development.
Currently, Eclipse is available on various Linux flavors, Win32, and Solaris, but, provided that a Java Virtual Machine exists for your platform, you may still be able to use this IDE.
More information can be found at http://www.eclipse.org
Programming Languages
Perl
PERL, sometimes called “the duct tape of the Internet”, is perhaps the granddaddy of all open-source programming languages. Larry Wall developed it back in 1987. Now, according to http://www.perl.org, PERL has over a million users, and is a widely-used tool for Common Gateway Interface (CGI) scripting, which allows web servers to extend their processing capabilities beyond simply serving up web pages.
PERL is an interpreted language, and thus can run on any operating system that has a PERL interpreter. It includes many useful features, including extremely powerful regular-expression and string manipulation functions, database connectivity, and the ability to link to C/C++ native libraries. Developers can write PERL programs either procedurally, or using the object-oriented paradigm.
Python
Python is a cross-platform, object-oriented, extensible scripting language used widely for Internet development. It supports features such as regular expressions, advanced string processing features, Internet protocols (HTTP, SMTP, POP, and so forth), unit testing, logging, Python language parsing, and operating system calls in the standard language libraries. The language itself is extensible in either C or C++.
Python’s author, Guido van Rossum, originally wrote Python in response to complaints he had about an interpreted language he used back in 1989. Subsequently, he began development of the Python programming language to address these complaints. The first official release of Python came out in 1991, and has gained popularity as a CGI language in recent years.
Find more information at http://www.python.net/
PHP
PHP is a powerful, object-oriented scripting language primarily used for generating dynamic web pages. The language combines syntactic features from C, Java, and Perl, and has extensive string processing, Internet protocol, and databases. PHP is also easily extendable, with thousands of modules available in the PHP community.
PHP was originally developed as a Perl extension by Rasmus Lerdorf to fulfill his own personal web site needs. After some additional development, Lerdorf realized the potential for the language, and, along with others, began the development of PHP as it exists today. According to http://www.php.net/, PHP powers millions of Internet sites, accounting for up to 20% of all sites on the Web.
Testing
JUnit
Unit testing has always been a necessary evil for software developers. Now, Junit, a unit-testing tool for Java developers, can take some of the drudgery out of this important task.
JUnit is basically a Java framework for creating assertions (test cases), sharing test data, running test suites automatically, and verifying the results. By extending JUnit classes, developers can test their business logic by simply running the Junit test suite, just like any other Java program. Since developers code the expected outcome into each test case, there is no human intervention required to determine which test cases passed, and which failed.
For more information about JUnit, visit http://www.junit.org/.
Bugzilla
No matter how carefully development is done, bugs always creep into the system. Bugzilla is an open-source defect tracking system based on Perl that incorporates many of the features of expensive commercial products. Some features include inter-bug dependency graphing, advanced reporting capabilities, a stable RDBMS back-end, support for email, XML, console, and HTTP APIs, and integration with various version management systems, including CVS.
Bugzilla still has some challenges ahead. Most notably, future enhancements include adding more flexibility to the bug-reporting portion of the application, as well as addressing performance problems in a few areas. Still, Bugzilla is a robust, widely used defect tracking system that is fast becoming an industry standard.
For more information about Bugzilla, visit http://www.bugzilla.org/.
Cactus
While JUnit is used to test business logic, Cactus, part of the Apache Jakarta project, is used to test server-side Java code, such as servlets, JSPs, EJBs, etc. This framework actually extends the capabilities of JUnit, so that appropriate HTTP requests are sent to the server, where the processing occurs, and is then sent back to Cactus as server-side output. Developers must extend the Cactus classes, as well as writing JUnit test suites, to perform their tests.
For more information about Cactus, visit http://jakarta.apache.org/cactus.