Open Source The Myth of Open Source Security Revisited v2.0

The Myth of Open Source Security Revisited v2.0

This article is a followup to an article entitled The Myth of Open
Source Security Revisited
. The original article tackled the
common misconception amongst users of Open Source Software(OSS) that OSS is a
panacea when it comes to creating secure software. The article presented
anecdotal evidence taken from an article
written by John Viega, the original author of GNU Mailman, to illustrate its
point. This article follows up the anecdotal evidence presented in the original
paper by providing an analysis of similar software applications, their
development methodology and the frequency of the discovery of security
vulnerabilities.

The purpose of this article is to expose the fallacy of
the belief in the “inherent security” of Open Source software and instead point
to a truer means of ensuring the quality of the security of a piece software is
high.

Apples, Oranges, Penguins and Daemons

When
performing experiments to confirm a hypothesis on the effect of a particular
variable on an event or observable occurence, it is common practice to utilize
control groups. In an attempt to establish cause and effect in such experiments,
one tries to hold all variables that may affect the outcome constant except for
the variable that the experiment is interested in. Comparisons of the security
of software created by Open Source processes and software produced in a
proprietary manner have typically involved several variables besides development
methodology.

A number of articles have been written that compare the
security of Open Source development to proprietary development by comparing
security vulnerabilities in Microsoft products to those in Open Source products.
Noted Open Source pundit, Eric Raymond wrote an article on
NewsForge
where he compares Microsoft Windows and IIS to Linux, BSD and
Apache. In the article, Eric Raymond states that Open Source development implies
that “security holes will be infrequent, the compromises they cause will be
relatively minor, and fixes will be rapidly developed and deployed.”. However,
upon investigation it is disputable that Linux
distributions have less frequent
or more minor security vulnerabilities when
compared to recent versions of Windows. In fact the belief in the inherent
security of Open Source software over proprietary software seems to be the
product of a single comparison, Apache versus Microsoft IIS.

There are a
number of variables involved when one compares the security of software such as
Microsoft Windows operating systems to Open Source UNIX-like operating systems
including the disparity in their market share, the requirements and
dispensations of their user base, and the differences in system design. To
better compare the impact of source code licensing on the security of the
software, it is wise to reduce the number of variables that will skew the
conclusion. To this effect it is best to compare software with similar system
design and user base than comparing software applications that are significantly
distinct. The following section analyzes the frequency of the discovery of
security vulnerabilities in UNIX-like operating systems including HP-UX,
FreeBSD, RedHat Linux, OpenBSD, Solaris, Mandrake Linux, AIX and Debian
GNU/Linux.

Security Vulnerability Face-Off

Below is a
listing of UNIX and UNIX-like operating systems with the number of security
vulnerabilities that were discovered in them in 2001 according to the Security Focus
Vulnerability Archive
.


AIX
10
vulnerabilities
[6 remote, 3 local, 1 both]
Debian GNU/Linux
13
vulnerabilities
[1 remote, 12 local] + 1 Linux kernel
vulnerability
[1 local]
FreeBSD
24
vulnerabilities
[12 remote, 9 local, 3 both]
HP-UX
25
vulnerabilities
[12 remote, 12 local, 1 both]
Mandrake Linux
17
vulnerabilities
[5 remote, 12 local] + 12 Linux kernel
vulnerabilities
[5 remote, 7 local]
OpenBSD
13
vulnerabilities
[7 remote, 5 local, 1 both]
Red Hat Linux
28
vulnerabilities
[5 remote, 22 local, 1 unknown] + 12 Linux kernel
vulnerabilities
[6 remote, 6 local]
Solaris
38
vulnerabilities
[14 remote, 22 local, 2 both]
From the above
listing one can infer that source licensing is not a primary factor in
determining how prone to security flaws a software application will be.
Specifically proprietary and Open Source UNIX family operating systems are
represented on both the high and low ends of the frequency distribution.


Factors that have been known to influence the security and quality of a
software application are practices such as code auditing (peer review),
security-minded architecture design, strict software development practices that
restrict certain dangerous programming constructs (e.g. using the str* or scanf*
family of functions in C) and validation &
verification
of the design and implementation of the software. Also reducing
the focus on deadlines and only shipping when the system the system is in a
satisfactory state is important.

Both the Debian and OpenBSD projects
exhibit many of the aforementioned characteristics which help explain why they
are the Open Source UNIX operating systems with the best security record.
Debian’s track record is particularly impressive when one realizes that the
Debian Potato consists of over 55 million lines of
code
(compared to RedHat’s
30,000,000 million lines of code
).

The Road To Secure
Software


Exploitable security vulnerabilities in a software
application are typically evidence of bugs in the design or implementation of
the application. Thus the process of writing secure software is an extension of
the process behind writing robust, high quality software. Over the years a
number of methodolgies have been developed to tackle the problem of producing
high quality software in a repeatable manner within time and budgetary
constraints. The most successful methodologies have typically involved using the
following software quality assurance, validation and verification techniques;
formal methods, code audits, design reviews, extensive testing and codified best
practices.

  1. Formal Methods: One can use formal proofs based on mathematical
    methods and rigor to verify the correctness of software algorithms. Tools for
    specifying software using formal techniques exist such as VDM and Z. Z
    (pronounced ‘zed’) is a formal specification notation based on set theory and
    first order predicate logic. VDM stands for “The Vienna Development Method”
    which consists of a specification language called VDM-SL, rules for data and
    operation refinement which allow one to establish links between abstract
    requirements specifications and detailed design specifications down to the
    level of code, and a proof theory in which rigorous arguments can be conducted
    about the properties of specified systems and the correctness of design
    decisions.The previous descriptions were taken from the Z FAQ and the VDM FAQ
    respectively. A comparison of both specification languages is available in
    the paper, Understanding
    the differences between VDM and Z
    by I.J. Hayes et al.


  2. Code Audits: Reviews of source code by developers other than the
    author of the code are good ways to catch errors that may have been overlooked
    by the original developer. Source code audits can vary from informal reviews
    with little structure to formal code inspections or walkthroughs. Informal
    reviews typically involve the developer sending the reviewers source code or
    descriptions of the software for feedback on any bugs or design issues. A
    walkthrough involves the detailed examination of the source code of the
    software in question by one or more reviewers. An inspection is a formal
    process where a detailed examination of the source code is directed by
    reviewers who act in certain roles. A code inspection is directed by a
    “moderator”, the source code is read by a “reader” and issues are documented
    by a “scribe”.


  3. Testing: The purpose of testing is to find failures. Unfortunately,
    no known software testing method can discover all possible failures that may
    occur in a faulty application and metrics to establish such details have not
    been forthcoming. Thus a correlation between the quality of a software
    application and the amount of testing it has endured is practically
    non-existent.

    There are various categories of tests including unit,
    component, system, integration, regression, black-box, and white-box tests.
    There is some overlap in the aforementioned mentioned testing categories.


    Unit testing involves testing small pieces of functionality of the
    application such as methods, functions or subroutines. In unit testing it is
    usual for other components that the software unit interacts with to be
    replaced with stubs or dummy methods. Component tests are similar to unit
    tests with the exception that dummmy and stub methods are replaced with the
    actual working versions. Integration testing involves testing related
    components that communicate with each other while system tests involve testing
    the entire system after it has been built. System testing is necessary even if
    extensive unit or component testing has occured because it is possible for
    seperate subroutines to work individually but fail when invoked sequentialy
    due to side effects or some error in programmer logic. Regression testing
    involves the process of ensuring that modifications to a software module,
    component or system have not introduced errors into the software. A lack of
    sufficient regression testing is one of the reasons why certain software
    patches break components that worked prior to installation of the patch.


    Black-box testing also called functional testing or specification
    testing test the behavior of the component or system without requiring
    knowledge of the internal structure of the software. Black-box testing is
    typically used to test that software meets its functional requirements.
    White-box testing also called structural or clear-box testing involves tests
    that utilize knowledge of the internal structure of the software. White-box
    testing is useful in ensuring that certain statements in the program are
    excercised and errors discovered. The existence of code coverage tools aid in
    discovering what percentages of a system are being excercised by the tests.


    More information on testing can be found at the comp.software.testing
    FAQ
    .


  4. Design Reviews: The architecture of a software application can be
    reviewed in a formal process called a design review. In design reviews the
    developers, domain experts and users examine that the design of the system
    meets the requirements and that it contains no significant flaws of omission
    or commission before implementation occurs.


  5. Codified Best Practices: Some programming languages have libraries
    or language features that are prone to abuse and are thus prohibited in
    certain disciplined software projects. Functions like strcpy,
    gets, and scanf in C are examples of library
    functions that are poorly designed and allow malicious individuals to use
    buffer overflows or format string attacks to exploit the security
    vulnerabilities exposed by using these functions. A number of platforms
    explicitly disallow gets especially since alternatives exist.
    Programming guidelines for such as those written
    by Peter Galvin in a Unix Insider article on designing secure software
    are
    used by development teams to reduce the likelihood of security vulnerabilities
    in software applications.
Projects such as the OpenBSD project that
utilize most of the aforementioned techniques in developing software typically
have a low incidence of security vulnerabilities.

Issues Preventing
Development of Secure Open Source Software


One of the assumptions
that is typically made about Open Source software is that the availability of
source code translates to “peer review” of the software application. However,
the anecdotal experience of a number of Open Source developers including John
Viega belies this assumption.

The term “peer review” implies an
extensive review of the source code of an application by competent parties. Many
Open Source projects do not get peer reviewed for a number of reasons including

  • complexity of code in addition to a lack of documentation makes it
    difficult for casual users to understand the code enough to give a proper
    review


  • developers making improvements to the application typically focus only on
    the parts of the application that will affect the feature to be added instead
    of the whole system.


  • ignorance of developers to security concerns.


  • complacency in the belief that since the source is available that it is
    being reviewed by others.

Also the lack of interest in
unglamorous tasks like documentation and testing amongst Open Source
contributors adversely affects quality of the software. However, all of these
issues can and are solved in projects with a disciplined software development
process, clearly defined roles for the contributers and a semi-structured
leadership hierarchy.

Benefits of Open Source to Security-Conscious
Users


Despite the fact that source licensing and source code
availability are not indicators of the security of a software application, there
is still a significant benefit of Open Source to some users concerned about
security. Open Source allows experts to audit their software options
before making a choice and also in some cases to make improvements without
waiting for fixes from the vendor or source code maintainer.

One should
note that there are constraints on the feasibility of users auditing the
software based on the complexity and size of the code base. For instance, it is
unlikely that a user who wants to make a choice of using Linux as a web server
for a personal homepage will scrutinize the TCP/IP stack code.


References

  1. Frankl, Phylis et al. Choosing a Testing Method to Deliver Reliability.
    Proceedings of the 19th International Conference on Software Engineering,
    pp. 68–78, ACM Press, May 1997. <
    http://citeseer.nj.nec.com/frankl97choosing.html
    >


  2. Hamlet, Dick. Software Quality, Software Process, and Software
    Testing.
    1994. < http://citeseer.nj.nec.com/hamlet94software.html>



  3. Hayes, I.J., C.B. Jones and J.E. Nicholls. Understanding the
    differences between VDM and Z.
    Technical Report UMCS-93-8-1, University of
    Manchester, Computer Science Dept., 1993. < http://citeseer.nj.nec.com/hayes93understanding.html
    >


  4. Miller, Todd C. and Theo De Raadt. strlcpy and strlcat – consistent,
    safe, string copy and concatenation.
    Proceedings of the 1999 USENIX Annual
    Technical Conference, FREENIX Track, June 1999. < http://www.usenix.org/events/usenix99/full_papers/millert/millert_html/
    >


  5. Viega, John. The Myth of Open Source Security. Earthweb.com. <
    http://www.earthweb.com/article/0,,10455_626641,00.html
    >


  6. Gonzalez-Barona, Jesus M. et al. Counting Potatoes: The Size of Debian
    2.2
    . < http://people.debian.org/~jgb/debian-counting/counting-potatoes/
    >


  7. Wheeler, David A. More Than A Gigabuck: Estimating GNU/Linux’s
    Size.
    < http://www.counterpane.com/crypto-gram-0003.html
    >



Acknowledgements

The following
people helped in proofreading this article and/or offering suggestions about
content: Jon Beckham, Graham Keith Coleman, Chris Bradfield, and David Dagon.


About the Author

Dare Obasanjo is a recent graduate of the Georgia Institute of Technology, with a degree with honors in computer science. (This article was written there.) The author is a vigorous participant in discussion forums such as Slashdot, Kuro5hin, and Advogato, on various aspects of software development. He has written numerous articles on the subject.



(c) 2002 Dare Obasanjo

Latest Posts

Related Stories