August 20, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Apache Server on Windows

  • March 15, 2000
  • By Rich Bowen
  • Send Email »
  • More Articles »

History

Before Apache was available on Windows, you had basically two choices. You could use IIS, which was pretty pathetic, especially in those days. Or you could use Netscape server. The Netscape servers were pretty good, because the folks that were working on them (some of them at least) were the same folks that worked on the NCSA server. But as they made the servers fancier and fancier, they become harder and harder to configure and use.

Finally, in October of 1997 Apache 1.3 came out, with support for Windows NT and Windows 95. And there was much rejoicing. I started running Apache on my NT machines about 2 days after that.

Apache on Windows is "entirely experimental."

The Apache documentation contains the following warning:

"Windows support is entirely experimental, and is recommended only for experienced users."

This is quickly followed by:

"Apache on NT has not yet been optimized for performance. Apache still performs best, and is most reliable on Unix platforms."

And one can certainly not argue with those assertions. I'd take Apache on my Linux machines over Apache on NT any day. However, the reasonable thing to compare it to is not Apache on Unix, but IIS on NT, since that is really what the choice is.

Since late in 1997, when I started running Apache on NT, I've had a sneaking feeling that Apache was outperforming IIS, but I did not have any real evidence of this. And everything that I read about this seemed to say that I was way off base, and that IIS was much faster.

In December, 1999 I did some of my own testing. I am running Apache and IIS on the same server, for reasons that I won't go into here. So, the hardware is identical, and the server load is identical. I did some benchmarking on a day immediately after Christmas, when hardly anyone was in the office - this is an internal server - so server load was not a factor. And I did the tests over a 100MB switched network, so network latency was not really an issue either. But, regardless of that, conditions were identical for each HTTP server.

Using a simple Perl program, I tested performance on the two servers. First, I just fetched a 1K HTML document. Next, I got a simple "helloworld" Perl CGI program, to test CGI performance. The results were pleasing:

D:\Apachecon>perl benchmark.pl
GET


Benchmark: timing 2000 iterations of Apache, IIS...
    Apache: 34 wallclock secs ( 6.71 usr +  4.42 sys = 11.13 CPU)
       IIS: 31 wallclock secs ( 6.75 usr +  4.41 sys = 11.16 CPU)
CGI


Benchmark: timing 2000 iterations of Apache, IIS...
    Apache: 62 wallclock secs ( 6.16 usr +  4.13 sys = 10.28 CPU)
       IIS: 65 wallclock secs ( 6.31 usr +  4.02 sys = 10.33 CPU)


D:\Apachecon>perl benchmark.pl
GET


Benchmark: timing 2000 iterations of Apache, IIS...
    Apache: 34 wallclock secs ( 6.50 usr +  4.55 sys = 11.05 CPU)
       IIS: 31 wallclock secs ( 6.65 usr +  4.43 sys = 11.08 CPU)
CGI


Benchmark: timing 2000 iterations of Apache, IIS...
    Apache: 63 wallclock secs ( 5.73 usr +  3.82 sys =  9.54 CPU)
       IIS: 64 wallclock secs ( 6.15 usr +  3.97 sys = 10.12 CPU)

If you're not familiar with the way that the Perl Benchmark module does things, what those results mean is that Apache consistently outperformed IIS on the two simplest things that a web server does -serving HTML pages, and executing CGI programs. Yes, I ran more than just the two tests. These are just sample results. And clearly, a .03 second difference over 2000 iterations is not a huge difference, but it was gratifying to see that Apache consistently came out ahead.

In case you'd like to repeat the tests for yourself, here's the code that I used:

use Benchmark;
use LWP::Simple;


print "GET\n\n";
timethese(2000, {
        'Apache' => 'get ("http://servername/cgi-bin/test.pl";)',
        'IIS' => 'get ("http://servername:90/scripts/test.pl";)',
});


print "CGI\n\n";
timethese(2000, {
        'Apache' => 'get ("http://servername/cgi-bin/test.pl";)',
        'IIS' => 'get ("http://servername:90/scripts/test.pl";)',
});

Now, IIS advocates will say that CGI is not the way that you are supposed to do things on IIS. As of this writing, I don't yet know how to do stuff in ASP, so I can't say how it compares to using an Apache scripting language like Perl. However, the Microsoft claim has always been that IIS is substantially faster than Apache, and yet it can't even serve an HTML document faster, on its own native platform.

So, while it is certainly true that Apache performs better on Unix than on NT, it is no slouch on NT, and compares very well to its competition.

Apache or IIS?

When you install NT, there's IIS, along for the ride. So it is, in a sense, free. So why not use it instead of Apache? Allow me to suggest three reasons.

Configuration: There are a very limited number of things that you can configure. The configuration GUI, while moderately easy to use, simply does not give you the range of configuration options that you get with Apache. Apache assumes that you might want to configure everything, and makes it possible to do that.

Impossible to extend: IIS has nothing analogous to Apache's modules, which would let you extend the behavior of the server. Obviously, this is a limitation of being closed-source, so I expect that this would be difficult to get around, even if Microsoft was interested in doing so.

Authentication: IIS uses NT accounts to manage authenticated access to the web server. This might be a good idea on an Intranet (although, personally, I don't think so) but is a really bad idea on a web server out on the internet. Sure, you can lock down NT accounts so that they don't have access to stuff, but creating NT user accounts, and then giving people out on the internet those logins, seems like a recipe for getting hacked. And that's in addition to the nightmare of managing user accounts.

Apache and IIS are by no means the only choices for running a Webserver on NT. WebServer Compare lists 23 HTTP servers that run on NT, and I'm aware of at least one that they missed. To me, Apache is the clear choice, but take a look at the list. Perhaps something there will better meet your specific needs.

Differences between Apache Unix and Apache NT

There are some differences between Apache on Unix and Apache on NT. While you may not care too much about the differences in the actual way that the code works, you probably will want to know about the differences in the way that you configure and use the server.

Threading vs. forking

One of the biggest differences between Apache on Unix and Apache on NT is between threading and forking.

On Unix, Apache forks multiple child processes, each of which listens for and serves requests, maintaining contact with the parent process. After a while, a child will die off, and the parent process will fork a new process to take its place. There are a number of configuration directives that let you control this behavior - how many processes are forked, the maximum, and minimum, number of processes that can be going at any one time, how long a process is allowed to live, and so on.

On NT, there's no such thing as forking, and so this had to be handled differently. There are two Apache processes running on your NT machine. One of them is the parent process, and the other is the child process that actually handles requests. Within this child process, there are multiple threads, which can serve requests simultaneously. There can be a large number of threads at the same time, and Apache creates additional threads, as necessary, in much the same way that it forks new child processes on Unix.

Whether forking or threading is better, for some value of "better," is a discussion for another day. There are strong opinions on either side of the argument, and this is rather beyond the scope of what we're talking about.

Unix and Windows refer to files differently. Windows has the notion of drive letters, which Unix systems don't have. And, by whatever twist of history, Unix uses forward slashes (/) to denote directories, while Windows/DOS uses back slashes (\).

The general rules are this:

You don't have to specify the drive letter if the resource is on the same drive letter as your ServerRoot. So, if, for example, you have

ServerRoot "c:/httpd"
Then you do not need to specify drive letters in other directives
Alias perldocs /perl/html
But you can if you want to
Alias perldocs C:/perl/html
And, as with Apache on NT, relative paths are assumed to be relative to the ServerRoot, and so don't require any disk path.
PidFile logs/httpd.pid
You should use quotes around any disk path with spaces in it. You should probably avoid using spaces in disk paths, on general principle, but Apache handles them correctly, if you feel the need to use them.

There was a bug in earlier versions of Apache that caused problems when trying to run CGI programs from paths with spaces in them, which was especially irritating, because Apache installs in C:\ProgramFiles\Apache Group\Apache That was when I developed the habit of installing in C:\httpd, which has stuck with me, even though the bug has gone away since then.

Use forward slashes, or back slashes - whichever makes you happiest. You may find some strange places where back slashes appear to not work for you, and you can avoid most of these by using forward slashes, but in most cases, you can use whatever you're used to.

For example:

alias /perldocs c:\perl\html
worked, and
alias /perldocs /perl/html
worked, but
alias /perldocs \perl\html
did not. I did not really think that this was a bug, since ordinary users would probably not ever use that notation, so this probably does not actually matter to anyone.

NT-specific configuration directives

Because of the different ways that things are handled on Unix and NT, there are some configuration directives that are specific to NT, there are some other directives that don't mean anything on NT, and there are some directives for which the recommended values are different on NT, for whatever reason.

AccessConfig and ResourceConfig In recent versions of Apache, the old 3-file configuration system has given way to just having one configuration file. There are files called srm.conf and access.conf, but they contain just comments. However, Apache still opens them up when the server starts, and looks for directives. You can tell Apache not to do this, by setting the<#item_AccessConfig>AccessConfig and ResourceConfig directives to point at /dev/null on Unix. The equivalent of this on Windows is nul. Every directory contains an imaginary file called nul which acts like /dev/null -that is, it's just a bit bucket.

AccessConfig nul
ResourceConfig nul
nul can also be used for directives like AuthGroupFile, and anywhere else that you might use /dev/null on Unix

AccessFileName The default value for this directive on Unix is .htaccess NT does not care much for filenames that start with a period (.) so the recommended value for this on NT is htaccess. You can get away with using .htaccess if you like, but you might find that some editors will have trouble reading these files, and you might have difficulty actually naming a file that. When I tried to rename a text file to .htaccess in Windows Explorer, I got a dialog box that said "You must type a filename."

MaxSpareServers, MinSpareServers These directives refer to the number of child processes that should be left lying around idle before they are killed. This is of course meaningless on Windows, and so these directives have no effect on Windows.

MaxSpareServers, MinSpareServers On Unix, the first line of a script file, such as a shell script, or a Perl program, indicates what interpreter is to be used to run the program. That line will look something like:

#!/usr/bin/perl
or
#!/bin/sh
Windows has no concept of looking for the #! ("shebang") line, but does everything based on the file extension.

Apache lets you do things either way, as you like. If you are aiming for platform independence, it is very handy to be able to use the #! line, so that you can run your CGI programs on your Windows machine, and also on your Unix machine. But, if you are developing CGI programs only for Windows, then perhaps this does not matter to you, and you want it to execute based on the file extension.

The <#item_ScriptInterpreterSource>ScriptInterpreterSource directive tells Apache which one of these you prefer. The default value, script, tells Apache to look for a #! line to find out what interpreter to use to execute the program. Setting it to registry will cause Apache to look in the registry for the association between the file extension and a script interpreter to run it.

Tip: If you will be using Perl CGI programs, and want to maintain some level of portability between your Unix machines and your NT machines, you will want to install Perl in a location on your NT machine that is the same as on your Unix machines. For example, on my Linux machines, Perl is located at /usr/bin/perl and so every Perl program that I write begins with #!/usr/bin/perl So, when I install Perl on an NT machine, instead of installing it in the default location (which is c:\perl for ActivePerl I install it in C:\usr so that the Perl executable is located at /usr/bin/perl. This allows me to write code on my Windows machine, and then move it, without changes, to my Linux machine, and have it run there. And vice versa.

ScriptInterpreterSource

On Unix, the first line of a script file, such as a shell script, or a Perl program, indicates what interpreter is to be used to run the program. That line will look something like:

#!/usr/bin/perl
or
#!/bin/sh
Windows has no concept of looking for the #! ("shebang") line, but does everything based on the file extension.

Apache lets you do things either way, as you like. If you are aiming for platform independence, it is very handy to be able to use the #! line, so that you can run your CGI programs on your Windows machine, and also on your Unix machine. But, if you are developing CGI programs only for Windows, then perhaps this does not matter to you, and you want it to execute based on the file extension.

The <#item_ScriptInterpreterSource>ScriptInterpreterSource directive tells Apache which one of these you prefer. The default value, script, tells Apache to look for a #! line to find out what interpreter to use to execute the program. Setting it to registry will cause Apache to look in the registry for the association between the file extension and a script interpreter to run it.

Directives that don't work on NT (or, at least, work differently)

Many of the directives that work differently under NT are the ones you'd expect - notably the ones dealing with forking child processes. As discussed above, NT does not use fork, but uses threads, to accomplish what Unix-type systems use fork for.

There are some other directives that you might want to use, but which you will need to think about a little differently on NT.

UserDir

Users on NT have user directories, but, as the comments in httpd.conf note,

# Under Win32, we do not currently try to determine the 
# home directory of a Windows login, so a format such as 
# that below needs to be used.  See the UserDir 
# documentation for details.
#
UserDir "C:/httpd/users/"

What this means is that you should just use <#item_UserDir>UserDir in the normal way, except that you must specify an absolute path.

Under Unix, specifying a relative path causes Apache to append that path to the home directory of the user specified. That is, if the URL requested is http://www.rcbowen.com/~rbowen and <#item_UserDir>UserDir is set to 'public_html', Apache will attempt to serve files from /home/rbowen/public_html.

However, on Win32, there's no nice way to figure out a user's home directory, so that syntax is unavailable.

XBitHack

XBitHack is an enormously useful directive that turns on SSI processing on files with the execute bit set. This allows a happy medium between changing filenames to something.shtml and having every .html file parsed for SSI directives.

Alas, NT does not have an execute bit to set. So, this wonderful directive does not work.

Authentication

This is not a big difference, but it is worth mentioning. On Unix, the default encryption scheme used for the password files used for HTTP authentication is Unix crypt. On Windows, it is MD5.

In earlier versions of Apache on Windows, the password files were actually plain text, and you will still find online documentation that says that this is still the case. Ignore it. Apache for Windows comes with an htpasswd.exe utility that works exactly like the htpasswd utility on Unix, for creating password files. Or you can use modules from the PerlHTTPD-User-Manage package to manage your password files.





Page 1 of 2



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel