What's in a Web Log?
Lean over to the guy in the cube next to you and ask him what a web log is. It's OK, I'll wait. Did you get an answer that made sense? I tried a quick poll at a client recently and got answers that varied from "Huh?" to the windows event log to a blog (weB LOG). The tragedy of this is that if you're doing web development the web log is a repository for the information you may need to debug your problems.
What is a web log?
Every action taken by web servers is kept track of in a web log. Every page you request and every graphics image sent down the pike are dutifully recorded into a web log. A web log is in essence an audit trail of every request a web server responds to. Although configurable, most web logs include the page URL, the IP address of the client, and the status returned to the client. Other information is included in a web log; however, just these few pieces can help you understand what is going on with the web browser and what is happening on the server. Reviewing web logs will enable you to see all of the requests from the browser and which ones met with errors.
The HTTP status is one of the key pieces of information. Most of us pay little attention to this status code, which is a part of the HTTP standard. We have inevitably run into the 404 status code indicating the page wasn't found. Occasionally we've stumbled across a 401 error indicating an authentication problem. Of course, there's also the always favorite 500 status indicating an internal server error. However, there is a whole wealth of status codes that are most likely occurring on your server and which can provide useful information. A complete list of standard HTTP status codes can be found at http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
Here's a quick list of statuses you may want to know about:
- 100 - Continue: This is an intermediate message indicating to the client that the server is working on the request but it may take a few seconds to respond. This status does not often appear in the web log.
- 200 - Success: This is the most common status and indicates that the request was successful.
- 301 - Moved Permanently: The content exists at a new location permanently. The reference to the item should be changed. This status doesn't occur often in a Windows IIS world, but is used by some other web servers.
- 302 - Moved Temporarily: The content exists at a new location, but only temporarily. The client should continue to use the same reference to the content without changing. This is common when Response.Redirect is used in ASP or ASP.NET.
- 304 - Not Modified: The browser requested the content only if it had changed since a date provided by the client. This response indicates to the browser that the content hadn't changed and won't be retransmitted. This status indicates that the browser is working to prevent the retransmission of data - which is a good thing.
- 401 - Unauthorized: The page requires authorization the client does not possess. This can be caused by a lack of authentication with a specific user name or can indicate that the user doesn't possess authorization to the requested resource.
- 403 - Forbidden: Most generally, this error is seen when the web server is set to require HTTPS/SSL for access to a specific URL but the client hasn't requested the URL with HTTPS/SSL.
- 404 - Not Found: The page requested wasn't found on the web server; therefore, the request could not be fulfilled.
- 500 - Internal Server Error: A problem occurred processing the page, but the error occurred in the internal processing of the request. It doesn't indicate an issue with the browser request; instead, it indicates a problem with the server.
There are other status codes, but these are the most common ones you will run into. These status codes can show you how the conversation between the browser and the client are going. Knowing how to decode them can help you see where you may need to tune up the conversation between the browser and the server to improve performance. These codes can also tell you what pages or requests may be receiving errors.
Where can you find the web log?
Finding where a web log is stored isn't as easy as going to a specific directory, unless you're only using the default web site in IIS. If you are using the default web site, you'll find the logs in %WINDIR%\SYSTEM32\LogFiles\W3SVC1\. If not, you'll have to open up the administrative tool, Internet Information Services. Once launched, select the properties on the web site you need the log for. In the bottom section on logging, click the properties button. Concatenate the log file directory text box and the log file name label below to get the directory that the log files will be stored in. This is where you will find the web log files.
Web logs are generally stored in text files separated at an interval you set on the dialog referenced above. Because web logs are text files, you can open them up in notepad or another text editor to see what has happened. Here's what you can expect to see in a typical file:
Listing 1 - Web Log #Software: Microsoft Internet Information Services 5.1 #Version: 1.0 #Date: 2004-11-19 01:11:06 #Fields: time c-ip cs-method cs-uri-stem sc-status 01:11:06 127.0.0.1 GET /Default.htm 200
In this simple example, there are a series of header lines which begin with a number/pound sign (#). These tell you about the server that wrote the entries, the log standard being used, the date and time the headers were written, and the fields that will be written for each request. In this case, the file is logging the time (time), client IP address (c-ip), method used (cs-method), page location (cs-uri-stem), and the status.
Of these pieces of information, only the method may be somewhat unfamiliar at this point. In most cases, the request will be with a GET. This is the standard method for a web browser. You will, however, likely have some POST methods. POST methods are the way that most HTML forms are transmitted to the server. Other methods are allowed but are less frequently used.
You may notice that the entries for each request are a series of space delimited fields which match the list of fields provided in the #Fields header. Although this can be a bit hard to read with a large number of fields in a file with a large number of requests, with some practice you will find them readable. If you find it difficult to read, you can open the file in Excel as a delimited text file. You'll have to tell Excel that it is space delimited and all of the fields will be lined up into separate columns.
Using the information
Now that you know what a web log is, where to find it, and how to open it, it's important to consider some of the cases when it might be the way to track down the cause of your problem. Here are just a few scenarios where a web log might be useful in the debugging process:
- Users Reporting Random Errors - We've all had situations where a user says that they got an error but they didn't write it down and don't remember the exact error message, but they're sure it said error. In most cases, you can look through the web log for error responses (401, 404, and 500, for instance.) This coupled with looking for the IP address of the user can often illuminate what they were doing when the error occurred.
- Comparing when the user saw an error to internal operations which may have impacted it - Web servers don't work in a vacuum; they are often impacted by other servers, including database servers, which can adversely impact their ability to deliver pages. Web logs keep a complete record of what happened and when. This can be used to compare against windows event logs and the logging of other applications to see if a user's reported error started with the web server or ended up at the web server because the back end infrastructure weren't working.
- Slow Performance - A quick look at the logs can tell you how many requests it's servicing and therefore, tell you if there's an unusual number of people putting a heavy burden on the server. Alternatively, it may show that there are a lot of requests which are receiving errors, or even that the requests aren't making it to the server.
Web logs are an immensely powerful tool in the debugging process. Whether you're working on a large production site or a small development workstation, it's worth your time to get familiar with your logs. There are several tools designed for web log analysis, although many of them are designed for usage trending - not detailed analysis of trends and problems. Crowe Chizek has a web log parser program that converts all of the data in a set of web logs into a SQL database. This database is highly normalized enabling you to search for specific patterns, look for individual problems, or perform any analysis of the details that you need. SQL Server Analysis server reports are available to allow for the broader reporting analysis as well. Send me an email if you would like more information about the product or would like to share problems you've had with other web log analysis tools.
About the AuthorRobert Bogue, MCSE (NT4/W2K), MCSA, A+, Network+, Server+, I-Net+, IT Project+, E-Biz+, CDIA+ has contributed to more than 100 book projects and numerous other publishing projects. He writes on topics from networking and certification to Microsoft applications and business needs. Robert is a strategic consultant for Crowe Chizek in Indianapolis. Some of Robert's more recent books are Mobilize Yourself!: The Microsoft Guide to Mobile Technology, Server+ Training Kit, and MCSA Training Guide (70-218): Managing a Windows 2000 Network. You can reach Robert at Robert.Bogue@CroweChizek.com.