Logging 404's using Filter DLL
This article was contributed by Chiraz.
When mastering a very big website, sometimes scoping the log for 404's can be a tedious task. There might be some logreaders here and there that simplify this process.
I chose to create a filtering DLL that logs only 404 not founds. In the general logfile the referer-URL is never inserted, which means you still don't know what page to change. By using ISAPI, you can parse all HTTP-headers and if the browser sends them along, log the referer-URL as well. That way you know what link was broken on which page.
This DLL is the filtering DLL. When it initializes it configures the path to the log by reading it from the registry. This path should be made using escaped backslashes as in a normal C/C++ application. It will then fill a Security-Attributes structure. This is necessary if you want to be able to clean the log using an extension DLL. The filter ISAPI DLL does not run under the same user as an extension DLL (called from a form for instance), therefore the extension DLL cannot normally write to the logfile (just read it). In this example, I set the permissions for the logfile to include "Everyone" access with "Full control" permissions.
As soon as a file cannot be found on the IIS Server, it will jump into the OnLog - event of this filtering DLL. This method first checks if the reply-code to the client is 404 (not found). Any other event is not handled and the function returns. It checks if the path to the logfile is not zero length. If the function has not returned, we are allowed to carry on. Because we open, write and close a file here, we need to enter a critical section here for thread-safety.
The 'referer' attribute is not included, but if it was sent by the browser it will be included in the CHttpFilterContext class, passed along in this function. There I query the ServerVariables for all unparsed HTTP headers. Next I look for http-referer: inside these headers to find what I am looking for. The result is returned in a CString. Then the logline is filled using the information out of the http-referer header and some information of the HTTP_FILTER_LOG structure. You can add any information you like there.
After that, it will try to create the file using the "CreateFile" function. Because of the "CREATE_NEW" constant, this function will fail if the file already exists. When that happens, I am assuming that is the error that occurred and try to open the file using the plain "fopen" command. If the file did not exist before, the file IS created and the logline is written to that file. Last the critical section is left and we can wait for another 404-logging request.
Because I do not want to restart the service when the logfile location changes I included "/magic_string_reset_log_location". When this URL is requested on this server it will re-read the path to the logfile from the registry and put that in the m_LogPath attribute. Any subsequent logging is done using that file. This becomes handy when you really need to keep your IIS service running and cannot afford to shut down for a few seconds. The disadvantage is that since the client connection has already been terminated (we're in the logging phase now), we cannot reply to the client that this special URL has been typed in and feedback success or error messages. Parsing on the URL in a previous stage however is something I would not advise, because you simply do not want to parse at the stage of URL mapping. That'd go beyond the point of this logging DLL. This DLL is only called when logging should occur. (however, 200 OK's are logged as well many times... hmmm..)
This dll was written to support the maintenance of the log in a more glorious way. The magic-string solution in the previous solution is insufficient right here, because this time we want a report whether our function has succeeded or not. Also, this DLL gives us better means of performing maintenance on our log and we can set special access rights for the use of this DLL or change it frequently using standard IIS security functionality.
This DLL goes into a directory with authentication, so only some people will be able to call the functions contained here. Consult IIS documentation for how to set this up.
When this DLL is called from a webbrowser we need to specify the function we want to use. Since at this time only the "GET" method is used, using this DLL from a form is highly insecure. Anyone that has access to the machine where the webbrowser was run can read the "Username" and "Password" from the URL - historylist. Therefore it's highly recommendable to use the "POST" method instead, but I could not find a way to use at short notice (and this is the first ISAPI application I wrote).
The "SetLogLocation" function will reset the value in the registry to point to the new location, as inserted on the webform. Then it returns with the magic-string set up as a link once this function succeeded. The user can then click on that link to make the new location current for the filtering DLL (404log.dll).
The "LogClean" function will read the path to the logfile from the registry and will try to open it for writing. The file will be truncated if it succeeds. The file is then immediately closed and a success message returned to the client.
The validity check on the UserName and Password does not use any system calls. Instead, these parameters are compiled in. Consult the header file for these values. Currently, in this example, these are both set to "Test". A better solution should be
This file is a sample form which you can use for testing. "MfcISAPICommand" is a special hidden field that allows you to use "GET" operations on specific functions inside the ISAPI extension DLL. Many browsers are not consistent when "ACTION" is specified using a question-mark inbetween. Consult the MSDN library, chapter "ISAPI Extension DLL" for more details on this topic.
Disclaimer This DLL is my first project on this topic. The software may contain bugs and it will have security-problems. Use this software at your own risk, as I will not accept any responsibility for production loss where this software has been used as-is.