Implementing an Anti-Virus File Scan in JEE Applications
This article will discuss one of the ways to implement antivirus file scanning in Java, particular in the JEE applications. Viruses, Trojan Horses, and different malware and spyware are a real problem for current computer environments, and especially for the Windows operating system. If you are designing any application in Java that has a requirement to be able to upload external files, you have a potential security risk. By uploading, I mean any way to get the external file inside of the corporate firewall be it via HTTP protocol or any other means. It is quite common to have this type of requirement in an enterprise application and with Java being one of the most popular web development platforms, it is unfortunate that this type of gaping security risk is quite often overlooked.
Java's Development Kit (JDK) does not have any means to do the antivirus scan right out of the box. This is primarily because Java is a programming language, and does not have any virus scanning packages. Furthermore, anti-virus software is not Sun's area of expertise or business model. Developing this type of software (or Java package), and more importantly maintaining it, would be a huge task for Sun. Mainly because viruses are constantly evolving and keeping virus definitions up-to-date is a daunting task. Large companies such as McAffee, Symantec, or Zone Labs develop virus detecting and combating products and spend a lot of resources to maintain them.
To implement a virus file scan in Java, a third-party package needs to be used. For the purposes of this article, I will use Symantec Scan Engine (SSE) package, which comes with Java APIs. This package is an application that serves as a TCP/IP server and has a programming interface and enables Java applications to incorporate support for content scanning technologies. For this article, I used Symantec Scan Engine 5.1, which is available as a Unix or Windows install.
If you are using an anti-virus package from the different vendor, you will need to investigate what kind of APIs are available; however, the general approach should be similar. Also, note that my implementation can be used with JEE technology and any modern MVC framework such as Struts or Spring.
The architecture is as follows: A server machine needs to have SSE running at all times. This can be the same machine that hosts your Application Server, but in an enterprise environment this should be a different machine. The Default Port needs to be open through the firewall to allow communication with the scan engine. All JEE applications that need to do file scanning can talk to the SSE server machine through a default port. Also, multiple applications running on different application servers can re-use the same scanning server. For more information, you should refer to the Symantec Scan Engine (SSE) Installation Guide, available on the Symantec web site.
When an external file that needs to be scanned is sent to the SSE via its programming interface (Java APIs using the default port), before any other operation on the file is performed, the SSE returns a result code. For instance, a file is uploaded by an external user into the web email type application as an attachment; then, the SSE API is invoked by the application and the return code of pass or fail determines the outcome of the upload and whether that email can actually be sent. If you have an account on Yahoo mail, you probably have seen that Yahoo is using Norton Antivirus to scan all attachments, although no Java.
Figure 1: Screen shot from Yahoo
For details on the Scan Engine Server Installationm please see the Symantec Scan Engine (SSE) Implementation Guide from Symantec.
Here are some key things to remember about SSE:
- Java 2 SE Runtime (JRE) 5.0 Update 6.0 or later must be installed on the server before the SSE installation is done.
- After installation, verify that the Symantec Scan Engine daemon is running. At the Unix command prompt (if it's a Unix install), type the following command:
ps ea | grep sym.A list of processes similar to the following should appear:
- 5358 ? 0:00 symscan
- 5359 ? 0:00 symscan
If the SSE process did not start, type the following command to restart SSE:
For the purposes of this article, I included a wrapper for the Symantec SSE APIs, av.jar, which has Symantec Java APIs and serves as a client to the SSE server and takes care of all communications with the server. Please refer to the download source section. The av.jar should be included in the Java CLASSPATH to work with the SSE. This jar contains a class called AVClient that takes care of actually sending the files to SSE as byte arrays and returning the result.
In my project setting, I added three variables to be accessed via the System.getProperty mechanism. For example:
AV_SERVER_HOST=192.168.1.150 AV_SERVER_PORT=1344 AV_SERVER_MODE=SCAN
The AV_SERVER_HOST is the host name or IP of the machine where Scan Engine is installed.
The AV_SERVER_PORT is the port where Scan Engine listens for incoming files.
The AV_SERVER_MODE is the scan mode which can be:
- NOSCAN: No scanning will be done (any keyword that does not start with "SCAN" will result in ignoring the call to the Scan Engine and no files will be transferred for scanning).
- SCAN: Files or the byte stream will be scanned, but the scan engine will not try to repair infections.
- SCANREPAIR: Files will be scanned, the scan engine will try to repair infections, but nothing else will be done.
- SCANREPAIRDELETE: Files will be scanned, the scan engine will try to repair infections, and irreparable files will be deleted.
Note: For the file stream (byte array) scanning, the only meaning full values are "SCAN" and "NOSCAN".