Uploading Old Email to Gmail using Java
Discussion and Sample Code
The class named BigDog05parseThe ultimate purpose of this class is to work in conjunction with a class named BigDog05upload to upload legacy Email messages to an Email server such as Gmail.
This program is designed to read an Mbox file produced by the Netscape 7.2 Email client program and to extract and write each message into an output file that is compatible for uploading to an Email server using the class named BigDog05upload.
Each output file has a unique file name based on the number of milliseconds since Jan 1, 1970. A one-millisecond delay is inserted into the program between messages to guarantee that no attempt is made to write two files with the same file name.
Input parameters
The following input values are provided as command-line parameters:
- inputPathAndFile: The path to the Mbox file and the name of the
Mbox file.
Example: ./MailToBeProcessed/Inbox - workingDir: Where message files are temporarily stored awaiting
upload to the destination Email address.
Example: ./DataFiles/ - destinationAddress: Example: joe@dummy.com
- smtpServer: Example: smtp-server.austin.rr.com
- uploadTag: A string that is prepended onto the Subject line
before the message is uploaded. Can be an empty string as in "".
Example: "ae|"
In case you are unfamiliar with the format, path names that begin with a period are relative to the directory containing the Java class files being executed. However, you could just as well specify those directories relative to the root directory.
Some are used and some are passed along
Some of these parameter values are used by the BigDog05parse object, some are simply passed along to BigDog05upload object, and some are used by objects of both classes.
Both classes were tested using JDK 1.5.0_01 under WinXP. JDK 1.5 or later is required because of the use of generics in the program.
Will discuss in fragments
As is my custom, I will break these classes down and discuss them in fragments, beginning with BigDog05parse. You will find complete listings of both classes in Listings 20 and 21 near the end of the lesson.
The beginning of the first class definition and the declaration of some instance variables are shown in Listing 1.
class BigDog05parse{
//A directory where output message files are
// stored awaiting uploading by BigDog05upload.
String workingDir;
//Directory and file name for the Mbox file.
String inputPathAndFile;
//Save all command-line parameters here
static String[] params;
|
The main method
The beginning of the main method is shown in Listing 2.
public static void main(String[] args){
if(args.length != 5){
System.out.println(
"Usage: java BigDog05parse "
+ "n inputPathAndFile"
+ "n workingDir"
+ "n destinationAddress"
+ "n smtpServer"
+ "n uploadTag");
System.out.println("Terminating");
System.exit(0);
}//end if
//Save and display command-line parameters
params = args;
System.out.println("inputPathAndFile: "
+ params[0]);
System.out.println("workingDir: "
+ params[1]);
System.out.println("destinationAddress: "
+ params[2]);
System.out.println("smtpServer: "
+ params[3]);
System.out.println("uploadTag: "
+ params[4]);
|
The code in Listing 2 should be self-explanatory.
Extract messages from Mbox file
The code in Listing 3 instantiates an object of the BigDog05parse class to extract the individual messages from the Mbox file and to save them as individual message files in the working directory.
new BigDog05parse(params[0],params[1]); |
Upload the messages
When the constructor for the BigDog05parse class in Listing 3 returns, all of the messages have been extracted from the Mbox file and have been saved as individual files in the working directory.
Listing 4 instantiates an object of the BigDog05upload class to cause those messages to be uploaded to the specified destination address.
new BigDog05upload(params[2],
params[3],
params[1],
params[4]);
}//end main
|
Gmail doesn't retain duplicate copies
As an aside that may be useful during your testing of the program, it appears that if you upload the same message to Gmail two or more times in succession, Gmail will not retain duplicate copies of the message. Only one copy will be retained in the Gmail archives.
The constructor for the BigDog05parse class
Listing 5 shows the constructor for the BigDog05parse class in its entirety.
BigDog05parse(String input,String output){
this.inputPathAndFile = input;
this.workingDir = output;
parseMboxFile();
}//end constructor
|
The constructor requires two parameters. The first parameter points to the Mbox file. The second parameter points to the directory where the individual message files will be stored awaiting upload to the destination Email address.
The constructor invokes the instance method named parseMboxFile where all the real work is done for this class.
The parseMboxFile method
The method named parseMboxFile begins in Listing 6.
void parseMboxFile(){
String data;
try{//Get input stream object
BufferedReader mboxInputObj =
new BufferedReader(
new FileReader(
inputPathAndFile));
DataOutputStream dataOut = null;
String workingDirAndFile = null;
String outputFileName = null;
int msgNumber = 1;
|
The code in Listing 6 is straightforward and shouldn't require further explanation.
Read and process Mbox file
An Mbox file is simply a text file. According to Wikipedia,
"mbox is a generic term for a family of related file formats used for holding collections of electronic mail messages. In these formats, the messages are concatenated in a single file, with a From line prepended to the beginning of each and a blank line appended to the end of each. The From line begins with the five characters "From ", and may continue with other text."
That is all we need to know
The information in the above quotation is all that is needed for this method to parse the Mbox file, separating it into individual messages, and writing each message into a separate output file. For example, Figure 1 shows a typical beginning for an Mbox file. (Note the first five characters in the first line, which reads "From ".)
From - Fri Aug 12 07:33:13 2005 X-UIDL: 40d1c47e00012ae8 X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Figure 1 |
Similarly, Figure 2 shows a typical transition from the end of one message to the beginning of the next message in an Mbox file. (Note the blank line followed by the line that begins with "From ".)
or else you will continue to receive email/s. From - Fri Aug 12 07:33:13 2005 X-UIDL: 40d1c47e00012aea X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Figure 2 |
Iterate on lines in Mbox file
As shown in Listing 7, the parseMboxFile method uses a while loop to examine each line of text in the Mbox file and to decide what to do with it. When it finds a line that begins with "From ", the program assumes that the line constitutes the beginning of a new message.
while((data = mboxInputObj.readLine())
!= null){
if(data.startsWith("From ")){
if(dataOut != null){
dataOut.close();
}//end if(dataOut != null)
|
If the program currently has an output file open, the code in Listing 7 closes that file in order to start a new output file.
Create a unique fileID
Still in the body of the if statement that begins in Listing 7, the code in Listing 8 creates a unique fileID for the new file name based on the negative value of the current time in milliseconds relative to January 1, 1970.
(The program uses the hexadecimal value of the negative, in two's complement notation, of the current time instead of the actual current time, to eliminate leading zeros in the long time value. As a result, these hexadecimal values all begin with FFF... and are all of the same length.)
String fileID = Long.toHexString(
-(new Date().getTime()));
//Sleep for 1 ms to guarantee unique
// file names.
Thread.currentThread().sleep(1);
|
Sleep for one millisecond
After getting the unique fileID, the program goes to sleep for one millisecond to guarantee that the next fileID value will be different.
Complete the if statement
The code in Listing 9 completes the body of the if statement that began in Listing 7. Recall that this is all being done in conjunction with the decision that the line of text constitutes the first line in a new message.
//Open an output file to save the
// message. Use the fileID as part of
// the file name. Use a file name that
// is compatible with earlier versions
// of the programs in the BigDog
// series.
outputFileName = "+OK " +
msgNumber + " " + fileID;
//Concatenate file name to the output
// path.
workingDirAndFile =
workingDir + outputFileName;
//Get an output stream for the file.
dataOut = new DataOutputStream(
new FileOutputStream(
workingDirAndFile));
//Show progress
System.out.print(msgNumber + " ");
//Increment the message counter in
// preparation for processing the next
// message.
msgNumber++;
}//end if(data.startsWith("File ")
|
The code in Listing 9 is straightforward and shouldn't require further explanation.
The else clause
The code in Listing 10 shows the else clause for the if statement that began in Listing 7. This code is executed if the line of text does not constitute the first line in a new message. In this case, the program writes a newline character at the end of every input line except for the last line in the file.
else{
dataOut.writeBytes("n");
}//end else
//Write the line into the output file.
dataOut.writeBytes(data);
}//end while loop
|
Finally, the code in Listing 10 writes the line of text into the output file. That constitutes the end of the while loop.
At this point, the program goes back to the top of the while loop and reads the next line of text from the Mbox file.
The end of the Mbox file
When the readLine statement in the conditional clause of the while loop that begins in Listing 7 encounters the end of the Mbox file, the readLine method returns null causing the while loop to terminate. That bring us to Listing 11, which signals the end of the parseMboxFile method and the end of the class definition for the BigDog05parse class.
//All messages from the Mbox file have been
// written into separate files.
if(dataOut != null){
//Close the last file.
dataOut.close();
}//end if(dataOut != null)
System.out.println();//blank line
System.out.println(
"The Mbox file is parsed");
System.out.println("Start uploading msgs");
//Sound an audio alert
Thread.currentThread().sleep(500);
Toolkit.getDefaultToolkit().beep();
Thread.currentThread().sleep(500);
Toolkit.getDefaultToolkit().beep();
Thread.currentThread().sleep(500);
}catch(Exception ex){
ex.printStackTrace();
System.exit(0);
}//end catch
}//end parseMboxFile
}//end class BigDog05parse
|
Listing 11 contains some cleanup code, which is straightforward and shouldn't require further explanation.
Returning control to the main method
At this point, the parseMboxFile method will return control to the constructor as shown in Listing 5. The constructor will return control to the main method as shown in Listing 3.
The next thing that happens is construction of the object of the BigDog05upload object as shown in Listing 4. That brings us to a discussion of the BigDog05upload class shown in Listing 21 near the end of the lesson.
The BigDog05upload class
The ultimate purpose of this class is to work in conjunction with the class named BigDog05parse to upload legacy Email messages to an Email server such as Gmail.
An object of this class is instantiated by the program named BigDog05parse to tag and upload a set of message files written by the program named BigDog05parse.
This object tags messages with a tag passed as a constructor parameter. Then the object uploads the messages to a destination Email account that is specified as a constructor parameter using an SMTP server that is also specified as a constructor parameter.
Beginning of the class definition
The beginning of the class definition, along with some variable declarations is shown in Listing 12.
class BigDog05upload{
//ID of the destination email account.
final String destinationAddress;
//An smtp server through which the user is
// authorized to send email messages.
final String smtpServer;
//Local folder where message files are stored
// awaiting uploading.
final String workingDir;
//Tag that is prepended to the Subject line of
// the message before uploading.
final String uploadTag;
//Concatenation of the working directory and
// the name of a message file.
String pathFileName;
//IDs of msg files that will be deleted from
// the working directory are stored here.
Vector <String> msgToDelete =
new Vector<String>();
boolean okToDelete = false;
int msgNumber = 0;
|
Constructor for the BigDog05upload class
The constructor is shown in its entirety in Listing 13.
BigDog05upload(final String destinationAddress,
final String smtpServer,
final String workingDir,
final String uploadTag){
this.destinationAddress = destinationAddress;
this.smtpServer = smtpServer;
this.workingDir = workingDir;
this.uploadTag = uploadTag;
System.out.println("Uploading messages:");
uploadMsgs();
//All messages have been uploaded.
System.out.println("Deleting message files");
deleteMsgFiles();
}//end constructor
|
Upload the messages
After saving the constructor parameters and printing a message, the constructor invokes the method named uploadMsgs. This method converts the message files produced by the BigDog05parse object to Email messages and sends them to the destination Email address.
Delete the message files
When the uploadMsgs method returns successfully, after having uploaded the message files to the destination Email address, the constructor invokes the deleteMsgFiles method to delete the message files in order to clean up the working directory.
The uploadMsgs method
The method named uploadMsgs begins in Listing 14.
void uploadMsgs(){
//The following code creates a directory
// listing containing only those files with
// names that begin with +OK.
//This is an anonymous implementation of a
// class that implements FilenameFilter.
String[] dirList = new File(workingDir).list(
new FilenameFilter(){
public boolean accept(
File dir,String name){
if(!(new File(dir,name).
isFile())) return false;
return name.startsWith("+OK");
}//end accept
}//end FilenameFilter
);//end list
if(dirList.length == 0){
System.out.println("No files to upload");
System.out.println("Terminating program");
System.exit(0);
}//end if
|
Create a directory listing
The code in Listing 14 creates a listing of the files in the working directory that are to be uploaded to the destination Email address. If that list is empty, the program aborts with an explanatory message.
The code in Listing 14 is straightforward and should not require further explanation.
Iterate on the directory listing
The uploadMsgs method uses a for loop to iterate on the directory listing, converting each file contained in that listing into an Email message and sending the message to the destination Email address. The beginning of the for loop and the initialization of some working variables are shown in Listing 15.
for(int msgCounter = 0;
msgCounter < dirList.length;
msgCounter++){
String fileName = dirList[msgCounter];
pathFileName = workingDir + fileName;
|
Construct and send the Email message
During each iteration of the for loop, the code in Listing 16 invokes the forwardEmailMsg method to construct an Email message from the message file and send that Email message to the destination Email address.
If the Email message is sent successfully, the forwardEmailMsg returns true. Otherwise, it returns false.
The third parameter to the forwardEmailMsg method can be used to tag the Subject line of the message to show that it was uploaded. If you don't want to tag the message, just pass an empty string as the third parameter.
okToDelete = forwardEmailMsg(
destinationAddress,
smtpServer,
uploadTag,
pathFileName);
|
Update the deletion list
If the forwardEmailMsg method returns true, the name of the message file is added to a list of files scheduled for deletion later as shown in Listing 17.
if(okToDelete){
msgToDelete.add(pathFileName);
System.out.print((msgCounter + 1) + " ");
}//end if
}//end for loop on directory length
|
The message number for that message is also displayed as an upload progress indicator by the code in Listing 17.
Listing 17 also signals the end of the for loop. When the for loop terminates, an attempt has been made to upload all of the message files to the destination Email address.
Completion of the uploadMsgs method
Listing 18 shows some additional cleanup code that is executed in the uploadMsgs method after the for loop terminates.
//Sound an audio alert when all messages have
// been uploaded.
try{
Thread.currentThread().sleep(500);
Toolkit.getDefaultToolkit().beep();
Thread.currentThread().sleep(500);
Toolkit.getDefaultToolkit().beep();
Thread.currentThread().sleep(500);
Toolkit.getDefaultToolkit().beep();
Thread.currentThread().sleep(500);
}catch(Exception ex){
ex.printStackTrace();
System.exit(0);
}//end catch
System.out.println();//blank line
System.out.println("Upload complete");
}//end uploadMsgs
|
The code in Listing 18 is straightforward and shouldn't require further explanation.
The forwardEmailMsg method
This is a fairly complex method. You can view the code for the method in Listing 21 near the end of the lesson. You will find a detailed explanation of the method in one of my previously published lessons numbered 2180 and entitled Enlisting Java in the War Against Email Viruses. Therefore, I won't repeat that explanation here.
The method named deleteMsgFiles
As the name implies, the purpose of this method is to delete the message file for each of the messages that have been successfully uploaded to the destination Email address. The method is shown in its entirety in Listing 19.
void deleteMsgFiles(){
//Delete the files in the msgToDelete
// collection.
for(int cnt = 0;
cnt < msgToDelete.size();cnt++){
pathFileName = msgToDelete.elementAt(cnt);
File file = new File(pathFileName);
boolean isDeleted = file.delete();
if(!isDeleted)System.out.println(
"Unable to delete " + file);
}//End for loop
}//end deleteMsgFiles
|
The code in Listing 19 is straightforward and shouldn't require further
explanation.
Run the Program
I encourage you to copy the code from Listings 20 and 21 into your text editor, compile the code, and execute the program. Experiment with it, making changes, and observing the results of your changes.
Also please be aware of the following disclaimer:
THIS PROGRAM IS PROVIDED TO YOU AT NO COST. BY USING THIS PROGRAM TO PROCESS YOUR EMAIL, YOU AGREE THAT YOU ARE USING IT AT YOUR OWN RISK. THE AUTHOR OF THE PROGRAM, RICHARD G. BALDWIN, ACCEPTS NO LIABILITY FOR ANY LOSS THAT YOU MAY INCUR THROUGH THE USE OF THIS PROGRAM.
Summary
In this lesson, I showed you how to write a program that can be used to upload legacy Email messages from local Mbox files to an Email account.
Such a program is particularly useful for persons who would like to upload their collection of legacy Email messages onto the Gmail server so that they can take advantage of the extremely fast search capability that Gmail provides.
What's Next?
In the next lesson in this series, I will show you how to modify the program that I provided in the earlier lesson entitled Consolidating Email using Java so that it can be run in a fully automated unattended mode. That will make it possible to use a task scheduler to automatically move Email messages from one POP3 server to another Email account on an unattended regularly scheduled basis, such as once per hour for example.



Solid state disks (SSDs) made a splash in consumer technology, and now the technology has its eyes on the enterprise storage market. Download this eBook to see what SSDs can do for your infrastructure and review the pros and cons of this potentially game-changing storage technology.