Mbox files to an Email server. Such a program is particularly useful for persons who would like to upload their collection of legacy Email messages onto Email servers such as the Gmail server.">
August 23, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Uploading Old Email to Gmail using Java

  • October 18, 2005
  • By Richard G. Baldwin
  • Send Email »
  • More Articles »

Discussion and Sample Code

The class named BigDog05parse

The ultimate purpose of this class is to work in conjunction with a class named BigDog05upload to upload legacy Email messages to an Email server such as Gmail.

This program is designed to read an Mbox file produced by the Netscape 7.2 Email client program and to extract and write each message into an output file that is compatible for uploading to an Email server using the class named BigDog05upload.

Each output file has a unique file name based on the number of milliseconds since Jan 1, 1970.  A one-millisecond delay is inserted into the program between messages to guarantee that no attempt is made to write two files with the same file name.

Input parameters

The following input values are provided as command-line parameters:

  • inputPathAndFile: The path to the Mbox file and the name of the Mbox file.
    Example: ./MailToBeProcessed/Inbox
  • workingDir: Where message files are temporarily stored awaiting upload to the destination Email address.
    Example: ./DataFiles/
  • destinationAddress:  Example: joe@dummy.com
  • smtpServer:  Example: smtp-server.austin.rr.com
  • uploadTag:  A string that is prepended onto the Subject line before the message is uploaded.  Can be an empty string as in "".
    Example: "ae|"

In case you are unfamiliar with the format, path names that begin with a period are relative to the directory containing the Java class files being executed.  However, you could just as well specify those directories relative to the root directory.

Some are used and some are passed along

Some of these parameter values are used by the BigDog05parse object, some are simply passed along to BigDog05upload object, and some are used by objects of both classes.

Both classes were tested using JDK 1.5.0_01 under WinXP.  JDK 1.5 or later is required because of the use of generics in the program.

Will discuss in fragments

As is my custom, I will break these classes down and discuss them in fragments, beginning with BigDog05parse.  You will find complete listings of both classes in Listings 20 and 21 near the end of the lesson.

The beginning of the first class definition and the declaration of some instance variables are shown in Listing 1.

class BigDog05parse{
  //A directory where output message files are
  // stored awaiting uploading by BigDog05upload.
  String workingDir;
  
  //Directory and file name for the Mbox file.
  String inputPathAndFile;
  
  //Save all command-line parameters here
  static String[] params;

Listing 1

The main method

The beginning of the main method is shown in Listing 2.

  public static void main(String[] args){
    if(args.length != 5){
      System.out.println(
                     "Usage: java BigDog05parse "
                     + "n  inputPathAndFile"
                     + "n  workingDir"
                     + "n  destinationAddress"
                     + "n  smtpServer"
                     + "n  uploadTag");
      System.out.println("Terminating");
      System.exit(0);
    }//end if
    
    //Save and display command-line parameters
    params = args;
    
    System.out.println("inputPathAndFile:   " 
                                    + params[0]);
    System.out.println("workingDir:         " 
                                    + params[1]);
    System.out.println("destinationAddress: " 
                                    + params[2]);
    System.out.println("smtpServer:         " 
                                    + params[3]);
    System.out.println("uploadTag:          " 
                                    + params[4]);

Listing 2

The code in Listing 2 should be self-explanatory.

Extract messages from Mbox file

The code in Listing 3 instantiates an object of the BigDog05parse class to extract the individual messages from the Mbox file and to save them as individual message files in the working directory.

    new BigDog05parse(params[0],params[1]);

Listing 3

Upload the messages

When the constructor for the BigDog05parse class in Listing 3 returns, all of the messages have been extracted from the Mbox file and have been saved as individual files in the working directory.

Listing 4 instantiates an object of the BigDog05upload class to cause those messages to be uploaded to the specified destination address.

    new BigDog05upload(params[2],
                       params[3],
                       params[1],
                       params[4]);
  }//end main

Listing 4

Gmail doesn't retain duplicate copies

As an aside that may be useful during your testing of the program, it appears that if you upload the same message to Gmail two or more times in succession, Gmail will not retain duplicate copies of the message.  Only one copy will be retained in the Gmail archives.

The constructor for the BigDog05parse class

Listing 5 shows the constructor for the BigDog05parse class in its entirety.

  BigDog05parse(String input,String output){
    this.inputPathAndFile = input;
    this.workingDir = output;
    parseMboxFile();
  }//end constructor

Listing 5

The constructor requires two parameters.  The first parameter points to the Mbox file.  The second parameter points to the directory where the individual message files will be stored awaiting upload to the destination Email address.

The constructor invokes the instance method named parseMboxFile where all the real work is done for this class.

The parseMboxFile method

The method named parseMboxFile begins in Listing 6.

  void parseMboxFile(){
    String data;
    try{//Get input stream object
      BufferedReader mboxInputObj = 
                          new BufferedReader(
                            new FileReader(
                              inputPathAndFile));
                                  
      DataOutputStream dataOut = null;
      String workingDirAndFile = null;
      String outputFileName = null;
      int msgNumber = 1;

Listing 6

The code in Listing 6 is straightforward and shouldn't require further explanation.

Read and process Mbox file

An Mbox file is simply a text file.  According to Wikipedia,

"mbox is a generic term for a family of related file formats used for holding collections of electronic mail messages. In these formats, the messages are concatenated in a single file, with a From line prepended to the beginning of each and a blank line appended to the end of each. The From line begins with the five characters "From ", and may continue with other text."

That is all we need to know

The information in the above quotation is all that is needed for this method to parse the Mbox file, separating it into individual messages, and writing each message into a separate output file.  For example, Figure 1 shows a typical beginning for an Mbox file.  (Note the first five characters in the first line, which reads "From ".)

From - Fri Aug 12 07:33:13 2005
X-UIDL: 40d1c47e00012ae8
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
Figure 1

Similarly, Figure 2 shows a typical transition from the end of one message to the beginning of the next message in an Mbox file.  (Note the blank line followed by the line that begins with "From ".)

 or else you will continue to receive email/s.
From - Fri Aug 12 07:33:13 2005
X-UIDL: 40d1c47e00012aea
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
Figure 2

Iterate on lines in Mbox file

As shown in Listing 7, the parseMboxFile method uses a while loop to examine each line of text in the Mbox file and to decide what to do with it.  When it finds a line that begins with "From ", the program assumes that the line constitutes the beginning of a new message.

      while((data = mboxInputObj.readLine())
                                        != null){
        if(data.startsWith("From ")){
          if(dataOut != null){
            dataOut.close();
          }//end if(dataOut != null)

Listing 7

If the program currently has an output file open, the code in Listing 7 closes that file in order to start a new output file.

Create a unique fileID

Still in the body of the if statement that begins in Listing 7, the code in Listing 8 creates a unique fileID for the new file name based on the negative value of the current time in milliseconds relative to January 1, 1970.

(The program uses the hexadecimal value of the negative, in two's complement notation, of the current time instead of the actual current time, to eliminate leading zeros in the long time value.  As a result, these hexadecimal values all begin with FFF... and are all of the same length.)

          String fileID = Long.toHexString(
                        -(new Date().getTime()));
          //Sleep for 1 ms to guarantee unique
          // file names.
          Thread.currentThread().sleep(1);

Listing 8

Sleep for one millisecond

After getting the unique fileID, the program goes to sleep for one millisecond to guarantee that the next fileID value will be different.

Complete the if statement

The code in Listing 9 completes the body of the if statement that began in Listing 7.  Recall that this is all being done in conjunction with the decision that the line of text constitutes the first line in a new message.

          //Open an output file to save the
          // message.  Use the fileID as part of
          // the file name.  Use a file name that
          // is compatible with earlier versions
          // of the programs in the BigDog
          // series.
          outputFileName = "+OK " + 
                        msgNumber + " " + fileID;
          //Concatenate file name to the output
          // path.
          workingDirAndFile =
                     workingDir + outputFileName;
          //Get an output stream for the file.
          dataOut = new DataOutputStream(
                           new FileOutputStream(
                             workingDirAndFile));
          //Show progress
          System.out.print(msgNumber + " ");
          //Increment the message counter in
          // preparation for processing the next
          // message.
          msgNumber++;
        }//end if(data.startsWith("File ")

Listing 9

The code in Listing 9 is straightforward and shouldn't require further explanation.

The else clause

The code in Listing 10 shows the else clause for the if statement that began in Listing 7.  This code is executed if the line of text does not constitute the first line in a new message.  In this case, the program writes a newline character at the end of every input line except for the last line in the file.

        else{
          dataOut.writeBytes("n");
        }//end else
        //Write the line into the output file.
        dataOut.writeBytes(data);
      }//end while loop

Listing 10

Finally, the code in Listing 10 writes the line of text into the output file.  That constitutes the end of the while loop.

At this point, the program goes back to the top of the while loop and reads the next line of text from the Mbox file.

The end of the Mbox file

When the readLine statement in the conditional clause of the while loop that begins in Listing 7 encounters the end of the Mbox file, the readLine method returns null causing the while loop to terminate.  That bring us to Listing 11, which signals the end of the parseMboxFile method and the end of the class definition for the BigDog05parse class.

      //All messages from the Mbox file have been
      // written into separate files.
      if(dataOut != null){
        //Close the last file.
        dataOut.close();
      }//end if(dataOut != null)
      
      System.out.println();//blank line
      System.out.println(
                      "The Mbox file is parsed");
      System.out.println("Start uploading msgs");
      //Sound an audio alert
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
    }catch(Exception ex){
      ex.printStackTrace();
      System.exit(0);
    }//end catch
  }//end parseMboxFile
}//end class BigDog05parse

Listing 11

Listing 11 contains some cleanup code, which is straightforward and shouldn't require further explanation.

Returning control to the main method

At this point, the parseMboxFile method will return control to the constructor as shown in Listing 5.  The constructor will return control to the main method as shown in Listing 3.

The next thing that happens is construction of the object of the BigDog05upload object as shown in Listing 4.  That brings us to a discussion of the BigDog05upload class shown in Listing 21 near the end of the lesson.

The BigDog05upload class

The ultimate purpose of this class is to work in conjunction with the class named BigDog05parse to upload legacy Email messages to an Email server such as Gmail.

An object of this class is instantiated by the program named BigDog05parse to tag and upload a set of message files written by the program named BigDog05parse.

This object tags messages with a tag passed as a constructor parameter.  Then the object uploads the messages to a destination Email account that is specified as a constructor parameter using an SMTP server that is also specified as a constructor parameter.

Beginning of the class definition

The beginning of the class definition, along with some variable declarations is shown in Listing 12.

class BigDog05upload{
  //ID of the destination email account.
  final String destinationAddress;
  //An smtp server through which the user is
  // authorized to send email messages.
  final String smtpServer;
  //Local folder where message files are stored
  // awaiting uploading.
  final String workingDir;
  //Tag that is prepended to the Subject line of
  // the message before uploading.
  final String uploadTag;
  //Concatenation of the working directory and
  // the name of a message file.
  String pathFileName;
  //IDs of msg files that will be deleted from
  // the working directory are stored here.
  Vector <String> msgToDelete = 
                            new Vector<String>();
  boolean okToDelete = false;
  int msgNumber = 0;

Listing 12

Constructor for the BigDog05upload class

The constructor is shown in its entirety in Listing 13.

  BigDog05upload(final String destinationAddress,
                 final String smtpServer,
                 final String workingDir,
                 final String uploadTag){
    this.destinationAddress = destinationAddress;
    this.smtpServer = smtpServer;
    this.workingDir = workingDir;
    this.uploadTag = uploadTag;
    
    System.out.println("Uploading messages:");
    uploadMsgs();
    
    //All messages have been uploaded.
    System.out.println("Deleting message files");
    deleteMsgFiles();
  }//end constructor

Listing 13

Upload the messages

After saving the constructor parameters and printing a message, the constructor invokes the method named uploadMsgs.  This method converts the message files produced by the BigDog05parse object to Email messages and sends them to the destination Email address.

Delete the message files

When the uploadMsgs method returns successfully, after having uploaded the message files to the destination Email address, the constructor invokes the deleteMsgFiles method to delete the message files in order to clean up the working directory.

The uploadMsgs method

The method named uploadMsgs begins in Listing 14.

  void uploadMsgs(){
    //The following code creates a directory 
    // listing containing only those files with
    // names that begin with +OK.
    //This is an anonymous implementation of a 
    // class that implements FilenameFilter.
    String[] dirList = new File(workingDir).list(
      new FilenameFilter(){
        public boolean accept(
                           File dir,String name){
          if(!(new File(dir,name).
            isFile())) return false;
          return name.startsWith("+OK");
        }//end accept
      }//end FilenameFilter
    );//end list
    
    if(dirList.length == 0){
      System.out.println("No files to upload");
      System.out.println("Terminating program");
      System.exit(0);
    }//end if

Listing 14

Create a directory listing

The code in Listing 14 creates a listing of the files in the working directory that are to be uploaded to the destination Email address.  If that list is empty, the program aborts with an explanatory message.

The code in Listing 14 is straightforward and should not require further explanation.

Iterate on the directory listing

The uploadMsgs method uses a for loop to iterate on the directory listing, converting each file contained in that listing into an Email message and sending the message to the destination Email address.  The beginning of the for loop and the initialization of some working variables are shown in Listing 15.

    for(int msgCounter = 0;
                     msgCounter < dirList.length;
                                   msgCounter++){
      String fileName = dirList[msgCounter];
      pathFileName = workingDir + fileName;

Listing 15

Construct and send the Email message

During each iteration of the for loop, the code in Listing 16 invokes the forwardEmailMsg method to construct an Email message from the message file and send that Email message to the destination Email address.

If the Email message is sent successfully, the forwardEmailMsg returns true.  Otherwise, it returns false.

The third parameter to the forwardEmailMsg method can be used to tag the Subject line of the message to show that it was uploaded.  If you don't want to tag the message, just pass an empty string as the third parameter.

      okToDelete = forwardEmailMsg(
                              destinationAddress,
                              smtpServer,
                              uploadTag,
                              pathFileName);

Listing 16

Update the deletion list

If the forwardEmailMsg method returns true, the name of the message file is added to a list of files scheduled for deletion later as shown in Listing 17.

      if(okToDelete){
        msgToDelete.add(pathFileName);
        System.out.print((msgCounter + 1) + " ");
      }//end if
    }//end for loop on directory length

Listing 17

The message number for that message is also displayed as an upload progress indicator by the code in Listing 17.

Listing 17 also signals the end of the for loop.  When the for loop terminates, an attempt has been made to upload all of the message files to the destination Email address.

Completion of the uploadMsgs method

Listing 18 shows some additional cleanup code that is executed in the uploadMsgs method after the for loop terminates.

    //Sound an audio alert when all messages have
    // been uploaded.
    try{
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
    }catch(Exception ex){
      ex.printStackTrace();
      System.exit(0);
    }//end catch
    System.out.println();//blank line
    System.out.println("Upload complete");
  }//end uploadMsgs

Listing 18

The code in Listing 18 is straightforward and shouldn't require further explanation.

The forwardEmailMsg method

This is a fairly complex method.  You can view the code for the method in Listing 21 near the end of the lesson.  You will find a detailed explanation of the method in one of my previously published lessons numbered 2180 and entitled Enlisting Java in the War Against Email Viruses.  Therefore, I won't repeat that explanation here.

The method named deleteMsgFiles

As the name implies, the purpose of this method is to delete the message file for each of the messages that have been successfully uploaded to the destination Email address.  The method is shown in its entirety in Listing 19.

  void deleteMsgFiles(){
    //Delete the files in the msgToDelete
    // collection.
    for(int cnt = 0;
                 cnt < msgToDelete.size();cnt++){
      pathFileName = msgToDelete.elementAt(cnt); 
      File file = new File(pathFileName);
      boolean isDeleted = file.delete();
      if(!isDeleted)System.out.println(
             "Unable to delete " + file);
      }//End for loop
  }//end deleteMsgFiles

Listing 19

The code in Listing 19 is straightforward and shouldn't require further explanation.

Run the Program

I encourage you to copy the code from Listings 20 and 21 into your text editor, compile the code, and execute the program.  Experiment with it, making changes, and observing the results of your changes.

Also please be aware of the following disclaimer:

THIS PROGRAM IS PROVIDED TO YOU AT NO COST.  BY USING THIS PROGRAM TO PROCESS YOUR EMAIL, YOU AGREE THAT YOU ARE USING IT AT YOUR OWN RISK.  THE AUTHOR OF THE PROGRAM, RICHARD G. BALDWIN, ACCEPTS NO LIABILITY FOR ANY LOSS THAT YOU MAY INCUR THROUGH THE USE OF THIS PROGRAM.

Summary

In this lesson, I showed you how to write a program that can be used to upload legacy Email messages from local Mbox files to an Email account.

Such a program is particularly useful for persons who would like to upload their collection of legacy Email messages onto the Gmail server so that they can take advantage of the extremely fast search capability that Gmail provides.

What's Next?

In the next lesson in this series, I will show you how to modify the program that I provided in the earlier lesson entitled Consolidating Email using Java so that it can be run in a fully automated unattended mode.  That will make it possible to use a task scheduler to automatically move Email messages from one POP3 server to another Email account on an unattended regularly scheduled basis, such as once per hour for example.





Page 2 of 3



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel