http://www.developer.com/

Back to article

Uploading Old Email to Gmail using Java


October 18, 2005

Java Programming Notes # 2404


Preface

The recent availability of low cost (or no cost) Email services, (such as Google's Gmail), which provide massive storage capacity, advanced features, and lightening-fast search capability has caused many of us to rethink the way that we manage Email.

A whole new outlook

Up until recently, I managed my Email pretty much the same as almost everyone else.  I downloaded the messages from several different Email accounts into an Email client program, (Netscape Mail in my case), read messages, deleted messages, filed messages, etc.

Since getting my Gmail account in May of 2005, I have discovered that there is a much better way to manage Email.  Quite simply, in my opinion, Gmail is the finest Email program that I have ever seen.  It has completely changed my outlook (no pun intended) on how to manage Email.

Solving some operational problems

However, making the switch to Gmail does involve a few operational problems, none of which are of Google's making.  The lessons in this series of lessons are intended to show you how I have elected to write special Java programs to deal with those operational problems.

Viewing tip

You may find it useful to open another copy of this lesson in a separate browser window.  That will make it easier for you to scroll back and forth among the different listings and figures while you are reading about them.

One lesson in a series of lessons

This is the second lesson in a series of lessons on the general topic of moving Email messages around among servers and local computers.  The first and previous lesson in the series was entitled Consolidating Email using Java.

Supplementary material

I recommend that you also study the other lessons in my extensive collection of online Java tutorials.  You will find those lessons published at Gamelan.com.  However, as of the date of this writing, Gamelan doesn't maintain a consolidated index of my Java tutorial lessons, and sometimes they are difficult to locate there.  You will find a consolidated index at www.DickBaldwin.com.

General Background Information

Historical approach to managing Email

Up until recently, I managed my Email pretty much the same way that others did.  I downloaded messages from several different Email accounts onto my hard drive using a local Email client program.  In my case, that client program was Netscape mail.

I would delete most of the messages because they were identified as spam either using Netscape filters or the Netscape junk mail capability.  Then I would read the remaining messages.

Delete some, file the rest

Having read the remaining messages, I would delete some of them in order to conserve disk space, and file the rest in an elaborate system of Email folders that had evolved over the years. 

(I have accumulated more than 500 Mbytes of Email messages on my hard disk.)

Hmm, now where did I file that message?

Everything usually went pretty well until I needed to find a message that I had received and filed earlier.  Oftentimes, I would forget which folder I had filed it in.  This often resulted in a long and frequently tedious manual search.  Sometimes I would try running the search feature of the Email client program, but it was so slow making one search pass through 500 Mbytes of Email messages that it wasn't really practical.

Gmail to the rescue

Then along came an opportunity to open a Gmail account, which I did mainly out of curiosity.  Once I became familiar with Gmail, I quickly decided that it would form the basis of my new approach to Email management.

(Access to a Gmail account is currently available by invitation only.  People who already have an account can invite others to open accounts.  As of this writing, I have coupons for fifty invitations.  If you would like to open a Gmail account, send an Email message to baldwin@dickbaldwin.com notifying me of that fact and I will submit your Email address to Google in order to get an invitation for you.  To keep my spam blocker from discarding your message, please include Gmail in the subject of the message.)

Advantages and disadvantages

Gmail offers many advantages over my old approach.  So far, at least, I haven't found any disadvantages.  Although there are advertisements on the right side of the browser window when I read a message, they are unobtrusive and they don't bother me at all.  In fact, I can eliminate them by adjusting the width of my browser window if I want to.

I have read concerns about privacy issues that involve storing Email messages on a server controlled by someone else.  When I am concerned about having confidential information in my Email messages I encrypt them anyway, so that isn't a concern for me.  I have much greater concerns about privacy related to making online purchases and having my credit card number stored in hundreds of servers scattered around the world than I do about having my Email messages stored on the Gmail server.

Virtually unlimited disk capacity

As one advantage, for example, I don't need to be concerned about conserving disk space.  Google makes so much space available that I can even afford to save the messages in the trash folder.  For example, here is a statement that appears on the bottom of my Gmail web page today, "You are currently using 243 MB (10%) of your 2502 MB."

Why would I save the trash?

Why, you might wonder, would I want to save the trash?  As it turns out, having a large number of messages in the trash folder is very valuable.  I periodically do a statistical analysis on those messages to identify emerging spam trends, such as new incorrect spellings for various medications.  I use that information to develop new spam filters using Gmail's excellent filter system.

Foreign-language messages

Also, approximately one-third of the messages that I receive are in languages that I can't read.  Therefore, there is no point in having them clutter my inbox.  Periodically, I use the messages in the trash folder to identify foreign-language characters that can be used to filter out foreign-language messages.  This turns out to be a fairly easy filtering task.  I am able to trap about ninety-five percent of all foreign-language messages and send them straight to the trash folder completely bypassing my inbox.

The main advantage is a lightening-fast search

I could go on and on singing the praises of Gmail, but I won't.  The main advantage of Gmail, (the advantage to which this lesson is devoted), is the ability to use a very sophisticated search capability to search all the messages in the archives (and optionally in the spam and trash folders) with lightening speed.

An alternative to Email folders

Although Gmail makes it possible to apply labels to messages (as an alternative to using mail folders), the ability to search very rapidly reduces the need for filing messages in an organized way.

Basically, Gmail has only three folders:

  • All Mail
  • Spam
  • Trash

The All Mail folder is automatically subdivided into the following categories, and you can further subdivide it through the use of labels:

  • Inbox
  • Starred
  • Sent Mail
  • Drafts

A single message may be tagged with none, one, or more labels.

POP3 access and mail forwarding

I now do all of my serious Email processing using Gmail's web mail capability. However, Gmail also provides POP3 access and message forwarding at no cost as well.  I use the POP3 capability to download and save selected messages locally for backup purposes.  So far, I haven't found a use for the forwarding capability, but it is good to know that it is available.

Now back to the search capability

As an example of searching, I recently had a question about the availability of campus parking permits for the upcoming school year.  I remembered having received an Email message that addressed that subject sometime in the recent past.

I was able to search through tens of thousands of messages for the keyword parking in less time than it took for me to type the keyword into the search field.  The search isolated thirteen messages out of the thousands of messages in the archives and it was easy to spot the one that I needed.  If need be, I could have further narrowed down the search using a logical combination of multiple keywords (and/or labels) in conjunction with AND, OR, and NOT.

If I had been looking for this message on my hard drive using my old approach, I would probably still be trying to figure out which folder contained the message.

Just use it and be quiet about it!

By now, you are probably wondering why I don't simply use Gmail and be quiet about it.  The reason is that there are a couple of operational problems in making the switch to Gmail that are not of Google's making.  I want to share the solution to those problems with you just in case you might be interested in making the switch.

Both problems revolve around the fact that in order for Gmail to be most useful, it is important for me to consolidate all of my Email messages on the Gmail server so that I can apply Gmail's fast search capability to all of my Email messages.

Email accounts refuse to forward messages

The first problem is that over the years, I have accumulated several different Email accounts.  Unfortunately, a couple of the most important accounts (including my employer) refuse to forward my Email messages to Gmail, (or to any other Email account, for that matter).  I addressed that problem in an earlier lesson entitled Consolidating Email using Java.  In that lesson, I provided a Java program that can be used to fetch messages from such uncooperative Email accounts and to forward those messages to Gmail (or any other Email account).

Legacy Email messages

The second problem has to do with legacy Email messages.  Over the years, I have accumulated many tens of thousands of Email messages under control of the Netscape Mail program on my local hard disk.  I need to upload those messages to the Gmail server so that they will be included in my newfound search capability.

In this lesson, I will provide and explain a Java program that can be used to upload legacy Email messages to Gmail or to any other Email account.

Mbox format is required

This program requires that the legacy messages be stored in the well-known Mbox Email format.  Many Email client programs, (including Netscape, Eudora, and Mozilla's Thunderbird) use this storage format.  However, some Email client programs, (such as Microsoft's Outlook), do not store their Email messages in the Mbox format.

If you have legacy Email messages stored in Mbox format, you should be able to use this program to upload them to an Email account.

Format conversion may be required

If your legacy messages are stored in some other format, you will need to search the web to find a program that will convert them into Mbox format before attempting to upload them using this program.  Because I have no experience with such conversions, I can't recommend any specific programs for making the conversion.

Email transmission volume limitations

When I first embarked on this project, my initial inclination was to write a recursive routine that would automatically traverse my entire Email folder tree, extracting messages and uploading them to Gmail in one great blast of Email transmissions.  This would entail the sending of many tens of thousands of Email messages.

Then reality set in

Then I realized that if I were to start a program running that would automatically format and send hundreds (possibly thousands) of Email messages per hour over many consecutive hours, my ISP would shut me down after the first several hundred messages were sent.  In other words, such an operation would probably trip a spam alarm causing the ISP management system to conclude that I was conducting a mass spam mailing campaign and they would disable my ability to send Email messages after a few hundred messages.

A more conservative approach

So, I was forced to take a more conservative approach.  The approach that I settled in on involves the following steps:

  • One time only, set up a dummy Email account in Netscape Mail that is linked to a dummy POP3 server and a dummy SMTP server.  This account cannot be used to actually process Email outside of my local hard disk.
  • Once each day, use the Netscape Mail program to delete all messages from the inbox folder of the dummy account and to copy several hundred existing messages from other folders into the inbox folder.  (If you do this using Netscape Mail, be sure to Compact Folders after deleting the messages from the inbox folder.  Otherwise, they will remain physically in the Mbox file and will be picked up and processed by the Java program discussed below.)
  • Start the special Java program running to upload the messages from the inbox folder of the dummy account to the Gmail server.  The program knows how to find the inbox folder of the dummy account and how to send those messages to the Gmail server.

May require several weeks to complete

Using this approach, several weeks (maybe several months) will be required for me to upload all of my legacy Email messages to the Gmail server, but the process is relatively painless.  (I have no idea how many messages there are to be uploaded.  I have been uploading about 750 messages per day for several days and haven't made much of a dent in the total.)

Preview

In this lesson, I will present and explain two classes that work together to:

  • Extract individual Email messages from an Mbox file in a specified directory.
  • Send those messages to a specified destination Email address using a specified SMTP server.

The names of the two classes are:

  • BigDog05parse
  • BigDog05upload

Basically, an object of the class named BigDog05parse parses the Mbox file, extracts the individual messages from the Mbox file, and writes each message into an individual file in a working directory.  An object of the class named BigDog05upload sends the messages stored in those files to the destination Email address.

Discussion and Sample Code

The class named BigDog05parse

The ultimate purpose of this class is to work in conjunction with a class named BigDog05upload to upload legacy Email messages to an Email server such as Gmail.

This program is designed to read an Mbox file produced by the Netscape 7.2 Email client program and to extract and write each message into an output file that is compatible for uploading to an Email server using the class named BigDog05upload.

Each output file has a unique file name based on the number of milliseconds since Jan 1, 1970.  A one-millisecond delay is inserted into the program between messages to guarantee that no attempt is made to write two files with the same file name.

Input parameters

The following input values are provided as command-line parameters:

  • inputPathAndFile: The path to the Mbox file and the name of the Mbox file.
    Example: ./MailToBeProcessed/Inbox
  • workingDir: Where message files are temporarily stored awaiting upload to the destination Email address.
    Example: ./DataFiles/
  • destinationAddress:  Example: joe@dummy.com
  • smtpServer:  Example: smtp-server.austin.rr.com
  • uploadTag:  A string that is prepended onto the Subject line before the message is uploaded.  Can be an empty string as in "".
    Example: "ae|"

In case you are unfamiliar with the format, path names that begin with a period are relative to the directory containing the Java class files being executed.  However, you could just as well specify those directories relative to the root directory.

Some are used and some are passed along

Some of these parameter values are used by the BigDog05parse object, some are simply passed along to BigDog05upload object, and some are used by objects of both classes.

Both classes were tested using JDK 1.5.0_01 under WinXP.  JDK 1.5 or later is required because of the use of generics in the program.

Will discuss in fragments

As is my custom, I will break these classes down and discuss them in fragments, beginning with BigDog05parse.  You will find complete listings of both classes in Listings 20 and 21 near the end of the lesson.

The beginning of the first class definition and the declaration of some instance variables are shown in Listing 1.

class BigDog05parse{
  //A directory where output message files are
  // stored awaiting uploading by BigDog05upload.
  String workingDir;
  
  //Directory and file name for the Mbox file.
  String inputPathAndFile;
  
  //Save all command-line parameters here
  static String[] params;

Listing 1

The main method

The beginning of the main method is shown in Listing 2.

  public static void main(String[] args){
    if(args.length != 5){
      System.out.println(
                     "Usage: java BigDog05parse "
                     + "n  inputPathAndFile"
                     + "n  workingDir"
                     + "n  destinationAddress"
                     + "n  smtpServer"
                     + "n  uploadTag");
      System.out.println("Terminating");
      System.exit(0);
    }//end if
    
    //Save and display command-line parameters
    params = args;
    
    System.out.println("inputPathAndFile:   " 
                                    + params[0]);
    System.out.println("workingDir:         " 
                                    + params[1]);
    System.out.println("destinationAddress: " 
                                    + params[2]);
    System.out.println("smtpServer:         " 
                                    + params[3]);
    System.out.println("uploadTag:          " 
                                    + params[4]);

Listing 2

The code in Listing 2 should be self-explanatory.

Extract messages from Mbox file

The code in Listing 3 instantiates an object of the BigDog05parse class to extract the individual messages from the Mbox file and to save them as individual message files in the working directory.

    new BigDog05parse(params[0],params[1]);

Listing 3

Upload the messages

When the constructor for the BigDog05parse class in Listing 3 returns, all of the messages have been extracted from the Mbox file and have been saved as individual files in the working directory.

Listing 4 instantiates an object of the BigDog05upload class to cause those messages to be uploaded to the specified destination address.

    new BigDog05upload(params[2],
                       params[3],
                       params[1],
                       params[4]);
  }//end main

Listing 4

Gmail doesn't retain duplicate copies

As an aside that may be useful during your testing of the program, it appears that if you upload the same message to Gmail two or more times in succession, Gmail will not retain duplicate copies of the message.  Only one copy will be retained in the Gmail archives.

The constructor for the BigDog05parse class

Listing 5 shows the constructor for the BigDog05parse class in its entirety.

  BigDog05parse(String input,String output){
    this.inputPathAndFile = input;
    this.workingDir = output;
    parseMboxFile();
  }//end constructor

Listing 5

The constructor requires two parameters.  The first parameter points to the Mbox file.  The second parameter points to the directory where the individual message files will be stored awaiting upload to the destination Email address.

The constructor invokes the instance method named parseMboxFile where all the real work is done for this class.

The parseMboxFile method

The method named parseMboxFile begins in Listing 6.

  void parseMboxFile(){
    String data;
    try{//Get input stream object
      BufferedReader mboxInputObj = 
                          new BufferedReader(
                            new FileReader(
                              inputPathAndFile));
                                  
      DataOutputStream dataOut = null;
      String workingDirAndFile = null;
      String outputFileName = null;
      int msgNumber = 1;

Listing 6

The code in Listing 6 is straightforward and shouldn't require further explanation.

Read and process Mbox file

An Mbox file is simply a text file.  According to Wikipedia,

"mbox is a generic term for a family of related file formats used for holding collections of electronic mail messages. In these formats, the messages are concatenated in a single file, with a From line prepended to the beginning of each and a blank line appended to the end of each. The From line begins with the five characters "From ", and may continue with other text."

That is all we need to know

The information in the above quotation is all that is needed for this method to parse the Mbox file, separating it into individual messages, and writing each message into a separate output file.  For example, Figure 1 shows a typical beginning for an Mbox file.  (Note the first five characters in the first line, which reads "From ".)

From - Fri Aug 12 07:33:13 2005
X-UIDL: 40d1c47e00012ae8
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
Figure 1

Similarly, Figure 2 shows a typical transition from the end of one message to the beginning of the next message in an Mbox file.  (Note the blank line followed by the line that begins with "From ".)

 or else you will continue to receive email/s.
From - Fri Aug 12 07:33:13 2005
X-UIDL: 40d1c47e00012aea
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
Figure 2

Iterate on lines in Mbox file

As shown in Listing 7, the parseMboxFile method uses a while loop to examine each line of text in the Mbox file and to decide what to do with it.  When it finds a line that begins with "From ", the program assumes that the line constitutes the beginning of a new message.

      while((data = mboxInputObj.readLine())
                                        != null){
        if(data.startsWith("From ")){
          if(dataOut != null){
            dataOut.close();
          }//end if(dataOut != null)

Listing 7

If the program currently has an output file open, the code in Listing 7 closes that file in order to start a new output file.

Create a unique fileID

Still in the body of the if statement that begins in Listing 7, the code in Listing 8 creates a unique fileID for the new file name based on the negative value of the current time in milliseconds relative to January 1, 1970.

(The program uses the hexadecimal value of the negative, in two's complement notation, of the current time instead of the actual current time, to eliminate leading zeros in the long time value.  As a result, these hexadecimal values all begin with FFF... and are all of the same length.)

          String fileID = Long.toHexString(
                        -(new Date().getTime()));
          //Sleep for 1 ms to guarantee unique
          // file names.
          Thread.currentThread().sleep(1);

Listing 8

Sleep for one millisecond

After getting the unique fileID, the program goes to sleep for one millisecond to guarantee that the next fileID value will be different.

Complete the if statement

The code in Listing 9 completes the body of the if statement that began in Listing 7.  Recall that this is all being done in conjunction with the decision that the line of text constitutes the first line in a new message.

          //Open an output file to save the
          // message.  Use the fileID as part of
          // the file name.  Use a file name that
          // is compatible with earlier versions
          // of the programs in the BigDog
          // series.
          outputFileName = "+OK " + 
                        msgNumber + " " + fileID;
          //Concatenate file name to the output
          // path.
          workingDirAndFile =
                     workingDir + outputFileName;
          //Get an output stream for the file.
          dataOut = new DataOutputStream(
                           new FileOutputStream(
                             workingDirAndFile));
          //Show progress
          System.out.print(msgNumber + " ");
          //Increment the message counter in
          // preparation for processing the next
          // message.
          msgNumber++;
        }//end if(data.startsWith("File ")

Listing 9

The code in Listing 9 is straightforward and shouldn't require further explanation.

The else clause

The code in Listing 10 shows the else clause for the if statement that began in Listing 7.  This code is executed if the line of text does not constitute the first line in a new message.  In this case, the program writes a newline character at the end of every input line except for the last line in the file.

        else{
          dataOut.writeBytes("n");
        }//end else
        //Write the line into the output file.
        dataOut.writeBytes(data);
      }//end while loop

Listing 10

Finally, the code in Listing 10 writes the line of text into the output file.  That constitutes the end of the while loop.

At this point, the program goes back to the top of the while loop and reads the next line of text from the Mbox file.

The end of the Mbox file

When the readLine statement in the conditional clause of the while loop that begins in Listing 7 encounters the end of the Mbox file, the readLine method returns null causing the while loop to terminate.  That bring us to Listing 11, which signals the end of the parseMboxFile method and the end of the class definition for the BigDog05parse class.

      //All messages from the Mbox file have been
      // written into separate files.
      if(dataOut != null){
        //Close the last file.
        dataOut.close();
      }//end if(dataOut != null)
      
      System.out.println();//blank line
      System.out.println(
                      "The Mbox file is parsed");
      System.out.println("Start uploading msgs");
      //Sound an audio alert
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
    }catch(Exception ex){
      ex.printStackTrace();
      System.exit(0);
    }//end catch
  }//end parseMboxFile
}//end class BigDog05parse

Listing 11

Listing 11 contains some cleanup code, which is straightforward and shouldn't require further explanation.

Returning control to the main method

At this point, the parseMboxFile method will return control to the constructor as shown in Listing 5.  The constructor will return control to the main method as shown in Listing 3.

The next thing that happens is construction of the object of the BigDog05upload object as shown in Listing 4.  That brings us to a discussion of the BigDog05upload class shown in Listing 21 near the end of the lesson.

The BigDog05upload class

The ultimate purpose of this class is to work in conjunction with the class named BigDog05parse to upload legacy Email messages to an Email server such as Gmail.

An object of this class is instantiated by the program named BigDog05parse to tag and upload a set of message files written by the program named BigDog05parse.

This object tags messages with a tag passed as a constructor parameter.  Then the object uploads the messages to a destination Email account that is specified as a constructor parameter using an SMTP server that is also specified as a constructor parameter.

Beginning of the class definition

The beginning of the class definition, along with some variable declarations is shown in Listing 12.

class BigDog05upload{
  //ID of the destination email account.
  final String destinationAddress;
  //An smtp server through which the user is
  // authorized to send email messages.
  final String smtpServer;
  //Local folder where message files are stored
  // awaiting uploading.
  final String workingDir;
  //Tag that is prepended to the Subject line of
  // the message before uploading.
  final String uploadTag;
  //Concatenation of the working directory and
  // the name of a message file.
  String pathFileName;
  //IDs of msg files that will be deleted from
  // the working directory are stored here.
  Vector <String> msgToDelete = 
                            new Vector<String>();
  boolean okToDelete = false;
  int msgNumber = 0;

Listing 12

Constructor for the BigDog05upload class

The constructor is shown in its entirety in Listing 13.

  BigDog05upload(final String destinationAddress,
                 final String smtpServer,
                 final String workingDir,
                 final String uploadTag){
    this.destinationAddress = destinationAddress;
    this.smtpServer = smtpServer;
    this.workingDir = workingDir;
    this.uploadTag = uploadTag;
    
    System.out.println("Uploading messages:");
    uploadMsgs();
    
    //All messages have been uploaded.
    System.out.println("Deleting message files");
    deleteMsgFiles();
  }//end constructor

Listing 13

Upload the messages

After saving the constructor parameters and printing a message, the constructor invokes the method named uploadMsgs.  This method converts the message files produced by the BigDog05parse object to Email messages and sends them to the destination Email address.

Delete the message files

When the uploadMsgs method returns successfully, after having uploaded the message files to the destination Email address, the constructor invokes the deleteMsgFiles method to delete the message files in order to clean up the working directory.

The uploadMsgs method

The method named uploadMsgs begins in Listing 14.

  void uploadMsgs(){
    //The following code creates a directory 
    // listing containing only those files with
    // names that begin with +OK.
    //This is an anonymous implementation of a 
    // class that implements FilenameFilter.
    String[] dirList = new File(workingDir).list(
      new FilenameFilter(){
        public boolean accept(
                           File dir,String name){
          if(!(new File(dir,name).
            isFile())) return false;
          return name.startsWith("+OK");
        }//end accept
      }//end FilenameFilter
    );//end list
    
    if(dirList.length == 0){
      System.out.println("No files to upload");
      System.out.println("Terminating program");
      System.exit(0);
    }//end if

Listing 14

Create a directory listing

The code in Listing 14 creates a listing of the files in the working directory that are to be uploaded to the destination Email address.  If that list is empty, the program aborts with an explanatory message.

The code in Listing 14 is straightforward and should not require further explanation.

Iterate on the directory listing

The uploadMsgs method uses a for loop to iterate on the directory listing, converting each file contained in that listing into an Email message and sending the message to the destination Email address.  The beginning of the for loop and the initialization of some working variables are shown in Listing 15.

    for(int msgCounter = 0;
                     msgCounter < dirList.length;
                                   msgCounter++){
      String fileName = dirList[msgCounter];
      pathFileName = workingDir + fileName;

Listing 15

Construct and send the Email message

During each iteration of the for loop, the code in Listing 16 invokes the forwardEmailMsg method to construct an Email message from the message file and send that Email message to the destination Email address.

If the Email message is sent successfully, the forwardEmailMsg returns true.  Otherwise, it returns false.

The third parameter to the forwardEmailMsg method can be used to tag the Subject line of the message to show that it was uploaded.  If you don't want to tag the message, just pass an empty string as the third parameter.

      okToDelete = forwardEmailMsg(
                              destinationAddress,
                              smtpServer,
                              uploadTag,
                              pathFileName);

Listing 16

Update the deletion list

If the forwardEmailMsg method returns true, the name of the message file is added to a list of files scheduled for deletion later as shown in Listing 17.

      if(okToDelete){
        msgToDelete.add(pathFileName);
        System.out.print((msgCounter + 1) + " ");
      }//end if
    }//end for loop on directory length

Listing 17

The message number for that message is also displayed as an upload progress indicator by the code in Listing 17.

Listing 17 also signals the end of the for loop.  When the for loop terminates, an attempt has been made to upload all of the message files to the destination Email address.

Completion of the uploadMsgs method

Listing 18 shows some additional cleanup code that is executed in the uploadMsgs method after the for loop terminates.

    //Sound an audio alert when all messages have
    // been uploaded.
    try{
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
    }catch(Exception ex){
      ex.printStackTrace();
      System.exit(0);
    }//end catch
    System.out.println();//blank line
    System.out.println("Upload complete");
  }//end uploadMsgs

Listing 18

The code in Listing 18 is straightforward and shouldn't require further explanation.

The forwardEmailMsg method

This is a fairly complex method.  You can view the code for the method in Listing 21 near the end of the lesson.  You will find a detailed explanation of the method in one of my previously published lessons numbered 2180 and entitled Enlisting Java in the War Against Email Viruses.  Therefore, I won't repeat that explanation here.

The method named deleteMsgFiles

As the name implies, the purpose of this method is to delete the message file for each of the messages that have been successfully uploaded to the destination Email address.  The method is shown in its entirety in Listing 19.

  void deleteMsgFiles(){
    //Delete the files in the msgToDelete
    // collection.
    for(int cnt = 0;
                 cnt < msgToDelete.size();cnt++){
      pathFileName = msgToDelete.elementAt(cnt); 
      File file = new File(pathFileName);
      boolean isDeleted = file.delete();
      if(!isDeleted)System.out.println(
             "Unable to delete " + file);
      }//End for loop
  }//end deleteMsgFiles

Listing 19

The code in Listing 19 is straightforward and shouldn't require further explanation.

Run the Program

I encourage you to copy the code from Listings 20 and 21 into your text editor, compile the code, and execute the program.  Experiment with it, making changes, and observing the results of your changes.

Also please be aware of the following disclaimer:

THIS PROGRAM IS PROVIDED TO YOU AT NO COST.  BY USING THIS PROGRAM TO PROCESS YOUR EMAIL, YOU AGREE THAT YOU ARE USING IT AT YOUR OWN RISK.  THE AUTHOR OF THE PROGRAM, RICHARD G. BALDWIN, ACCEPTS NO LIABILITY FOR ANY LOSS THAT YOU MAY INCUR THROUGH THE USE OF THIS PROGRAM.

Summary

In this lesson, I showed you how to write a program that can be used to upload legacy Email messages from local Mbox files to an Email account.

Such a program is particularly useful for persons who would like to upload their collection of legacy Email messages onto the Gmail server so that they can take advantage of the extremely fast search capability that Gmail provides.

What's Next?

In the next lesson in this series, I will show you how to modify the program that I provided in the earlier lesson entitled Consolidating Email using Java so that it can be run in a fully automated unattended mode.  That will make it possible to use a task scheduler to automatically move Email messages from one POP3 server to another Email account on an unattended regularly scheduled basis, such as once per hour for example.

Complete Program Listings

Complete listings of the programs discussed in this lesson are shown in Listing 20 and Listing 21 below.
 
/*File BigDog05parse.java 
Copyright 2005, R.G.Baldwin
Rev 08/16/05
DISCLAIMER:  THIS PROGRAM IS PROVIDED TO YOU AT 
NO COST.  BY USING THIS PROGRAM TO PROCESS YOUR 
EMAIL, YOU AGREE THAT YOU ARE USING IT AT YOUR 
OWN RISK.  THE AUTHOR OF THE PROGRAM, RICHARD G.
BALDWIN, ACCEPTS NO LIABILITY FOR ANY LOSS THAT 
YOU MAY INCUR THROUGH THE USE OF THIS PROGRAM.
The ultimate purpose of this class is to work in 
conjunction with a class named BigDog05upload to
upload legacy Email messages to an Email server 
such as Gmail.
This program is designed to read an Mbox file 
produced by the Netscape 7.2 Email client and to 
extract and write each msg into an output file 
that is compatible for uploading to an Email 
server using the class named BigDog05upload.
Each output file has a unique file name based on 
the number of milliseconds since Jan 1, 1970.  A
one-millisecond delay is inserted into the 
program between messages to guarantee that no 
attempt is made to write two files within one
millisecond, which would otherwise lead to an
attempt to duplicate file names.  This, in turn,
would lead to some messages being lost.
The following input values are provided as 
command-line parameters:
inputPathAndFile: The path to the Mbox file and
  the name of the file. 
  Example: ./MailToBeProcessed/Inbox
workingDir: Where message files are temporarily
  stored.  Example: ./DataFiles/
destinationAddress: Example: joe@dummy.com
smtpServer: Example: smtp-server.austin.rr.com
uploadTag: A string that is prepended onto the
  Subject line before the message is uploaded.
  Can be an empty string as in "". Example: "ae|"
Tested using JDK 1.5.0_01 under WinXP
************************************************/
import java.io.*;
import java.util.*;
import java.awt.*;
class BigDog05parse{
  //A directory where output message files are
  // stored awaiting uploading by BigDog05upload.
  String workingDir;
  
  //Directory and file name for the Mbox file.
  String inputPathAndFile;
  
  //Save all command-line parameters here
  static String[] params;
  public static void main(String[] args){
    if(args.length != 5){
      System.out.println(
                     "Usage: java BigDog05parse "
                     + "n  inputPathAndFile"
                     + "n  workingDir"
                     + "n  destinationAddress"
                     + "n  smtpServer"
                     + "n  uploadTag");
      System.out.println("Terminating");
      System.exit(0);
    }//end if
    
    //Save and display command-line parameters
    params = args;
    
    System.out.println("inputPathAndFile:   " 
                                    + params[0]);
    System.out.println("workingDir:         " 
                                    + params[1]);
    System.out.println("destinationAddress: " 
                                    + params[2]);
    System.out.println("smtpServer:         " 
                                    + params[3]);
    System.out.println("uploadTag:          " 
                                    + params[4]);
    
    //Instantiate an object of the BigDog05parse
    // class to process the Mbox file.
    new BigDog05parse(params[0],params[1]);
    
    //Messages have been extracted from the Mbox
    // file and saved as individual files.
    // Instantiate an object of the 
    // BigDog05upload class to upload the files
    // to the destination address.
    new BigDog05upload(params[2],
                       params[3],
                       params[1],
                       params[4]);
  }//end main
  //===========================================//
  //Constructor requires inputPathAndFile as
  // first parameter and workingDir as second
  // parameter.
  BigDog05parse(String input,String output){
    this.inputPathAndFile = input;
    this.workingDir = output;
    parseMboxFile();
  }//end constructor
  //===========================================//
  
  void parseMboxFile(){
    String data;
    try{//Get input stream object
      BufferedReader mboxInputObj = 
                          new BufferedReader(
                            new FileReader(
                              inputPathAndFile));
                                  
      DataOutputStream dataOut = null;
      String workingDirAndFile = null;
      String outputFileName = null;
      int msgNumber = 1;
                                  
      //Read and process each line from the
      // Mbox file.
      while((data = mboxInputObj.readLine())
                                        != null){
        if(data.startsWith("From ")){
          //This is the beginning of a new
          // message in the Mbox file. Close the
          // current output file in  order to
          // start a new output file.
          if(dataOut != null){
            dataOut.close();
          }//end if(dataOut != null)
          //Create a unique fileID for the new
          // file name based on the negative
          // value of the current time in ms
          // relative to 1 Jan 1970. Use the
          // negative of the current time instead
          // of the current time to eliminate 
          // leading zeros in the long time
          // value.
          String fileID = Long.toHexString(
                        -(new Date().getTime()));
          //Sleep for 1 ms to guarantee unique
          // file names.
          Thread.currentThread().sleep(1);
          //Open an output file to save the
          // message.  Use the fileID as part of
          // the file name.  Use a file name that
          // is compatible with earlier versions
          // of the programs in the BigDog
          // series.
          outputFileName = "+OK " + 
                        msgNumber + " " + fileID;
          //Concatenate file name to the output
          // path.
          workingDirAndFile =
                     workingDir + outputFileName;
          //Get an output stream for the file.
          dataOut = new DataOutputStream(
                           new FileOutputStream(
                             workingDirAndFile));
          //Show progress
          System.out.print(msgNumber + " ");
          //Increment the message counter in
          // preparation for processing the next
          // message.
          msgNumber++;
        }//end if(data.startsWith("File ")
        else{
          //Write a NewLine at the end of every
          // input line except for the last line
          // in the file.
          dataOut.writeBytes("n");
        }//end else
        //Write the line into the output file.
        dataOut.writeBytes(data);
      }//end while loop
      
      //All messages from the Mbox file have been
      // written into separate files.
      if(dataOut != null){
        //Close the last file.
        dataOut.close();
      }//end if(dataOut != null)
      
      System.out.println();//blank line
      System.out.println(
                      "The Mbox file is parsed");
      System.out.println("Start uploading msgs");
      //Sound an audio alert
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
    }catch(Exception ex){
      ex.printStackTrace();
      System.exit(0);
    }//end catch
  }//end parseMboxFile
}//end class BigDog05parse
//=============================================//


Listing 20

 

/*File BigDog05upload.java
Copyright 2005, R.G.Baldwin
Rev 08/16/05
DISCLAIMER:  THIS PROGRAM IS PROVIDED TO YOU AT 
NO COST.  BY USING THIS PROGRAM TO PROCESS YOUR 
EMAIL, YOU AGREE THAT YOU ARE USING IT AT YOUR 
OWN RISK.  THE AUTHOR OF THE PROGRAM, RICHARD G.
BALDWIN, ACCEPTS NO LIABILITY FOR ANY LOSS THAT 
YOU MAY INCUR THROUGH THE USE OF THIS PROGRAM.
The ultimate purpose of this class is to work in 
conjunction with a class named BigDog05parse to 
upload legacy Email messages to an Email server 
such as Gmail.
An object of this class is instantiated by the 
program named BigDog05parse to tag and upload a 
set of message files written by the program named
BigDog05parse.    
This object tags messages with a tag passed as a 
constructor parameter.  Then the object uploads 
the messages to a destination email account that 
is specified as a constructor parameter using an 
SMTP server that is specified as a constructor 
parameter.
See additional comments at the beginning of 
BigDog05parse.java.
Tested using JDK 1.5.0_01 under WinXP
************************************************/
import java.io.*;
import java.util.*;
import java.awt.*;
import sun.net.smtp.SmtpClient;
class BigDog05upload{
  //ID of the destination email account.
  final String destinationAddress;
  //An smtp server through which the user is
  // authorized to send email messages.
  final String smtpServer;
  //Local folder where message files are stored
  // awaiting uploading.
  final String workingDir;
  //Tag that is prepended to the Subject line of
  // the message before uploading.
  final String uploadTag;
  //Concatenation of the working directory and
  // the name of a message file.
  String pathFileName;
  //IDs of msg files that will be deleted from
  // the working directory are stored here.
  Vector <String> msgToDelete = 
                            new Vector<String>();
  boolean okToDelete = false;
  int msgNumber = 0;
  //===========================================//
  //Constructor
  BigDog05upload(final String destinationAddress,
                 final String smtpServer,
                 final String workingDir,
                 final String uploadTag){
    this.destinationAddress = destinationAddress;
    this.smtpServer = smtpServer;
    this.workingDir = workingDir;
    this.uploadTag = uploadTag;
    
    System.out.println("Uploading messages:");
    uploadMsgs();
    
    //All messages have been uploaded.
    System.out.println("Deleting message files");
    deleteMsgFiles();
  }//end constructor
  //===========================================//
  
  void uploadMsgs(){
    //The following code creates a directory 
    // listing containing only those files with
    // names that begin with +OK.
    //This is an anonymous implementation of a 
    // class that implements FilenameFilter.
    String[] dirList = new File(workingDir).list(
      new FilenameFilter(){
        public boolean accept(
                           File dir,String name){
          if(!(new File(dir,name).
            isFile())) return false;
          return name.startsWith("+OK");
        }//end accept
      }//end FilenameFilter
    );//end list
    
    if(dirList.length == 0){
      System.out.println("No files to upload");
      System.out.println("Terminating program");
      System.exit(0);
    }//end if
    //Now upload the files in the directory
    // listing.
    for(int msgCounter = 0;
                     msgCounter < dirList.length;
                                   msgCounter++){
      String fileName = dirList[msgCounter];
      pathFileName = workingDir + fileName;

      //This code uploads the message to the
      // destination email account.  The third 
      // parameter can be used to tag the message
      // to show that it was uploaded. Just pass
      // an empty string, "", if you don't want
      // to tag the message.  Return value will
      // be true if the message was successfully
      // uploaded.
      okToDelete = forwardEmailMsg(
                              destinationAddress,
                              smtpServer,
                              uploadTag,
                              pathFileName);
  
      if(okToDelete){
        //Identify the message file for deletion
        // later.
        msgToDelete.add(pathFileName);
        //Display progress
        System.out.print((msgCounter + 1) + " ");
      }//end if
    }//end for loop on directory length
    //Sound an audio alert when all messages have
    // been uploaded.
    try{
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
      Toolkit.getDefaultToolkit().beep();
      Thread.currentThread().sleep(500);
    }catch(Exception ex){
      ex.printStackTrace();
      System.exit(0);
    }//end catch
    System.out.println();//blank line
    System.out.println("Upload complete");
  }//end uploadMsgs
  //===========================================//
  
  void deleteMsgFiles(){
    //Delete the files in the msgToDelete
    // collection.
    for(int cnt = 0;
                 cnt < msgToDelete.size();cnt++){
      pathFileName = msgToDelete.elementAt(cnt); 
      File file = new File(pathFileName);
      boolean isDeleted = file.delete();
      if(!isDeleted)System.out.println(
             "Unable to delete " + file);
      }//End for loop
  }//end deleteMsgFiles
  //===========================================//
  
  //This method reads and saves lines of data
  // from a file starting with the line that
  // startsWith firstLine and ending with the
  // line that startsWith lastLine.
  //If firstLine is null, data is saved beginning
  // with the first line in the file.
  //If lastLine is null, data is saved to the end
  // of the file.
  //The lines of data from the file are saved by
  // concatenating them into a single string with
  // a newline inserted into the string at the
  // end of each line.
  //The name and path to the file is given by
  // pathFileName.
  public String readLines(String pathFileName,
                          String firstLine,
                          String lastLine){
    StringBuffer strBuf = new StringBuffer();
    try{
      BufferedReader inDataMsg
        = new BufferedReader(new FileReader(
                                  pathFileName));
      String data;
      boolean isSave = false;
      while((data = inDataMsg.readLine())
                                        != null){
        if( ((firstLine == null) ||
             (data.startsWith(firstLine))) &&
             (isSave == false)){
          isSave = true;
        }//end if
        if(isSave){
          strBuf.append(data + "n");
        }//end if
        if((lastLine != null) &&
           (data.startsWith(lastLine))){
          break;//no need to read any more
        }//end if
      }//end while loop
      inDataMsg.close();//Close file
    }catch(Exception e){e.printStackTrace();}
    return new String(strBuf);
  }//end readLines
  //===========================================//
  
  //This method is used to construct an email
  // message and send it to the 
  // destinationAddress.
  public boolean forwardEmailMsg(
                       String destinationAddress,
                       String smtpServer,
                       String uploadTag,
                       String pathFileName){
      StringBuffer message = new StringBuffer(
                             "No message found");
      try{
        //Pass a string containing the name of
        // the smtp server as a parameter to the
        // following constructor.
        SmtpClient smtp =
                      new SmtpClient(smtpServer);
        //Pass any valid email address to the
        // from() method.
        smtp.from(destinationAddress);
        //Pass the email address of the
        // destinationAddress to the to() method.
        smtp.to(destinationAddress);
        //Construct the message as a single
        // StringBuffer object by concatenating
        // all of the lines in the message file.
        message = new StringBuffer(readLines(
                        pathFileName,null,null));
        //Insert uploadTag in subject
        message = message.insert(message.indexOf(
                       "Subject: ")+9,uploadTag);
        //Get an output stream for the message
        PrintStream msg = smtp.startMessage();
        //Write the message into the output
        // stream.
        msg.println(new String(message));
        //Close the stream and send the message
        smtp.closeServer();
        return true;
      }catch( Exception e ){
        e.printStackTrace();
        System.out.println(
                       "while forwarding email");
        //Sound an alarm.
        Toolkit.getDefaultToolkit().beep();
        try{
          Thread.currentThread().sleep(500);
        }catch(Exception ex){
          System.out.println(ex);
        }//end catch
        Toolkit.getDefaultToolkit().beep();
        
        //Return false to indicate that the msg
        // was not successfully forwarded.
        return false;
      }//end catch
  }//end forwardEmailMsg
  //===========================================//
}//end class BigDog05upload
//=============================================//

Listing 21


Copyright 2005, Richard G. Baldwin.  Reproduction in whole or in part in any form or medium without express written permission from Richard Baldwin is prohibited.

About the author

Richard Baldwin is a college professor (at Austin Community College in Austin, TX) and private consultant whose primary focus is a combination of Java, C#, and XML. In addition to the many platform and/or language independent benefits of Java and C# applications, he believes that a combination of Java, C#, and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects and he frequently provides onsite training at the high-tech companies located in and around Austin, Texas.  He is the author of Baldwin's Programming Tutorials, which have gained a worldwide following among experienced and aspiring programmers. He has also published articles in JavaPro magazine.

In addition to his programming expertise, Richard has many years of practical experience in Digital Signal Processing (DSP).  His first job after he earned his Bachelor's degree was doing DSP in the Seismic Research Department of Texas Instruments.  (TI is still a world leader in DSP.)  In the following years, he applied his programming and DSP expertise to other interesting areas including sonar and underwater acoustics.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

Baldwin@DickBaldwin.com

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date