April 24, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Creating a Portable Bookmark Library using Java, Part 2

  • December 28, 2005
  • By Richard G. Baldwin
  • Send Email »
  • More Articles »

Java Programming Notes # 2408


Preface

This is Part 2 of a two-part lesson that will teach you how to write a Java program to create and maintain a portable bookmark (Favorites) library that will follow you from browser to browser, machine to machine, and operating system to operating system.  Part 1 of this lesson was entitled Creating a Portable Bookmark Library using Java.

Viewing tip

You may find it useful to open another copy of this lesson in a separate browser window.  That will make it easier for you to scroll back and forth among the different listings and figures while you are reading about them.

Supplementary material

I recommend that you also study the other lessons in my extensive collection of online Java tutorials.  You will find those lessons published at Gamelan.com.  However, as of the date of this writing, Gamelan doesn't maintain a consolidated index of my Java tutorial lessons, and sometimes they are difficult to locate there.  You will find a consolidated index at www.DickBaldwin.com.

General Background Information

Part 1 of this lesson provided general background information on creating and maintaining a portable bookmark library.  I won't repeat that information here, but will simply refer you to that document.

We need a program

Having suggested how you can use your web mail account as a portable bookmark library, I provided and explained a program that you can:

  • Run initially to populate your web mail bookmark library with the hundreds or thousands of bookmarks that you have accumulated in the past.
  • Run periodically thereafter to update and consolidate your web mail bookmark library to incorporate new bookmarks that you have created using one or more different browsers on one or more different computers as you do your normal web research.

Firefox, Netscape, and Internet Explorer

The program that I provided in Part 1 can be used to consolidate Firefox, Netscape, and Internet Explorer bookmarks into your web mail bookmark library.  The code that is specific to the different browser formats is isolated in two methods.  If you use a browser that creates and maintains its bookmarks in a format that is different from the list of browsers given above, you should be able to write a new method to handle the bookmark format for that browser.

My favorite web mail is Google Gmail

Although I occasionally use three different web mail accounts, the one that I use most consistently and the one that I know the most about is Google Gmail.  Therefore, most of the discussion in this lesson has been slanted toward the use of Gmail for this purpose.  However, this approach should work equally well with just about any web mail account provided that it has sufficient capacity, search capability, and longevity.

Preview

As mentioned earlier, this document is the second part of a two-part lesson.  Part 1 explained the overall control code, using method stubs for the following two methods in place of the methods that actually extract bookmarks from specific browser bookmark libraries.

  • getFireFoxBookmarks
  • getIEBookmarks

This part of the lesson provides and explains those two methods in detail.  At the completion of this lesson you should have everything you need to create, populate, and maintain your own portable bookmark library.

Discussion and Sample Code

This program, named Bookmarks02 that I will provide and explain in this lesson is the same as the program named Bookmarks02a (that I provided and explained in Part 1), except that the two methods listed above have been replaced in this version by actual methods instead of the stubs used in Part 1.

Purpose of the program

The purpose of this program is to:

  • Extract bookmarks from one or more Firefox, Netscape, or Internet Explorer (IE) browsers within a specified range of bookmark indices.
  • Construct an Email message containing the name and the URL of each extracted bookmark.
  • Send the messages to a specified destination Email address.

Email message format

Each Email message is formatted with the name of the bookmark as the Subject and the URL for the bookmark in the body of the message.  To use this message as a bookmark later, simply open the message in the archives of your web mail account and double-click the URL.

A bookmark code

The Subject of the Email message may optionally be prepended with a text string provided by the user and identified as bkMrkCode below.  The string that is prepended to the subject can be used by the Email client to recognize the message as a bookmark message.

A bookmark history list

The program maintains a list of bookmarks that have previously been sent to the destination Email address.  The purpose of this list is to prevent the sending of duplicate bookmark messages to the destination Email address during successive runs of the program.

The history list is maintained in a text file that can easily be edited by the user if such editing becomes necessary.  Several backup copies of the file are automatically generated and maintained by the program. 

History file is automatically created, backed up, and maintained

The history file is named BkMrkHistory.txt.  If it doesn't already exist, it is automatically created (in a folder specified by the user and identified below as dataOutPath) the first time the program is run.  Once the file is created, it is backed up at the beginning of each successive run of the program.  Then it is updated during that run.

Prevention of duplicate Email messages

Bookmarks identified in the history list are not sent to the destination Email address.  This prevents the sending of duplicate bookmark messages during successive runs of the program.

Once all the bookmarks in the Firefox or Netscape bookmark file (or the IE Favorites folder) have been sent to the destination Email address, this program will send only those bookmarks that have been added or changed since the last time the program was run.

Don't overwhelm the SMTP server

Because the number of messages that could be sent the first time the program is run could overwhelm the SMTP server, (causing your ISP to erroneously conclude that you are distributing SPAM email), the program allows the user to specify the index of the first bookmark to be sent and the maximum number of bookmarks to be sent during any particular run.

Controlled by command-line parameters

This program is controlled using command-line parameters making it very easy to run the program periodically using a batch file or a script.

(Once you create the batch file, all that you need to do to run the program is to start the batch file.  If you put a shortcut to the batch file on your desktop, you can run the program simply by double-clicking the shortcut icon.)

The required command-line parameters

The following values must be provided as command-line parameters.  Although all command-line parameters are provided as strings, several of the parameters must be convertible to the int and boolean types shown below.

  • String destAdr:  Email address to which bookmarks are to be sent.
  • String smtpServer:  An SMTP server that can be used to send the messages.
  • String bkMrkPath:  Path to the folder containing the Firefox or Netscape bookmark file, (which is an HTML file) or containing the IE Favorites files (which Microsoft refers to as Internet Shortcut files).
  • String bkMrkFile:  Name of the Firefox or Netscape bookmark file.  (Provide a dummy file name if you are processing IE Favorites.)
  • String dataOutPath:  Path to a folder where output files will be stored.  (Make sure this folder is on the path for your regular backups because you don't want to lose these files in case of a disk failure.)
  • String bkMrkCode:  Unique code that is prepended to the Subject of the Email message to identify the message as a bookmark message.
  • int lowBkMrkLimit:  First bookmark index to process.
  • int numToProc:  Number of bookmarks to process.
  • String browser:  Type of browser: F for Firefox, N for Navigator, or I for Internet Explorer.
  • boolean sendMsgs:  Specify true or false.  Messages will be sent on true.  Message will not be sent, but bookmark statistics will be displayed on false.

Program testing

This program was tested using J2SE 5.0 under WinXP.  J2SE 5.0 or later is required due to the use of generics.

The Bookmarks02 class

The class definition and the main method begin in Listing 1.  Much of the code in Listing 1 was deleted for brevity.  However, all of the code that was deleted from Listing 1 was explained in Part 1 of this lesson.

Listing 1 picks up at the point in the program code that is significant relative to the explanation of the following two methods in this part of the lesson.

  • getFireFoxBookmarks
  • getIEBookmarks

class Bookmarks02{

  public static void main(String[] args){

    //Code deleted for brevity

    if(browser.toUpperCase().equals("F")){
      //Process FireFox bookmarks.
      thisObj.copyBkMrkFile(bkMrkPath,bkMrkFile,
                                dataOutPath,tempBkMrkFile);
      theBookmarks = thisObj.getFireFoxBookmarks(
                                dataOutPath,tempBkMrkFile);
    }else if(browser.toUpperCase().equals("N")){
      //Process Netscape Navigator bookmarks.  Same format
      // as FireFox
      thisObj.copyBkMrkFile(bkMrkPath,bkMrkFile,
                                dataOutPath,tempBkMrkFile);
      theBookmarks = thisObj.getFireFoxBookmarks(
                                dataOutPath,tempBkMrkFile);
    }else if(browser.toUpperCase().equals("I")){
      //Process Inernet Explorer favorites.
      theBookmarks = thisObj.getIEBookmarks(
                                   bkMrkPath,theBookmarks);
    }else{
      System.out.println("Don't recognize browser");
      System.out.println("Terminating program");
      System.exit(0);
    }//end else

Listing 1

Decide among Firefox, Netscape, and IE

The code in Listing 1 uses one of the command-line parameters to decide whether to invoke the getFireFoxBookmarks method or the getIEBookmarks to extract the bookmark information from the browser file(s) and to save the results in an ArrayList object.

Firefox and Netscape bookmark formats are the same

As near as I can determine, the bookmark file format for Firefox and Netscape are the same.  Therefore, the code in Listing 1 invokes the same method regardless of whether the user specifies Firefox or Netscape Navigator in the command-line parameter.

(If I learn later that the Netscape format is different from the Firefox format, I will write a new method to accommodate the Netscape format and reference the new method in Listing 1.  This is also the place where you would need to modify the program if you were to develop a method to extract bookmark information from some browser other than the three supported by this program as written.)

The code in Listing 1 causes the bookmark information to be extracted from the browser file(s) and stored as objects of type Bookmark in the ArrayList object that was instantiated in Listing 6.  That ArrayList object is referred to by theBookmarks.

Very little browser-specific code

When these methods return, the bookmark information is ready to be processed, independently of the specific browser involved.  The only code in this program that is browser-specific is the code in Listing 1 and the two methods named getFireFoxBookmarks and getIEBookmarks.  These two methods extract bookmark information from the browser file(s).

At this point, I will set the main method aside and explain these two methods in detail.  I will return to my explanation of the main method later.

The getFireFoxBookmarks method

The getFireFoxBookmarks method begins in Listing 2.  The purpose of this method is to extract all of the bookmarks from a Firefox or Netscape bookmark file and to encapsulate them in an ArrayList object.  Each element in the ArrayList object is an object of the inner class named Bookmark.  Each Bookmark object contains the name and the URL for one bookmark.

The Bookmark class

The Bookmark class is a very simple class whose sole purpose is to encapsulate two String objects:

  • The name of the bookmark
  • The URL for the bookmark

You can view the definition of the Bookmark class in Listing 17.

Some background information on bookmarks

Before getting into the details of the getFireFoxBookmarks method, I need to provide you with some background information regarding the Firefox, Netscape, and IE approaches to bookmarks.

Entirely different approaches

Firefox and Netscape use an entirely different approach to the creation and maintenance of the bookmark library.  Therefore, the ability to handle these two different approaches is written into two completely different methods in this program.

A well-behaved HTML file

The Firefox approach is to encapsulate the bookmark information into a well behaved HTML file.  This file can have quite a lot of structure, particularly if the user has created a complex system of folders.  In the final analysis, however, all of the information of interest for each bookmark is contained in an HTML element named A.  The URL for the bookmark is defined by the value of an attribute named HREF.  The name of the bookmark is provided as the content of the element.

(If you are unfamiliar with this terminology, see my previously published tutorials in the section entitled XML for Beginners.)

A sample element

Figure 1 shows an example of such an element.  Note, however, that all of this material appears on a single line of text in the Firefox HTML file.  It was necessary for me to manually insert several line breaks in Figure 1 to cause the material to fit in this narrow publication format.


<A HREF="http://www.austinjug.org/" 
ADD_DATE="989249911" 
LAST_VISIT="989249897" 
LAST_MODIFIED="989249897" 
ID="rdf:#$FOszP3">
Austin Java Users Group
</A>

Figure 1

The name and the URL are shown in boldface

The information of interest is shown in boldface in Figure 1.  The URL (the value of the HREF attribute) is shown in boldface on the first line.  The content of the element is shown in boldface near the end.

Other attributes

The element in Figure 1 shows four other attributes that this program ignores:

  • ADD_DATE
  • LAST_VISIT
  • LAST_MODIFIED
  • ID

You may decide that you want to modify the program to also include this information in your portable bookmark library, and it wouldn't be difficult to do so.

In addition, some of the elements have attributes not shown in Figure 1, (such as an ICON attribute), which are also ignored by this program.

Not difficult to parse

Fortunately, it is not difficult to parse this HTML file to extract the URL and the name for each bookmark.  The code to accomplish that begins in Listing 2, which shows the beginning of the getFireFoxBookmarks method.

  ArrayList <Bookmark> getFireFoxBookmarks(
                  String dataOutPath,String tempBkMrkFile){
    int urlIndex = 0;
    int startIndex = 0;
    int endIndex = 0;
    ArrayList <Bookmark> theBookmarks = 
                                new ArrayList <Bookmark>();

Listing 2

The code in listing 2 declares and initializes some local working variables.  It also instantiates the ArrayList object that will be populated with bookmark data and returned when the method terminates.

Read each line of text

Listing 3 shows the beginning of a while loop that reads each line of text from the bookmark file.

    try{
      BufferedReader bufRdr = new BufferedReader(
          new InputStreamReader(new FileInputStream(
                            dataOutPath + tempBkMrkFile)));
      String theName = null;
      String theUrl = null;
      String data = null;

      while((data = bufRdr.readLine()) != null){

Listing 3

Does the line contain a URL?

Each line of text will be tested to determine if it contains an HREF attribute.  If so, the program will conclude that the line of text contains one (and only one) element named A.  The code in the loop will extract the URL from the value of the HREF attribute, and will extract the name of the bookmark from the content of the element named A that is contained in that line of text.

(The fact that one, and only one entire element named A is contained in a single line of text greatly simplifies the code required to parse the file and to extract the name and URL of each bookmark.)

Test for existence of HREF attribute

Listing 4 invokes the indexOf method on the String object that describes the line of text to determine if it contains an attribute named HREF.  This method attempts to find a numeric index that defines the location of the following substring:

A HREF=".

        urlIndex = data.indexOf("A HREF="");

Listing 4

The index value that is returned by the indexOf method is stored in the variable named urlIndex.

The value of urlIndex will be -1 if the line doesn't contain the specified substring.  In that case, the program will simply ignore that line of text and read the next line.  If the value of the index is not -1, that value will be used later to extract the URL.

Get the URL for the bookmark

The code in Listing 5 extracts and saves the URL for the case where the value of the index is not -1.

        if(urlIndex != -1){
          //Find the index of the quotation marks at the
          // beginning and the end of the URL.
          startIndex = urlIndex+8;//Index of first quote+1
          //Index of quotation mark at the end of the URL.
          endIndex = data.indexOf(""",startIndex);
          //Extract and save the URL
          theUrl = data.substring(startIndex,endIndex);

Listing 5

The code in Listing 5, (as explained by the comments), is straightforward and shouldn't require further explanation.

Get the name of the bookmark

The code in Listing 6 gets and saves the name of the bookmark, which is the content of the element named A.  You may want to refer back to Figure 1 to gain a better understanding of exactly how this code works.

          // Get the index of the beginning of the content.
          startIndex = data.indexOf(">",urlIndex) +1;
          //Get the index of the end of the content.
          endIndex = data.indexOf("</A>",startIndex);
          //Get and save the content
          if(endIndex > startIndex){
            //The A element is not empty.
            theName = data.substring(startIndex,endIndex);
          }else{
            //The A element is empty
            theName = "No bookmark name found.";
          }//end else

Listing 6

Once again, the code in Listing 6, (as explained by the comments), is straightforward and shouldn't require further explanation.  Note, however, that in the unlikely event that there is no content, an artificial name for the bookmark is generated.

Process next line of text

The code in Listing 7 adds the bookmark just extracted to the ArrayList object, and then goes back to the top of the while loop that began in Listing 3 to process the next line of text.  As you can see, the name and the URL are encapsulated in a Bookmark object, whose reference is added to the ArrayList object.

          theBookmarks.add(new Bookmark(theName,theUrl));
        }//end if
      }//end while

Listing 7

Return the populated ArrayList object

When all of the lines of text in the bookmark file have been processed, control transfers from the while loop to the code in Listing 8.

The code in Listing 8 terminates the getFireFoxBookmarks method, returning a reference to the ArrayList object in the process.  This returns control to that portion of the main method shown in Listing 1.

      bufRdr.close();
    }catch(Exception e){
      e.printStackTrace();
      System.exit(0);
    }//end catch
    
    return theBookmarks;
  }//end getFireFoxBookmarks

Listing 8

As you will recall from Part 1 of this lesson, the program goes on at that point to invoke the processBkMrks method to process the contents of the ArrayList object.

Processing IE Favorites

As mentioned earlier, the approach that Microsoft uses to create and maintain the IE Favorites library is entirely different from the approach used by Firefox.  The IE Favorites library is simply a directory tree structure rooted in a Windows folder at a location similar to the following:

C:Documents and SettingsOwnerFavorites

Each bookmark is stored in a separate text file having an extension of URL.

(The Microsoft properties dialog refers to these files as Internet Shortcut files.)

The name and the URL for the bookmark

The name of the bookmark is simply the name of the Internet Shortcut file.

The URL for the bookmark and some other information as well, is stored in the Internet Shortcut file.

Bookmark library structure

Folders in the bookmark library are created by creating ordinary Windows folders as children, grandchildren, etc., of the folder named Favorites.

A typical Internet Shortcut file

Figure 2 shows the contents of a typical Internet Shortcut file.  Note, however, that it was necessary for me to manually insert several line breaks to cause this material to fit in this narrow publication format.

[DEFAULT]
BASEURL=http://securityresponse.symantec.com/avcenter/
download/pages/US-NAVCE.html
[InternetShortcut]
URL=http://securityresponse.symantec.com/avcenter/
download/pages/US-NAVCE.html
Modified=90C11AFFF005C40124
IconFile=http://securityresponse.symantec.com/favicon.ico
IconIndex=1
Figure 2

The BASEURL item

I'm unsure as to the purpose of the item at the beginning of the file identified as BASEURL=http...  This item isn't contained in all Internet Shortcut files.  When it is contained in the file, it is often a duplicate of the item that begins with URL=http...

The URL item

The boldface item that begins with URL=http..., does seem to be contained in all Internet Shortcut files.  Rightly or wrongly, while writing this program, I assumed that this is the bookmark URL that needs to be extracted.

The getTheUrl method

Since one of the obvious tasks of this program is to extract the URL from the Internet Shortcut file, I will begin the discussion with a helper method named getTheUrl.  As you will see later, this is a helper method that is called by the getIEBookmarks method.  The purpose of the getTheUrl method is to extract the URL from a Microsoft Internet Shortcut file.

This method is shown in its entirety in Listing 9.

  String getTheUrl(String pathAndFile){
    try{
      BufferedReader inData = new BufferedReader(
                              new FileReader(pathAndFile));
      String data; //temp holding area

      while((data = inData.readLine()) != null){
        if(data.startsWith("URL=")){
          String theUrl = data.substring(4);
          inData.close();//Close input file
          return theUrl;
        }//end if
      }//end while loop
      inData.close();//Close input file
    }catch(Exception e){
      e.printStackTrace();
    }//end catch
    System.out.println("No URL Found");
    return "No URL Found";
  }//end getTheUrl

Listing 9

This method is straightforward.  Given what you have been told, and given the sample in Figure 2 to look at, no further explanation should be required. 

However, it is worth mentioning that the entire URL is contained in a single line of text in the Internet Shortcut file, whereas it was necessary for me to break the boldface URL into two lines for display purposes in Figure 2.

For the unexpected case where the Internet Shortcut file doesn't contain a line that matches the template used in Listing 9, the code in Listing 9 returns a String containing an artificial URL.

The getIEBookmarks method

The code in Listing 1 invokes the getIEBookmarks method to extract the names and the URLs from IE Favorites.  The getIEBookmarks method begins in Listing 10.

The code in Listing 10 declares and initializes some local working variables.

  ArrayList <Bookmark> getIEBookmarks(
       String bkMrkPath,ArrayList <Bookmark> theBookmarks){
   
    String theName = null;
    String theUrl = null;
    String fileName = null;
    String pathAndFile = null;

Listing 10





Page 1 of 3



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel