July 23, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Using Java to Clean Up Your Bookmark Library

  • March 21, 2006
  • By Richard G. Baldwin
  • Send Email »
  • More Articles »

Discussion and Sample Code

The program named Bookmarks10

I will explain this program in fragments.  You can view a complete listing of the program in Listing 20 near the end of the lesson.

The class definition begins in Listing 1.  The code in Listing 1 simply declares several variables used to produce the output files.

class Bookmarks10{
  //Output text file streams
  DataOutputStream file000;
  DataOutputStream file100;
  DataOutputStream file200;
  DataOutputStream file300;
  DataOutputStream file400;
  DataOutputStream file500;
  DataOutputStream file600;

Listing 1

The main method

The main method begins in Listing 2.

  public static void main(String[] args){
    //Confirm correct number of command-line parameters.
    // If the number is not correct, display a usage msg
    // and terminate the program.
    if(args.length != 5){
      System.out.println("Command-line parameter error");
      System.out.println();
      System.out.println("Usage: java Bookmarks10");
      System.out.println("followed by:");
      System.out.println("Bookmark path");
      System.out.println("Bookmark file");
      System.out.println("Low bookmark limit");
      System.out.println("Number bookmarks to process");
      System.out.println("Browser, F, N, or I");
      
      System.out.println();
      System.out.println("Terminating Program");
      System.exit(0);      
    }//end if
    
    //The following values are provided as command-line
    // parameters.

    //Path to the folder containing a Firefox bookmark
    // file or containing a multitude of IE .url files.
    String bkMrkPath = args[0];
    //Name of the Firefox bookmark file.  Just use a 
    // dummy name for this parameter when processing IE
    // favorites
    String bkMrkFile = args[1];
    //Index of first bookmark to process.
    int lowBkMrkLimit = Integer.parseInt(args[2]);
    //Number of bookmarks to process.
    int numToProc = Integer.parseInt(args[3]);
    //Type of browser: F for Firefox, N for Navigator,
    // or I for Internet Explorer.
    String browser = args[4];
    //End of command-line parameters

Listing 2

The code in Listing 2 simply deals with the required command-line parameters and shouldn't require further explanation.

Instantiate an object of this class

The code in Listing 3 instantiates an object of this class and stores its reference in a reference variable named thisObj.

    Bookmarks10 thisObj = new Bookmarks10();

Listing 3

The reference variable named thisObj will be used later to invoke instance methods belonging to the object.

Get name and URL for each bookmark

The code in Listing 4 gets the name and the URL for each of the bookmarks and encapsulates them in an object of type Bookmark.  All of the Bookmark objects are encapsulated in an object of type ArrayList.

    //The following collection encapsulates all of the
    // bookmarks awaiting final processing.  The
    // getIEBookmarks method requires that a method
    // parameter points to the ArrayList object on input
    // because of its recursive nature.  The
    // getFireFoxBookmarks method is not recursive and it
    // overwrites this object with a new ArrayList object
    // that it creates.
    ArrayList <Bookmark> theBookmarks = 
                                new ArrayList <Bookmark>();
    if(browser.toUpperCase().equals("F")){
      //Process Firefox bookmarks.
      theBookmarks = thisObj.getFireFoxBookmarks(
                                      bkMrkPath,bkMrkFile);
    }else if(browser.toUpperCase().equals("N")){
      //Process Netscape Navigator bookmarks.  Same format
      // as Firefox
      theBookmarks = thisObj.getFireFoxBookmarks(
                                      bkMrkPath,bkMrkFile);
    }else if(browser.toUpperCase().equals("I")){
      //Process Inernet Explorer favorites.
      theBookmarks = thisObj.getIEBookmarks(
                                   bkMrkPath,theBookmarks);
    }else{
      System.out.println("Don't recognize browser");
      System.out.println("Terminating program");
      System.exit(0);
    }//end else

Listing 4

Code was explained in an earlier lesson

The code in Listing 4, along with the methods named getFireFoxBookmarks and getIEBookmarks is very similar to code that I explained in the earlier lesson entitled Creating a Portable Bookmark Library using Java, Part 2.  Therefore, I won't explain that code again here.  Rather, I will simply refer you to that earlier lesson.  You can view those methods in Listing 20 near the end of the lesson.

Once the code in Listing 4 has executed, all of the required bookmark information has been encapsulated in an ArrayList object referred to by theBookmarks.

Process the bookmarks

Continuing with the main method, the code in Listing 5 invokes the method named processBkMrks to process all of the bookmarks that have been encapsulated in the ArrayList object.

    thisObj.processBkMrks(lowBkMrkLimit,numToProc,
                                             theBookmarks);
  }// end main

Listing 5

Listing 5 also signals the end of the main method.

The processBkMrks method

The method named processBkMrks begins in Listing 6.  This method processes bookmarks previously stored in an ArrayList object referred to by theBookmarks.

  void processBkMrks(int lowBkMrkLimit,
                     int numToProc,
                     ArrayList <Bookmark> theBookmarks){
    int eligibleCounter = 0;
    String theName = null;
    String theUrl = null;

Listing 6

This method receives a reference to the ArrayList object containing bookmark information along with information identifying the bookmarks to process.  The parameter named lowBkMrkLimit specifies the index of the first bookmark to process.  The parameter named numToProc specifies the number of bookmarks to process.

Listing 6 declares and initializes some local working variables.

Create the output files

The code in Listing 7 creates the seven output files and places one line of explanatory text in each file.

    try{
      file000 = new DataOutputStream(
                          new FileOutputStream("000.txt"));
      file000.writeBytes(
                     "This file contains all headersnn");
      
      file100 = new DataOutputStream(
                          new FileOutputStream("100.txt"));
      file100.writeBytes(
          "This file contains all 100-series headersnn");
      
      file200 = new DataOutputStream(
                          new FileOutputStream("200.txt"));
      file200.writeBytes(
          "This file contains all 200-series headersnn");
      
      file300 = new DataOutputStream(
                          new FileOutputStream("300.txt"));
      file300.writeBytes(
          "This file contains all 300-series headersnn");
      
      file400 = new DataOutputStream(
                          new FileOutputStream("400.txt"));
      file400.writeBytes(
          "This file contains all 400-series headersnn");
      
      file500 = new DataOutputStream(
                          new FileOutputStream("500.txt"));
      file500.writeBytes(
          "This file contains all 500-series headersnn");
      
      file600 = new DataOutputStream(
                          new FileOutputStream("600.txt"));
      file600.writeBytes(
                "This file contains exception outputnn");
    }catch(IOException e){
      try{
        file600.writeBytes(e + "nn");
      }catch(Exception ex){
        ex.printStackTrace();
      }//end catch
      e.printStackTrace();
      System.exit(0);
    }//end catch

Listing 7

The code in Listing 7 is straightforward and shouldn't require further explanation.

Iterate on the ArrayList object

Listing 8 shows the beginning of a for loop that is used to iterate on the ArrayList object and to examine each bookmark encapsulated in the object.

    for(int msgCntr = 0;msgCntr < theBookmarks.size();
                                                msgCntr++){
      theName = theBookmarks.get(msgCntr).bkMrkName;
      theUrl = theBookmarks.get(msgCntr).bkMrkUrl;

Listing 8

The code in Listing 8 extracts and saves the name and the URL for each bookmark that it examines.

Determine eligibility

Listing 9 shows the beginning of an if statement that determines the eligibility of the current bookmark for processing based on the specified range of bookmark indices and the protocol.

      if((msgCntr >= lowBkMrkLimit) && 
                    (msgCntr < lowBkMrkLimit + numToProc)){
        //Strip off the protocol for the HTTP protocol only
        if(theUrl.substring(0,7).toUpperCase().
                                        equals("HTTP://")){
          theUrl = theUrl.substring(7);
          //This bookmark is eligible for processing.
          eligibleCounter++;
          //Display progress on standard output
          System.out.println("n" + msgCntr + " " 
                                 + theName + " " + theUrl);
                                 
          //Try to connect to the server to retrieve the
          // response headers.
          tryToConnect(msgCntr,theName,theUrl);

Listing 9

In order to be eligible for processing, the bookmark must specify the HTTP protocol and the index of the bookmark must fall within the range specified by the user.

If the bookmark is determined to be eligible, the URL for the bookmark, along with some other information is passed to a method named tryToConnect.  This method, which I will explain later, contains the code that attempts to connect to the server specified by the URL and to retrieve the response header for the specified resource.

If protocol is not HTTP

Continuing for the moment with the method named processBkMrks, the code in the else clause in Listing 10 deals with those bookmarks for which the index is in the specified range, but for which the protocol is not HTTP.

        }else{
          //This protocol can't be handled by this program.
          // Document that fact in the file named 000.txt.
          try{
            file000.writeBytes(msgCntr + " " + 
                          "Can't handle this protocol.n");
            file000.writeBytes(
                         theName + "   " + theUrl +"nn");
          }catch(IOException e){
            try{
              file600.writeBytes(e + "nn");
            }catch(Exception ex){
              ex.printStackTrace();
            }//end catch
            e.printStackTrace();
            System.exit(0);
          }//end catch
        }//end else regarding protocol
      }//end if regarding the bookmark indices
    }//end for loop iterating on the ArrayList object

Listing 10

The code in the else clause in Listing 10 writes a notification into the file named 000.txt to the effect that the protocol is not eligible for processing.

Listing 10 also contains some cleanup code including a catch block and several end points including the end point for the for loop that began in Listing 8 and is used to iterate on the bookmarks encapsulated in the ArrayList object.

Store summary information

Listing 11 stores summary information about the run at the end of the file named 000.txt and closes all output text files.

    try{
      
      file000.writeBytes("Number eligible bookmarks = " 
                                 + eligibleCounter + "n");
      file000.writeBytes("Bookmark range = " 
            + lowBkMrkLimit
            + " to " + (lowBkMrkLimit + numToProc) + "n");
      file000.writeBytes("Total number bookmarks = " 
                             + theBookmarks.size() + "n");
      file000.close();
      file100.close();
      file200.close();
      file300.close();
      file400.close();
      file500.close();
      file600.close();
    }catch(IOException e){
      try{
        file600.writeBytes(e + "nn");
      }catch(Exception ex){
        ex.printStackTrace();
      }//end catch
      e.printStackTrace();
      System.exit(0);
    }//end catch
  }//end processBkMrks

Listing 11

Listing 11 also signals the end of the method named processBkMrks.

Sample summary information

Figure 15 shows a sample of the summary information that resulted from running the program on my Firefox bookmark library.

Number eligible bookmarks = 1203
Bookmark range = 2869 to 7919
Total number bookmarks = 4088
Figure 15

As you can see, there were a little over 1200 available bookmarks between the specified beginning index and the end of the library at an index of 4087.  Of this total, 1203 were deemed to be eligible for processing.  Presumably the remaining bookmarks specified the wrong protocol.

The method named tryToConnect

Listing 12 shows the beginning of the method named tryToConnect, which is invoked on all eligible bookmarks in Listing 9.

The purpose of this method is to try to connect to the server specified by a given URL and to download the response header lines for the specified resource.

  void tryToConnect(int cnt, String theName,String URL){
    String server = "";
    String theFile = "";

    //Handle cases with a file specified or with no file
    // specified but a trailing slash on the URL.
    if(URL.indexOf("/") != -1){
      server = URL.substring(0,URL.indexOf("/"));
      theFile = URL.substring(URL.indexOf("/"));
    }else
      //Handle the case of no slash and no file specified.
      if(URL.indexOf("/") == -1){
        server = URL;
        theFile = "/";
    }//end if

Listing 12

After declaring and initializing a couple of local working variables, the code in Listing 12 gets values for the server and the resource that is requested by the bookmark.

Different URL formats

The code in Listing 12 deals with the fact that URLs can come in different formats.  For example, some URLs specify a resource and some do not.  In the latter case, the expectation is that the server will deliver a default resource, such as a file named index.html.  In this case, the resource needs to be specified as a single forward-slash character when the HEAD request is sent to the server.

Get a Socket connection to the server

The code in Listing 13 tries to get a socket connection to the server on port 80, the standard HTTP port.

    int port = 80; //http port
    try{
      Socket socket = new Socket(server,port);//get socket

      //Get input and output streams from the socket      
      BufferedReader inputStream = 
                  new BufferedReader(new InputStreamReader(
                                 socket.getInputStream()));
      PrintWriter outputStream = 
                    new PrintWriter(new OutputStreamWriter(
                           socket.getOutputStream()),true);

Listing 13

If the connection is achieved, Listing 13 gets input and output streams on the socket by which the program can send a request to the server and read the response provided by the server.

If the attempt to get the socket connection fails, the code in a catch block shown later in Listing 19 will be executed to cause that failure to be noted in the output file named 600.txt.

Request the response headers

The code in Listing 14 sends a HEAD request to the server asking it to send back the response header lines pertaining to the resource specified by theFile using the HTTP 1.1 protocol.

      outputStream.println(
                          "HEAD " + theFile + " HTTP/1.1");
      outputStream.println("Host: " + server);
      //May need to modify the following for non-Windows
      // systems, (see Wikipedia reference) to cause hard
      // line breaks consisting of both a carriage return
      // and a line feed to be sent to the server.
      outputStream.println();
      outputStream.println();

Listing 14

You can read more about the format requirements of the HTTP 1.1 protocol at Wikipedia.

(Note the comment in Listing 14 regarding hard line breaks and non-windows systems.)

Read and save the first response line header

The code in Listing 15 reads and saves the first line sent back by the server in the response header for the resource.  For the purposes of this program, we don't care about the other lines in the response header, so we don't read them.

      String line = inputStream.readLine();

Listing 15

Save the first line for all bookmarks

The code in Listing 16 saves the first header response line in the file named 000.txt, along with the index value for the bookmark.  This information can be useful later for reference purposes.

      file000.writeBytes(cnt + " " + theName + " " + URL 
                                                   + "n");

Listing 16

Distribute the fist line among different output files

The code in Listing 17 distributes another copy of the first response header line among five different output files based on the first character of the status code.  For example, all lines for which the status code begins with 2 go into the file named 200.txt, and all lines for which the status code begins with 4 go into the file named 400.txt.

      if(line.startsWith("HTTP/1.0")){
        file000.writeBytes(
                    "HTTP/1.0 results are not reliablen");
      }//end if
      file000.writeBytes(line + "n");
      file000.writeBytes("n");

      //Save first line of all 100 series headers in the
      // file named 100.txt
      if(line.substring(9,10).equals("1")){
        file100.writeBytes(cnt + " " + theName + " " + URL
                                                   + "n");
        if(line.startsWith("HTTP/1.0")){
          file100.writeBytes(
                    "HTTP/1.0 results are not reliablen");
        }//end if
        file100.writeBytes(line + "n");
        file100.writeBytes("n");
      }//end if

      //Save first line of all 200 series headers in the
      // file named 200.txt
      if(line.substring(9,10).equals("2")){
        file200.writeBytes(cnt + " " + theName + " " + URL
                                                   + "n");
        if(line.startsWith("HTTP/1.0")){
          file200.writeBytes(
                    "HTTP/1.0 results are not reliablen");
        }//end if
        file200.writeBytes(line + "n");
        file200.writeBytes("n");
      }//end if

      //Save first line of all 300 series headers in the
      // file named 300.txt
      if(line.substring(9,10).equals("3")){
        file300.writeBytes(cnt + " " + theName + " " + URL
                                                   + "n");
        if(line.startsWith("HTTP/1.0")){
          file300.writeBytes(
                    "HTTP/1.0 results are not reliablen");
        }//end if
        file300.writeBytes(line + "n");
        file300.writeBytes("n");
      }//end if

      //Save first line of all 400 series headers in the
      // file named 400.txt
      if(line.substring(9,10).equals("4")){
        file400.writeBytes(cnt + " " + theName + " " + URL
                                                   + "n");
        if(line.startsWith("HTTP/1.0")){
          file400.writeBytes(
                    "HTTP/1.0 results are not reliablen");
        }//end if
        file400.writeBytes(line + "n");
        file400.writeBytes("n");
      }//end if
      
      //Save first line of all 500 series headers in the
      // file named 500.txt
      if(line.substring(9,10).equals("5")){
        file500.writeBytes(cnt + " " + theName + " " + URL
                                                   + "n");
        if(line.startsWith("HTTP/1.0")){
          file500.writeBytes(
                    "HTTP/1.0 results are not reliablen");
        }//end if
        file500.writeBytes(line + "n");
        file500.writeBytes("n");
      }//end if

Listing 17

Close the connection

The code in Listing 18 closes the Socket connection.

      socket.close();
    }//end try

Listing 18

Listing 18 also signals the end of the try block that began in Listing 13.

Unable to connect

Listing 19 shows the catch block that is associated with the try block that began in Listing 13

    catch(Exception e){
      try{
        file600.writeBytes(cnt + " " + theName + "n");
        file600.writeBytes(server + theFile + "n");
        file600.writeBytes(e + "n");
        file600.writeBytes("n");
      }catch(IOException ex){
        ex.printStackTrace();
      }//end catch
    }//end catch
  }//end tryToConnect

}//end class Bookmarks10 definition

Listing 19

The code in Listing 19 is executed if the program is unable to make the connection with the server specified by the bookmark.  In this event, information regarding the problem is recorded in the output file named 600.txtFigure 11 shows an example of such output

Listing 19 also signals the end of the method named tryToConnect and the end of the class named Bookmarks10.





Page 2 of 3



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel