October 30, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Using Java to Clean Up Your Bookmark Library

  • March 21, 2006
  • By Richard G. Baldwin
  • Send Email »
  • More Articles »

Java Programming Notes # 2410


Preface

Many of us who have been using browsers on the web for many years have accumulated vast bookmark libraries containing many broken bookmarks.  In my own case, before I embarked on my bookmark cleanup campaign, I had accumulated more than 5,200 bookmarks, many of which had probably been broken for years.

In this lesson, I will show you how to write a program that will identify potentially broken bookmarks so that you can either delete them from your library or repair them.  The program works for Firefox and Netscape bookmark libraries as well as Internet Explorer Favorites libraries.

Viewing tip

You may find it useful to open another copy of this lesson in a separate browser window.  That will make it easier for you to scroll back and forth among the different listings and figures while you are reading about them.

Supplementary material

I recommend that you also study the other lessons in my extensive collection of online Java tutorials.  You will find those lessons published at Gamelan.com.  However, as of the date of this writing, Gamelan doesn't maintain a consolidated index of my Java tutorial lessons, and sometimes they are difficult to locate there.  You will find a consolidated index at www.DickBaldwin.com.

General Background Information

A Firefox bookmark library

Firefox and Netscape use the same technique for creating and maintaining a bookmark library.  In particular, by default, the bookmarks are stored in a file named bookmarks.html that you will find somewhere on your hard disk in an area that is dedicated to the browser.

(Internet Explorer, on the other hand, uses a completely different approach to creating and maintaining its library of Favorites.  This program is compatible with the approaches used by all three programs.)

Path to the Firefox bookmark file

For example, here is the path to the Firefox bookmark file on my computer running under Windows XP:

C:Documents and SettingsOwnerApplication Data
MozillaFirefoxProfilesathy94h2.default
bookmarks.html

Note that by default everything in and beyond the folder named Application Data is hidden.  You must select "Show hidden files and folders" under Folder Options in order to be able to see the bookmark file.  Depending on your operating system, your bookmark file may or may not be similarly located on your hard disk.

Also note that the folder named athy94h2.default appears to be a random folder name that is established when you install Firefox.

A browser view of a Firefox bookmark file

Figure 1 shows a cropped rectangular section of the Firefox browser window when it has been loaded with its own bookmark file named bookmarks.html.


Figure 1

The Firefox Bookmarks Manager view

Figure 2 shows a screen shot of the Firefox Bookmarks Manager screen, adjusted so as to view the same set of bookmarks shown in Figure 1.  You should be able to correlate the material in the upper portion of the left pane and all of the material in the right pane in Figure 2 with the material in Figure 1.  As you can see, the general structure of the browser view and the Bookmarks Manager view of the file named bookmarks.html are very similar.  As you will learn later, this is a very fortunate circumstance.


Figure 2

IE Favorites

As mentioned earlier, the approach that Microsoft uses to create and maintain the IE Favorites library is entirely different from the approach used by Firefox and Netscape.  The IE Favorites library is simply a directory tree structure rooted in a Windows folder at a location similar to the following:

C:Documents and SettingsOwnerFavorites

Each Favorite item (bookmark) is stored in a separate text file having an extension of url.

(The Microsoft properties dialog refers to these files as Internet Shortcut files.)

The name and the URL for the bookmark

The name of the bookmark is the name of the Internet Shortcut file.

The URL for the bookmark along with some other information is stored in the Internet Shortcut file.

Bookmark library structure

Folders in the IE Favorites library are created by creating ordinary Windows folders as children, grandchildren, etc., of the folder named Favorites.

The Windows Explorer view

I'm going to show you three views of the IE Favorites, which unfortunately bear little resemblance to one another.  Figure 3 shows a screen shot of an ordinary Windows Explorer window in which the files have been sorted according to the Name by clicking the sorting bar at the top.


Figure 3

With the exception of the file named aacmd.bat, each of the files in Figure 3 represents an item in the Favorites library (a bookmark).  There are, in addition, other bookmarks in the folders named Adobe Studio, HP Recommended Sites, Links, and Media.  The order of the files and the folders in the view shown in Figure 3 depends on which of the sorting bars at the top has most recently been clicked.

Connecting to a server via an Internet Shortcut file

Double-clicking one of the Internet Shortcut files shown in Figure 3 will cause the default browser to attempt to connect to the server whose URL is contained in the Internet Shortcut file.

The IE Favorites view

The view shown in Figure 4 is the view taken from inside the IE browser after having clicked the button with the large gold star near the top.


Figure 4

The order is controlled by the user

As you can see, the order of the items in Figure 4 doesn't match the order of the items in Figure 3.  In fact, the user can change the order of the items in Figure 4 by selecting an item and dragging it up or down to a new location.  The user can also change the order of the items by clicking the button labeled Organize and making use of tools that are found there (see Figure 12).  Note, however, that neither of these approaches to rearranging the items in this view has any effect on the order of the actual files in the folder.

This ability to rearrange the items is very useful from the viewpoint of making the Favorites library useful, but as you will see later makes it more difficult to clean up the library by deleting or repairing broken links.

The view with the most natural order

The view that shows the Favorites items in the most natural order is the view shown in Figure 5.  This view is the result of opening a command window and executing a DIR command in the Favorites folder.

In Figure 5, you can see the names of the individual files having an extension of url.  These are the Internet Shortcut files.  The names of these files match the names of the Favorites items that appear in the view shown in Figure 4.


Figure 5

Will use this order

The order of the Internet Shortcut files shown in Figure 5 matches the processing order of the program that I will explain later.  The program processes the Favorites directory listing recursively.  Thus in the case shown in Figure 5, the program begins by processing the following three files having the url extension in the order shown:

  • .NET Development.url
  • .NET Framework Home Page.url
  • ACC WebMail Login,baldwin,ACC Email Pwd...

Then the program makes a recursive call and process all of the files in the directory named Adobe Studio.

Once all the files in directory named Adobe Studio have been processed (along with the files in its sub-directories, if any), the program returns to the level shown in Figure 5 and processes the file named Antivirus daily download.url.  It will continue processing files in the order shown until it encounters the directory named HP Recommended Sites.  At that point, it makes a recursive call to process the files in that directory and its sub-directories.

IE Favorites can be difficult to locate

What you will see later is that even when you have identified a Favorites item with a broken link, it can be difficult to locate it in the IE Favorites view shown in Figure 4 in order to delete or repair the item.  As near as I have been able to determine, that view does not provide a mechanism by which you can search for a specified item (but perhaps I overlooked that capability).

Let me see the headers ...

Generally, this program operates by attempting to contact the server specified in the URL for each bookmark and asking that server to send back the response header lines for the resource specified by the URL.

(The program requests that the server send only the response header lines and not the entire resource in order to preserve bandwidth and improve speed.)

HTTP requests

According to Wikipedia, whenever an HTTP client contacts an HTTP server, it can send one of the requests shown in Figure 6.

HTTP request methods
  • GET By far the most common method used to request for a specified URL.
  • HEAD Identical to GET, except that the page content is not returned; just the headers are. Useful for retrieving meta-information.
  • POST Similar to GET, except that a message body, typically containing key-value pairs from an HTML form submission, is included in the request.
  • PUT Used for uploading files to a specified URI on a web-server.
  • DELETE Rarely implemented, deletes a resource (i.e. a file).
  • TRACE Echoes back the received request, so that a client can see what intermediate servers are adding or changing in the request.
  • OPTIONS Returns the HTTP methods that the server supports. This can be used to check the functionality of a web server.
  • CONNECT Rarely implemented, for use with a proxy that can change to being an SSL tunnel.

HTTP servers are supposed to implement at least GET and HEAD methods and, whenever possible, also OPTIONS method.

Figure 6

Response header lines

When this program contacts a server, it sends a HEAD request using the HTTP 1.1 protocol, requesting that only the response header lines be returned.

(You can view request and response headers for any URL at http://web-sniffer.net/.)

For example, the entry of HTTP://WWW.DICKBALDWIN.COM/ABC into the web sniffer page shown above produced the output shown in Figure 7.

HTTP/1.1 404 Not Found
Date: Sat, 17 Sep 2005 13:56:02 GMT
Server: Apache	
Content-Length: 320
Connection: close
Content-Type: text/html; charset=iso-8859-1
Figure 7

Ignore all but the status line

This program ignores all but the first response header line, taking the content of that line as an indication of the quality of the bookmark.

According to HTTP Made Really Easy, the initial response line, often called the status line, has three parts separated by spaces:

  • The HTTP version
  • A response status code that gives the result of the request
  • An English reason phrase describing the status code

Typical HTTP 1.1 status lines

Typical HTTP 1.1 status lines from different servers are shown in Figure 8.

HTTP/1.1 200 OK
HTTP/1.1 301 Moved Permanently
HTTP/1.1 302 Moved Temporarily
HTTP/1.1 302 Found
HTTP/1.1 302 Object moved
HTTP/1.1 400 Bad Request
HTTP/1.1 401 Authorization Required
HTTP/1.1 403 Access Forbidden
HTTP/1.1 403 Invalid method
HTTP/1.1 404 Not found
HTTP/1.1 404 Object Not Found
HTTP/1.1 405 Method Not Allowed
HTTP/1.1 405
HTTP/1.1 500 Server Error
HTTP/1.1 500 Internal Server Error
HTTP/1.1 501 Method Not Implemented
HTTP/1.1 501 Method Not Supported
Figure 8

As you can see in Figure 8, the reason phrase for the same response status code varies from one server to another.

The status code

Also according to HTTP Made Really Easy,

  • The status code is meant to be computer-readable; the reason phrase is meant to be human-readable, and may vary.
  • The status code is a three-digit integer, and the first digit identifies the general category of response:
    • 1xx indicates an informational message only
    • 2xx indicates success of some kind
    • 3xx redirects the client to another URL
    • 4xx indicates an error on the client's part
    • 5xx indicates an error on the server's part

Program output

This program processes a specified bookmark library (Firefox, Netscape, or IE) and produces seven separate reports that indicate the quality of each bookmark in the library.

(For cases where the bookmark library is large, the user is allowed to specify a subset of bookmarks to process based on the positional indices of the bookmarks in the library.)

Six of the seven reports contain the status line plus additional information about the bookmarks.  The reports are written into text files named 000.txt through 600.txt.

Why do we need seven different reports?

The file named 000.txt contains information about every bookmark in the subset of bookmarks being processed.

In addition, the bookmarks are partitioned into five categories based on the first character in the status code.  The files named 100.txt through 500.txt contain information about bookmarks where the first character in the status code matches the first character in the file name.

(For example, only those bookmarks that produced a response status code beginning with the character 4, indicating an error on the client's part, are contained in the file named 400.txt.  Furthermore, those bookmarks are not contained in any other report other than 000.txt, which contains all bookmarks.)

A report on exceptions

The file named 600.txt contains information about bookmarks for which the program was unable to successfully communicate with the specified server.  Figure 9 shows some typical examples in this category.


java.net.ConnectException: Connection timed out: connect
java.net.SocketException: Network is unreachable: connect
Figure 9

Most important results for cleanup effort

Referring back to the meaning of the different status codes, it is apparent that the contents of the files named 400.txt, 500.txt, and 600.txt are the most important with regard to the task of identifying and either deleting or repairing broken bookmarks.

Sample output

Figure 10 shows an example of the type of output that is provided for the bookmarks in the files from 000.txt through 500.txt.

57 Portfolio Rules 
www2.austin.cc.tx.us./ftfac/PPportfolio.htm
HTTP/1.1 301 Moved Permanently
Figure 10

The position, name, and URL

The number at the beginning shows the position of the bookmark in the library, beginning with an index value of 0.  This number is followed by the bookmark name on the same line.  Then, the URL for the bookmark follows the bookmark name on the same line separated by a space.

(Note that in Figure 10, it was necessary to move the URL from the first line to the second line to cause the material to fit in this narrow publication format.)

The status line

The last (second) line of output for each bookmark is the status line from the response header.  This is the information that is used to categorize the bookmarks and to place the information for each bookmark in the files with names ranging from 100.txt through 500.txt.

The exception output format

The format of the information in the file named 600.txt is somewhat different from the other six files.  Figure 11 shows a typical entry in this file.

1048 WebTk Download Page
redsonja.sunlabs.com/research/tcl/
java.net.UnknownHostException: redsonja.sunlabs.com
Figure 11

The number at the beginning shows the index of the bookmark in the bookmark library.  This is followed on the same line by the bookmark name.

The second line in Figure 11 shows the URL for the bookmark.

The bookmarks in this file are those for which an exception was thrown when the program attempted to connect to the server and to request a resource.  The third line shows the error message encapsulated in the exception object.  In the case of Figure 11, for example, the exception occurred when the program contacted the Domain Name Server in an attempt to resolve the IP address for the server named redsonja.sunlabs.com.

Using the information in the reports

Before getting into the program code, I want to make a few comments about how you can use the information contained in the various reports to clean up your bookmark library.

Not a silver bullet

To begin with, this program is not a silver bullet that resolves all of your bookmark library problems when you run it.  Rather, while this program is extremely useful in helping you to identify broken bookmarks, it is still up to you to either delete those bookmarks from your bookmark library, or to repair them.

As mentioned earlier, for the purpose of cleaning up your bookmark library, the information contained in the files named 400.txt, 500.txt, and 600.txt is probably the most important.  These are the files that contain information about bookmarks that are potentially broken.  In addition, the file named 000.txt contains information about all of the bookmarks in the order that they appear in the bookmark library.  This information is sometimes useful for reference purposes.

Firefox bookmark problems are the easiest to deal with

As I mentioned earlier, it is easier to deal with the problems in the Firefox and Netscape bookmark libraries than it is to deal with problems in the IE Favorites library.  Therefore, I will begin my discussion with Firefox.  Since Netscape uses the same approach to creating and maintaining the bookmark library, these comments apply also to Netscape.

I recommend that you begin your cleanup effort with the file named 600.txt.  After you deal with all the bookmarks for which you are unable to connect to the server, you can process the information in the file named 400.txt.  After that, you can finish up with the file named 500.txt.

However, if you prefer a different order, you can process the files in any order that suits your needs.

Three windows on your screen

Regardless of the order in which you process the files, my recommendation is that you open three windows on your screen.

The bookmarks in browser view

The first window that you should open provides a browser view of the bookmark file named bookmarks.html.  Locate this file on your disk and copy it into another folder.  Then open the copy in your favorite browser producing a screen output similar to that shown in Figure 1.

When your library contains hundreds and possibly thousands of bookmarks, it can be very difficult to locate an individual bookmark in the library.  This view of the bookmarks is very useful in helping you to locate a bookmark that has been identified by the program as potentially broken.  You can use the search feature of the browser to search and find a bookmark with a given name in this view.

The bookmarks are hyperlinks

Also note that the bookmarks are hyperlinks in this view.  All that is needed to manually test the quality of a bookmark is to click on the hyperlink with your mouse.  That will cause the browser to attempt to connect to the server and to download the requested resource.

The bookmarks in Bookmarks Manager view

The second window that you should open on your screen is the Firefox Bookmarks Manager, producing a screen output similar to that shown in Figure 2.  This is the view that you should use to either delete or to repair bookmarks.

(While it is possible to edit the Firefox bookmark file directly, that is a bad idea unless you are very skilled at editing HTML.  It is probably also a bad idea to modify that file while Firefox is running even if you are skilled at editing HTML.)

Deleting a bookmark

You can delete a bookmark in the Bookmarks Manager view by highlighting the bookmark in this view and clicking the large red X in the top of the window (not shown in the cropped image in Figure 2).

Repairing a bookmark

You can repair a bookmark in this view by right-clicking a bookmark and selecting Properties.  This will produce a dialog in which you can edit the URL, changing it from a broken URL to a good URL.

Locating a specific bookmark

Recall that the order of the bookmarks in the browser view of Figure 1 is the same as the order of the bookmarks in the Bookmarks Manager view of Figure 2.  Once you get used to the formats involved, there is a strong visual correlation between the formats of Figure 1 and Figure 2.  Thus, once you have used the search feature of the browser to locate a bookmark in the browser view of Figure 1, it is usually an easy task to manually locate that bookmark in the Bookmarks Manager view of Figure 2.

(The Bookmarks Manager also has very respectable search capability.  However, once you have searched for and found a bookmark in the search view of the bookmark manager, you have access to the bookmark (or more probably a copy of the bookmark) itself, but you can neither delete nor repair the bookmark in the search view.  Author's update:  While that was true in the version of Firefox being used by the author when this lesson was originally written, with version 1.5, a bookmark that has been located in the search view can be deleted or can be repaired by accessing its properties.  Further, the result of a search doesn't provide any information about the actual location of the bookmark in the library.  Therefore, if the number of bookmarks in the library is large, something like the browser view of Figure 1 is needed to locate bookmarks in the Bookmarks Manager view.)

The broken bookmarks window

Depending on which type of problem you are addressing, the third window that you should open is one of the text files produced by the program.  If you open the files named 400.txt or 500.txt in a text editor, you should see something similar to Figure 10.  If you open the file named 600.txt in a text editor, you should see something similar to Figure 11.

Potentially broken bookmarks

Regardless of which type of problem you are addressing, each text file contains information about potentially broken bookmarks.

The order of the bookmarks in the text file is the same as the order of the bookmarks in the browser view of Figure 1 and the order of the bookmarks in the Bookmarks Manager view of Figure 2.  Thus, you can easily start at the top of the broken-bookmarks list and work your way down, or start at the bottom and work your way up.

The basic approach

The basic approach is to copy a bookmark name from the broken-bookmarks window, paste it into the search field of the browser view, and search for the bookmark.  Then scroll the Bookmarks Manager view to the same bookmark and either delete or repair it.

Since it is possible to have two or more bookmarks with the same name in the library, once you locate the bookmark in the browser view, you should compare its URL with the URL shown in the broken-bookmarks window to confirm that you have located the correct bookmark.

(In Firefox, if you point to a bookmark in the browser view, the URL for that hyperlink appears at the bottom of the browser window.)

Dealing with exceptions

My recommendation for dealing with exceptions is that you manually test each bookmark for which the program threw an exception when the attempt was made to connect to the server.  (The server may simply have been down for maintenance when the program was run.)  All that is necessary to manually test the bookmark in browser view is to click on the hyperlink identifying that bookmark.

If the manual test with the browser view indicates that a problem still exists, scroll the Bookmarks Manager view to locate the same bookmark in that view.  Then either delete or repair the bookmark.

Dealing with 400-series errors

As shown in Figure 8, there are several different kinds of errors that you are likely to encounter in the 400-series. You will need to interpret the meaning of the different kinds of errors to determine what to do about them.

My experience is that the occurrence of a 404 error, (indicating that the requested resource could not be found), is usually pretty reliable.  After manually testing a number of errors of this type in the browser view, I concluded that unless the bookmark is one that I considered to be very important, it was not worth the effort to manually test them.  After that, whenever I encountered a 404 error, I simply scrolled the Bookmarks Manager view to that same bookmark and deleted it.

However, the true meaning of the other errors in the 400 series seems to be less definitive.  For those cases, I manually tested each bookmark before deleting it from the bookmark library.

Dealing with 500-series errors

For the most part, I found the exact meaning of the 500-series errors to be very unreliable.  For the most part, I manually tested all 500-series errors in the browser view before deleting them.

Dealing with IE Favorites

I wish that I could give you similarly helpful suggestions as to how to deal with IE Favorites that show up in the reports as being potentially broken.  Unfortunately, I don't have quite as much to offer in this regard.

(Although Microsoft doesn't refer to their Favorites as bookmarks, for simplicity of writing, I will often refer to them as bookmarks in this lesson.)

Finding a specific IE bookmark

As near as I can determine, the only way to find a specific bookmark in the IE bookmark view shown in Figure 4 is to search for it visually and manually.

(The IE bookmark view shown in Figure 4 is exposed by clicking on the button with the large gold star and the word Favorites at the top of an IE browser window.)

Apparently no search capability is available

If the IE bookmark view provides any way to automatically search for a specific bookmark, I have been unable to find it.

(I confess, however, that I rarely use IE and therefore may have overlooked a search capability.)

Deleting a problem bookmark

Therefore, after you run the program and identify bookmarks that are potentially broken in your IE bookmarks library, you may need to manually and visually search the bookmarks view to locate those bookmarks if you want to delete them.

(I will show you another possible but somewhat questionable way to delete IE bookmarks later in this lesson.)

The IE Favorites organizer view

You can delete bookmarks from the IE Favorites library shown in Figure 4 by clicking the Organize link shown at the top of Figure 4 in order to produce the organizer view shown in Figure 12.


Figure 12

You can select a bookmark in the organizer view and click the Delete button to delete it from the Favorites library.

Repairing an IE bookmark

Having located a bookmark in either the IE Favorites view shown in Figure 4 or the organizer view shown in Figure 12, you can right-click on that bookmark and select Properties to expose a dialog that you can use to repair the bookmark.

However, if you just want to repair an IE bookmark, it may be easier to use the standard Windows Search tool shown in Figure 13 to find the Internet Shortcut file that represents the bookmark of interest.


Figure 13

If you are an IE user, you are probably already aware that you activate this search tool by clicking the button with the picture of the magnifying glass and the word Search at the top of a standard Windows XP Explorer window.

Searching for Internet Shortcut files

To search for a specific Internet Shortcut file representing an IE bookmark, open an Explorer window on the Favorites folder, which will probably have a path similar to the following:

C:Documents and SettingsOwnerFavorites

Then open the search tool and enter the name of the file, (which is also the name of the bookmark), in the search dialog that appears in the left pane of Figure 13.  Click the Search button.  If the file exists in the Favorites folder or one of its sub-folders, a link to the file will appear in the right pane of Figure 13 when the search is complete.

Double-click to test the bookmark

At this point, you can double-click the link in the right pane to manually test the bookmark that the file represents if such a test is needed.  You can also right-click the link and select Properties to expose a dialog that will allow you to edit the URL in order to repair it.

Deleting the file to delete the bookmark

You could also delete the file showing in the right pane of Figure 13 to delete the bookmark.  However, I'm not absolutely certain that is a safe thing to do.  Because Windows has the ability to maintain the order of the bookmarks in the IE bookmarks view (Figure 4), according to the arrangement that you create by dragging the bookmarks up and down, the Internet Shortcut files don't exist in a vacuum.  There is some linkage (possibly an index file) between the existence of the Internet Shortcut files and IE.  It is possible that deleting those files outside of IE could cause a problem with IE's ability to manage the bookmarks represented by those files.

(However, I frequently drag shortcuts onto the Links toolbar and delete shortcuts from the Links toolbar with no apparent ill effects.  The Links toolbar is apparently just another view of the Links folder shown in Figure 4.  On the basis of that experience, I suspect that it is probably safe to delete an Internet Shortcut file in order to delete an IE bookmark.  However, you might want to be a little cautious in this regard.  For example, it might be a good idea to make certain that IE isn't running when you delete the files.)

Program Preview

This section provides a preview of the program named Bookmarks10.

Purpose

The purpose of this program is to help you to clean up your bookmark library by identifying potentially broken bookmarks.  The program is compatible with bookmark libraries for the following browsers:

  • Firefox
  • Netscape
  • Internet Explorer

Processes HTTP bookmarks only

This program does not attempt to connect to secure web sites using the HTTPS protocol.  Also, it does not support FTP and protocols other than HTTP.  If the bookmark library contains bookmarks that specify a protocol other than HTTP, those bookmarks are simply ignored.

Methodology

The program attempts to connect to the server using the HTTP 1.1 protocol and to retrieve the response headers from the server for each bookmark within a specified range of bookmarks in the bookmark library.

The program uses the first line in the response header to categorize the response into one of five categories as described at http://www.jmarshall.com/easy/http/.

According to the source given above, the initial response line, often called the status line, has three parts separated by spaces:

  • The HTTP version
  • A response status code that gives the result of the request
  • An English reason phrase describing the status code.

The HTTP version is in the format "HTTP/x.x".

The status code is meant to be computer-readable.

The reason phrase is meant to be human-readable, and may vary.

Format and meaning of the status code

The status code is a three-digit integer, and the first digit identifies the general category of response:

  • 1xx indicates an informational message only
  • 2xx indicates success of some kind
  • 3xx redirects the client to another URL
  • 4xx indicates an error on the client's part
  • 5xx indicates an error on the server's part

Some typical status lines follow:

  • HTTP/1.1 200 OK
  • HTTP/1.1 301 Moved Permanently
  • HTTP/1.1 302 Moved Temporarily
  • HTTP/1.1 302 Found
  • HTTP/1.1 302 Object moved
  • HTTP/1.1 400 Bad Request
  • HTTP/1.1 401 Authorization Required
  • HTTP/1.1 403 Access Forbidden
  • HTTP/1.1 403 Invalid method
  • HTTP/1.1 404 Not found
  • HTTP/1.1 404 Object Not Found
  • HTTP/1.1 405 Method Not Allowed
  • HTTP/1.1 405
  • HTTP/1.1 500 Server Error
  • HTTP/1.1 500 Internal Server Error
  • HTTP/1.1 501 Method Not Implemented
  • HTTP/1.1 501 Method Not Supported

Note that the reason phrase does vary from one web server to another.  Also note that I haven't seen any status lines that show a status code in the 1xx range.

Program output

The first header response line along with additional information about each bookmark within the specified range is stored in a set of output text files named 100.txt through 500.txt.  The user can examine the information provided in those text files to determine the quality of each bookmark.

For those bookmarks that appear to be broken on the basis of the web server response, the user can either delete the bookmark from the library, or attempt to repair it.

The program produces two more output files in addition to the five output files described above.  A file named 000.txt contains information about every bookmark within the range of specified bookmarks.  A file named 600.txt contains information about each bookmark for which the program threw an exception when trying to connect to the server.  Some sample exceptions follow:

  • java.net.UnknownHostException: www.BadBookmark.com
  • java.net.ConnectException: Connection timed out: connect
  • java.net.SocketException: Network is unreachable: connect

Program input

The following five values must be provided as command-line parameters.  All command-line parameters are provided as strings, but must be convertible to the types shown below.

  • String bkMrkPath:  Path to the folder containing a Firefox bookmark file or containing a multitude of IE url files.
  • String bkMrkFile:  Name of the Firefox bookmark file.  Use a dummy name for this parameter when processing IE favorites.
  • int lowBkMrkLimit:  Index of first bookmark to process.  Indices begin with 0 for the first bookmark.
  • int numToProc:  Number of bookmarks to process.
  • String browser:  Type of browser:  F for Firefox, N for Netscape, or I for Internet Explorer.

Figure 14 shows the contents of a typical batch file used to process 200 bookmarks beginning with bookmark index 100 in an IE Favorites library.

java Bookmarks10 
"C:/Documents and Settings/Owner/Favorites/" 
DummyFileName 
100 
200 
I
Figure 14

Note that it was necessary to display each of the command-line parameters on a different line in Figure 14 to force this material to fit in this narrow publication format.

Program testing

This program was tested using J2SE 5.0 under WinXP.  J2SE 5.0 or later is required due to the use of generics.





Page 1 of 3



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel