gamelan
Search EarthWeb
CodeGuru | Gamelan | Jars | Wireless | Discussions
Navigate developer.com
Architecture & Design  
Database  
Java
Languages & Tools
Microsoft & .NET
Open Source  
Project Management  
Security  
Techniques  
Voice  
Web Services  
Wireless/Mobile
XML  
New
 
Technology Jobs  

   Developer.com Webcasts:
  The Impact of Coding Standards and Code Reviews

  Project Management for the Developer

  Defining Your Own Software Development Methodology

  more Webcasts...




Vote for the Developer.com Product of the Year Winners!




Developer Jobs

Be a Commerce Partner














 


Developer News -
HP to Microsoft: Thanks for Nothing    November 18, 2008
SaaS Vendors Can Grow Despite Tough Economy    November 18, 2008
iPhone Remains Left Out as Android Scores Flash    November 17, 2008
The Year of Living the OpenSocial    November 14, 2008
Free Tech Newsletter -

Instance-Based Learning: A Java Implementation
By Sione Palu

Go to page: Prev  1  2  3  4  5  Next  

Training examples (items) are first run through a FilterEngine object. The FilterEngine object does a hit or miss filtering or SQL type filter, which filters out items you do not want to include for the similarity classification. For example, from the people database of Table 2, if you do not want to include any person who is shorter than 180 cm, the FilterEngine can eliminate those Training examples before the rest are passed to the SimilarityEngine for classification.

Listing 2 (computeSimilarity() method)

  public SimilarItems computeSimilarity( Items items,
                      SimilarityCriteria criteria,
                      SimilarityWeights weights,
                      DataSetStatistics statistics )
                     {
    
     //--- This is the item that we want to compare ourselves to
     //---   to see how similar we are
     SimilarityCriterionScores targetValues = getTargetValues
                               ( items.getTraitDescriptors(), 
                               criteria, weights, 
                               statistics );

     float maxDistance = getMaxDistance( criteria, weights );

     //--- Create a similarity descriptor for each item
     SimilarItems similarItems = new SimilarItems();
     Iterator itemList = items.iterator();

    while (itemList.hasNext()) {
      Item item = (Item) itemList.next();
      SimilarityDescription descriptor = new
                                         SimilarityDescription();
      descriptor.setItem( item );

      SimilarityCriterionScores itemValues = normalizeValues(
                                item, criteria, weights,
                                statistics );
      float distance = computeDistance( targetValues, itemValues );
      float percentDifference = (distance / maxDistance);
      float percentSimilarity = (1 - percentDifference);
      descriptor.setPercentSimilarity( percentSimilarity );
      similarItems.add( descriptor );
      }

     //--- Now that we know how similar everyone is, let's go and
     //---   and rank them so that the caller has an easy way to
     //---   sort the items and so to make it easier to select the
     //---   k-best matches
     similarItems.rankItems();

     //--- All done.
     return similarItems;
  }

The computeSimilarity() method of the SimilarityEngine class also requires arguments of SimilarityCriteria and SimilarityWeights objects. SimilarityCriteria contains a collection of SimilarityCriterion objects. A SimilarityCriterion object describes a simple relationship made up of:

  • an attribute
  • a value
  • an operator (3 types)
    1. ~ means "around"   (Use with numbers)
    2. % means "prefer"   (Use with strings and booleans)
    3. !% means "try to avoid"   (Use with strings and booleans)

Examples of using operators, from the hypothetical database in Table 2, Training examples:

  • "age ~ 37" (literally means around 37 years of age)
  • "name % 'Daniel'" (literally means prefer Daniel)
  • "name !% 'Daniel'" (literally means avoid or to exclude Daniel)

Selection Engine is case insensitive with attributes. The filter operators used by FilterEngine are ( =, !=, <, >, <=, and >= ), which are for hit or miss (boolean-type) filtering. Example, "age < 30" will filter all years under 30.

For numeric attributes, the SimilarityEngine recognizes two special values, [MAX_VAL] and [MIN_VAL]. These are relative values rather than absolute values. The SimilarityEngine translates relative numbers into absolutes by determining the max and min values for each of the item's attributes.

Listing 2 shows the overloaded method computeSimilarity() is passed a third parameter, SimilarityWeights, which is a collection of SimilarityWeight objects. SimilarityWeight is also a "name/value" pair, where name is the name of an attribute and value is its weight. Weight can be any integer value. The default weight of all attributes is 1. Weights only affect similarity operators ( ~, %, !% ), but not filter operators ( =, !=, <, >, <=, and >= ).

The way the Selection Engine does its similartiy computation procedure is as follows:

  • Compute the maximum distance of the nth attribute.
  • Calculate the Euclidean distance between the query instance xq and the k-Nearest Neighbour of the Training Examples (items).
  • Divide the Euclidean distance by the maximum distance (a normalization process).
  • Subtract the normalized value from 1 (percentage similarities between the query instance xq and the k-Nearest Neighbour of the Training Examples, that is Items).

Listing 3 is a test class, called SelectionEngineTest, which comes with the Selection Engine's distribution.

Listing 3

import net.sourceforge.selectionengine.*;

/**
  * This is the MAIN CONTROLING class. It calls everything else.
  * ©author baylor wetzel (Selection Engine)
  */
  public class SelectionEngineTest {

    public static void main( String[] args ) {
      SelectionEngineTest demo = new SelectionEngineTest();
      demo.start();
    } //--- main

  /**
  * This is the main routine. It takes items that matched a
  * given target and displays the results to stdout. It's good
  * for running tests.
  */
  public void start( ) {
     try {
       MatchingItemsManager matchingItemsManager =
                    new MatchingItemsManager();
       DisplayManager displayManager = new DisplayManager();
       matchingItemsManager.load();

       TraitDescriptors traitDescriptors =
                    matchingItemsManager.getTraitDescriptors();
       FilterCriteria filterCriteria =
                    matchingItemsManager.getQueryManager()
                                        .getFilterCriteria();

       Items items = matchingItemsManager.getItems();
       Items filteredItems =
                    matchingItemsManager.getFilteredItems();

       SimilarItems similarItems =
                    matchingItemsManager.getSimilarItems();
       SimilarityCriteria sc =
                    matchingItemsManager.getSimilarityCriteria();
       SimilarityWeights sw =
                    matchingItemsManager.getSimilarityWeights();

       displayManager.displayTraitDescriptors( traitDescriptors );
       displayManager.displayItems( traitDescriptors, items );
       displayManager.displayFilterCriteria( filterCriteria );
       displayManager.displayItems( traitDescriptors,
                                    filteredItems );
       displayManager.displaySimilarityCriteria( sc );
       displayManager.displaySimilarityWeights( sw );
       displayManager.displaySimilarItems( traitDescriptors,
                                           similarItems );
       System.out.println( "" );
       }
      catch( Exception e) {
        Log.log( "Error: " + e );
       }
     } //--- start

  }  //--- SelectionEngineTest

Go to page: Prev  1  2  3  4  5  Next  


Tools:
Add www.developer.com to your favorites
Add www.developer.com to your browser search box
IE 7 | Firefox 2.0 | Firefox 1.5.x
Receive news via our XML/RSS feed


Java Archives






internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info

Legal Notices, Licensing, Reprints, Permissions, Privacy Policy.
Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers