Training examples (items) are first run through a FilterEngine object. The FilterEngine object does a hit or miss filtering or SQL type filter, which filters out items you do not want to include for the similarity classification. For example, from the people database of Table 2, if you do not want to include any person who is shorter than 180 cm, the FilterEngine can eliminate those Training examples before the rest are passed to the SimilarityEngine for classification.
Listing 2 (computeSimilarity() method)
public SimilarItems computeSimilarity( Items items,
SimilarityCriteria criteria,
SimilarityWeights weights,
DataSetStatistics statistics )
{
//--- This is the item that we want to compare ourselves to
//--- to see how similar we are
SimilarityCriterionScores targetValues = getTargetValues
( items.getTraitDescriptors(),
criteria, weights,
statistics );
float maxDistance = getMaxDistance( criteria, weights );
//--- Create a similarity descriptor for each item
SimilarItems similarItems = new SimilarItems();
Iterator itemList = items.iterator();
while (itemList.hasNext()) {
Item item = (Item) itemList.next();
SimilarityDescription descriptor = new
SimilarityDescription();
descriptor.setItem( item );
SimilarityCriterionScores itemValues = normalizeValues(
item, criteria, weights,
statistics );
float distance = computeDistance( targetValues, itemValues );
float percentDifference = (distance / maxDistance);
float percentSimilarity = (1 - percentDifference);
descriptor.setPercentSimilarity( percentSimilarity );
similarItems.add( descriptor );
}
//--- Now that we know how similar everyone is, let's go and
//--- and rank them so that the caller has an easy way to
//--- sort the items and so to make it easier to select the
//--- k-best matches
similarItems.rankItems();
//--- All done.
return similarItems;
}
The computeSimilarity() method of the SimilarityEngine class also requires arguments of SimilarityCriteria and SimilarityWeights objects. SimilarityCriteria contains a collection of SimilarityCriterion objects. A SimilarityCriterion object describes a simple relationship made up of:
an attribute
a value
an operator (3 types)
~ means "around" (Use with numbers)
% means "prefer" (Use with strings and booleans)
!% means "try to avoid" (Use with strings and booleans)
Examples of using operators, from the hypothetical database in Table 2, Training examples:
"age ~ 37" (literally means around 37 years of age)
"name % 'Daniel'" (literally means prefer Daniel)
"name !% 'Daniel'" (literally means avoid or to exclude Daniel)
Selection Engine is case insensitive with attributes. The filter operators used by FilterEngine are ( =, !=, <, >, <=, and >= ), which are for hit or miss (boolean-type) filtering. Example, "age < 30" will filter all years under 30.
For numeric attributes, the SimilarityEngine recognizes two special values, [MAX_VAL] and [MIN_VAL]. These are relative values rather than absolute values. The SimilarityEngine translates relative numbers into absolutes by determining the max and min values for each of the item's attributes.
Listing 2 shows the overloaded method computeSimilarity() is passed a third parameter, SimilarityWeights, which is a collection of SimilarityWeight objects. SimilarityWeight is also a "name/value" pair, where name is the name of an attribute and value is its weight. Weight can be any integer value. The default weight of all attributes is 1. Weights only affect similarity operators ( ~, %, !% ), but not filter operators ( =, !=, <, >, <=, and >= ).
The way the Selection Engine does its similartiy computation procedure is as follows:
Compute the maximum distance of the nth attribute.
Calculate the Euclidean distance between the query instance xq and the k-Nearest Neighbour of the Training Examples (items).
Divide the Euclidean distance by the maximum distance (a normalization process).
Subtract the normalized value from 1 (percentage similarities between the query instance xq and the k-Nearest Neighbour of the Training Examples, that is Items).
Listing 3 is a test class, called SelectionEngineTest, which comes with the Selection Engine's distribution.
Add www.developer.com to your favorites Add www.developer.com to your browser search box IE 7 | Firefox 2.0 | Firefox 1.5.xReceive news via our XML/RSS feed