Architecture & DesignUnderstanding Java Tree APIs

Understanding Java Tree APIs

A Tree is a non-linear data structure where data objects are organized in terms of hierarchical relationship. The structure is non-linear in the sense that, unlike simple array and linked list implementation, data in a tree is not organized linearly. Each data element is stored in a structure called a node. The topmost or starting node of the (inverted) tree is called the root node. All nodes are linked with an edge and form hierarchical sub trees beginning with the root node. Tree data structure is useful on occasions where linear representation of data do not suffice, such as creating a family tree. Java provides two in-built classes, TreeSet and TreeMap, in Java Collection Framework that cater to the needs of the programmer to describe data elements in the aforesaid form.

Learn JAVA and Start your Free Trial today!

What is the Java Tree Data Structure?

Java tree classes are actual implementations of a specific variety of the tree data structure. Therefore, before going into the details of its usage, we must have some idea of its organization.

A tree, T, by definition, is a non-empty set of elements where one of these elements is called the root and the remaining elements (which may or may not be present in case of an empty tree rhyming with empty set of Set Theory) are partitioned further into sub trees of T.

For example, think of the hierarchical relationship in the following family tree.

A family tree
Figure 1: A family tree

The descendants of Tim are arranged in a hierarchical fashion. Tim, at the top of the (inverted) tree, represents the root node. Tim‘s children are Bobby, Ron, and Amy. Bobby has no children, Ron has one, and Mary has two. They are listed in the same hierarchical manner. The link between each of the nodes is called an edge. This link signifies the relationship that one node has with another, such as Amy‘s children, Bobby‘s sibling, Tim‘s descendant, and so forth. Sometimes, the ending nodes of the tree are called leaves.

Now, according to the organizational structure, a tree can be implemented in many ways. Of its many forms, the binary tree is of special importance because it is simple and efficient to implement. The constraint with binary tree is that it allows at most two children to descend from a parent node; they are called the left-child and the right-child, respectively. The tree progressing from left-child is called left-sub tree, and the tree progressing from right-child is called right-sub tree. This is common for any type of binary tree, because a binary tree also has various implementation schemes. Each of these schemes has certain clear defined norms for creation and maintenance, which directly impacts the access mechanics of the data elements, usually measured in Big O notation. For example, if a data element to be searched in a particular type of binary tree takes, say O(n2) times, and another type of binary tree implementation takes say, O(log2 n) times in worst cases, the second implementation is more efficient than the first.

Binary tree implementation
Figure 2: Binary tree implementation

Some of the common binary tree types are termed as full-binary tree, complete-binary tree, binary search tree (BST), height balance tree (AVL), red-black tree, and so on.

What is the Red and Black Tree?

Among the various types of binary trees, here we are interested in the red-black tree because Java tree API implementation is an instance of this data structure. There is a reason for Java API designers culled this binary tree scheme. The reason is that it is one of the many balanced search tree schemes that guarantees basic dynamic set operation to complete in O(log2 n) times even in worst cases.

The properties of the red-black tree schemes are as follows:

  • The red-black tree keeps one extra bit of information per node to denote its color, red or black.
  • The root is always black.
  • Every leaf is black.
  • Both the children of a red node must be black.
  • For each node, all simple paths from node to descendant leaves contain the same number of black nodes.

Therefore, the implementation must ascertain that these properties are maintained at any instance of time, especially during INSERTION and DELETION of a node. And, to keep it intact dynamically, sub-trees are rotated left or right with intricate logic. If you want to dive deeper into the algorithm and find out how exactly it works at the low-level then read, Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein.

Java Tree APIs

When using the Java API library, fortunately or unfortunately we do not have to deal with the low-level complex implementation logic of the red-black tree. Instead, we can focus on solving the problem at hand rather than implementing the scheme from scratch. Just import the relevant package and create an instance of the ready-made tree classes available. That is all we need to do.

As mentioned earlier, there are two variety of tree implementation in Java collection framework. One is TreeSet and another is TreeMap. They are defined in the java.util package as follows:

public class TreeSet<E>
   extends AbstractSet<E>
   implements NavigableSet<E>, Cloneable, Serializable


public class TreeMap<K, V>
   extends AbstractMap<K, V>
   implements NavigableMap<K, V>, Cloneable, Serializable

The intriguing feature of both the tree classes is that one is furnished as a Set and another as Map. The Set and Map are interfaces implemented by the abstract classes AbstractSet and AbstractMap, respectively.

The Set represents a collection of distinct elements—an element e1 must not be equal to another element e2 of the same set by e1.equal(e2). Also, there may only be one null element in the set. The property it imposes upon the collection of elements is based upon the mathematical set abstraction model.

The Map property imposes that the collection of elements must have a key, value pair. Each key maps to one value only; that means it disallows duplicate keys. A value with a distinct key may be duplicated, however.

The tree classes, TreeSet and TreeMap, adhere to the specific norms derived from their respective interfaces apart from organizing its internal data structure in binary tree form.

A Quick Example of TreeSet

As we know, the property of Set implementation ensures that the tree shall not contain any duplicates when storing the data element in a tree. In contrast to other set implementations, the TreeSet guarantees that the data elements stored will be sorted by default according to the natural ordering of the elements. Let’s have a glimpse of its usage with the help of a simple example.

package org.mano.example;

import java.util.Arrays;
import java.util.SortedSet;
import java.util.TreeSet;

public class TreeSetDemo {

   public static void main(String[] args) {

      Integer[] nums={2,4,1,6,3,7,9,5};
      SortedSet<Integer> tree=new TreeSet<>(Arrays.asList(nums));

      // Print first and last element
      System.out.println(tree.first());
      System.out.println(tree.last());

      printAll(tree);
      // False. Set does not allow duplicates,
      // so this will not be added.
      System.out.println(tree.add(1));

      // But, this will be added because 11 is not a duplicate
      System.out.println(tree.add(11));
      printAll(tree);

      printAll(tree.headSet(7));

   }

   public static void printAll(SortedSet<Integer> tree){
      for(int s: tree){
         System.out.println(s);
      }
      System.out.println();
   }
}

A Quick Example of TreeMap

A key, value pair class much like another Map implementation, called a HashMap. Apart from storing its data elements in a red-black tree structure, TreeMap maintains an order in the keys stored. Therefore, when we iterate over the keys, we find that they are naturally ordered. Let us see with the help of a simple example.

package org.mano.example;

import java.util.Map;
import java.util.TreeMap;

public class TreeMapDemo {

   public static void main(String[] args){

      TreeMap<String, Double> treeMap=new TreeMap<>();

      treeMap.put("Paradise Lost", 23.56);
      treeMap.put("Golden Treasury", 12.47);
      treeMap.put("Moon and the Sixpence", 65.28);
      treeMap.put("Holinshed", 7.68);
      treeMap.put("Ancient Mariner", 45.36);

      printAll(treeMap);

      // Keys cannot be duplicates. This will not be stored.
      treeMap.put("Paradise Lost", 23.56);
      printAll(treeMap);

      // Values may be duplicates. This will be stored.
      treeMap.put("Paradise Regained", 23.56);
      printAll(treeMap);

   }

   public static void printAll(TreeMap<String, Double> treeMap){
      for(Map.Entry<String, Double> et:treeMap.entrySet()){
         System.out.println(et.getKey()+": "+et.getValue());
      }
      System.out.println();
   }
}

Conclusion

The TreeSet and TreeMap classes are the most obvious implementation of binary tree data structure in the Java API Library. For the high-level users, the rules of data organization do not make any difference in its usages. But, the tree structure is slightly more complicated and inefficient than its non-tree or linear counterparts, like HashSet and HashMap, due to the numerous rules to maintain the norms of a balanced tree structure. Therefore, unless a specific need arises, its counterpart class should be used more often than trees.

To learn more about Java check out TechRepublic Academy!

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories