# Understanding Java Tree APIs

A *Tree* is a non-linear data structure where data objects are organized in terms of hierarchical relationship. The structure is non-linear in the sense that, unlike simple array and linked list implementation, data in a tree is not organized linearly. Each data element is stored in a structure called a *node*. The topmost or starting node of the (inverted) tree is called the *root node*. All nodes are linked with an edge and form hierarchical sub trees beginning with the root node. Tree data structure is useful on occasions where linear representation of data do not suffice, such as creating a family tree. Java provides two in-built classes, *TreeSet *and *TreeMap*, in Java Collection Framework that cater to the needs of the programmer to describe data elements in the aforesaid form.

## Tree Data Structure

Java tree classes are actual implementations of a specific variety of the tree data structure. Therefore, before going into the details of its usage, we must have some idea of its organization.

A tree, T, by definition, is a non-empty set of elements where one of these elements is called the root and the remaining elements (which may or may not be present in case of an empty tree rhyming with empty set of Set Theory) are partitioned further into sub trees of T.

For example, think of the hierarchical relationship in the following family tree.

**Figure 1:** A family tree

The descendants of *Tim* are arranged in a hierarchical fashion. *Tim*, at the top of the (inverted) tree, represents the root node. *Tim*'s children are *Bobby*, *Ron*, and *Amy*. *Bobby* has no children, *Ron* has one, and *Mary* has two. They are listed in the same hierarchical manner. The link between each of the nodes is called an *edge*. This link signifies the relationship that one node has with another, such as *Amy*'s children, *Bobby*'s sibling, *Tim*'s descendant, and so forth. Sometimes, the ending nodes of the tree are called *leaves*.

Now, according to the organizational structure, a tree can be implemented in many ways. Of its many forms, the binary tree is of special importance because it is simple and efficient to implement. The constraint with binary tree is that it allows at most two children to descend from a parent node; they are called the left-child and the right-child, respectively. The tree progressing from left-child is called left-sub tree, and the tree progressing from right-child is called right-sub tree. This is common for any type of binary tree, because a binary tree also has various implementation schemes. Each of these schemes has certain clear defined norms for creation and maintenance, which directly impacts the access mechanics of the data elements, usually measured in *Big O* notation. For example, if a data element to be searched in a particular type of binary tree takes, say O(n^{2}) times, and another type of binary tree implementation takes say, O(log_{2} n) times in worst cases, the second implementation is more efficient than the first.

**Figure 2:** Binary tree implementation

Some of the common binary tree types are termed as full-binary tree, complete-binary tree, binary search tree (BST), height balance tree (AVL), red-black tree, and so on.

## Red and Black Tree

Among the various types of binary trees, here we are interested in the red-black tree because Java tree API implementation is an instance of this data structure. There is a reason for Java API designers culled this binary tree scheme. The reason is that it is one of the many balanced search tree schemes that guarantees basic dynamic set operation to complete in O(log_{2} n) times even in worst cases.

The properties of the red-black tree schemes are as follows:

- The red-black tree keeps one extra bit of information per node to denote its color, red or black.
- The root is always black.
- Every leaf is black.
- Both the children of a red node must be black.
- For each node, all simple paths from node to descendant leaves contain the same number of black nodes.

Therefore, the implementation must ascertain that these properties are maintained at any instance of time, especially during INSERTION and DELETION of a node. And, to keep it intact dynamically, sub-trees are rotated left or right with intricate logic. If you want to dive deeper into the algorithm and find out how exactly it works at the low-level then read, *Introduction to Algorithms* by Cormen, Leiserson, Rivest, and Stein.

## Java Tree APIs

When using the Java API library, fortunately or unfortunately we do not have to deal with the low-level complex implementation logic of the red-black tree. Instead, we can focus on solving the problem at hand rather than implementing the scheme from scratch. Just import the relevant package and create an instance of the ready-made tree classes available. That is all we need to do.

As mentioned earlier, there are two variety of tree implementation in Java collection framework. One is *TreeSet* and another is *TreeMap*. They are defined in the *java.util* package as follows:

public class TreeSet<E> extends AbstractSet<E> implements NavigableSet<E>, Cloneable, Serializable public class TreeMap<K, V> extends AbstractMap<K, V> implements NavigableMap<K, V>, Cloneable, Serializable

The intriguing feature of both the tree classes is that one is furnished as a *Set* and another as *Map*. The *Set* and *Map* are interfaces implemented by the abstract classes *AbstractSet* and *AbstractMap*, respectively.

The *Set* represents a collection of distinct elements—an element e1 must not be equal to another element e2 of the same set by e1.equal(e2). Also, there may only be one *null* element in the set. The property it imposes upon the collection of elements is based upon the mathematical set abstraction model.

The *Map* property imposes that the collection of elements must have a key, value pair. Each key maps to one value only; that means it disallows duplicate keys. A value with a distinct key may be duplicated, however.

The tree classes, *TreeSet* and *TreeMap*, adhere to the specific norms derived from their respective interfaces apart from organizing its internal data structure in binary tree form.

## A Quick Example of *TreeSet*

As we know, the property of *Set* implementation ensures that the tree shall not contain any duplicates when storing the data element in a tree. In contrast to other set implementations, the TreeSet guarantees that the data elements stored will be sorted by default according to the natural ordering of the elements. Let's have a glimpse of its usage with the help of a simple example.

packageorg.mano.example;importjava.util.Arrays;importjava.util.SortedSet;importjava.util.TreeSet;public classTreeSetDemo {public static voidmain(String[] args) { Integer[] nums={2,4,1,6,3,7,9,5}; SortedSet<Integer> tree=newTreeSet<>(Arrays.asList(nums)); // Print first and last element System..println(tree.first()); System.out.println(tree.last());outprintAll(tree); // False. Set does not allow duplicates, // so this will not be added. System..println(tree.add(1)); // But, this will be added because 11 is not a duplicate System.out.println(tree.add(11));outprintAll(tree);printAll(tree.headSet(7)); }public static voidprintAll(SortedSet<Integer> tree){for(ints: tree){ System..println(s); } System.out.println(); } }out

## A Quick Example of *TreeMap*

A key, value pair class much like another *Map* implementation, called a *HashMap*. Apart from storing its data elements in a red-black tree structure, *TreeMap* maintains an order in the keys stored. Therefore, when we iterate over the keys, we find that they are naturally ordered. Let us see with the help of a simple example.

packageorg.mano.example;importjava.util.Map;importjava.util.TreeMap;public classTreeMapDemo {public static voidmain(String[] args){ TreeMap<String, Double> treeMap=newTreeMap<>(); treeMap.put("Paradise Lost", 23.56); treeMap.put("Golden Treasury", 12.47); treeMap.put("Moon and the Sixpence", 65.28); treeMap.put("Holinshed", 7.68); treeMap.put("Ancient Mariner", 45.36);printAll(treeMap); // Keys cannot be duplicates. This will not be stored. treeMap.put("Paradise Lost", 23.56);printAll(treeMap); // Values may be duplicates. This will be stored. treeMap.put("Paradise Regained", 23.56);printAll(treeMap); }public static voidprintAll(TreeMap<String, Double> treeMap){for(Map.Entry<String, Double> et:treeMap.entrySet()){ System..println(et.getKey()+": "+et.getValue()); } System.out.println(); } }out

## Conclusion

The *TreeSet* and *TreeMap* classes are the most obvious implementation of binary tree data structure in the Java API Library. For the high-level users, the rules of data organization do not make any difference in its usages. But, the tree structure is slightly more complicated and inefficient than its non-tree or linear counterparts, like *HashSet* and *HashMap*, due to the numerous rules to maintain the norms of a balanced tree structure. Therefore, unless a specific need arises, its counterpart class should be used more often than trees.

*This article was originally published on May 25, 2017*